WO2004112340A2 - Maintaining entity order with gate managers - Google Patents

Maintaining entity order with gate managers Download PDF

Info

Publication number
WO2004112340A2
WO2004112340A2 PCT/US2004/018723 US2004018723W WO2004112340A2 WO 2004112340 A2 WO2004112340 A2 WO 2004112340A2 US 2004018723 W US2004018723 W US 2004018723W WO 2004112340 A2 WO2004112340 A2 WO 2004112340A2
Authority
WO
WIPO (PCT)
Prior art keywords
gate
entity
request
resource
queue
Prior art date
Application number
PCT/US2004/018723
Other languages
English (en)
French (fr)
Other versions
WO2004112340A3 (en
Inventor
Robert E. J. Jeter
John A. Chanak
Original Assignee
Cisco Technology, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Cisco Technology, Inc. filed Critical Cisco Technology, Inc.
Priority to DE602004022288T priority Critical patent/DE602004022288D1/de
Priority to AT04755096T priority patent/ATE438140T1/de
Priority to EP04755096A priority patent/EP1631906B1/de
Publication of WO2004112340A2 publication Critical patent/WO2004112340A2/en
Publication of WO2004112340A3 publication Critical patent/WO2004112340A3/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/28Data switching networks characterised by path configuration, e.g. LAN [Local Area Networks] or WAN [Wide Area Networks]
    • H04L12/46Interconnection of networks
    • H04L12/4604LAN interconnection over a backbone network, e.g. Internet, Frame Relay
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/90Buffering arrangements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/90Buffering arrangements
    • H04L49/901Buffering arrangements using storage descriptor, e.g. read or write pointers

Definitions

  • the present invention relates to computer networking and in particular to maintaining entity order in a system containing multiple entities.
  • Computer architecture generally defines the functional operation, including the flow of information and control, among individual hardware units of a computer.
  • One such hardware unit is the processor or processing engine, which contains arithmetic and logic processing circuits organized as a set of data paths.
  • the data path circuits may be configured as a central processing unit (CPU) having operations that are defined by a set of instructions.
  • the instructions are typically stored in an instruction memory and specify a set of hardware functions that are available on the CPU.
  • a high-performance computer may be realized by using a number of CPUs or processors to perform certain tasks in parallel.
  • each processor may have shared or private access to resources, such as an external memory coupled to the processors.
  • Access to the external memory is generally handled by a memory controller, which accepts requests from the various processors to access the external memory and processes them in an order that typically is controlled by the memory controller.
  • a memory controller which accepts requests from the various processors to access the external memory and processes them in an order that typically is controlled by the memory controller.
  • Certain complex multiprocessor systems may employ many memory controllers where each controller is attached to a separate external memory subsystem.
  • An intermediate node interconnects communication links and sub-networks of a computer network through a series of ports to enable the exchange of data between two or more software entities executing on hardware platforms, such as end nodes.
  • the nodes typically communicate by exchanging discrete packets or frames of data according to predefined protocols, such as the Transmission Control Protocol/Internet Protocol (TCP/TP) or the Internetwork Packet Exchange (IPX) protocol.
  • TCP/TP Transmission Control Protocol/Internet Protocol
  • IPX Internetwork Packet Exchange
  • the forwarding engine is often used by the intermediate node to process packets acquired on the various ports in accordance with various predefined protocols. This processing may include placing a packet in a packet memory where the forwarding engine may access data associated with the packet and perform various functions on the data such as, modifying the contents of the packet.
  • the multiprocessor forwarding engine is organized as a systolic array comprising "m" rows and "n" columns of entities, such as processors or threads.
  • the entities of each row may be configured to process packets in a pipeline fashion, wherein the entities of each column in the row acts as a stage in the pipeline and performs a particular function on the packets.
  • an 8x2 systolic array of processors comprises 16 processors organized as 8 rows containing 2 columns per row wherein the column processors of each row comprise a 2-stage pipeline.
  • packets are processed by the systolic array in a manner where a packet is assigned to a particular row of entities and each entity in a column is configured to perform a function on the packet in a manner as described above.
  • the intermediate node acquires a packet and assigns the packet to a particular row of processors in the array.
  • the processor in the first column of the row may be configured to apply a destination address contained in the packet to a look-up table to determine the destination of the packet.
  • the processor in the second column may be configured to place the packet on an output queue associated with the destination.
  • each entity in a particular column is configured to execute the same code within a fixed amount of time but with a shifted phase.
  • packets are acquired, they are placed in a shared resource, such as an external memory, and assigned to the next available row of entities, as described above.
  • packets tend to be processed by the intermediate node on a first-in first-out basis such that packets that arrive ahead of later packets exit the forwarding engine ahead of the later packets.
  • packets that arrive later may be possible for packets that arrive later to exit ahead of packets that arrived earlier.
  • the processors of a particular row processing an earlier acquired packet may stall due to various memory events associated with the shared resource, such as memory refresh cycles or being denied access to locked memory locations.
  • the time spent processing the earlier packet may take longer than the time spent processing a later acquired packet processed by a different row of processors and, consequently, the later acquired packet may end up exiting the forwarding engine ahead of the earlier acquired packet.
  • One way to maintain entity order and consequently packet processing order is to employ a synchronization mechanism that synchronizes the entities in the systolic array at certain points during their processing of packets.
  • a prior art synchronization mechanism that may be used involves a special instruction called a "boundary synchronize" (BSYNC) instruction.
  • the BSYNC instruction causes the entities in a particular column to wait (stall) until all the processors in the column have executed the instruction.
  • code executed by the column of entities contains BSYNC instructions at various strategic points in the code.
  • the BSYNC instruction acts to synchronize the entities at certain code boundaries, and can be used to prevent entities that are processing later acquired packets from "getting ahead" of entities processing earlier acquired packets.
  • the present invention relates to an efficient technique for maintaining order among entities of an intermediate network node by ensuring orderly access to a resource shared by the entities.
  • a request is generated by an entity to access the resource.
  • the request is placed on a queue associated with the entity.
  • Each queue is further associated with an identifier (ID) that illustratively repre- sents the entity associated with the queue.
  • the request eventually reaches the head of the queue.
  • a gate manager is provided to maintain orderly access to the shared resource.
  • the gate manager generates an ID that illustratively represents an entity allowed to access the resource.
  • the ID generated by the gate manager is compared with the ID associated with the queue to determine if they match. If so, the re- quest is transferred to the resource, which processes the request. Results acquired from the resource (if any) are transferred to the entity.
  • the entities such as processors or threads of execution, access resources, such as external memories, via resource controllers, such as memory controllers coupled to the processors and external memories.
  • the memory controllers contain gate managers that are used to maintain orderly access to the memories.
  • a processor accesses a memory by generating a request and transferring the request to the memory controller coupled to the memory. The request is placed on a request queue associated with the processor. The request queue is associated with an ID that represents the processor.
  • a gate manager ID specified by the processor and con- tained in the request identifies a gate manager contained in the memory controller.
  • an arbiter of the memory controller determines if the ID generated by the specified gate manager is the same as (matches) the ID associated with the queue. If so, the memory controller removes the request from the queue and issues the request to the memory. The memory processes the re- quest and returns results (if any) to the memory controller, which transfers the results to the processor.
  • the inventive technique is an improvement over prior techniques in that entity order can be maintained in a system containing multiple entities without requiring intervention on the part of the entities.
  • FIG. 1 is a schematic block diagram of a computer network comprising a collection of interconnected communication media attached to a plurality of nodes that may be advantageously used with the present invention
  • Fig. 2 is a high-level schematic block diagram of an intermediate node that may be advantageously used with the present invention
  • Fig. 3 is a partial schematic block diagram of the arrayed-processing engine comprising a symmetric multiprocessor system configured as a multi-dimensioned systolic array that may be advantageously used with the present invention
  • Fig. 4 is a schematic block diagram of a processor cluster of the arrayed-processing engine that may be advantageously used with the present invention
  • Fig. 5 is a schematic block diagram of a processor cluster coupled to a plurality of memory controllers that may be advantageously used with the present invention
  • Fig. 6 is a schematic block diagram of a gate manager selection register that may be advantageously used with the present invention.
  • Fig. 7 is a schematic block diagram of a request that may be advantageously used with the present invention.
  • Fig. 8 is a high-level schematic block diagram of a gate manager that may be advantageously used with the present invention.
  • Fig. 9 is a schematic block diagram of a processor bit mask that may be advantageously used with the present invention
  • Fig. 10 is a flow diagram of a sequence of steps that may be advantageously used to select and configure a gate manager in accordance with the present invention.
  • Fig. 11 is a flow diagram of a sequence of steps that may be advantageously used to process a request in accordance with the present invention.
  • Fig. 1 is a schematic block diagram of a computer network 100 that may be advantageously used with the present invention.
  • the computer network 100 comprises a collection of communication links and segments connected to a plurality of nodes, such as end nodes 110 and intermediate network nodes 200.
  • the network links and segments may comprise local area networks (LANs) 120, wide area networks (WANs), such as Internet 170 and WAN links 130 interconnected by intermediate nodes 200 to form an internetwork of computer nodes.
  • LANs local area networks
  • WANs wide area networks
  • Internet 170 Internet 170
  • WAN links 130 interconnected by intermediate nodes 200 to form an internetwork of computer nodes.
  • These internetworked nodes communicate by exchanging data packets according to a predefined set of protocols, such as the Transmission Control Protocol/Internet Protocol (TCP/IP) and the Internetwork Packet eX- change (IPX) protocol.
  • TCP/IP Transmission Control Protocol/Internet Protocol
  • IPX Internetwork Packet eX- change
  • Fig. 2 is a high-level schematic block diagram of intermediate node 200, which illustratively is a router.
  • An example of a router that may be advantageously used with the present invention is the Cisco 10000 Series Internet Router available from Cisco Systems Incorporated, San Jose, CA.
  • Node 200 comprises a plurality of interconnected components including a forwarding engine 300, various memories, queuing logic 210, and network interface cards (line cards) 240. Operations of these components are preferably synchronously controlled by a clock module 270 although the arrayed elements of the forwarding engine 300 may be operatively configured to function asynchronously.
  • the clock module 270 generates clock signals at a frequency of, e.g., 200 megahertz (i.e., 5 nanosecond clock cycles), and globally distributes them via clock lines to the components of the intermediate node 200.
  • the memories generally comprise random-access-memory (RAM) storage locations addressable by the forwarding engine 300 and logic for storing data structures accessed by the components and software programs including programs that may implement aspects of the present invention.
  • An operating system portions of which are typically resident in memory and executed by the forwarding engine 300, functionally organizes node 200 by, inter alia, invoking network operations in support of software processes executing on node 200. It will be apparent to those skilled in the art that other memory means, including various computer readable media, may be used for storing and executing program instructions pertaining to the inventive technique and mechanism described herein.
  • the buffer and queuing unit (BQU) 210 is connected to a packet memory 220 that is configured to store packets, and a queue memory 230 that is configured to store network and link layer headers of the packets in data structures, such as linked lists, organized as queues.
  • the BQU 210 further comprises interface circuitry for interconnecting the forwarding engine 300 with a plurality of line cards 240 via a selector circuit 250 having an arbiter 255.
  • the line cards 240 may comprise, e.g., Asynchronous Transfer Mode (ATM), Fast Ethernet (FE) and Gigabit Ethernet (GE) ports, each of which includes conventional interface circuitry that may incorporate the signal, electrical and mechanical characteristics, and interchange circuits, needed to interface with the physical media and protocols running over that media.
  • ATM Asynchronous Transfer Mode
  • FE Fast Ethernet
  • GE Gigabit Ethernet
  • a routing processor 260 executes conventional routing protocols for communication directly with the forwarding engine 300.
  • the routing protocols generally comprise topological information exchanges between intermediate nodes to determine preferred paths through the network based on, e.g., destination IP addresses. These protocols provide information used by the processor 260 to create and maintain forwarding tables.
  • the tables are loaded into the external memories 340 as forwarding information base (FIB) tables, used by the engine 300 to perform, e.g., layer-2 (L2) and layer-3 (L3) forwarding operations.
  • FIB forwarding information base
  • L2 layer-2
  • L3 layer-3
  • the forwarding engine 300 may comprise a symmetric multiprocessor system having a plurality of processing elements or processors.
  • Fig. 3 is a partial schematic block diagram of forwarding engine 300 comprising a plurality of processors (TMCs) 450 organized as a multi-dimensional systolic array.
  • Each processor 450 is preferably a pipelined processor that includes, inter alia, a plurality of arithmetic logic units (ALUs) and a register file having a plurality of general purpose registers that store intermediate result information processed by the ALUs.
  • the processors 450 may be arrayed into multiple rows and columns.
  • the processors are arrayed as eight (8) rows and two (2) columns in an 8x2 arrayed configuration that is embedded between an input buffer 360 and an output buffer 380.
  • the processors may be arrayed as eight (8) rows and two (2) columns in an 8x2 arrayed configuration that is embedded between an input buffer 360 and an output buffer 380.
  • other arrangements such as 4x4, 4x8, or 8x1 -arrayed configurations, may be advantageously used with the present invention.
  • a system containing e.g., a single processor, supporting multiple threads of execution can take advantage of the invention.
  • the forwarding engine 300 is coupled to a plurality of external memory resources 340 via associated external memory controllers 375.
  • external memory 340a is coupled to the forwarding engine 300 via its associated memory controller 375a.
  • the external memory 340 is preferably organized as one or more banks and implemented using reduced-latency-dynamic-random-access-memory (RLDRAM) devices, although other devices, such as fast-cycle-random-access- memory (FCRAM) devices, could be used.
  • the external memory 340 stores non- transient data (e.g., forwarding tables, queues) organized as a series of data structures for use in processing transient data (e.g., packets).
  • Each memory controller 375 contains logic that enables access to memory locations contained in its associated external memory 340.
  • the processors 450 of a particular column are coupled to a particular external memory controller 375 that enables the processors 450 to share and access data contained in the external memory 340 coupled to the controller 375.
  • the processors 450 that comprise column TMCO i.e., processors 450a-h, are coupled to external memory controller 375a, which enables each of these processors to share and access data contained in external memory 340a.
  • the processors 450 of a row e.g., processors 450a and 45Oi, are organized as a cluster 400 containing a context memory 430 configured to hold context information
  • FIG. 4 is a schematic block diagram of a cluster 400.
  • Each processor 450 of the cluster is coupled to an instruction memory (IRAM) 420 that is configured to store instructions for execution by the processor 450, a control registers unit 410, and the context memory 430.
  • the control registers unit 410 comprises various general-purpose and control registers used for o storage and to control the operation of the TMCs 450, respectively.
  • Each processor 450 contains an MMU 460 that couples the processor 450 to the memory controller 375 and enables the processor 450 to access the external memory 340 coupled to the controller 375.
  • the processors 450 of each cluster 400 execute operations on transient data s loaded into the context memory 430 by the input buffer 360, whereas the processors of each column operate in parallel to perform substantially the same operation on the transient data, but with a shifted phase.
  • Transient (context) data are passed between the input and output buffers of the engine 300 over a data path channel 440 provided by a , data mover circuit 470 coupled to the processor 450.
  • the context data flowing through 0 the cluster 400 are stored in the context memory 430 along with other data and pointers that reference data and various data structures (e.g., tables) stored in, e.g., external memory 340, for use by the processor 450.
  • the data mover 470 comprises logic that enables data to be transferred from the context memory 430 to the output buffer 380.
  • the present invention relates to an efficient technique for maintaining order 5 among entities, such as processors 450, by ensuring orderly access to resources shared by the entities, such as memories 340.
  • a request is generated to access a resource.
  • the request is placed on a queue associated with an entity.
  • Each queue is further associated with an ID that illustratively represents the entity associated with the queue.
  • the request eventually reaches the head of the queue.
  • a gate manager is provided to maintain orderly access to the shared resource.
  • the gate manager generates an ID that illustratively represents an entity that is al- lowed to access the resource.
  • the ID generated by the gate manager is compared with the ID associated with the queue to determine if they match.
  • Fig. 5 is a schematic block diagram of a cluster 400 coupled to a plurality of external memory controllers 375.
  • Each memory controller 375 contains a plurality of request queues 560 coupled to an arbiter 550.
  • Each request queue 560 is illustratively a first-in-first-out (FIFO) queue that is associated with a processor 450 and is configured to hold requests transferred from the processor 450.
  • FIFO first-in-first-out
  • each request queue 560 is associated with an identifier (ID) that illustratively represents the processor 450 associated with the queue 560.
  • the arbiter 550 contains conventional logic that implements a conventional arbitration policy to select queues 560 and process requests at the head of the selected queues 560.
  • Arbiter 550 contains gate managers 800, which are configured to generate identifiers (IDs) used by the arbiter 550 to process requests at the head of queues 560 in accordance with the inventive technique.
  • IDs identifiers
  • Each processor 450 (Fig. 4) illustratively contains a gate manager (MGR) select register 600 that the processor 450 configures to specify (select) one or more gate managers 800 associated with the processor 450.
  • Fig. 6 is a schematic block diagram of a gate manager select register 600 that may be advantageously used with the present in- vention.
  • Register 600 contains gate manager fields 620a-c, each of which is illustratively a one-bit field associated with a particular gate manager 800 and holds a value that associates the gate manager to the processor. For example, fields 620a-c hold values that associates the processor 450 to gate managers A-C, respectively.
  • a gate manager 800 is associated with the processor 450 by illustratively asserting its correspond- ing bit in field 620 to e.g., a one. For example, asserting bits contained in fields 620a and 620c specifies that gate managers A and C are associated with the processor 450.
  • the content of the gate manager select register 600 is transferred to the memory controller 375 coupled to the processor 450 via bus 530.
  • the contents of the gate manager select registers 600 contained in processors 450a-h are transferred to external memory controller 375a via bus 530a.
  • Processor 450 accesses an external memory 340 by generating a request and transferring the request to the memory controller 375 associated with the memory.
  • Request 700 contains a memory operation field 710, a gate man- ager identifier (ID) field 720, and an address field 730.
  • the memory operation 710 and address 730 fields hold values that represent a memory operation, e.g., read or write, to be performed to access the memory, and a memory address where the operation is to be performed, respectively.
  • the gate manager ID field 720 holds a value that represents, inter alia, a gate manager 800 to be used to process a request.
  • field 720 is a two-bit field that holds a value between 0 and 3, where a zero indicates no gate manager is to be used and a non-zero value (e.g., 1 through 3) specifies (selects) a particular gate manager 800.
  • a request 700 containing a value of 1 in the gate manager ID field 720 indicates gate manager A is used to process the request 700.
  • a request 700 containing a value of zero in field 720 indicates no gate man- ager is used to process the request 700.
  • Fig. 8 is a schematic block diagram of a gate manager 800 that may be advantageously used with the present invention.
  • Gate manager 800 contains entry count logic 820, a processor bit mask 900, and processor identifier (ID) generator logic 850.
  • Entry count logic 820 comprises an entry count 822 and a rollover counter 824.
  • the entry count 822 holds a value that is a count of the number of processors in a particular column that have specified (selected) the gate manager 800 in their gate manager select register 600.
  • the rollover counter 824 is a conventional rollover counter that generates a value illustratively from 1 to the number of processors represented by the entry count 822.
  • Processor bit mask 900 is, illustratively, an 8-bit bit mask that indicates whether a particular processor in a column has specified (selected) the gate manager in the processor's gate manager select register 600.
  • Fig. 9 is a schematic block diagram of a processor bit mask 900 that may be advantageously used with the present invention.
  • Each processor 450 is represented by, illustratively, a one-bit field 920 that indicates whether the processor has specified the gate manager 800. For example, fields PO 920a, Pl
  • a value of one in field 920 indicates the processor 450 has specified the gate manager 800 and a value of zero indicates the processor 450 has not specified the gate manager 800.
  • the processor ID generator logic 850 combines the value generated from the rollover counter 824 with the processor mask 900 to generate an identifier (ID) that represents a processor 450. Specifically, logic 850 acquires the value generated by the rollover counter 824, applies the generated value to the processor bit mask 900 to de- termine the processor 450 associated with the generated value, and generates an ID that represents the processor 450. This generated ID is transferred to the arbiter 550.
  • ID identifier
  • processors 450a, 450b, and 45Oh specify gate manager 800a in their gate manager select registers 600 and the content the each processor's gate manager select register 600 is transferred to memory controller 375a.
  • Gate manager 800a acquires the transferred contents and configures its entry count logic 820 to generate count values 1, 2, and, 3, corresponding to processors 450a, 450b, and 45Oh.
  • gate manager 800a configures its processor bit mask 900 to represent these processors by, illustratively, setting bits PO 920a, Pl 920b, and P7 92Oh in its processor bit mask 900 to a one.
  • the rollover counter 824 generates a value of 3, representing processor 450h.
  • the processor ID generator logic 850 applies this value to the processor bit mask 900 and determines that the value generated by the counter 824 corresponds to processor 45Oh. That is, the processor ID generator logic 850 determines from examining the processor bit mask 900 that processor 45Oh is, illustratively, the third processor starting from bit PO 920a that has selected gate manager 800a in its processor gate manager select register 600. Logic 850 then generates an ID representing processor 45Oh, which is transferred to the arbiter 550.
  • a processor 450 accesses a memory 340 via requests 700.
  • the processor 450 accesses a memory 340 by generating a request 700 and transferring the request 700 to the appropriate external memory controller 375 coupled to the memory 340.
  • the memory controller 375 places the request 700 at the end (tail) of the queue 560 associated with the processor 450.
  • the request 700 reaches the top (head) of the queue 560 and the arbiter 550 processes the request.
  • the arbiter 550 implements a conventional "polling" algorithm that polls the queues 560 in, e.g., a round robin fashion, selects the queue 560 containing the request, and determines that the request 700 is at the head the selected queue 560.
  • the arbiter 550 then processes the request 700 including examining the content of the request's gate manager ID field 720 and determining if a gate manager 800 has been specified. If not, the request is removed from the queue and transferred to the external memory 340 coupled to the memory controller 375. Otherwise, the arbiter 550 determines if the ID generated by the specified gate manager 800 is the same as the ID associated with the queue 560. If so, the arbiter 550 removes the request 700 from the queue 560 and transfers it to the external memory 340 coupled to the memory controller 375. Otherwise, the request 700 remains at the head of the queue 560.
  • the arbiter 550 transfers a request 700 from a queue 560 to the memory 340 it sends a signal to the entry count logic 820 to update the rollover counter 824.
  • Logic 850 then generates an ID based on the updated counter's value, as described above. For example, suppose the value in the rollover counter 824 for gate manager 800a is 1 and a request containing a gate manager ID associated with gate manager 800a is at the head of queue 560a. Further, suppose the gate manager 800a generates an ID that is the same as the ID associated with queue 560a.
  • the arbiter 550 polls the queue 560a, (ii) determines the ID generated by the gate manager 800a is the same as the ID associated with queue 560a, (iii) removes the request from the queue 560a, (iv) transfers the request 700 to memory 340a, and (v) notifies the entry count logic 820 to update the rollover counter's value to, e.g., a value of 2.
  • the gate manager 800 then generates an ID based on the updated rollover counter value, e.g., generates an ID that is the same as the ID associated with processor 45 Ob's queue 560.
  • a processor 450 selects one or more gate managers 800 by placing an appropriate value in its gate manager select register 600.
  • the register's content is transferred to the memory controller 375 coupled to the processor where the selected gate managers 800 acquire the content and configure the entry count logic 820 and processor bit mask 900, as described above.
  • Fig. 10 is a flow diagram of a se- quence of steps that may be used to configure the entry count logic 820 and processor bit mask 900 in accordance with the inventive technique. The sequence begins at Step 1005 and proceeds to Step 1020 where an entity, such as processor 450, selects one or more gate managers 800 by placing an appropriate value in the gate manager selection register 600, as described above.
  • processor 450a selects gate managers 800a, and 800c, by illustratively placing a value of five in its gate manager select register 600.
  • the content of the gate manager selection register 600 is transferred to the external memory controller 375, coupled to the processor 450, which in turn, transfers the register's 600 contents to the gate managers 800.
  • the content of processor 450a's gate manager selection register 600 e.g., five, is transferred to external memory controller 375a, which transfers this content to gate managers 800a, 800b, and 800c.
  • each gate manager 800 updates their entry counts 822 and processor bit masks 900 based on the transferred gate manager selection register's 600 content.
  • gate managers 800a and 800c update their entry counts 822 and processor bit masks 900 to reflect the content of processor 450a's gate manager select register 600 by, e.g., incrementing their entry counts 822 by one and setting field PO 920a in their bit masks to one.
  • each gate manager 800 generates an ID using the processor bit mask 900 and the entry count logic 820, as described above.
  • the sequence ends at Step 1095.
  • Fig. 11 is a flow diagram of a sequence of steps that may be used to process a request 700 in accordance with the inventive technique.
  • Step 1105 The sequence begins at Step 1105 and proceeds to Step 1110 where a request 700 associated with an entity, e.g., processor 450, is generated to access a resource, e.g., memory 340.
  • the generated request 700 is transferred from the entity to the resource controller associated with the resource, e.g., the memory controller 375 coupled to the memory 340 (Step 1112).
  • the resource controller places the request 700 on the request queue 560 associated with the entity.
  • Step 1120 the request reaches the head of the queue 560.
  • the arbiter 550 polls the queues 560 in a manner as described above, selects the queue 560 containing the request 700, and examines the gate manager field 720 in the request 700 to determine if the request 700 specifies a gate manager (Steps 1122-1130). If not, the sequence proceeds to Step 1140. Otherwise, the sequence proceeds to Step 1135 where the arbiter 550 determines if the identifier (ED) generated by the gate manager specified in the request 700 is the same as the ID associated with the queue 560. If not, the arbiter selects the next queue 560, in a manner as described above, and returns to Step 1125.
  • ED identifier
  • Step 1140 the resource controller s removes the request 700 from the queue 560 and transfers the request 700 to the resource.
  • the gate manager is notified by the resource controller that the request has been transferred to the resource and the gate manager generates the next ID, in a manner as described above.
  • the resource processes the request 700 and responds to the resource controller with results, if any (Step 1145). For example, if the request's I 0 memory operation field 710 specified a read operation, memory 340 processes the request including responding with (returning) data read contained in the location represented in the request's address field 730. Moreover, if the request's memory operation is a write operation, memory 340 processes the request but does not respond.
  • the resource controller transfers the results (if any) acquired from the resource to is the entity. The sequence ends at Step 1195.
  • the above-described embodiment of the invention describes the invention as used in a system containing a plurality of gate managers, this is not intended to be a limitation of the invention. Rather, a system employing a single gate manager may take advantage of the inventive technique.
  • the above-described em- 20 bodiment of the invention describes the invention as used in a system containing one or more memories as resources. However, this too is not intended to be a limitation of the invention. Rather, the inventive technique may be applied to other types of resources shared by a plurality of entities, such as an input and/or output device.
  • the invention may be implemented in whole or in part in software comprising computer executable code stored in a computer readable medium, such as a flash RAM or a disk file.
  • a computer readable medium such as a flash RAM or a disk file.
  • threads of execution generate requests 700 that are transferred to a resource controller (e.g., memory controller 375) implemented as a software routine that 30 processes the requests as described above.
  • the entity is "blocked" from execution while the request is processed.
  • the entity is a thread of execution that generates and transfers request to a resource controller. While the request is processed, the thread of execution is blocked from further execution.
  • the resource controller If the request contains an operation that does not return results from the resource (e.g., a write operation), the resource controller notifies a scheduler contained in the system when the request is transferred to the resource (e.g., a memory). If the request contains an operation that returns results the resource (e.g., a read operation), the resource controller transfers the results to the thread and notifies the scheduler. When the scheduler receives the notification, it unblocks the thread and reschedules it for execution.
  • a scheduler contained in the system when the request is transferred to the resource (e.g., a memory). If the request contains an operation that returns results the resource (e.g., a read operation), the resource controller transfers the results to the thread and notifies the scheduler. When the scheduler receives the notification, it unblocks the thread and reschedules it for execution.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Multi Processors (AREA)
  • Debugging And Monitoring (AREA)
  • Catching Or Destruction (AREA)
  • Revetment (AREA)
PCT/US2004/018723 2003-06-11 2004-06-10 Maintaining entity order with gate managers WO2004112340A2 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
DE602004022288T DE602004022288D1 (de) 2003-06-11 2004-06-10 Aufrechterhalten der entitätenreihenfolge bei gate-managern
AT04755096T ATE438140T1 (de) 2003-06-11 2004-06-10 Aufrechterhalten der entitätenreihenfolge bei gate-managern
EP04755096A EP1631906B1 (de) 2003-06-11 2004-06-10 Aufrechterhalten der entitätenreihenfolge bei gate-managern

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US10/458,869 2003-06-11
US10/458,869 US7257681B2 (en) 2003-06-11 2003-06-11 Maintaining entity order with gate managers

Publications (2)

Publication Number Publication Date
WO2004112340A2 true WO2004112340A2 (en) 2004-12-23
WO2004112340A3 WO2004112340A3 (en) 2005-07-28

Family

ID=33510675

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2004/018723 WO2004112340A2 (en) 2003-06-11 2004-06-10 Maintaining entity order with gate managers

Country Status (6)

Country Link
US (1) US7257681B2 (de)
EP (1) EP1631906B1 (de)
CN (1) CN100361084C (de)
AT (1) ATE438140T1 (de)
DE (1) DE602004022288D1 (de)
WO (1) WO2004112340A2 (de)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007038445A2 (en) 2005-09-26 2007-04-05 Advanced Cluster Systems, Llc Clustered computer system
US8453147B2 (en) * 2006-06-05 2013-05-28 Cisco Technology, Inc. Techniques for reducing thread overhead for systems with multiple multi-threaded processors
US8082289B2 (en) 2006-06-13 2011-12-20 Advanced Cluster Systems, Inc. Cluster computing support for application programs
US8041929B2 (en) 2006-06-16 2011-10-18 Cisco Technology, Inc. Techniques for hardware-assisted multi-threaded processing
US8010966B2 (en) 2006-09-27 2011-08-30 Cisco Technology, Inc. Multi-threaded processing using path locks
US9276868B2 (en) * 2012-12-17 2016-03-01 Marvell Israel (M.I.S.L) Ltd. Maintaining packet order in a parallel processing network device
CN105164984B (zh) * 2013-03-13 2019-01-08 马维尔以色列(M.I.S.L.)有限公司 保持并行处理网络设备中的分组顺序的方法和设备
US9846658B2 (en) * 2014-04-21 2017-12-19 Cisco Technology, Inc. Dynamic temporary use of packet memory as resource memory
CN104899008B (zh) * 2015-06-23 2018-10-12 北京玉华骢科技股份有限公司 并行处理器中的共享存储结构及方法

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002029511A2 (en) 2000-10-05 2002-04-11 Wintegra Ltd. Method system and apparatus for multiprocessing
US20020118692A1 (en) 2001-01-04 2002-08-29 Oberman Stuart F. Ensuring proper packet ordering in a cut-through and early-forwarding network switch

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH05120239A (ja) * 1991-10-30 1993-05-18 Seiko Epson Corp 並列処理回路
US6026464A (en) * 1997-06-24 2000-02-15 Cisco Technology, Inc. Memory control system and method utilizing distributed memory controllers for multibank memory
US6119215A (en) * 1998-06-29 2000-09-12 Cisco Technology, Inc. Synchronization and control system for an arrayed processing engine
US6230241B1 (en) * 1998-09-09 2001-05-08 Cisco Technology, Inc. Apparatus and method for transferring data in a data communications device
US6330645B1 (en) * 1998-12-21 2001-12-11 Cisco Technology, Inc. Multi-stream coherent memory controller apparatus and method
US7194568B2 (en) * 2003-03-21 2007-03-20 Cisco Technology, Inc. System and method for dynamic mirror-bank addressing

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002029511A2 (en) 2000-10-05 2002-04-11 Wintegra Ltd. Method system and apparatus for multiprocessing
US20020118692A1 (en) 2001-01-04 2002-08-29 Oberman Stuart F. Ensuring proper packet ordering in a cut-through and early-forwarding network switch

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
BUX W ET AL., TECHNOLOGIES AND BUILDING BLOCKS FOR FAST PACKET FORWARDING, 1 January 2001 (2001-01-01)
MINKENBERG C ET AL.: "A Combined Input and Output Queued Packet-Switched System based on Prizma Switch-On-A-Chip Technology", IEEE COMMUNICATIONS MAGAZINE, 1 December 2000 (2000-12-01)

Also Published As

Publication number Publication date
WO2004112340A3 (en) 2005-07-28
ATE438140T1 (de) 2009-08-15
US20040252710A1 (en) 2004-12-16
DE602004022288D1 (de) 2009-09-10
EP1631906A2 (de) 2006-03-08
CN100361084C (zh) 2008-01-09
US7257681B2 (en) 2007-08-14
CN1781079A (zh) 2006-05-31
EP1631906B1 (de) 2009-07-29

Similar Documents

Publication Publication Date Title
US7039914B2 (en) Message processing in network forwarding engine by tracking order of assigned thread in order group
US7047370B1 (en) Full access to memory interfaces via remote request
US6804815B1 (en) Sequence control mechanism for enabling out of order context processing
US7434016B2 (en) Memory fence with background lock release
EP0992056B1 (de) Suchmotorarchitektur für ein mehrschicht-hochleistungsschaltelement
US7017020B2 (en) Apparatus and method for optimizing access to memory
US7349399B1 (en) Method and apparatus for out-of-order processing of packets using linked lists
US7065050B1 (en) Apparatus and method for controlling data flow in a network switch
US7158964B2 (en) Queue management
US7853951B2 (en) Lock sequencing to reorder and grant lock requests from multiple program threads
US6546010B1 (en) Bandwidth efficiency in cascaded scheme
US7590785B2 (en) Systems and methods for multi-tasking, resource sharing, and execution of computer instructions
EP0947926A2 (de) System und Verfahren für Multi-tasking, Betriebsmittelteilung und Ausführung von Rechnerbefehlen
US7290105B1 (en) Zero overhead resource locks with attributes
US7293158B2 (en) Systems and methods for implementing counters in a network processor with cost effective memory
US20070044103A1 (en) Inter-thread communication of lock protected data
US7254687B1 (en) Memory controller that tracks queue operations to detect race conditions
EP1631906B1 (de) Aufrechterhalten der entitätenreihenfolge bei gate-managern
JP3605574B2 (ja) ネットワーク処理システム内で既存のフレーム分類器ツリーを更新するための方法およびネットワーク処理システム

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): BW GH GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DPEN Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed from 20040101)
WWE Wipo information: entry into national phase

Ref document number: 20048113038

Country of ref document: CN

WWE Wipo information: entry into national phase

Ref document number: 2004755096

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 2004755096

Country of ref document: EP