WO2017151138A1 - Atomic memory operation - Google Patents

Atomic memory operation Download PDF

Info

Publication number
WO2017151138A1
WO2017151138A1 PCT/US2016/020719 US2016020719W WO2017151138A1 WO 2017151138 A1 WO2017151138 A1 WO 2017151138A1 US 2016020719 W US2016020719 W US 2016020719W WO 2017151138 A1 WO2017151138 A1 WO 2017151138A1
Authority
WO
WIPO (PCT)
Prior art keywords
memory
ring buffer
interrupt
buffer
atomic
Prior art date
Application number
PCT/US2016/020719
Other languages
French (fr)
Inventor
Jean Tourrilhes
Michael Schlansker
Original Assignee
Hewlett Packard Enterprise Development Lp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hewlett Packard Enterprise Development Lp filed Critical Hewlett Packard Enterprise Development Lp
Priority to PCT/US2016/020719 priority Critical patent/WO2017151138A1/en
Publication of WO2017151138A1 publication Critical patent/WO2017151138A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
    • G06F15/163Interprocessor communication
    • G06F15/167Interprocessor communication using a common memory, e.g. mailbox
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/382Information transfer, e.g. on bus using universal interface adapter
    • G06F13/385Information transfer, e.g. on bus using universal interface adapter for adaptation of a particular data processing system to different peripheral devices

Abstract

An atomic memory operation system comprises a memory fabric coupling a receiving node and a sending node. The memory fabric executes an atomic memory operation by identifying a control word of a ring buffer using a memory address of the memory operation. The memory fabric then retrieves a tail index from the control word, wherein the tail index indicates a position in a ring buffer, and inserts a memory word in the ring buffer at the position indicated by the tail index, wherein the memory word corresponds to the memory operation. Finally, the memory fabric updates the tail index in the control word to point to a next slot in the ring buffer.

Description

ATOMIC EMORY OPERATION

Background

[0001] Messaging allows two or more entities to exchange information across a network and represents a foundationai aspect of networking. As networks become larger and more complex, challenges may arise with respect to successfully messaging across networks.

Brief Description of the Drawings

[0002] Figure 1 is a block diagram of an example system for an atomic memory operation consistent with the present disclosure

[0003] Figure 2 is a block diagram of an example system for an atomic memory operation consistent with the present disclosure.

[0004] Figure 3 is a block diagram of an example system for an atomic memory operation consistent with the present disclosure.

[0005] Figure 4 is a block diagram of an example system for an atomic memory operation consistent with the present disclosure.

[0006] Figure 5 is a block diagram of an example method for an atomic memory operation consistent with the present disclosure.

[0007] Messaging may allow two or more entities to exchange information across a network. Messaging may be accomplished using either hardware or processor instructions. Hardware may be used for messaging between multiple computers. In some cases, the message may be sent via a cable, such as a copper or optical cable, and may be received by a Network Interface Card (NIC). The NIC may define a hardware receiving queue to receive the message and notify the recipient that the message has been received.

[0008] When messaging is done within a single computer, processor instructions may be used to control and implement the messaging, in some cases, the operations for queueing a received message may be encoded as memory write and read operations. The memory write and read operations may be performed as a single transaction, known as an atomic operation. This may allow multiple senders to send messages within the computer.

[0009] Figure 1 is a block diagram of an example system 100 for atomic queueing consistent with the present disclosure. System 100 may include multiple components, as illustrated in Figure 1. For example, system 100 may include a memory fabric 102. As used herein, a memory fabric refers to a framework that connects a plurality of computing nodes. Memory fabric 102 may consist of connected storage, a connected network, and/or connected processing. In some instances, memory fabric 102 may connect a plurality of nodes to a pool of global, shared memory 103. Memory fabric 102 may execute an atomic operation consistent with the present disclosure.

[0010] Further, as illustrated in Figure 1 , system 100 may include a sending node 104. As used herein, a sending node refers to a node that transmits a message to a separate node. As used herein, a message refers to a piece of data with a sender and a receiver. Sending node 104 may be composed of a plurality of processor cores and local memory. Sending node 104 may act like a computer. For instance, sending node 104 may have its own operating system and/or a local memory domain. Sending node 104 may further have its own power supply and/or its own fault domain. As shown in Figure 1 , sending node 104 may be coupled to the memory fabric 102. Although a single sending node 104 is shown in Figure 1 , it is contemplated that a plurality of sending nodes may be included within system 100, and a plurality of sending nodes 104 may be coupled to memory fabric 102. [0011] As further illustrated in Figure 1 , system 00 may include a receiving node 106. As used herein, a receiving node refers to a node that receives a message from another node, such as sending node 104. Receiving node 106 may be composed of a plurality of processor cores and local memory. Receiving node 106 may act like a computer. For instance, receiving node 106 may have its own operating system and/or a local memory domain. Receiving node 106 may further have its own power supply and/or its own fault domain. As shown in Figure 1 , receiving node 106 may be coupled to the memory fabric 102. Although a single receiving node 106 is shown in Figure 1 , it is contemplated that a plurality of receiving nodes 106 may be included within system 100, and a plurality of receiving nodes 106 may be coupled to memory fabric 102. in some examples, a single node may behave as both a receiving node 106 and a sending node 104. in such examples, the single node may act as a receiving node 106 for a first message and as a sending node 104 for a second message.

[0012] As shown in Figure 1 , receiving node 106 may include a memory fabric interface 108. As used herein, a memory fabric interface refers to a hardware interface that couples a memory fabric to other components of a system. Receiving node 106 may also include local memory 1 14. As used herein, local memory refers to the memory specific to a particular node. For instance, in Figure 1 , local memory 1 14 corresponds to the memory specific to receiving node 106.

[0013] Figure 2 is a block diagram of an example system 200 for atomic queueing consistent with the present disclosure. System 200 may include multiple components, as illustrated in Figure 2.

[0014] For example, system 200 may include a memory fabric 202. Memory fabric 202 is analogous to memory fabric 102 shown in Figure 1. Memory fabric 202 may consist of connected storage, a connected network, and/or connected processing. In some instances, memory fabric 202 may connect a plurality of nodes to a pool of global, shared memory. Memory fabric 202 may execute an atomic operation consistent with the present disclosure.

[0015] System 200 may further include a sending node 204. Sending node 204 is analogous to sending node 104 shown in Figure 1. Sending node 204 may be composed of a plurality of processor cores and local memory. Sending node 204 may act like a computer. For instance, sending node 204 may have its own operating system and/or a local memory domain. Sending node 204 may further have its own power supply and/or its own fault domain. As shown in Figure 2, sending node 204 may be coupled to the memory fabric 202. Although a single sending node 204 is shown in Figure 2, it is contemplated that a plurality of sending nodes 204 may be included within system 200, and a plurality of sending nodes 204 may be coupled to memory fabric 202.

[0016] System 200 may further include a receiving node 206. Receiving node 206 is analogous to receiving node 106 shown in Figure 1. Although a single receiving node 206 is shown, it is contemplated that a plurality of receiving nodes 206 may be included within system 200 and that a plurality of receiving nodes 206 may be coupled to memory fabric 202.

[0017] As shown in Figure 2, receiving node 206 may include a memory fabric interface 208. As used herein, a memory fabric interface refers to a hardware interface that couples a memory fabric to other components of a system. For instance, as shown in Figure 2, memory fabric interface 208 may couple the memory fabric 202 to other components of receiving node 206, Memory fabric interface 208 may include an enqueue atomic handier 210. As used herein, an enqueue atomic handier refers to the portion of the memory fabric interface responsible for executing atomic queueing instructions. Enqueue atomic handier 210 may be entirely contained within memory fabric interface 208.

[0018] Receiving node 206 may further contain a ring buffer 212. As used herein, a ring buffer refers to a fixed-size memory structure used to temporarily store data. Ring buffer 212 may be contained within local memory 214. As used herein, local memory refers to the memory specific to a particular node. For example, in Figure 2, local memory 214 corresponds to the memory specific to receiving node 206. Although a single ring buffer 212 is shown, it is contemplated that multiple ring buffers may be present. Additional ring buffers may be ring buffers similar to ring buffer 212, or they may be reserve ring buffers, described further herein. [0019] Ring buffer 212 may be composed of a single control word followed by an array of message slots. As used herein, a control word refers to metadata that allows an atomic queueing handler, such as enqueue atomic handler 210, to identify and operate on a particular ring buffer. As used herein, a message slot refers to a position within a ring buffer where a message handle may be written. In some examples, the control word may be a 64-bit control word and the array may be composed of 64-bit message slots. The size of the array, N, may be a power of two and in some implementations may be set within the control word. The overall size of the ring buffer 212 would be N+1 words, with each word being 64 bits, in some embodiments, the ring buffer may be composed of a 128 bit control word, with an array composed of 128 bit message slots.

[0020] In some embodiments, an individual message slot may have an index corresponding to its position within the ring buffer 212. For instance, as shown in Figure 2, a message slot may have an index of Slot #0 or Slot #1. A message slot may have a maximum index of Slot # (N- ), where N represents the size of the array.

[0021] When the sending node 204 wants to insert a message handle into the ring buffer 212, it generates an enqueue atomic memory operation, containing the memory address of the ring buffer 212 and the message handle, over the memory fabric 202. The enqueue memory operation is received by the enqueue atomic handier 210. When executing the enqueue atomic memory operation, the enqueue atomic handier 210 may identify a control word of a ring buffer 212 using a memory address of the memory operation. Enqueue atomic handler 210 may then retrieve a tail index from the control word. As used herein, a tail index refers to a position of a slot within a ring buffer. In some embodiments, the tail index may indicate a next slot to be filled in a ring buffer. Once the tail index has been retrieved, enqueue atomic handier 210 may insert a memory word in ring buffer 212 at the position indicated by the tail index, in some instances, the memory word may be a message handle. In other examples, the memory word may be a queue handle which identifies a specific message queue. In still other examples, the memory word may be a sender identifier. Enqueue atomic queueing handier 210 may complete the memory operation by updating the tail index in the control word. In some

embodiments, updating the tail index may include advancing a pointer to point to a next slot in ring buffer 212.

[0022] Receiving node 206 may further include an interrupt controller 218. As used herein, an interrupt controller refers to a hardware component which collects interrupts form various sources and interrupts the processor, interrupt controller 218 may be coupled to memory fabric interface 208 as well as to interrupt handier 220. In some embodiments, interrupt controller 218 may generate an interrupt. As used herein, an interrupt refers to a signal indicating an event that requires attention. An interrupt may be generated by hardware. For example, the enqueue atomic handier 210 may generate an interrupt to an interrupt handier 220. In some instances, enqueue atomic handier 210 may generate an interrupt in response to an updating of the tail index,

[0023] Figure 3 is a block diagram of an example system 300 for atomic queueing consistent with the present disclosure. System 300 may be used for interrupt virtualization and management, and may include multiple components, as illustrated in Figure 3.

[0024] For example, system 300 may include a memory fabric 302. Memory fabric 302 is analogous to memory fabric 102 shown in Figure 1 and memory fabric 202 shown in Figure 2. Memory fabric 302 may consist of connected storage, a connected network, and/or connected processing. In some instances, memory fabric 302 may connect a plurality of nodes to a pool of global, shared memory. Memory fabric 302 may execute an atomic operation consistent with the present disclosure.

[0025] System 300 may further include a sending node 304. Sending node 304 is analogous to sending node 104 shown in Figure 1 and sending node 204 shown in Figure 2. Sending node 304 may be composed of a plurality of processor cores and local memory. Sending node 304 may act like a computer. For instance, sending node 304 may have its own operating system and/or a local memory domain. Sending node 304 may further have its own power supply and/or its own fault domain. As shown in Figure 3, sending node 304 may be coupled to the memory fabric 302. Although a single sending node 304 is shown in Figure 3, it is contemplated that a plurality of sending nodes may be included within system 300, and a plurality of sending nodes 304 may be coupled to memory fabric 302.

[0026] System 300 may further include a receiving node 306. Receiving node 306 is analogous to receiving node 106 shown in Figure 1 and receiving node 206 shown in Figure 2. As shown in Figure 3, receiving node 306 may include multiple components. For instance, receiving node 306 may include a memory fabric interface 308. Memory fabric interface 308 is analogous to memory fabric interface 208 shown in Figure 2. Memory fabric interface 308 may include an enqueue atomic handier 310. As used herein, an enqueue atomic handier refers to the portion of the memory fabric interface responsible for executing atomic queueing instructions. Enqueue atomic handier 310 may be entirely contained within memory fabric interface 308, Receiving node 306 may further contain a local memory 314 and a processor 316, Local memory 314 and processor 316 are analogous to local memory 214 and processor 216, respectively, as shown in Figure 2.

[0027] Receiving node 306 may further include an interrupt controller 318.

Interrupt controller 318 is analogous to interrupt controller 218, shown in Figure 2. Interrupt controller 318 may be coupled to memory fabric interface 308 as well as to interrupt handler 320. System 300 may further contain an interrupt handler 320.

Interrupt handler 320 is analogous to interrupt handier 220, shown in Figure 2. As used herein, an interrupt handier refers to a set of instructions executable to prioritize and respond to interrupts occurring on a system. Interrupt handler 320 may be located on a processor, such as processor 316. Processor 316 is analogous to processor 216 shown in Figure 2.

[0028] Interrupt handler 320 may be coupled to interrupt controller 318 such that interrupt handler 320 is activated upon assertion of an interrupt on processor 316 by interrupt controller 318. Interrupt handler 320 may further be coupled to virtual interrupt queue 324. As used herein, a virtual interrupt queue refers to a stored series of virtual interrupts. Sending node 204 may use the enqueue atomic memory operation over the memory fabric 302 to insert a handle into a message queue, such as message queue 326. Message queue 326 may correspond to a ring buffer, such as ring buffer 212 shown in Figure 2. After inserting a handle into message queue 326, the sending node 304 may use the enqueue atomic memory operation over the memory fabric 302 to insert the queue handle corresponding to the message queue into the virtual interrupt queue 324. The handle inserted into the virtual interrupt queue 324 may correspond to a message queue 326 and may serve to identify the message queue 326. The handle may be a memory address of the message queue, an index corresponding to the message queue, or another identifier that serves to specify the specific message queue. Message queue 326 may in turn correspond to ring buffer 212, shown in Figure 2.

[0029] Once enqueue atomic handier 310 has inserted a handle into interrupt queue 324, enqueue atomic handler 310 may generate an interrupt to interrupt controller 318. Interrupt handler 320 may then activate in response to the interrupt generated to interrupt controller 318. Interrupt handler 320 may then consult interrupt queue 324 to locate the handle stored within the interrupt queue 324. in some instances, the handle stored in interrupt queue 324 may be the address of a message queue, such as message queue 326. Message queue 326 may correspond to a ring buffer, such as ring buffer 212 as shown in Figure 2. In such instances, interrupt handler 320 may then service the message queue 326. In some instances, servicing the message queue may include resolving the event that caused generation of the original interrupt.

[0030] Figure 4 is a block diagram of an example system 400 for atomic queueing consistent with the present disclosure. System 400 may be used for buffer management and may include multiple components, as illustrated in Figure 4.

[0031] For example, system 400 may include a memory fabric 402. Memory fabric 402 is analogous to memory fabric 102, 202, and 302, shown in Figures 1 , 2, and 3, respectively. Memory fabric 402 may consist of connected storage, a connected network, and/or connected processing. In some instances, memory fabric 402 may connect a plurality of nodes to a pool of global, shared memory. Memory fabric 402 may execute an atomic operation consistent with the present disclosure. [0032] System 400 may further include a sending node 404. Sending node 404 is analogous to sending nodes 104, 204, and 304, shown in Figures 1 , 2, and 3, respectively. Sending node 404 may be composed of a plurality of processor cores and local memory. Sending node 404 may act like a computer. For instance, sending node 404 may have its own operating system and/or a local memory domain. Sending node 404 may further have its own power supply and/or its own fault domain. As shown in Figure 4, sending node 404 may be coupled to the memory fabric 402. Although a single sending node 404 is shown in Figure 4, if is contemplated that a plurality of sending nodes may be included within system 400, and a plurality of sending nodes 404 may be coupled to memory fabric 402.

[0033] System 400 may further include a receiving node 406. Receiving node 406 is analogous to receiving nodes 106, 206, and 306, shown in Figures 1 , 2 and 3, respectively. As shown in Figure 4, receiving node 406 may include multiple components. For instance, receiving node 406 may include a memory fabric interface 408, Memory fabric interface 408 is analogous to memory fabric interfaces 208 and 308, shown in Figures 2 and 3, respectively. Memory fabric interface 408 may include an enqueue atomic handler 410. Enqueue atomic handler 410 is analogous to enqueue atomic handler 310, shown in Figure 3.

[0034] Receiving node 406 may further contain a local memory 414 and a processor 416. Local memory 414 and processor 416 are analogous to local memories 214 and 314, and processors 216 and 316, respectively, as shown in Figures 2 and 3. As shown in Figure 4, local memory 414 may contain a ring buffer 412. Ring buffer 412 is analogous to ring buffers 212 and 312, shown in Figures 2 and 3, respectively. Local memory 414 may further contain a reserve ring buffer 428 and a buffer array 430. The reserve ring buffer 428 is analogous to ring buffer 412 and may be the same size as ring buffer 412 or it may be a different size. However, reserve ring buffer 428 and buffer array 430 are to have the same size. Buffer array 430 includes a plurality of buffers 432-1 , 432-2...432-N.

[0035] To reserve a buffer using system 400, receiving node 406 may set the reserve ring buffer 428 up to allow the atomic memory operation to proceed on the reserve ring buffer 428. Receiving node 406 may then pre-allocate the plurality of buffers 432-1 through 432-N and populate the buffer array 430 with pointers to the pre- allocated buffers 432-1 through 432-N. Pre-allocation may be done using a malloc library, allocating on a stack, or splitting a large memory region into smaller, equally- sized pieces.

[0036] Once the buffer array 430 has been populated, the sending node 404 may use the enqueue atomic memory operation over the memory fabric 402 to insert a sender identification into the reserve buffer ring 428. Upon insertion, a unique index may be returned by the atomic memory operation to the sending node 404. The index may be used to point to a slot position in the buffer array 430. The slot position in the buffer array 430 indicated by the index may further contain a pointer that points to the reserved buffer.

[0037] Figure 5 is a block diagram of an example method 540 for atomic queueing consisting with the present disclosure. At 542, method 540 may include receiving an enqueue atomic memory operation. The memory operation may be received on the receiving node shown in Figures 1 -4 and may be sent by the sending node shown in Figures 1 -4,

[0038] At 544, method 540 may include identifying a memory address. In some embodiments, identifying a memory address may include identifying the memory address of the atomic memory operation. At 546, method 540 may include identifying a control word, in some embodiments, the control word may be identified using the memory address identified at 544. in such embodiments, the control word may be thought of as a memory word. In some embodiments, identifying a control word may include identifying a ring buffer among a plurality of ring buffers, wherein the ring buffer will be used for storing a message.

[0039] At 548, method 540 may include retrieving a tail index. In some embodiments, the tail index may be retrieved from the control word identified at 546.

[0040] At 550, method 540 may include inserting a memory word of the atomic memory operation into a slot within the ring buffer. In some embodiments, inserting the memory word into a slot within the ring buffer may include storing a message handle in the slot. [0041] At 552, method 540 may include updating the tail index. In some instances, the tail index may be updated by advancing the ring buffer. Advancing the ring buffer may include advancing a pointer to point to the next open and available slot in the ring buffer. As the ring buffer has a fixed size, advancing the ring buffer may further include increasing the tail index by one. if the increased tail index exceeds the size of the ring buffer, advancing the ring buffer may include resetting the tail index to zero or one. The size of the ring buffer may be extracted from a control word.

[0042] Method 540 may further include returning the result of the insertion into the ring buffer to the sending node. In some instances, returning the result of the insertion may include returning an index showing the location within the ring buffer of the inserted message. In other embodiments, returning the results of the insertion into the ring buffer may include returning an error message. The error message may indicate that the ring buffer is full and that insertion is therefore unable to proceed. In such instances, method 540 may include checking whether the ring buffer is full, so as to know if an error message is to be returned. Checking whether the ring buffer is full may include extracting a head index from the control word associated with the ring buffer and comparing it with the tail index of the ring buffer.

[0043] in the foregoing detailed description of the present disclosure, reference is made to the accompanying drawings that form a part hereof, and in which is shown by way of illustration how examples of the disclosure may be practiced. These examples are described in sufficient detail to enable those of ordinary skill in the art to practice the examples of this disclosure, and it is to be understood that other examples may be utilized and that process, electrical, and/or structural changes may be made without departing from the scope of the present disclosure.

[0044] The figures herein follow a numbering convention in which the first digit corresponds to the drawing figure number and the remaining digits identify an element or component in the drawing. Elements shown in the various figures herein may be added, exchanged, and/or eliminated so as to provide a number of additional examples of the present disclosure. In addition, the proportion and the relative scale of the elements provided in the figures are intended to illustrate the examples of the present disclosure, and should not be taken in a limiting sense. Further, as used herein, "a number of an element and/or feature can refer to one or more of such elements and/or features.

[0045] As used herein, logic" is an alternative or additional processing resource to perform a particular action and/or function, etc., described herein, which includes hardware, e.g., various forms of transistor logic, application specific integrated circuits (ASICs), etc., as opposed to computer executable instructions, e.g., software firmware, etc., stored in memory and executable by a processor.

Claims

What is claimed:
1. An atomic memory operation system comprising:
a memory fabric to couple a receiving node and a sending node, wherein the memory fabric is to execute an atomic memory operation by:
identifying a control word of a ring buffer using a memory address of the memory operation;
retrieving a tail index from the control word, wherein the tail index indicates a position in a ring buffer;
inserting a memory word in the ring buffer at the position indicated by the tail index, wherein the memory word corresponds to the memory operation; and
updating the tail index in the control word to point to a next slot in the ring buffer.
2. The system of claim 1 , wherein updating the tail index in the control word to point to a next slot in the ring buffer is based on a ring size in the control word.
3. The system of claim 1 , wherein the memory fabric is to execute the atomic memory operation by:
returning the results of the insertion into the ring buffer, wherein returning the results of the insertion includes returning an index in the ring buffer showing where the message was inserted.
4. The system of claim 3, wherein returning the results of the insertion into the ring buffer includes returning an error message indicating that the ring buffer is full and the insertion may not proceed.
5. The system of claim 1 , wherein a plurality of sending nodes insert a plurality of memory words into the ring buffer.
6. The system of claim , further comprising the memory fabric to execute the atomic memory operation by: generating an interrupt via an interrupt controller on the receiving node; and storing a handle corresponding to the memory address, wherein the sending node inserts, via the memory fabric, a message queue identifier corresponding to the ring buffer into an interrupt queue.
7. The system of claim 1 , further comprising:
an interrupt handier, wherein the interrupt handler:
activates responsive to insertion of a handle corresponding to a memory address into an interrupt queue;
alerts the system upon receipt of an interrupt;
consults the interrupt queue,
locates the address of a message queue corresponding to the ring buffer stored in the interrupt queue; and
services the message queue, wherein servicing the message queue includes resolving an event causing an interrupt to be generated.
8. The system of claim 1 , further comprising the memory fabric to execute the atomic memory operation by:
generating an interrupt, wherein:
the interrupt is generated to the receiving node; and
the interrupt is generated in response to the updating of the tail index.
9. A system comprising:
a memory fabric coupling a receiving node and a sending node, wherein the receiving node, via the memory fabric, is to execute an atomic memory operation by:
defining a ring buffer in a memory fabric interface to receive a message from the sending node;
defining a reserve buffer ring and a buffer array;
configuring the reserve buffer ring to allow the atomic memory operation to proceed on the reserve buffer ring;
pre-ailocating a plurality of buffers; and populating the buffer array with pointers to the pre-allocated buffers.
10. The system of claim 9, further comprising the memory fabric to execute the atomic memory operation by:
inserting a sender identification into the reserve ring; and
returning a unique index to the sender, wherein:
the unique index reserves a buffer within the buffer array; and the unique index points to the reserved buffer within the buffer array.
1 1. A method for atomic memory operation, comprising:
receiving, by a memory fabric interface on a receiving node, a memory operation; identifying a memory address of the memory operation;
using the identified memory address to identify a control word;
retrieving a tail index from the control word;
inserting the control word of the memory operation into a slot within a ring buffer; and
updating the tail index of the control word by advancing the ring buffer.
12. The method of claim 1 1 , further comprising:
returning, via the memory fabric interface, the result of the insertion into the ring buffer to a sending node.
13. The method of claim 1 1 , wherein inserting the control word of the memory operation into the ring buffer includes:
identifying a ring buffer among a plurality of ring buffers on the receiving node using the address of the control word.
14. The method of claim 12, wherein returning the results of the insertion into the ring buffer includes returning an index in the ring buffer showing where the message was inserted.
15. The method of claim 12, wherein returning the results of the insertion into the ring buffer includes returning an error message indicating that the ring buffer is full and the insertion may not proceed.
PCT/US2016/020719 2016-03-03 2016-03-03 Atomic memory operation WO2017151138A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/US2016/020719 WO2017151138A1 (en) 2016-03-03 2016-03-03 Atomic memory operation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/US2016/020719 WO2017151138A1 (en) 2016-03-03 2016-03-03 Atomic memory operation

Publications (1)

Publication Number Publication Date
WO2017151138A1 true WO2017151138A1 (en) 2017-09-08

Family

ID=59744283

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2016/020719 WO2017151138A1 (en) 2016-03-03 2016-03-03 Atomic memory operation

Country Status (1)

Country Link
WO (1) WO2017151138A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030061417A1 (en) * 2001-09-24 2003-03-27 International Business Machines Corporation Infiniband work and completion queue management via head and tail circular buffers with indirect work queue entries
US20050262215A1 (en) * 2004-04-30 2005-11-24 Kirov Margarit P Buffering enterprise messages
US20060143373A1 (en) * 2004-12-28 2006-06-29 Sanjeev Jain Processor having content addressable memory for block-based queue structures
US20100088424A1 (en) * 2008-10-06 2010-04-08 Gidon Gershinsky Efficient Buffer Utilization in a Computer Network-Based Messaging System
US9003131B1 (en) * 2013-03-27 2015-04-07 Parallels IP Holdings GmbH Method and system for maintaining context event logs without locking in virtual machine

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030061417A1 (en) * 2001-09-24 2003-03-27 International Business Machines Corporation Infiniband work and completion queue management via head and tail circular buffers with indirect work queue entries
US20050262215A1 (en) * 2004-04-30 2005-11-24 Kirov Margarit P Buffering enterprise messages
US20060143373A1 (en) * 2004-12-28 2006-06-29 Sanjeev Jain Processor having content addressable memory for block-based queue structures
US20100088424A1 (en) * 2008-10-06 2010-04-08 Gidon Gershinsky Efficient Buffer Utilization in a Computer Network-Based Messaging System
US9003131B1 (en) * 2013-03-27 2015-04-07 Parallels IP Holdings GmbH Method and system for maintaining context event logs without locking in virtual machine

Similar Documents

Publication Publication Date Title
US7757232B2 (en) Method and apparatus for implementing work request lists
EP0106670A2 (en) CPU with multiple execution units
EP0116047B1 (en) Multiplexed first-in, first-out queues
DE202010017668U1 (en) Command and interrupt grouping on a data storage device
EP0258453B1 (en) Instruction prefetch control apparatus
DE112010004187T5 (en) Method and system for processing network events
EP0118446B1 (en) First-in, first-out (fifo) memory configuration for queue storage
US6591342B1 (en) Memory disambiguation for large instruction windows
US20080028116A1 (en) Event Queue in a Logical Partition
CN106294229B (en) It is read while in serial interface memory and write-in storage operation
US8996611B2 (en) Parallel serialization of request processing
US7827536B2 (en) Critical path profiling of threaded programs
US7363434B2 (en) Method, system, and computer-readable medium for updating memory devices in a multi-processor computer system
US8745292B2 (en) System and method for routing I/O expansion requests and responses in a PCIE architecture
US8375007B2 (en) Status tool to expose metadata read and write queues
TW201032077A (en) Devices, systems, and methods for communicating pattern matching results of a parallel pattern search engine
US8762790B2 (en) Enhanced dump data collection from hardware fail modes
US8707331B2 (en) RDMA (remote direct memory access) data transfer in a virtual environment
KR20090094256A (en) System and method for multicore communication processing
JPH08297626A (en) Network interface and method for processing packet in network interface
US7219198B2 (en) Facilitating communication within shared memory environments using lock-free queues
US4354232A (en) Cache memory command buffer circuit
US5555396A (en) Hierarchical queuing in a system architecture for improved message passing and process synchronization
US20050138230A1 (en) Method, apparatus and program product for low latency I/O adapter queuing in a computer system
US8156503B2 (en) System, method and computer program product for accessing a memory space allocated to a virtual machine

Legal Events

Date Code Title Description
NENP Non-entry into the national phase in:

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16892885

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 16892885

Country of ref document: EP

Kind code of ref document: A1