WO2016122602A1

WO2016122602A1 - Systems and methods for sharing non-volatile memory between multiple access models

Info

Publication number: WO2016122602A1
Application number: PCT/US2015/013795
Authority: WO
Inventors: Gregg B. Lesartre; Derek A. Sherlock; Siamak Tavallaei
Original assignee: Hewlett Packard Enterprise Development Lp
Priority date: 2015-01-30
Filing date: 2015-01-30
Publication date: 2016-08-04

Abstract

A computing system including a processor and a memory controller coupled to a plurality of remote memory modules, which implement a redundancy protocol and support a direct access request. The memory controller is to receive a block access request from the processor and, based on the redundancy model, reformat the block access request into a direct access request and transmit the request to the plurality of remote memory modules. The memory controller reformats the block access request to maintain data consistency in accordance with the redundancy protocol.

Description

SYSTEMS AND METHODS FOR SHARING

NON-VOLATILE MEMORY BETWEEN MULTIPLE ACCESS MODELS

BACKGROUND

[0001] Current data storage devices often include fault tolerance to ensure that data is not lost in the event of a device error or failure. An example of a fault tolerance provided to current data storage devices is a redundant array of independent disks. A redundant array of independent disks (RAID) is a storage technology that controls multiple disk drives and provides fault tolerance by storing data with redundancy. RAID technology can store data with redundancy in a variety of ways. Examples of redundant data storage methods include duplicating data and storing the data in multiple locations and adding bits to store calculated error recovery bits.

BRIEF DESCRIPTION OF THE DRAWINGS

[0002] For a detailed description of various examples, reference will now be made to the accompanying drawings in which:

[0003] FIG. 1 a shows a block diagram of system including multiple computing nodes and associated memory controllers, which are configured to access a remote, redundant memory in accordance with various examples of the present disclosure;

[0004] FIG. 1 b shows a block diagram of a computing system in accordance with various examples of the present disclosure;

[0005] FIGS. 2a and 2b shows flow charts of various method steps in accordance with various examples of the present disclosure; and

[0006] FIG. 3 shows another block diagram of a system for accessing remote, redundant memory in accordance with various examples of the present disclosure.

DETAILED DESCRIPTION

[0007] Certain terms are used throughout the following description and claims to refer to particular system components. As one skilled in the art will appreciate, different companies may refer to a component by different names. This document does not intend to distinguish between components that differ in name but not function. In the following discussion and in the claims, the terms "including" and "comprising" are used in an open-ended fashion, and thus should be interpreted to mean "including, but not limited to... ." Also, the term "couple" or "couples" is intended to mean either an indirect or direct wired or wireless connection. Thus, if a first device couples to a second device, that connection may be through a direct connection or through an indirect connection via other devices and connections.

[0008] Techniques described herein relate generally to redundant data storage. More specifically, techniques described herein relate to redundant data storage in persistent main memory. Main memory is primary storage that is directly or indirectly serves a central processing unit (CPU) and is directly accessible to the CPU.

[0009] New system architectures take advantage of dense, persistent, low latency memory devices to provide for large storage arrays accessed directly by a processor and cached in a processor's caches. New solid state persistent memory devices with densities like flash memory and access times like DRAM memories allow the design of systems that treat this memory as storage, but access it as memory, i.e., through direct memory access, allowing the solid state persistent memory devices to be used as persistent main memory, also known as the random-access method. To protect data stored in this persistent main memory, capabilities are integrated into the paths to access this memory which, in addition to routing write requests to memory, also route the data in a mirrored or RAID fashion to multiple storage locations in separate persistent memory devices. This routing ensures data recovery in the case of a persistent memory device failure while maintaining current programming paradigms.

[0010] By adding fault tolerance functionality in the main memory access path that operates on small granularities of data, for example individual cache lines, at main memory access speeds, this type of data protection (e.g., data duplication or RAIDing) can be extended to direct memory access, such as persistent main memory, without awareness of the protection mechanism at a software application level.

[0011] However, entry-level or legacy systems or applications may not be configured to take advantage of direct memory access, and instead rely on a traditional block storage model utilizing PCI Express (PCIe), Serial Attached SCSI (SAS), or other input/output (10) interfaces, which do not utilize direct memory access. Directly accessed storage gains performance benefit not only from its short latency, but also from the absence of block-transfer 10 handlers. Directly accessed storage also gains performance benefit from the efficiencies of moving only data that is actually modified or requested, rather than the entire block. Thus, it is advantageous to implement a direct access model (random-access) to a redundant, persistent main memory, while also enabling 10 block access, for example for sequential access and transfer of data.

[0012] Disclosed herein are examples of methods and systems to provide access to such redundant, persistent memory modules implementing a direct access model (random-access) to both applications utilizing a single-transaction direct access model as well as those making IO block accesses (sequential access).

[0013] FIG. 1 a is a block diagram of a computing system 100 including fault tolerance and permitting both direct access and IO block access. In an example, computing system 100 is a server cluster. The computing system 100 includes a number of nodes, such as computing node 102. In a further example, computing system 100 may also include a number of remote memories 1 10. The remote memories 1 10 form a memory pool, which is a collection of memory, such as a collection of memory devices, for storing a large amount of data. The computing nodes 102 are communicably coupled to each other through a network 104. The computing system 100 can include several computing nodes, such as several tens or even thousands of computing nodes.

[0014] The computing nodes 102 include a Central Processing Unit (CPU) 106 to execute stored instructions. The CPU 106 can be a single core processor, a multicore processor, or any other suitable processor. In an example, a computing node 102 includes a single CPU. In another example, a computing node 102 includes multiple CPUs, such as two CPUs, three CPUs, or more. Applications executing on the CPU 106 may generate memory requests in the form of a direct access request 1 14 or an IO block access request 1 16, which will be explained in further detail below. Different CPUs 106 may offer differing capabilities, such as differing numbers of processor cores, or different special processing units. Some CPUs 106 may be best suited to quickly executing many operations on individual datum. Other CPUs 106 may be lower cost, less capable processors that are appropriate for managing data in blocks, such as would be used for backing up data, duplicating data for other processors, migrating data, or other such data service operations.

[0015] The computing node 102 includes a main memory, which is not shown in FIG. 1 a for simplicity. The main memory may include volatile dynamic random access memory (DRAM) with battery backup, non-volatile phase change random access memory (PCRAM), spin transfer torque-magnetoresistive random access memory (STTMRAM), resistive random access memory (reRAM), memristor, FLASH, or other types of memory devices. For example, the main memory can be solid state, persistent, dense, fast memory. Fast memory can be memory having an access time similar to DRAM memory, for example.

[0016] Computing node 102 further includes a memory controller 108. The memory controller 108 communicates with local main memory and controls access to the main memory by the CPU 106. Persistent memory is non-volatile storage, such as storage on a storage device. In an example, the memory controller 108 is a RAID memory controller.

[0017] Computing system 100 also includes remote memory 1 10. Remote memory 1 10 can be persistent memory, and may be similar to main memory, although is not located local to any computing node 102. Remote memory 1 10 is communicably coupled to the computing nodes 102 through a network 104, such as a server cluster fabric. Remote memory 1 10 is remote and separate from main memory. For example, remote memory 1 10 can be physically separate from local main memory. In an example, remote memory 1 10 can be persistent memory divided into regions or ranges of memory address spaces. Each region can be assigned to a computing node 102. Each region can additionally be accessed by computing nodes 102 other than the assigned computing node 102. In the event of a failure of the assigned computing node 102, another computing node 102 can access the region of remote memory 1 10, or the region can be reassigned in order to preserve access to the data in remote memory 1 10 by other computing nodes 102. Remote memory 1 10 may also be assigned to multiple computing nodes 102 to allow shared access, including to use remote memory 1 10 as a communication channel, for example.

[0018] Remote memory 1 10 includes redundant data 1 12. Remote memory 1 10 thus provides a fault tolerance capability (i.e., providing a system and/or method of data recovery in order to ensure data integrity) to persistent main memory 1 10 via redundant data 1 12. When a memory controller 108 receives a write operation, to ensure the integrity of the data, the memory controller 108 will generate a transaction to the remote memory 1 10, resulting in generation and storage of redundant data 1 12. In some cases, such as a RAID 1 array of remote memory 1 10, redundant data 1 12 represents a copy of the data subject of a write operation. In other cases, such as a RAID 5 array of remote memory 1 10, redundant data 1 12 may represent a combination of the data subject of a write operation for some remote memory 1 10 and parity data for other remote memory 1 10. The scope of the present disclosure is not limited by the particular type of redundancy array employed by remote memory 1 10. By storing redundant data 1 12 to the remote memory 1 10, the data is effectively spread across multiple devices such that the data can be recovered when a device 102, or even multiple devices 102, fails. The redundant data 1 12 stored by the remote memory(s) 1 10 can be accessed by the computing nodes 102. The redundant data 1 12 stored by the remote memory(s) 1 10 can also be accessed by additional computing nodes102 , such as in the event of a failure of a computing node 102 or data corruption, or if multiple compute nodes 102 are allowed simultaneous access.

[0019] As explained above, in some examples an application executing on the CPU 106 generates a direct access request 1 14 and transmits this request to the memory controller 108. Similarly, another application executing on the CPU 106 or on a separate instantiation of CPU 106 generates an IO block access request 1 16 and transmits this request to the memory controller 108. Typically, the granularity or size of a direct access request 1 14 is related to the size of a cacheline. However, the granularity or size of an IO block request 1 16 is normally larger than the granularity or size of the direct access request 1 14. In one example, the granularity or size of the direct access request 1 14 is one cacheline and the granularity or size of the IO block request 1 16 is several or more cachelines.

[0020] The remote memory 1 10, as explained previously, implements a redundancy model that is based on the direct access model, which provides numerous performance benefits. However, since some applications executing on a CPU 106 may generate IO block access memory requests, the memory controller 108 is configured to reformat such requests such that they conform with both the direct access model and the implemented redundancy model. For example, IO block access requests 1 16 received at the memory controller 108 are broken down into a granularity expected by the remote memory 1 10. In the example where the granularity of a direct access request 1 14 is a cacheline, the memory controller 108 breaks down the IO block access request 1 16 into a cacheline granularity. The memory controller 108 maintains the redundancy model's data consistency protocol during such reformatting such that other CPUs 106 may continue direct access through the reformatting.

[0021] The memory controller 108 may reformat the IO block access request 1 16 in accordance with the redundancy model applied at the remote memory 1 10. For example, the memory controller 108 may break the IO block access request 1 16 down into multiple cache line access transactions. In particular, the memory controller 108 ensures that each access follows the redundancy consistency model required to provide consistency of the data and redundant data 1 12 (or parity) on the remote memories 1 10. For instance, the memory controller 108 may ensure that writes to the remote memories 1 10 enforce a RAID stripe lock before proceeding to modify the data, then modify the redundant data 1 12 (or parity data), and finally unlock the RAID stripe, thus preventing multiple active writes to the same RAID stripe from breaking the consistency of the data and redundant data 1 12 (or parity data) by allowing the updates to occur out of order. The above-described atomicity enforcement scheme is exemplary and other suitable schemes may be employed to prevent atomicity violations with respect to a RAID stripe or region of the remote memories 1 10. Further, the memory controller 108 will follow an error recovery protocol dictated by the direct access recovery model to correct encountered errors while still maintaining shared data consistency.

[0022] Similarly, those applications executing on the CPU 106 that generate IO block access requests 1 16 also expect to receive responses in the form of a block access, rather than direct access. Thus, in some examples, the memory controller 108 receives responses from the remote memory 1 10 having a granularity associated with a direct access request, and re-assembles those responses into a block access response and forwards the re-assembled response to the CPU 106. The memory controller 108 may include an accumulation buffer to gather the multiple direct access completions, including data for reads, to assemble the elements of a block response before forwarding the block response to the CPU 106. The memory controller 108 may allow direct access responses to individually complete in any order from remote memories 1 10. The memory controller 108 may be configured to accumulate direct access responses even in the event that one, some, or all direct accesses encounter errors that result in error recovery utilizing the redundancy coherency model.

[0023] As one example of a redundancy model, certain implementations may employ RAID 1 , where the write data is duplicated, or mirrored, to produce an identical copy of the data. In this mirroring mode, the data is written to the remote memories 1 10, becoming redundant data 1 12. In mirroring mode, the memory controller 108 accesses the remote memories 1 10, and particularly regions associated with the requesting CPU 106, in response to requests by CPU 106. Similarly, in examples that employ RAID 5, the data and associated parity data is written to the remote memories 1 10, becoming redundant data 1 12. In this way, redundancy provides a safeguard against failure events for data stored in the remote memories 1 10, while permitting shared access to the data stored in the remote memories 1 10.

[0024] In some examples, remote memory 1 10 may be rarely or less frequently accessed. An implementation may choose to occasionally access remote memory 1 10 to confirm that remote memory 1 10 remains accessible and able to provide correct data. By confirming the accessibility of remote memory 1 10, the integrity of the redundant data 1 12 is ensured. In an example, memory accesses, such as read requests, can occasionally be serviced by accessing the redundant data 1 12 of remote memory 1 10 rather than a local main memory. By occasionally servicing a memory access from remote memory 1 10, the system 100 can verify that remote memory 1 10 and redundant data 1 12 have not failed.

[0025] Memory controllers 108 often scrub stored data in order to detect and correct any soft errors and detect any hard errors that may have occurred during a period of infrequent access. In an example, scrubbing of redundant data 1 12 in remote memory 1 10 is supported by memory controller 108. In another example, remote memory 1 10 provides scrubbing of redundant data 1 12 without involving memory controller 108.

[0026] The computing system 100 can be adapted to employ other standard RAID levels. Further, it is to be understood the block diagram of FIG. 1 a is not intended to indicate that computing system 100 is to include all of the components shown in FIG. 1 a in every case. Further, any number of additional components may be included within computing system 100, depending on the details of the specific implementation.

[0027] FIG. 1 a also shows accesses originating from a CPU 106 of a computing node 102. However, many different entities may require access to remote memory 1 10, including I/O devices, application-specific accelerators, state machines and FPGAs, hardware table walkers, and the like. While the CPU 106 is a more common example, the computing node 102 is not so limited.

[0028] Interfaces between the CPU 106 and the memory controller 108 are explained broadly as being a direct access request 1 14 or an IO block access request 1 16. Examples of interfaces that may communicate direct access requests 1 14 include QPI, HyperTransport, Amba, or DDR. Examples of IO interfaces communicating IO block access include PCIe, Ethernet, Infiniband, SCSI, SASA, or SATA. These are merely exemplary. Any suitable interface may be used within the scope of the present disclosure, including proprietary interfaces. [0029] Further, FIG. 1 a and the above disclosure is not intended to restrict the functionality of the media controller 108 solely to direct access or block access requests. For example, while reference is made to the block access model and the cacheline-oriented direct access model, the media controller 108 could be extended to handle different access models. As noted above, direct accesses need not necessarily be cacheline sized; rather, they could be byte-sized, 32-bit word sized, and the like.

[0030] Turning briefly to FIG. 1 b, an exemplary computing system 120 is shown. The computing system 120 is similar to the computing nodes 102 shown in FIG. 1 a. In particular, the computing system 120 includes a processor 106 coupled to a memory controller 108. The memory controller 108 is, in turn, coupled to multiple memory modules 1 10, which may comprise remote memory modules 1 10 as described with respect to FIG. 1 a.

[0031] The memory controller 108 also includes a receive request functional block 122 and a reformat request functional block 124. The receive request functional block 122 receives a block access request or transaction for memory modules 1 10 from the processor 106. The reformat request functional block 124 reformats the block transaction into direct access transactions that are targeted to memory modules 1 10. The reformat request functional block 124 may also generate a sequence of operations to issue transactions to the memory modules 1 10 according to an implemented redundancy consistency model.

[0032] For example, a read transaction may be issued to a first remote memory module 1 10 to lock a RAID stripe and access the RAID parity data, issue a write operation to a second remote memory module 1 10 to write the data, and then write to the first remote memory module 1 10 to update the RAID parity and unlock the RAID stripe. In this way, the reformat request functional block 124 ensures correct operation according the implemented redundancy model.

[0033] FIG. 2a shows a flow chart of a method 200 in accordance with various examples of the present disclosure. The method 200 begins in block 202 with a memory controller 108 receiving a block access request from a processor 106, where the block access request is directed to a plurality of remote memory modules 1 10. Based on a redundancy model implemented across the remote memory modules 1 10 (e.g., RAID 1 or RAID 5), the method 200 continues in block 204 with reformatting the block access request into a direct access requests. As explained above, this is performed such that devices generating IO block access requests, for example, are able to interface with a redundant memory 1 10 that is implemented to receive requests using a direct access model. The method 200 continues in block 206 with transmitting the reformatted request to the plurality of remote memory modules 210. As explained above, reformatting is performed to reformat the block access request to maintain data consistency in accordance with the redundancy protocol, such that memory 1 10 collisions are avoided and mutually dependent accesses are not processed in a way that results in an atomicity violation.

[0034] FIG. 2b shows an additional flow chart of a method 210 in accordance with various examples of the present disclosure. The method steps embodied in method 210 may be performed in conjunction with or in addition to those method steps described above with respect to method 200. The method 210 may, for example, include the memory module 1 10 receiving concurrent direct access requests for a particular region of the memory modules 1 10 from multiple processors 106 through a memory controller 108 as shown in block 212 and, as in block 214, implementing an atomicity enforcement scheme that forces write accesses to that region of the memory modules 1 10 to occur in a correct order as determined by the redundancy model.

[0035] The method 210 may continue in block 216 with the memory controller 108 receiving a response from the memory modules 1 10 as a result of its transmitting the request reformatted as a direct access request and, in block 218, with reassembling the response into a block access response in an associated accumulation buffer. The block access response is then returned to the requesting processor 106.

[0036] Still further, the method 210 can continue in block 220 with the memory controller 108 identifying an error condition during reassembly of the response (e.g., as in block 218) and correcting the error condition according to the redundancy model implemented by the memory controller 108 and memory modules 1 10 (e.g., accessing a duplicate of the data as in RAID 1 or correcting using parity data as in RAID 5). The method 210 also may continue in block 222 with reformatting the block access request into a request having a granularity that corresponds to a direct access request, which may be, for example, a cache line granularity or size.

[0037] FIG. 3 shows another example of a system 300 to implement fault tolerant memory 1 10 access that permits both direct access and IO block access. The system 300 may include at least one computing device that is capable of accessing multiple remote memories. The system 300 may be similar to the computing node 102 of FIG. 1 a or the computing system 120 of FIG. 1 b. In the example shown in FIG. 3, the system 300 includes a processor 302 and a computer-readable storage medium 304Although the following description refers to a single processor and a single computer-readable storage medium, systems having multiple processors, multiple computer-readable storage mediums, or both are within the scope of the present disclosure. In such examples, instructions may be distributed (e.g., stored) across multiple computer-readable storage mediums and the instructions may be distributed across (e.g., executed by) multiple processors.

[0038] The processor 302 may be one or more central processing units (CPUs), microprocessors, and/or other hardware devices suitable for retrieval and execution of instructions stored in machine-readable storage medium 304. In the particular embodiment shown in FIG. 3, processor 302 may fetch, decode, and execute instructions 306, 308 to perform fault tolerant memory access that permits both direct access and IO block access. As an alternative or in addition to retrieving and executing instructions, processor 302 may include one or more electronic circuits comprising a number of electronic components for performing the functionality of one or more of the instructions in computer-readable storage medium 304. With respect to the executable instruction representations (e.g., boxes) described and shown herein, it should be understood that part or all of the executable instructions and/or electronic circuits included within one box may, in alternate embodiments, be included in a different box shown in the figures or in a different box not shown [0039] The computer-readable storage medium 304 may be any electronic, magnetic, optical, or other physical storage device that stores executable instructions. Thus, the computer-readable storage medium 304 may be, for example, Random Access Memory (RAM), an Electrically-Erasable Programmable Read-Only Memory (EEPROM), a storage drive, an optical disc, and the like. The computer-readable storage medium 304 may be disposed within system 300, as shown in FIG. 3. In this situation, the executable instructions may be "installed" on the system 300. Alternatively, the computer-readable storage medium 304 may be a portable, external or remote storage medium, for example, that allows system 300 to download the instructions from the portable/external/remote storage medium. In this situation, the executable instructions may be part of an "installation package". As described herein, the computer-readable storage medium 304 may be encoded with executable instructions to perform fault tolerant memory access that permits both direct access and IO block access.

[0040] Referring to FIG. 3, memory access receiving instructions 306, when executed by a processor (e.g., 302), may cause system 300 to receive a block access request or transaction for remote, redundant memory modules (e.g., 1 10). The memory access reformatting and transmission instructions 308, when executed by a processor (e.g., 302), may cause system 300 to reformat a block transaction into direct access transactions that are targeted to remote memory modules (e.g., 1 10). The reformatting and transmission instructions 308, when executed, may cause the generation of a sequence of operations to issue transactions to remote memory modules (e.g., 1 10) according a redundancy consistency model.

[0041] For example, a read transaction may be issued to a first remote memory module 1 10 to lock a RAID stripe and access the RAID parity data, issue a write operation to a second remote memory module 1 10 to write the data, and then write to the first remote memory module 1 10 to update the RAID parity and unlock the RAID stripe. In this way, the reformatting and transmission instructions 308, when executed, ensures correct operation according the implemented redundancy model. [0042] This allows sharing access to the data at the remote memory modules 1 10 with other processor nodes (e.g., CPUs 106 shown in FIG. 1 a) while adhering to the redundancy model, whether implemented with software through executed instructions contained in computer-readable storage medium 304, or with a hardware memory controller 108, and with CPUs 106 accessing the data by direct access 1 14 or block access 1 16 processor interfaces.

[0043] The above discussion is meant to be illustrative of the principles and various embodiments of the present invention. Numerous variations and modifications will become apparent to those skilled in the art once the above disclosure is fully appreciated. It is intended that the following claims be interpreted to embrace all such variations and modifications.

Claims

CLAIMS What is claimed is:

1 . A computing system, comprising:

a processor; and

a memory controller coupled to a plurality of remote memory modules, the plurality of memory modules to implement a redundancy protocol and support a direct access request, and the memory controller to: receive a block access request from the processor; and based on the redundancy model, reformat the block access request into a direct access request and transmit the request to the plurality of remote memory modules; wherein the memory controller reformats the block access request to maintain data consistency in accordance with the redundancy protocol.

2. The computing system of claim 1 wherein the memory controller is further to:

transmit a direct access request for a region of the memory modules from the processor, wherein the memory modules receive at least one other direct access request for the region of the memory modules from at least one other processor; and

implement, based on the redundancy protocol, an atomicity enforcement scheme to enforce an order of access to the region of the memory modules.

3. The computing system of claim 1 wherein the memory controller comprises an accumulation buffer and is further to:

receive a response from the plurality of remote memory modules as a result of transmittal of the request reformatted as a direct access request; and reassemble the response into a block access response in the accumulation buffer and return the block access response to the processor.

4. The computing system of claim 3 wherein the memory controller is further to identify an error condition during reassembly of the response into the block access response and correct the error condition according to the redundancy model.

5. The computing system of claim 1 wherein a direct access request comprises a granularity having a first size and a block access request comprises a granularity having a second size larger than the first size, and the memory controller reformats the block access request into a request comprising a granularity having the first size according to the redundancy model.

6. A method, comprising:

receiving, by a memory controller and from a processor, a block access request directed to a plurality of remote memory modules, the plurality of memory modules to implement a redundancy protocol and support a direct access request;

reformatting, by the memory controller and based on the redundancy model, the block access request into a direct access request; and transmitting the request to the plurality of remote memory modules;

wherein reformatting comprises reformatting the block access request to maintain data consistency in accordance with the redundancy protocol.

7. The method of claim 6 further comprising:

receiving, at the memory modules, concurrent direct access requests for a region of the memory modules from the processor and at least one other processor; and implementing, by the memory controller and based on the redundancy protocol, an atomicity enforcement scheme to enforce an order of access to the region of the memory modules.

8. The method of claim 6 wherein the memory controller comprises an accumulation buffer, the method further comprising:

receiving, by the memory controller, a response from the plurality of remote memory modules as a result of transmitting the request reformatted as a direct access request; and

reassembling the response into a block access response in the accumulation buffer and returning the block access response to the processor.

9. The method of claim 8 further comprising identifying, by the memory controller, an error condition during reassembly of the response into the block access response and correcting the error condition according to the redundancy model.

10. The method of claim 6 wherein a direct access request comprises a granularity having a first size and a block access request comprises a granularity having a second size larger than the first size, and the method comprises reformatting the block access request into a request comprising a granularity having the first size according to the redundancy model.

1 1 . A non-transitory computer-readable storage medium containing instructions that, when executed by a processor, cause the processor to:

receive, from a requesting entity, a block access request directed to a plurality of remote memory modules, the plurality of memory modules to implement a redundancy protocol and support a direct access request;

reformat, based on the redundancy model, the block access request into a direct access request; and transmit the request to the plurality of remote memory modules;

12. The non-transitory computer-readable storage medium of claim 1 1 wherein the instructions, when executed, further cause the processor to:

transmit a direct access request for a region of the memory modules, wherein the memory modules receive at least one other direct access request for the region of the memory modules from at least one other processor; and

13. The non-transitory computer-readable storage medium of claim 1 1 wherein the instructions, when executed, further cause the processor to:

receive a response from the plurality of remote memory modules as a result of transmitting the request reformatted as a direct access request; and

reassemble the response into a block access response in an accumulation buffer and return the block access response to the requesting entity.

14. The non-transitory computer-readable storage medium of claim 13 wherein the instructions, when executed, further cause the processor to:

identify an error condition during reassembly of the response into the block access response and correct the error condition according to the redundancy model.

15. The non-transitory computer-readable storage medium of claim 1 1 wherein a direct access request comprises a granularity having a first size and a block access request comprises a granularity having a second size larger than the first size, and the instructions, when executed, further cause the processor to reformat the block access request into a request comprising a granularity having the first size according to the redundancy model.