CN107479833A - A remote non-volatile memory access and management method for key-value storage - Google Patents

A remote non-volatile memory access and management method for key-value storage Download PDF

Info

Publication number
CN107479833A
CN107479833A CN201710716667.5A CN201710716667A CN107479833A CN 107479833 A CN107479833 A CN 107479833A CN 201710716667 A CN201710716667 A CN 201710716667A CN 107479833 A CN107479833 A CN 107479833A
Authority
CN
China
Prior art keywords
memory
memory block
area
data
client
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710716667.5A
Other languages
Chinese (zh)
Other versions
CN107479833B (en
Inventor
肖侬
余松平
邓明翥
邢玉轩
刘芳
陈薇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CN201710716667.5A priority Critical patent/CN107479833B/en
Publication of CN107479833A publication Critical patent/CN107479833A/en
Application granted granted Critical
Publication of CN107479833B publication Critical patent/CN107479833B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0644Management of space entities, e.g. partitions, extents, pools
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • G06F3/0679Non-volatile semiconductor memory device, e.g. flash memory, one time programmable memory [OTP]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Memory System (AREA)

Abstract

The invention relates to a remote nonvolatile memory access and management method facing key value storage, which is characterized in that a nonvolatile memory cache region is set for a key value storage system and is registered on a network card capable of supporting remote direct memory access technology; this non-volatile memory cache area is divided into two parts: the memory blocks in the sparse area are mainly used for receiving data of a remote client, different memory blocks are allocated for remote data writing each time, the memory block in each sparse area only belongs to one key value pair, the memory block in the compact area is a main cache area of the key value pair, each memory block comprises a plurality of key value pairs, and a read request of the client is served; compressing key value pairs in a plurality of allocated memory blocks in the sparse area into memory blocks in one or more compact areas by a compressor at regular intervals by the server; when the number of available memory blocks of the compact area is insufficient, data is replaced to the data storage area. The invention can relieve the fixed point write abrasion problem of the remote nonvolatile memory and improve the remote read-write performance.

Description

一种面向键值存储的远程非易失内存访问与管理方法A remote non-volatile memory access and management method for key-value storage

技术领域technical field

本发明涉及一种远程非易失内存访问与管理方法,特别是一种面向键值存储的远程非易失内存访问与管理方法。The invention relates to a remote non-volatile memory access and management method, in particular to a key-value storage-oriented remote non-volatile memory access and management method.

背景技术Background technique

在大数据环境下,需要面临海量数据的存储需求。一方面,对应用系统的扩展性提出了更高的要求,而基于键值存储架构的NoSQL系统,数据的存储以独立的键值对为基本单元,减少数据之间的关联性,具有良好的扩展性;另一方面,基于传统内存和外存架构的计算机存储系统面临着海量数据处理性能的挑战;新型存储器件,如PCM(相变随机存储器)、3D Xpoint, 拥有外存的非易失性、高集成度和内存级的访问速度等特性,考虑到现有的DRAM(动态随机存取存储器)的易失特性、高能耗以及容量有限的不足,基于这些器件构成的非易失内存(NVM, Non-Volatile Memory)成为计算机设计和研究者改善存储系统性能的热门。虽然,NVM的集成度要高于DRAM,但是单机的NVM容量还是有限的,并且目前的NVM技术的成熟度不高,据Intel报告显示:在未来的计算机系统中,将部署4倍于传统内存容量的商业化NVM产品(如3D Xpoint);诚然,单机的NVM资源远不能满足大数据应用系统的存储容量需求,所以通过网络横向扩展成为了必须。基于TCP/IP协议的传统网络会成为远程内存扩展的性能瓶颈,主要原因是TCP/IP协议的数据处理方式,应用程序在用户空间的缓存区准备好发送的数据,然后通过系统调用将数据拷贝的内核空间,在内核空间需要对发送的数据添加各层协议所需的头部信息,最后经过网卡发送到远端的机器,远端的机器通过网卡接收到数据,按照发送到反向的逻辑解析数据包,最后将数据拷贝至远端应用程序的用户空间的缓存区;不难发现,利用传统网络技术进行数据传输,需要经历多次的数据拷贝,数据的拷贝不仅消耗了CPU的资源,还降低了网络带宽的利用率。与TCP/IP相比,RDMA(Remote Direct Memory Access,远程直接访问内存技术)允许远程直接访问远程的用户内存空间,而不需要远端CPU的参与,在数据传输的过程中,远端的CPU可以并行运行其他程序。目前的远程访问内存是基于DRAM的,由于NVM的内存访问特性,远程访问NVM是合理的,并且在未来的RDMA技术中NVM将扮演重要的角色。In the big data environment, it is necessary to face the storage requirements of massive data. On the one hand, higher requirements are put forward for the scalability of the application system, while the NoSQL system based on the key-value storage architecture uses independent key-value pairs as the basic unit for data storage, which reduces the correlation between data and has a good Scalability; on the other hand, computer storage systems based on traditional memory and external memory architectures are faced with the challenge of massive data processing performance; new storage devices, such as PCM (phase-change random access memory), 3D Xpoint, non-volatile memory with external memory characteristics such as performance, high integration and memory-level access speed, considering the volatile characteristics, high energy consumption and limited capacity of the existing DRAM (Dynamic Random Access Memory), the non-volatile memory ( NVM (Non-Volatile Memory) has become a hot spot for computer designers and researchers to improve the performance of storage systems. Although the integration of NVM is higher than that of DRAM, the capacity of NVM in a single machine is still limited, and the maturity of the current NVM technology is not high. According to the report of Intel, in the future computer system, it will deploy 4 times the traditional memory. capacity of commercial NVM products (such as 3D Xpoint); it is true that the NVM resources of a single machine are far from meeting the storage capacity requirements of big data application systems, so horizontal expansion through the network has become a must. The traditional network based on the TCP/IP protocol will become the performance bottleneck of remote memory expansion. The main reason is the data processing method of the TCP/IP protocol. The application program prepares the data to be sent in the buffer area of the user space, and then copies the data through the system call In the kernel space, the header information required by each layer protocol needs to be added to the sent data in the kernel space, and finally sent to the remote machine through the network card, and the remote machine receives the data through the network card, according to the logic of sending to the reverse Parse the data packet, and finally copy the data to the buffer area of the user space of the remote application program; it is not difficult to find that using traditional network technology for data transmission requires multiple data copies. Data copying not only consumes CPU resources, but also It also reduces the utilization of network bandwidth. Compared with TCP/IP, RDMA (Remote Direct Memory Access, remote direct memory access technology) allows remote direct access to remote user memory space without the participation of the remote CPU. During data transmission, the remote CPU Other programs can be run in parallel. The current remote access memory is based on DRAM. Due to the memory access characteristics of NVM, remote access to NVM is reasonable, and NVM will play an important role in the future RDMA technology.

不过,在NVM具备内存和外存的优势特性的同时,NVM还存在着读写不均衡、写磨损等问题,由于RDMA技术可以使得远端的非易失内存可以直接被客户端访问,而且RDMA访问的虚拟地址和物理地址的映射是固定的,这就加速了NVM写损耗,所以,如何设计NVM的访问方式和管理机制成为了新的挑战。However, while NVM has the advantages of memory and external memory, NVM also has problems such as read-write imbalance and write wear. Because RDMA technology can make remote non-volatile memory directly accessible to clients, and RDMA The mapping between the accessed virtual address and physical address is fixed, which accelerates the write loss of NVM. Therefore, how to design the access mode and management mechanism of NVM has become a new challenge.

发明内容Contents of the invention

为了解决上述的技术问题,本发明的目的是提供了一种面向键值存储的远程非易失内存访问与管理方法。In order to solve the above-mentioned technical problems, the purpose of the present invention is to provide a remote non-volatile memory access and management method for key-value storage.

该方法为键值存储设定一个非易失内存缓存区,并注册到能够支持远程直接内存访问技术的网卡上;划分这个非易失内存缓存区为两个部分:稀疏区和紧凑区,稀疏区中的内存块主要用于接收远程客户端的数据,并且每次分配不一样的内存块用于远程数据写,每个稀疏区的内存块只属于一个键值对,其作用是为了提高远程写的性能以及延长非易失内存寿命。This method sets a non-volatile memory cache area for key-value storage and registers it on a network card that supports remote direct memory access technology; divides this non-volatile memory cache area into two parts: sparse area and compact area, sparse The memory blocks in the area are mainly used to receive data from remote clients, and different memory blocks are allocated for remote data writing each time. The memory blocks in each sparse area only belong to one key-value pair, and its function is to improve remote writing. performance and extended non-volatile memory life.

而紧凑区的内存块是键值对的主要缓存区,每个内存块包含多个键值对,服务客户端的读请求;服务端每隔一定的时间,通过压缩机制将稀疏区多个已分配的内存块中的键值对压缩至一个或多个紧凑区的内存块中;当紧凑区的可用内存块的数量不足时,会将数据替换至数据存储区。其作用是提高缓存的效率。The memory block in the compact area is the main cache area for key-value pairs. Each memory block contains multiple key-value pairs to serve the client's read request; The key-value pairs in the memory block of the compact area are compressed into one or more memory blocks in the compact area; when the number of available memory blocks in the compact area is insufficient, the data will be replaced to the data storage area. Its role is to improve the efficiency of the cache.

本发明解决其技术问题所采用的技术方案是:The technical solution adopted by the present invention to solve its technical problems is:

一种基于RDMA的远程非易失内存访问与管理方法,包括:An RDMA-based remote non-volatile memory access and management method, comprising:

第一步, 服务端划分内存为三类:数据存储区,非易失内存缓存区和命令缓冲区;其中,数据存储区和非易失内存缓存区的内存是NVM,命令缓冲区的内存是DRAM;In the first step, the server divides memory into three categories: data storage area, non-volatile memory cache area, and command buffer; among them, the memory in the data storage area and the non-volatile memory cache area is NVM, and the memory in the command buffer area is DRAM;

第二步,服务端将非易失内存缓存区和命令缓冲区的内存注册到支持远程直接内存访问技术的网卡上, 而数据存储区则根据用户设定的参数选择性的注册,并将非易失内存缓存区划分为紧凑区和稀疏区,稀疏区的内存块的大小统一设定为页字节的倍数,该倍数为1-n倍,n为正整数;In the second step, the server registers the memory of the non-volatile memory buffer and command buffer to the network card that supports remote direct memory access technology, while the data storage area is selectively registered according to the parameters set by the user, and the non-volatile memory The volatile memory cache area is divided into a compact area and a sparse area. The size of the memory block in the sparse area is uniformly set as a multiple of page bytes, and the multiple is 1-n times, and n is a positive integer;

第三步,服务端接收客户端的连接请求,并为每个客户端分配一个稀疏区内存块和一个命令缓冲区的内存块;In the third step, the server receives the connection request from the client, and allocates a memory block in the sparse area and a memory block in the command buffer for each client;

第四步,客户端在传送数据时,通过RDMA(Remote Direct Memory Access,远程直接访问内存技术)写入稀疏内存块,同时将控制信息发送到命令缓冲内存块中,服务端返回下次写的稀疏内存块地址作为写确认(ACK)消息;如果一次数据传输需要多个稀疏内存块,则根据控制信息,将多个稀疏内存块串联在一起保证数据的完整性;In the fourth step, when the client transmits data, it writes the sparse memory block through RDMA (Remote Direct Memory Access, remote direct memory access technology), and at the same time sends the control information to the command buffer memory block, and the server returns the next write The sparse memory block address is used as a write confirmation (ACK) message; if a data transmission requires multiple sparse memory blocks, multiple sparse memory blocks are connected in series according to the control information to ensure data integrity;

第五步,当稀疏区域的可用内存块的数量下降到一定的阀值时(该阈值是指:可用内存块的数量与总的内存块数量的比值,范围0.1~0.6),服务端将通过内存压缩机制,将选择多个稀疏内存块中的数据压缩至紧凑区的内存块中,并回收包含有垃圾数据的稀疏内存块;Step 5, when the number of available memory blocks in the sparse area drops to a certain threshold (the threshold refers to the ratio of the number of available memory blocks to the total number of memory blocks, ranging from 0.1 to 0.6), the server will pass The memory compression mechanism compresses the data in multiple selected sparse memory blocks to the memory blocks in the compact area, and reclaims the sparse memory blocks containing garbage data;

第六步,当紧凑区的可用内存块的数量下降到一定的阀值时(该阈值是指:可用内存块的数量与总的内存块数量的比值,范围0.1~0.6),服务端将通过数据迁移机制将选择多个紧凑区的内存块中的数据拷贝到数据存储区。Step 6, when the number of available memory blocks in the compact area drops to a certain threshold (the threshold refers to the ratio of the number of available memory blocks to the total number of memory blocks, ranging from 0.1 to 0.6), the server will pass The data migration mechanism copies the data in memory blocks in selected multiple compact areas to the data storage area.

使用本发明能达到以下有益效果:首先,由于RDMA网卡上的片上缓存有限,划分服务端的非易失内存为数据存储区和数据缓存区,注册数据缓存区并选择性的注册数据存储区,减少了网卡的片上缓存的不命中开销的同时保证了数据操作的性能;其次,将数据缓存区分为稀疏区和紧凑区,对于客户端而言,每一次发送数据时,稀疏区的内存块是不一样的,这就意味着,客户端的数据写不是定点的,这样可以延长非易失内存的写寿命;再者,在客户端每次写后分配一个稀疏内存块作为写完成的确认消息,在确保了数据传输的可靠性的同时提升了连续数据写的性能;最后,通过内存压缩机制将占用稀疏内存块的有效数据压缩至紧凑区内存块中,提升了内存的使用效率。Using the present invention can achieve the following beneficial effects: First, due to the limited on-chip cache on the RDMA network card, the non-volatile memory of the server is divided into a data storage area and a data buffer area, and the data buffer area is registered and the data storage area is selectively registered, reducing It reduces the miss overhead of the on-chip cache of the network card while ensuring the performance of data operations; secondly, the data cache is divided into a sparse area and a compact area. For the client, each time data is sent, the memory block in the sparse area is not Similarly, this means that the data writing of the client is not fixed-point, which can prolong the writing life of the non-volatile memory; moreover, a sparse memory block is allocated after each writing of the client as a confirmation message of the completion of the writing. While ensuring the reliability of data transmission, the performance of continuous data writing is improved; finally, the effective data occupying sparse memory blocks is compressed into compact memory blocks through the memory compression mechanism, which improves the efficiency of memory usage.

附图说明Description of drawings

下面结合附图和实施例对本发明作进一步说明。The present invention will be further described below in conjunction with drawings and embodiments.

图1是面向键值存储的远程非易失内存访问与管理流程图;Figure 1 is a flow chart of remote non-volatile memory access and management for key-value storage;

图2是数据缓存区中的稀疏和紧凑内存块组织图。Figure 2 is an organization diagram of sparse and compact memory blocks in the data cache.

具体实施方式detailed description

参照图1,本发明提供了一种面向键值存储的远程非易失内存访问与管理方法,包括:With reference to Fig. 1, the present invention provides a kind of key-value storage-oriented remote non-volatile memory access and management method, comprising:

第一步,服务端向RDMA网卡注册通信内存区域,通信内存区域包括非易失内存和易失(传统)内存,非易失内存用于键值对的缓存,易失内存接收用户数据传输的命令信息;In the first step, the server registers the communication memory area with the RDMA network card. The communication memory area includes non-volatile memory and volatile (traditional) memory. The non-volatile memory is used for caching key-value pairs, and the volatile memory receives user data transmission. command information;

第二步,服务端划分非易失内存为稀疏和紧凑两个区域,默认地,这两个区域由大小统一的多个内存块组成,稀疏区的内存块的大小在4096字节到16384字节之间;In the second step, the server divides the non-volatile memory into two areas: sparse and compact. By default, these two areas are composed of multiple memory blocks of uniform size. The size of the memory block in the sparse area ranges from 4096 bytes to 16384 words. Between sections;

第三步,客户端向服务端发起建立连接请求,服务端为每个客户端分配一个接收命令的缓冲内存块和一个稀疏区域的内存块,并将这些信息返回给客户端;In the third step, the client initiates a connection establishment request to the server, and the server allocates a buffer memory block for receiving commands and a memory block in a sparse area for each client, and returns these information to the client;

第四步,客户端首先通过RDMA远程写操作将键值对写入这个稀疏的内存块中,并将控制消息发送至命令缓冲区中;In the fourth step, the client first writes the key-value pair into this sparse memory block through the RDMA remote write operation, and sends the control message to the command buffer;

第五步,服务端通过解析命令缓冲区中的控制消息,为客户端继续分配一个稀疏区域中的内存块作为存放下一次发送的数据;如果这是最后一次数据块,那么在上一次的稀疏内存块中写入自身的标识信息表示本次数据传输结束,否则,在上一次的稀疏内存块末端写入为下一次数据传输所分配的稀疏区域的内存块的标识;In the fifth step, the server continues to allocate a memory block in a sparse area for the client by parsing the control message in the command buffer to store the data to be sent next time; if this is the last data block, then in the last sparse area Write its own identification information in the memory block to indicate the end of this data transmission, otherwise, write the identification of the memory block in the sparse area allocated for the next data transmission at the end of the previous sparse memory block;

第六步,如果客户端的键值对数据的剩余长度超过一个已绑定的内存块大小,继续重复步骤四;Step 6, if the remaining length of the key-value pair data of the client exceeds the size of a bound memory block, continue to repeat step 4;

第七步,每间隔一段时间,服务端统计稀疏区域的可用内存块的数量,如果可用内存块数据降低到一定的阀值,服务端选择一定数量的内存块,然后通过数据压缩机制将其中的数据拷贝至一个或多个紧凑区的内存块中;The seventh step, at intervals, the server counts the number of available memory blocks in the sparse area. If the data of the available memory blocks drops to a certain threshold, the server selects a certain number of memory blocks, and then compresses the number of them through the data compression mechanism. Data is copied to one or more memory blocks in the compact area;

第八步,每间隔一段时间,服务端统计紧凑区域的可用内存块的数量,如果可用内存块数据降低到一定的阀值,服务端选择一定数量的内存块,然后通过数据迁移机制将其中的数据拷贝至数据存储区域,增加可用的内存块数量;The eighth step, at regular intervals, the server counts the number of available memory blocks in the compact area. If the available memory block data drops to a certain threshold, the server selects a certain number of memory blocks, and then transfers the available memory blocks through the data migration mechanism. Copy data to the data storage area to increase the number of available memory blocks;

进一步作为优选的实施方式,所述的步骤二,其具体为:Further as a preferred embodiment, the second step is specifically:

注册的缓存区的内存是所有客户端共享的,默认的紧凑区的内存块和稀疏区的内存块的大小一样,对于稀疏区的内存块而言,其主要的作用是用来缓存更多的键值对,客户端只有读的权限,为了方便管理,在这个区域内的所有内存块的大小是统一的,不过为了让单个的内存块缓存更多的数据,稀疏区中的内存块的大小最大可以是紧凑区内存块大小的几倍;由于每个客户端都会绑定一个紧凑区的内存块,如果这个区域的内存块大小设定太大,在客户端增多的情况下,内存的资源消耗较多,所以,对于紧凑区的内存块按照大小设定为多个等级,比如4096字节,8192字节,16384字节,参照图2,在客户端建立连接或每次键值对传输完时,只绑定一个最小的内存块,也就是4096字节;其他的内存块的分配,则根据客户端的需求动态调整。The memory of the registered cache area is shared by all clients. The default memory block in the compact area is the same size as the memory block in the sparse area. For the memory block in the sparse area, its main function is to cache more For key-value pairs, the client only has read permission. For the convenience of management, the size of all memory blocks in this area is uniform. However, in order to allow a single memory block to cache more data, the size of the memory block in the sparse area The maximum can be several times the size of the memory block in the compact area; since each client will bind a memory block in the compact area, if the size of the memory block in this area is set too large, in the case of more clients, memory resources will be reduced. It consumes more, so the memory block in the compact area is set to multiple levels according to the size, such as 4096 bytes, 8192 bytes, 16384 bytes, refer to Figure 2, establish a connection on the client side or transmit key-value pairs every time When finished, only the smallest memory block, which is 4096 bytes, is bound; the allocation of other memory blocks is dynamically adjusted according to the needs of the client.

进一步作为优选的实施方式,所述的步骤三,其具体为:Further as a preferred embodiment, the step three is specifically:

由于客户端的数据发送请求的间断特性,如果为每个连接长期固定的绑定内存块,则会使得其他需要内存块的客户端缺少资源,所以设定每个连接在一定的时间内,如果没有数据的传输,则会回收绑定的紧凑区的内存块,回收有两种方式:客户端主动通知放弃这个内存块的权限和服务端强制回收内存块并通知客户端下次发送数据需要重新申请内存块。Due to the intermittent nature of the client's data sending request, if a long-term fixed memory block is bound for each connection, it will make other clients that need memory blocks lack resources, so set each connection within a certain period of time, if there is no The data transmission will reclaim the memory block in the bound compact area. There are two ways to recycle: the client actively notifies to give up the permission of this memory block; block of memory.

Claims (5)

1. a kind of long-range nonvolatile memory towards key assignments storage accesses and management method, non-volatile for key assignments storage setting one Internal memory cache region, and be registered on the network interface card that can support remote direct memory access technique;It is characterized in that, described non-volatile Internal memory cache region is divided into two parts:Rarefaction and compact area, the memory block in rarefaction are used to receive Terminal Server Client Data, and write per the different memory block of sub-distribution for teledata, the memory block of each rarefaction is pertaining only to a key Value pair, to improve the performance of remote write and extend the nonvolatile memory life-span;
The memory block in compact area is the main buffer area of key-value pair, and each memory block includes multiple key-value pairs, service client Read request;Service end is per at regular intervals, by compression mechanism by the key-value pair in the multiple allocated memory blocks in rarefaction It is compressed in the memory block in one or more compact areas;Improve the efficiency of caching;When compact area free memory block quantity not When sufficient, data are replaced to data storage area;
Specific steps include:
The first step, service end division in save as three classes:Data storage area, nonvolatile memory buffer area and command buffer;Its In, the internal memory of command buffer is DRAM;
Second step, the internal memory of nonvolatile memory buffer area and command buffer is registered to by service end supports remote direct memory to visit Ask on the network interface card of technology, and the registration for the parameter selectivity that data storage area is set according to user;By nonvolatile memory buffer area Compact area and rarefaction are divided into, the size of the memory block of rarefaction is uniformly set as the multiple of page byte, and the multiple is 1-n Times, n is positive integer;
3rd step, service end receives the connection request of client, and distributes a rarefaction memory block and one for each client The memory block of individual command buffer;
4th step, client enter sparse memory block, while send control information to order when transmitting data by RDMA Write In buffer memory block, service end returns to the sparse block address memory write next time as write acknowledgement message;An if data transfer Multiple sparse memory blocks are needed, then according to control information, multiple sparse memory blocks is cascaded and ensure the integrality of data;
5th step, when the quantity of the free memory block of sparse region drops to certain threshold values, service end will pass through internal memory pressure Contracting mechanism, the data compression in multiple sparse memory blocks will be selected into the memory block in compact area, and reclaim and include rubbish number According to sparse memory block;
6th step, when the quantity of the free memory block in compact area drops to certain threshold values, service end will pass through Data Migration Mechanism is by the data copy in the memory block for selecting multiple compact areas to data storage area.
2. a kind of long-range nonvolatile memory towards key assignments storage according to claim 1 accesses and management method, it is special Sign is that its effect of the memory block of the rarefaction is for caching more key-value pairs, and client only has the authority read, in order to Convenient management, it is unified in the size of all memory blocks of rarefaction, in order to allow single memory block to cache more data, The size maximum of memory block in rarefaction can be several times of compact area's internal memory block size;Because each client can be bound The memory block in one compact area, if the memory block setting in compact area is too big, in the case where client increases, the resource of internal memory Consume more, the memory block in compact area is set as several grades according to size.
3. a kind of long-range nonvolatile memory towards key assignments storage according to claim 1 accesses and management method, it is special Sign is, the 4th step,
If without the transmission of data, the memory block in the compact area of binding is reclaimed, reclaiming has two ways:Client proactive notification Abandon the authority of this memory block and service end forces recovery memory block and notifies client to send data needs Shen again next time Please memory block.
4. a kind of long-range nonvolatile memory towards key assignments storage according to claim 1 accesses and management method, it is special Sign is, the 4th step,
Client is it needs to be determined that the validity of the binding relationship of memory block before data is transmitted, if the relation of binding is solved Remove, then the memory block of the sparse memory field of first to file, be over if service end judges that a key-value pair is sent, can be this The transmission that a rarefaction memory block is used for key-value pair next time is bound in connection again.
5. a kind of long-range nonvolatile memory towards key assignments storage according to claim 1 accesses and management method, it is special Sign is that the threshold value in the 5th step, the 6th step refers to:The quantity of free memory block and the ratio of total internal memory number of blocks, Scope 0.1~0.6.
CN201710716667.5A 2017-08-21 2017-08-21 Key value storage-oriented remote nonvolatile memory access and management method Active CN107479833B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710716667.5A CN107479833B (en) 2017-08-21 2017-08-21 Key value storage-oriented remote nonvolatile memory access and management method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710716667.5A CN107479833B (en) 2017-08-21 2017-08-21 Key value storage-oriented remote nonvolatile memory access and management method

Publications (2)

Publication Number Publication Date
CN107479833A true CN107479833A (en) 2017-12-15
CN107479833B CN107479833B (en) 2020-04-17

Family

ID=60601682

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710716667.5A Active CN107479833B (en) 2017-08-21 2017-08-21 Key value storage-oriented remote nonvolatile memory access and management method

Country Status (1)

Country Link
CN (1) CN107479833B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109976947A (en) * 2019-03-11 2019-07-05 北京大学 A kind of method and system of the power loss recovery towards mixing memory
CN110535811A (en) * 2018-05-25 2019-12-03 中兴通讯股份有限公司 Remote memory management method and system, server-side, client, storage medium
CN111858418A (en) * 2019-04-30 2020-10-30 华为技术有限公司 A memory communication method and device based on remote direct memory access RDMA
CN111897784A (en) * 2020-07-13 2020-11-06 安徽大学 A near-data computing cluster system for key-value storage
WO2022095685A1 (en) * 2020-11-04 2022-05-12 中兴通讯股份有限公司 Persistent memory key value system and operation method therefor

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060047771A1 (en) * 2004-08-30 2006-03-02 International Business Machines Corporation RDMA server (OSI) global TCE tables
CN102843435A (en) * 2012-09-10 2012-12-26 浪潮(北京)电子信息产业有限公司 Access and response method and access and response system of storing medium in cluster system
US20140325012A1 (en) * 2012-11-21 2014-10-30 International Business Machines Corporation Rdma-optimized high-performance distributed cache
CN106202138A (en) * 2015-06-01 2016-12-07 三星电子株式会社 Storage device and method for autonomous space compression
CN106469198A (en) * 2016-08-31 2017-03-01 华为技术有限公司 Key assignments storage method, apparatus and system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060047771A1 (en) * 2004-08-30 2006-03-02 International Business Machines Corporation RDMA server (OSI) global TCE tables
CN102843435A (en) * 2012-09-10 2012-12-26 浪潮(北京)电子信息产业有限公司 Access and response method and access and response system of storing medium in cluster system
US20140325012A1 (en) * 2012-11-21 2014-10-30 International Business Machines Corporation Rdma-optimized high-performance distributed cache
CN106202138A (en) * 2015-06-01 2016-12-07 三星电子株式会社 Storage device and method for autonomous space compression
CN106469198A (en) * 2016-08-31 2017-03-01 华为技术有限公司 Key assignments storage method, apparatus and system

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110535811A (en) * 2018-05-25 2019-12-03 中兴通讯股份有限公司 Remote memory management method and system, server-side, client, storage medium
CN110535811B (en) * 2018-05-25 2022-03-04 中兴通讯股份有限公司 Remote memory management method and system, server, client and storage medium
CN109976947A (en) * 2019-03-11 2019-07-05 北京大学 A kind of method and system of the power loss recovery towards mixing memory
CN109976947B (en) * 2019-03-11 2020-11-27 北京大学 A method and system for power failure recovery oriented to hybrid memory
CN111858418A (en) * 2019-04-30 2020-10-30 华为技术有限公司 A memory communication method and device based on remote direct memory access RDMA
CN111858418B (en) * 2019-04-30 2023-04-07 华为技术有限公司 Memory communication method and device based on remote direct memory access RDMA
CN111897784A (en) * 2020-07-13 2020-11-06 安徽大学 A near-data computing cluster system for key-value storage
CN111897784B (en) * 2020-07-13 2022-12-06 安徽大学 Key value storage-oriented near data computing cluster system
WO2022095685A1 (en) * 2020-11-04 2022-05-12 中兴通讯股份有限公司 Persistent memory key value system and operation method therefor

Also Published As

Publication number Publication date
CN107479833B (en) 2020-04-17

Similar Documents

Publication Publication Date Title
CN107479833A (en) A remote non-volatile memory access and management method for key-value storage
WO2018119901A1 (en) Storage system and solid state hard disk
US20200151091A1 (en) Computing system and method for controlling storage device
CN114610232A (en) A storage system, memory management method and management node
US20020112102A1 (en) Computer forming logical partitions
WO2015110046A1 (en) Cache management method and device
CN112241320B (en) Resource allocation method, storage device and storage system
US12105951B2 (en) Data management method for application, system, and computer device
US20230409198A1 (en) Memory sharing control method and device, computer device, and system
KR20220084844A (en) Storage device and operating method thereof
CN113760560A (en) An inter-process communication method and inter-process communication device
CN115687184A (en) A resource allocation method and device
CN102279810A (en) A network storage server and its method for caching data
CN111857992A (en) Thread resource allocation method and device in Radosgw module
WO2023065654A1 (en) Data writing method and related device
CN117075815A (en) Disk data buffer management method, device, equipment and storage medium
CN101377788B (en) Method and system of caching management in cluster file system
Hines et al. Distributed anemone: Transparent low-latency access to remote memory
CN115729438A (en) Data access method, device and storage medium
CN105242884B (en) A kind of storage system of AUTOMATIC ZONING
CN115174484A (en) RDMA (remote direct memory Access) -based data transmission method, device, equipment and storage medium
CN115145493A (en) A resource allocation method based on ZNS SSD system
CN103747253B (en) A kind of video data encoder transmission method based on FIFO
CN114745410A (en) Remote heap management method and remote heap management system
CN115878311A (en) Computing node cluster, data aggregation method and related equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant