CN114244857A - Memory pool management method of distributed storage system - Google Patents

Memory pool management method of distributed storage system Download PDF

Info

Publication number
CN114244857A
CN114244857A CN202110387038.9A CN202110387038A CN114244857A CN 114244857 A CN114244857 A CN 114244857A CN 202110387038 A CN202110387038 A CN 202110387038A CN 114244857 A CN114244857 A CN 114244857A
Authority
CN
China
Prior art keywords
distributed storage
storage system
memory pool
memory
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110387038.9A
Other languages
Chinese (zh)
Inventor
何晓斌
肖伟
陈德训
高洁
余婷
陈起
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuxi Jiangnan Computing Technology Institute
Original Assignee
Wuxi Jiangnan Computing Technology Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuxi Jiangnan Computing Technology Institute filed Critical Wuxi Jiangnan Computing Technology Institute
Priority to CN202110387038.9A priority Critical patent/CN114244857A/en
Publication of CN114244857A publication Critical patent/CN114244857A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
    • G06F15/163Interprocessor communication
    • G06F15/173Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
    • G06F15/17306Intercommunication techniques
    • G06F15/17331Distributed shared memory [DSM], e.g. remote direct memory access [RDMA]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a memory pool management method of a distributed storage system, which comprises the following steps: s1, starting the initialization of the distributed storage system, and opening the network card equipment; s2, allocating a memory, registering a memory pool of the distributed storage system, and recording the initial address and the memory size of the memory pool; s3, taking the first address of the memory pool of the distributed storage system as an input parameter, registering the memory pool of the network card, organizing the memory pool of the network card according to the size of the data block of the distributed storage system, recording the offset, and returning a handle number of the network card; and S4, finishing the initialization of the distributed storage system. The invention can respectively eliminate the memory copy from the distributed storage memory pool to the network card memory pool and from the network card memory pool to the distributed storage memory pool in the scene of reading and writing data, further reduces the overhead of context switching and the like, and improves the data transmission performance of the distributed storage system.

Description

Memory pool management method of distributed storage system
Technical Field
The invention relates to a memory pool management method of a distributed storage system, and belongs to the technical field of high-performance computing.
Background
To address server-side data processing delays in network transmissions, Remote Direct Memory Access (RDMA) techniques have been developed that allow data to be moved quickly from one system memory storage to the memory storage of a remote system without any impact on the operating system, thus eliminating the need for as much computer processing power as is required, which eliminates the overhead of external memory copy and context switch, thus freeing memory bandwidth and CPU cycles for improved application system performance. RDMA has low latency, low load, high bandwidth characteristics. In a large-scale distributed storage system, due to the fact that multiple clients and servers perform file operation communication, high requirements are placed on network bandwidth and delay, the traditional TCP/IP protocol cannot meet the requirements more and more, and many distributed storage systems begin to use RDMA (remote direct memory Access) technologies in network communication, such as GlusterFs and Ceph.
In the distributed storage communication scenario, the RDMA technique, while eliminating the overhead of external memory copy and context switch, frees the processor, yet still has data copy between the distributed storage system and the network card for data. In the framework of the prior art, the distributed storage system must frequently perform memory copy from the network card memory pool to the distributed storage system management memory pool and from the distributed storage system management memory pool to the network card memory pool in the read-write process, and the frequent memory copy may become a performance bottleneck of the distributed storage system communication due to the large IO volume of the distributed storage system.
The current memory pool management method in the distributed storage system is mostly only limited to self management, taking data reading as an example, the data is received from the bottom storage device, the data is put into the memory pool of the distributed storage system according to the self data organization mode, and how the data is sent to the network card memory pool from the memory pool of the distributed storage system and then to the remote end through the RDMA technology, most of the data is memory copy. Under the scene of massive read-write data of the distributed storage system, the time consumed by frequent memory copying is increased, the overhead of context switching is also increased, the data transmission efficiency of the distributed storage system is reduced, and the overall performance is influenced.
Currently, a communication protocol based on the RDMA technology has become a preferred method for high-speed and high-performance data transmission, and at present, the data transmission module of the distributed storage system also uses the RDMA protocol to improve the transmission efficiency of data in a network. The distributed storage system needs to go through two memory copies during both write and read operations. During reading operation, data needs to be subjected to two memory copies from a bottom storage device to a memory pool managed by the distributed storage system and from the memory pool managed by the distributed storage system to a network card memory pool, and during writing operation, data needs to be subjected to two memory copies from a user application data space to the network card memory pool and from the network card memory pool to the memory pool managed by the distributed storage system, and during mass data transmission, frequent memory copies affect the transmission performance of the system.
Disclosure of Invention
The invention aims to provide a memory pool management method of a distributed storage system, which overcomes the defects of the memory pool management of the existing distributed storage system.
In order to achieve the purpose, the invention adopts the technical scheme that: a memory pool management method of a distributed storage system is provided, which comprises the following steps:
s1, starting the initialization of the distributed storage system, and opening the network card equipment;
s2, allocating a memory, registering a memory pool of the distributed storage system, and recording the initial address and the memory size of the memory pool;
s3, taking the first address of the memory pool of the distributed storage system as an input parameter, registering the memory pool of the network card, organizing the memory pool of the network card according to the size of the data block of the distributed storage system, recording the offset, and returning a handle number of the network card;
s4, finishing the initialization of the distributed storage system;
when the distributed storage system reads data, the following operations are performed:
s11, the bottom storage device returns data to the distributed storage system;
s12, the distributed storage system cuts and processes data according to the size of the data block and copies the data to the memory pool of the distributed storage system;
s13, finding the initial address of the data in the memory pool of the distributed storage system and the network card handle number according to the recorded offset;
s14, the network card sends data to the remote end through RDMA technology according to the given network card handle number, the memory first address and the memory block size;
when the distributed storage system writes data, the following operations are executed:
s21, the upper layer application sends data to a network card memory pool of the distributed storage system through RDMA technology;
s22, finding the first address and the size of the memory block of the data in the memory pool of the distributed storage system according to the network card handle number and the recorded offset;
and S23, copying the data of the memory pool of the distributed storage system to the bottom storage device by the distributed storage system according to the memory head address and the memory block size.
Due to the application of the technical scheme, compared with the prior art, the invention has the following advantages:
the invention provides a memory pool management method of a distributed storage system, which combines a memory pool managed by the distributed storage system with a network card memory pool, can respectively eliminate memory copy from the distributed storage memory pool to the network card memory pool and from the network card memory pool to the distributed storage memory pool under the scene of reading and writing data, further reduces the overhead of context switching and the like, and improves the data transmission performance of the distributed storage system.
Drawings
Fig. 1 is a first flowchart illustrating a memory pool management method of a distributed storage system according to the present invention;
fig. 2 is a flow chart illustrating a second method for managing a memory pool of a distributed storage system according to the present invention;
fig. 3 is a third schematic flow chart of the memory pool management method of the distributed storage system according to the present invention; .
Detailed Description
Example (b): the invention provides a memory pool management method of a distributed storage system, which specifically comprises the following steps:
s1, starting the initialization of the distributed storage system, and opening the network card equipment;
s2, allocating a memory, registering a memory pool of the distributed storage system, and recording the initial address and the memory size of the memory pool;
s3, taking the first address of the memory pool of the distributed storage system as an input parameter, registering the memory pool of the network card, organizing the memory pool of the network card according to the size of the data block of the distributed storage system, recording the offset, and returning a handle number of the network card;
s4, finishing the initialization of the distributed storage system;
when the distributed storage system reads data, the following operations are performed:
s11, the bottom storage device returns data to the distributed storage system;
s12, the distributed storage system cuts and processes data according to the size of the data block and copies the data to the memory pool of the distributed storage system;
s13, finding the initial address of the data in the memory pool of the distributed storage system and the handle number of the network card according to the recorded offset, and eliminating the memory copy of the data in the memory pool of the distributed storage system to the memory pool of the network card;
s14, the network card sends data to the remote end through RDMA technology according to the given network card handle number, the memory first address and the memory block size;
when the distributed storage system writes data, the following operations are executed:
s21, the upper layer application sends data to a network card memory pool of the distributed storage system through RDMA technology;
s22, according to the handle number of the network card and the recorded offset, finding the first address and the size of the memory block of the data in the memory pool of the distributed storage system, and eliminating the memory copy that the data in the memory pool of the network card is copied to the memory pool of the distributed storage system through the memory;
and S23, copying the data of the memory pool of the distributed storage system to the bottom storage device by the distributed storage system according to the memory head address and the memory block size.
The above embodiments are further explained as follows:
compared with the existing memory pool management method under the scene of mass data read-write of the distributed storage system, the network card memory pool and the distributed storage system management memory pool are combined, the memory pool is established by the distributed storage system and is simultaneously registered, memory copy from the distributed storage system management memory pool to the network card memory pool during data writing and data copy from the network card memory pool to the distributed storage system management memory pool during data reading are eliminated, the overhead such as context switching is further reduced, the time consumed during data transmission of the distributed storage system is reduced, and the data transmission efficiency and the overall performance of the distributed storage system are improved.
The method mainly comprises three parts of memory pool initialization, read data and write data of the distributed storage system.
The initialization process of the memory pool of the distributed storage system is as shown in fig. 1:
(1) starting the initialization of the distributed storage system, and opening network card equipment;
(2) allocating a memory, registering a memory pool of the distributed storage system, and recording a first address and the size of the memory;
(3) when registering a network card memory pool, the method does not allocate memory for the network card in the prior art, but registers the network card memory pool by taking the first address of the memory pool of the distributed storage system as an input parameter, organizes the network card memory pool according to the size of a data block of the distributed storage system, records the offset, and returns a handle number of the network card;
(4) the distributed storage system initialization is ended.
The distributed storage system data reading flow is as shown in fig. 2:
(1) the bottom storage device returns data to the distributed storage system;
(2) the distributed storage system cuts and processes data according to the block size, and copies the data to a managed memory pool;
(3) according to the recorded offset, finding the first address of the data in the distributed storage system management memory pool and the network card handle number, and eliminating the memory copy of the data in the distributed storage system memory pool to the network card memory pool;
(4) the network card sends data to the remote end through RDMA technology according to the given network card handle number, the memory first address and the memory block size.
The data writing flow of the distributed storage system is as shown in FIG. 3:
(1) the upper layer application sends data to a network card memory pool of the distributed storage system through an RDMA technology;
(2) according to the network card handle number and the recorded offset, finding the first address and the block size of the data in the management memory pool of the distributed storage system, and eliminating the memory copy that the data in the network card memory pool is copied to the memory pool of the distributed storage system through the memory;
(3) and the distributed storage system copies the data of the memory pool to the bottom storage equipment according to the memory first address and the block size.
When the memory pool management method of the distributed storage system is adopted, the memory pool managed by the distributed storage system is combined with the network card memory pool, so that the memory copy from the distributed storage memory pool to the network card memory pool and from the network card memory pool to the distributed storage memory pool can be respectively eliminated in the scene of reading and writing data, the overhead of context switching and the like is further reduced, and the data transmission performance of the distributed storage system is improved.
To facilitate a better understanding of the invention, the terms used herein will be briefly explained as follows:
distributed storage system: the storage system supports data dispersed storage and concurrent access of multiple clients through a network.
RDMA (remote Direct Memory Access) technology: techniques for accounting for server-side data processing delays in network transmissions that allow data to be transmitted directly from the memory of one computer to another without the intervention of two operating systems.
Memory copy: the n-byte content of the source memory address is copied to the target memory address.
The above embodiments are merely illustrative of the technical ideas and features of the present invention, and the purpose thereof is to enable those skilled in the art to understand the contents of the present invention and implement the present invention, and not to limit the protection scope of the present invention. All equivalent changes and modifications made according to the spirit of the present invention should be covered within the protection scope of the present invention.

Claims (1)

1. A memory pool management method of a distributed storage system is characterized by comprising the following steps:
s1, starting the initialization of the distributed storage system, and opening the network card equipment;
s2, allocating a memory, registering a memory pool of the distributed storage system, and recording the initial address and the memory size of the memory pool;
s3, taking the first address of the memory pool of the distributed storage system as an input parameter, registering the memory pool of the network card, organizing the memory pool of the network card according to the size of the data block of the distributed storage system, recording the offset, and returning a handle number of the network card;
s4, finishing the initialization of the distributed storage system;
when the distributed storage system reads data, the following operations are performed:
s11, the bottom storage device returns data to the distributed storage system;
s12, the distributed storage system cuts and processes data according to the size of the data block and copies the data to the memory pool of the distributed storage system;
s13, finding the initial address of the data in the memory pool of the distributed storage system and the network card handle number according to the recorded offset;
s14, the network card sends data to the remote end through RDMA technology according to the given network card handle number, the memory first address and the memory block size;
when the distributed storage system writes data, the following operations are executed:
s21, the upper layer application sends data to a network card memory pool of the distributed storage system through RDMA technology;
s22, finding the first address and the size of the memory block of the data in the memory pool of the distributed storage system according to the network card handle number and the recorded offset;
and S23, copying the data of the memory pool of the distributed storage system to the bottom storage device by the distributed storage system according to the memory head address and the memory block size.
CN202110387038.9A 2021-04-12 2021-04-12 Memory pool management method of distributed storage system Pending CN114244857A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110387038.9A CN114244857A (en) 2021-04-12 2021-04-12 Memory pool management method of distributed storage system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110387038.9A CN114244857A (en) 2021-04-12 2021-04-12 Memory pool management method of distributed storage system

Publications (1)

Publication Number Publication Date
CN114244857A true CN114244857A (en) 2022-03-25

Family

ID=80742814

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110387038.9A Pending CN114244857A (en) 2021-04-12 2021-04-12 Memory pool management method of distributed storage system

Country Status (1)

Country Link
CN (1) CN114244857A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101577716A (en) * 2009-06-10 2009-11-11 中国科学院计算技术研究所 Distributed storage method and system based on InfiniBand network
CN105978985A (en) * 2016-06-07 2016-09-28 华中科技大学 Memory management method of user-state RPC over RDMA
US20170293588A1 (en) * 2016-04-12 2017-10-12 Samsung Electronics Co., Ltd. Piggybacking target buffer address for next rdma operation in current acknowledgement message

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101577716A (en) * 2009-06-10 2009-11-11 中国科学院计算技术研究所 Distributed storage method and system based on InfiniBand network
US20170293588A1 (en) * 2016-04-12 2017-10-12 Samsung Electronics Co., Ltd. Piggybacking target buffer address for next rdma operation in current acknowledgement message
CN105978985A (en) * 2016-06-07 2016-09-28 华中科技大学 Memory management method of user-state RPC over RDMA

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
金浩;梅君君;: "一种RDMA传输控制方法研究", 网络安全技术与应用, no. 05 *

Similar Documents

Publication Publication Date Title
CN110402568B (en) Communication method and device
US20190073132A1 (en) Method and system for active persistent storage via a memory bus
US7234006B2 (en) Generalized addressing scheme for remote direct memory access enabled devices
CN109388590B (en) Dynamic cache block management method and device for improving multichannel DMA (direct memory access) access performance
CN108600053B (en) Wireless network data packet capturing method based on zero copy technology
CN112632069B (en) Hash table data storage management method, device, medium and electronic equipment
CN113986791B (en) Method, system, equipment and terminal for designing intelligent network card fast DMA
EP4369171A1 (en) Method and apparatus for processing access request, and storage device and storage medium
CN109857545B (en) Data transmission method and device
CN117312201B (en) Data transmission method and device, accelerator equipment, host and storage medium
KR102471966B1 (en) Data input and output method using storage node based key-value srotre
CN113296691B (en) Data processing system, method and device and electronic equipment
KR100449806B1 (en) A network-storage apparatus for high-speed streaming data transmission through network
US6108694A (en) Memory disk sharing method and its implementing apparatus
CN114244857A (en) Memory pool management method of distributed storage system
US9069821B2 (en) Method of processing files in storage system and data server using the method
CN111338570B (en) Parallel file system IO optimization method and system
CN114265791A (en) Data scheduling method, chip and electronic equipment
CN110209343B (en) Data storage method, device, server and storage medium
Dalessandro et al. iSER storage target for object-based storage devices
CN111274189A (en) USB device and real-time communication method thereof
Liang et al. High performance block I/O for global file system (GFS) with infiniband RDMA
WO2024217333A1 (en) Io access method and apparatus based on block storage, and electronic device and medium
CN118034615B (en) Data access method and device
US20240152476A1 (en) Data access method and related device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination