CN118312099A - Raid acceleration method based on GPU operation - Google Patents

Raid acceleration method based on GPU operation

Info

Publication number
CN118312099A
CN118312099A CN202410439927.9A CN202410439927A CN118312099A CN 118312099 A CN118312099 A CN 118312099A CN 202410439927 A CN202410439927 A CN 202410439927A CN 118312099 A CN118312099 A CN 118312099A
Authority
CN
China
Prior art keywords
kernel
data
raid
request
gpu
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410439927.9A
Other languages
Chinese (zh)
Inventor
陆建海
苏健威
田军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Baicifang Data Storage Technology Beijing Co ltd
Original Assignee
Baicifang Data Storage Technology Beijing Co ltd
Filing date
Publication date
Application filed by Baicifang Data Storage Technology Beijing Co ltd filed Critical Baicifang Data Storage Technology Beijing Co ltd
Publication of CN118312099A publication Critical patent/CN118312099A/en
Pending legal-status Critical Current

Links

Abstract

The invention relates to the technical field of Raid data processing, in particular to a Raid acceleration method based on GPU operation, which comprises a system architecture, wherein the system architecture comprises a server side, a kernel side and middleware; the server is a user-level program and is used for receiving a request of the kernel, performing GPU operation by using a GPU computing library, and finally, returning a result to the kernel; the kernel end is a kernel-level program and is used for sending a Raid operation request to the server end and receiving processed data; the middleware is used for transmitting data between the kernel end and the server end; the GPU is designed for processing a large amount of data simultaneously, and compared with a traditional Central Processing Unit (CPU), the GPU has stronger parallel processing capacity and can process more data at the same time, so that the computing speed is remarkably improved; the computational cost effectiveness of the GPU is higher than that of the CPU, which can reduce costs while improving performance.

Description

Raid acceleration method based on GPU operation
Technical Field
The invention relates to the technical field of Raid data processing, in particular to a Raid acceleration method based on GPU operation.
Background
RAID (Redundant Array of INDEPENDENT DISKS) is a storage technology that provides higher performance, capacity, and data redundancy by combining multiple independent physical hard disks together.
The development of RAID technology can be traced back to the 80 s of the last century. At that time, the capacity of the hard disk is relatively small, the performance is limited, and the backup and redundancy of data are important problems. RAID has emerged to address these issues.
RAID technology is widely used in data centers, servers, and mass storage systems to provide high performance, high availability, and data protection. It has become an important component of memory technology and has made an important contribution to the reliability and recoverability of data.
The basic principle of RAID is to combine multiple hard disks into one logical volume group and use specific data distribution and redundancy policies to provide higher performance and data protection. RAID technology is typically implemented by a hardware controller or software.
Soft RAID (Software RAID): soft RAID is a RAID function implemented by the software of an operating system. The operating system provides RAID drivers and management tools, which can combine a plurality of independent hard disks into a logical volume group and realize data distribution and redundancy. Soft RAID typically relies on the CPU and memory of the host to perform RAID computation and management tasks. The soft RAID has the advantages of lower cost, easy configuration and management and no need of additional hardware equipment. However, the performance of soft RAID may be affected by the load on the host system and may result in RAID functionality being limited or unusable in the event of a host failure.
Hard RAID (Hardware RAID): hard RAID is a RAID function implemented by a dedicated hardware RAID controller. A hardware RAID controller is a separate piece of hardware equipment, typically built into a server, storage array, or disk controller. The system has a processor, a cache and an interface, can independently execute RAID calculation and management tasks, and reduces the burden of a host system. Hard RAID generally provides higher performance and stability, supports more RAID levels and advanced functions, and maintains RAID functionality in the event of a host failure. However, hard RAID is costly, requires additional hardware equipment, and may require specialized management software to configure and monitor the RAID array.
Disclosure of Invention
The invention aims to provide a Raid acceleration method based on GPU operation aiming at the defects of the prior art.
In order to achieve the above purpose, the technical scheme of the invention is as follows:
The Raid acceleration method based on GPU operation comprises a system architecture, wherein the system architecture comprises a server side, a kernel side and middleware;
The server is a user-level program and is used for receiving a request of the kernel, performing GPU operation by using a GPU computing library, and finally returning a result to the kernel;
the kernel end is a kernel-level program and is used for sending a Raid operation request to the server end and receiving processed data;
The middleware is used for transmitting data between the kernel end and the server end;
The Raid acceleration method comprises the following steps:
Step one: compiling kernel codes based on a system version of a production server, wherein the generated kernel module replaces the original kernel module and is used for realizing a self-defined computing interface; the production server uses soft Raid, namely a mdadm soft Raid environment is created, and a Raid hard disk is generated for reading and writing;
Step two: when the system recovers and reads the Raid hard disk, the kernel module MD finally calls an exclusive or computing interface for hard disk deletion, and then returns after computing the data through a user-level program, i.e. a server;
Step three: when the system writes the Raid hard disk, the kernel module MD finally calls the exclusive-or computing interface, and then calculates the data through the server and returns the data;
Step four: when the system performs common reading on the Raid hard disk, an exclusive or computing interface is not required.
Further: the server side is used for calling the GPU computing library to execute operation, and the specific operation steps are as follows:
Step one: firstly, transmitting data to be processed to CUDA, wherein the CUDA is a GPU computing library special for NVIDIA, wherein the CUDA is transmitted by using cudaMemcpyAsync interfaces, and if the Raid to be processed at present is Raid5, the number of data disks is 2 and the number of check disks is 1, transmitting data of Len' 2 to the CUDA, wherein Len is data transmitted by a kernel, and the data is 4KB;
Step two: invoking a custom CUDA global function to perform operation, wherein the CUDA performs multi-core operation according to parameters when invoking the function, and if the number of currently used logic threads is 1024, each logic thread only needs to process exclusive OR of 4 bytes of data;
Step three: recall cudaMemcpyAsync returns the data from the CUDA to the current program.
Further: the kernel end is used for modifying the code of the kernel Raid part and comprises the following steps:
step one: the exclusive or code of Raid5 is mainly under kernel\linux-3.10.0-862.el7\ crypto\async_tx\async_xor.c, wherein the function called by the core is a do_sync_xor function, and the source code calls the CPU to carry out exclusive or, and instead calls the custom function gpu_sync_xor_blocks;
Step two: a request is generated in the gpu_sync_xor_blocks, the structure of the request comprises the type of the request, the length of the request, data to be processed, a check code of the request, and the request is written into a/dev/cgpuN file, wherein cgpuN is a communication channel file of the middleware, a plurality of/dev/cgpuN files are recorded in a kernel, and the request is sent by the kernel in a multithreading manner;
Step three: after the request is sent, the file is read/dev/cgpuN, the reading is synchronously blocked, the read result is processed data, when the processing error of the server side occurs, the reading returns an error code, the number of bytes read is 0, and the complete operation flow is completed.
Further: the middleware is used for transmitting data and comprises the following specific steps:
step one: the kernel is communicated with a user-level program, a character equipment file is used as a socket for communication, and an open source code library FUSE is selected;
Step two: the middleware establishes a communication file, wherein the communication file is the/dev/cgpuN file, and the communication between the kernel and the user-level program is realized by registering the read-write function of the file;
step three: when the writing interface receives the writing request, the writing interface detects whether all the writing is completed, if yes, the GPU operation of the server is called, and the code of the server is directly placed in the middleware;
Step four: when the reading interface receives a reading request, the reading interface waits for the GPU operation to finish and returns the data of the last request; when an error or overtime occurs, an error code is returned, and if the kernel does not call the read interface, the result of the processing is reserved until the next write request comes.
The invention has the beneficial effects that: the GPU is designed to process large amounts of data simultaneously, with thousands of processing cores, such as CUDA cores of NVIDIA, which enables it to perform multiple tasks or operations simultaneously. Compared with a traditional Central Processing Unit (CPU), the GPU has stronger parallel processing capability and can process more data at the same time, thereby remarkably improving the calculation speed; the GPU has higher power and cost benefits than the CPU, so that the cost can be reduced under the condition of improving the performance; in addition, the GPU has certain advantages over the array card in terms of price, and the array card is generally higher in price than the GPU due to the specialized and high-performance characteristics of the array card, so that a large amount of purchasing cost can be reduced.
Detailed Description
A Raid acceleration method based on GPU operation comprises a system architecture, wherein the system architecture comprises a server side, a kernel side and middleware.
The server is a user-level program and is used for receiving a request of the kernel, performing GPU operation by using a GPU computing library, and finally, returning a result to the kernel;
The kernel end is a kernel-level program and is used for sending a Raid operation request to the server end and receiving processed data;
The middleware is used for transmitting data between the kernel end and the service end.
Specifically, the server side mainly calls the GPU computing library to execute operation.
In one embodiment, the NVIDIA graphics card performs exclusive-or operation to exemplify:
1. firstly, data to be processed are transmitted to CUDA, the CUDA is a GPU computing library specific to NVIDIA, wherein the data are transmitted by using cudaMemcpyAsync interfaces, for example, the current to-be-processed Raid is Raid5, the number of data disks is 2, and the number of check disks is 1, then the data of Len x 2 are transmitted to the CUDA, and the Len is the data transmitted by a kernel, and is generally 4KB.
2. And (3) invoking a custom CUDA global function to perform operation, wherein the CUDA performs multi-core operation according to parameters when invoking the function, for example, the number of currently used logic threads is 1024, and each logic thread only needs to process exclusive OR operation of 4 bytes of data.
3. Recall cudaMemcpyAsync returns the data from the CUDA to the current program.
Specifically, the kernel side is mainly used for modifying the code of the kernel Raid part, and in one embodiment, the Raid5 code of the Centos system is described as follows:
The exclusive or code of Raid5 is mainly under kernel\linux-3.10.0-862.el7\ crypto\async_tx\async_xor.c, wherein the function called by the core is the do_sync_xor function, and the source code generally calls the CPU to carry out exclusive or, and the custom function gpu_sync_xor_blocks is called instead.
A request is generated in gpu_sync_xor_blocks, the structure of which includes the type of the request, the length of the request, the data to be processed, the check code of the request (not necessary, for testing), etc., and the request is written under a/dev/cgpuN file, where cgpuN is a communication channel file of the middleware, and multiple/dev/cgpuN files, such as/dev/cgpu 0,/dev/cgpu1,/dev/cgpu2, etc., are recorded in the kernel, for the kernel to send the request in a multithread.
3. After the request is sent, the file is read/dev/cgpuN, the reading is synchronously blocked, the read result is processed data, when the processing error of the server side occurs, the reading returns an error code, and the number of bytes read is 0. Thus, the complete operation process is completed.
Specifically, the middleware is used for transmitting data and comprises the following specific steps:
1. To enable the kernel to communicate with the user-level program, a character device file may be used as a socket for communication. But in practice there are many sophisticated solutions for this type of code. Here selected is an open source code library FUSE. "FUSE" refers to "FILESYSTEMIN USERSPACE" which is an interface and toolset that allows a user to implement a file system in user space. Traditionally, the implementation of a file system is done in the operating system kernel, while FUSE allows developers to move the implemented portion of the file system into user space for more flexibility in developing and customizing the file system. Using FUSE, a developer can interact with the FUSE interface by writing a program of the user space to realize the self-defined file system behavior.
2. The middleware establishes a communication file, namely the above-mentioned/dev/cgpuN file, and then the communication between the kernel and the user-level program can be realized by registering the read-write function of the file.
3. When the write-in interface receives the write-in request, the write-in interface can detect whether all the write-in requests are completed, if so, the GPU operation of the server is called, and the code of the server is directly placed in the middleware.
4. When the read interface receives the read request, the read interface waits for the GPU operation to finish and returns the data requested last time. When an error or timeout occurs, an error code is returned, and if the kernel does not call the read interface, the result of this processing is preserved until the next write request arrives, and since multiple channels, i.e. multiple/dev/cgpuN files, are used, there is no need to worry about the concurrent thread security of the data.
Specifically, the Raid acceleration method comprises the following steps:
1. Compiling kernel codes based on the system version of the production server, and replacing the original kernel module by the generated kernel module to realize a self-defined computing interface. The production server uses soft Raid, namely mdadm is used for creating a soft Raid environment, generally Raid5, and a Raid hard disk is generated for reading and writing
2. When the system recovers and reads the Raid hard disk, the hard disk is generally lost, the kernel module MD finally calls the exclusive-or computing interface, and then the data is returned after being computed by the user-level program, i.e. the server.
3. When the system writes to the Raid hard disk, this is the general case, the kernel module MD will eventually call the xor calculation interface, and then calculate the data through the server and return.
4. This is the general case when the system is making a normal read to the Raid hard disk, and the exclusive-or computing interface is not required.
Since the current RAID array card, i.e. the RAID controller, is expensive, how to save the cost of the server is a problem that must be considered in the current data center deployment, and how to replace the CPU and effectively increase the operation speed is important. Considering that Raid operations are mainly a large number of logically simple but repetitive operations, then a GPU is very suitable, which is designed to process large amounts of data simultaneously, with thousands of processing cores, such as CUDA cores of NVIDIA, which enables it to perform multiple tasks or operations simultaneously. Compared with a traditional Central Processing Unit (CPU), the GPU has stronger parallel processing capability and can process more data at the same time, thereby remarkably improving the calculation speed. Further, the GPU is more computationally and cost-effective than the CPU, which can reduce costs while improving performance. Also, GPUs have certain advantages over array cards, which are typically higher in price than GPUs due to their specialized and high performance characteristics. GPU replaces CPU to realize soft Raid is a feasible scheme at present.
In view of the above, the present invention has the above-mentioned excellent characteristics, so that it can be used to improve the performance and practicality of the prior art, and is a product with great practical value.
The foregoing is merely exemplary of the present invention, and those skilled in the art should not be considered as limiting the invention, since modifications may be made in the specific embodiments and application scope of the invention in light of the teachings of the present invention.

Claims (4)

1. The Raid acceleration method based on GPU operation comprises a system architecture, wherein the system architecture comprises a server side, a kernel side and middleware, and is characterized in that:
The server is a user-level program and is used for receiving a request of the kernel, performing GPU operation by using a GPU computing library, and finally returning a result to the kernel;
the kernel end is a kernel-level program and is used for sending a Raid operation request to the server end and receiving processed data;
The middleware is used for transmitting data between the kernel end and the server end;
The Raid acceleration method comprises the following steps:
Step one: compiling kernel codes based on a system version of a production server, wherein the generated kernel module replaces the original kernel module and is used for realizing a self-defined computing interface; the production server uses soft Raid, namely a mdadm soft Raid environment is created, and a Raid hard disk is generated for reading and writing;
Step two: when the system recovers and reads the Raid hard disk, the kernel module MD finally calls an exclusive or computing interface for hard disk deletion, and then returns after computing the data through a user-level program, i.e. a server;
Step three: when the system writes the Raid hard disk, the kernel module MD finally calls the exclusive-or computing interface, and then calculates the data through the server and returns the data;
Step four: when the system performs common reading on the Raid hard disk, an exclusive or computing interface is not required.
2. The method according to claim 1, wherein: the server side is used for calling the GPU computing library to execute operation, and the specific operation steps are as follows:
Step one: firstly, transmitting data to be processed to CUDA, wherein the CUDA is a GPU computing library special for NVIDIA, wherein the CUDA is transmitted by using cudaMemcpyAsync interfaces, and if the Raid to be processed at present is Raid5, the number of data disks is 2 and the number of check disks is 1, transmitting data of Len' 2 to the CUDA, wherein Len is data transmitted by a kernel, and the data is 4KB;
Step two: invoking a custom CUDA global function to perform operation, wherein the CUDA performs multi-core operation according to parameters when invoking the function, and if the number of currently used logic threads is 1024, each logic thread only needs to process exclusive OR of 4 bytes of data;
Step three: recall cudaMemcpyAsync returns the data from the CUDA to the current program.
3. The method according to claim 2, wherein: the kernel end is used for modifying codes of the kernel Raid part and comprises the following steps:
step one: the exclusive or code of Raid5 is mainly under kernel\linux-3.10.0-862.el7\ crypto\async_tx\async_xor.c, wherein the function called by the core is a do_sync_xor function, and the source code calls the CPU to carry out exclusive or, and instead calls the custom function gpu_sync_xor_blocks;
Step two: a request is generated in the gpu_sync_xor_blocks, the structure of the request comprises the type of the request, the length of the request, data to be processed, a check code of the request, and the request is written into a/dev/cgpuN file, wherein cgpuN is a communication channel file of the middleware, a plurality of/dev/cgpuN files are recorded in a kernel, and the request is sent by the kernel in a multithreading manner;
Step three: after the request is sent, the file is read/dev/cgpuN, the reading is synchronously blocked, the read result is processed data, when the processing error of the server side occurs, the reading returns an error code, the number of bytes read is 0, and the complete operation flow is completed.
4. A method according to claim 3, characterized in that: the middleware is used for transmitting data and comprises the following specific steps:
step one: the kernel is communicated with a user-level program, a character equipment file is used as a socket for communication, and an open source code library FUSE is selected;
Step two: the middleware establishes a communication file, wherein the communication file is the/dev/cgpuN file, and the communication between the kernel and the user-level program is realized by registering the read-write function of the file;
step three: when the writing interface receives the writing request, the writing interface detects whether all the writing is completed, if yes, the GPU operation of the server is called, and the code of the server is directly placed in the middleware;
Step four: when the reading interface receives a reading request, the reading interface waits for the GPU operation to finish and returns the data of the last request; when an error or overtime occurs, an error code is returned, and if the kernel does not call the read interface, the result of the processing is reserved until the next write request comes.
CN202410439927.9A 2024-04-12 Raid acceleration method based on GPU operation Pending CN118312099A (en)

Publications (1)

Publication Number Publication Date
CN118312099A true CN118312099A (en) 2024-07-09

Family

ID=

Similar Documents

Publication Publication Date Title
US10459814B2 (en) Drive extent based end of life detection and proactive copying in a mapped RAID (redundant array of independent disks) data storage system
US7730257B2 (en) Method and computer program product to increase I/O write performance in a redundant array
EP0426185B1 (en) Data redundancy and recovery protection
US9009580B2 (en) System and method for selective error checking
EP3100184B1 (en) Prioritizing data reconstruction in distributed storage systems
CN1965298B (en) Method, system, and equipment for managing parity RAID data reconstruction
US5166936A (en) Automatic hard disk bad sector remapping
CN101369240B (en) System and method for managing memory errors in an information handling system
US6112255A (en) Method and means for managing disk drive level logic and buffer modified access paths for enhanced raid array data rebuild and write update operations
WO2020236353A1 (en) Memory disaggregation for compute nodes
Menon et al. Floating parity and data disk arrays
Maruyama et al. A high-performance fault-tolerant software framework for memory on commodity gpus
US5978856A (en) System and method for reducing latency in layered device driver architectures
US5210865A (en) Transferring data between storage media while maintaining host processor access for I/O operations
JPH10111767A (en) Large capacity storage device
KR100208801B1 (en) Storage device system for improving data input/output perfomance and data recovery information cache method
JPS63145567A (en) Super large computer
US10430336B2 (en) Lock-free raid implementation in multi-queue architecture
US20030233596A1 (en) Method and apparatus for fast initialization of redundant arrays of storage devices
Curry et al. A lightweight, gpu-based software raid system
US20060064568A1 (en) Integrated circuit capable of mapping logical block address data across multiple domains
WO2021088423A1 (en) Memory management method and system for raid io, terminal and storage medium
US6954825B2 (en) Disk subsystem
CN112119380B (en) Parity check recording with bypass
US6883072B2 (en) Memory system and method of using same

Legal Events

Date Code Title Description
PB01 Publication