WO2024113925A1

WO2024113925A1 - Storage optimization method and system, device, and readable storage medium

Info

Publication number: WO2024113925A1
Application number: PCT/CN2023/109987
Authority: WO
Inventors: 贾猛
Original assignee: 苏州元脑智能科技有限公司
Priority date: 2022-11-30
Filing date: 2023-07-28
Publication date: 2024-06-06
Also published as: CN115543222A; CN115543222B

Abstract

The present application relates to the field of computers, and in particular, to a storage optimization method and system, a device, and a non-volatile readable storage medium. The method comprises: establishing a node and disk allocation policy according to node distribution and the number of disks on a server; and on the basis of the allocation policy, allocating the nodes that execute operations on the disks, and optimizing the allocation policy on the basis of a load of a network interface card. By means of the storage optimization method provided by the present application, an NVME solid-state drive and processor core binding policy based on NUMA technology is provided. By means of the policy, performance optimization of an all-flash server is achieved, existing resources are reasonably utilized, there is no need to upgrade to better hardware, and the read/write IOPS, latency and other performance of a storage server can be effectively and greatly improved, thereby better facilitating cost saving and reasonable use of resources, and maximizing the value of the existing resources.

Description

A storage optimization method, system, device and readable storage medium

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims priority to the Chinese patent application filed with the China Patent Office on November 30, 2022, with application number 202211515911.9, and application name “A Storage Optimization Method, System, Device and Readable Storage Medium”, all contents of which are incorporated by reference in this application.

Technical Field

The present application belongs to the field of computers, and specifically relates to a storage optimization method, system, device and non-volatile readable storage medium.

Background technique

Driven by the tide of science and technology, the era of flash storage has been fully opened, and the large-scale popularization of flash arrays has become unstoppable. The implementation of artificial intelligence, big data, cloud computing, 5G (Fifth Generation, the fifth generation of mobile communication technology), and the Internet of Things has brought about the explosion of massive data and unprecedented demand for extreme performance.

As the demand for all-flash storage gradually increases, how to improve the performance of all-flash servers and maximize their capabilities has become a huge challenge that major storage manufacturers need to face.

Summary of the invention

To solve the above problems, the present application proposes a storage optimization method, including:

Establish node and disk allocation strategy based on node distribution and number of disks on the server;

The nodes that perform operations on the disk are allocated based on the allocation policy, and the allocation policy is optimized based on the load of the network card.

In some implementations of the present application, establishing a node and disk allocation strategy based on the node distribution and the number of disks on the server includes:

In response to the number of nodes being greater than or equal to the number of disks, at least one node is allocated to each disk, and a corresponding mapping table is established based on the allocation relationship between the disks and the nodes.

In some implementations of the present application, allocating nodes that perform operations on disks based on an allocation strategy includes:

In response to receiving an operation request for a disk, the node assigned to the disk is confirmed through a mapping table, and the operation request is assigned to the node for execution.

In some embodiments of the present application, the method further comprises:

Identify the node on the server that is closest to the disk interface and set it as the disk priority node.

In some embodiments of the present application, the method further comprises:

In response to receiving an operation request for a disk, determining whether the operation request is sensitive to a delay requirement based on the type of application generating the operation request;

In response to the fact that the operation request is sensitive to the latency requirement, the operation request is assigned to the disk-first node for execution.

In some embodiments of the present application, the method further comprises:

In response to receiving an operation request for a disk, confirming whether the operation request is sensitive to a delay requirement based on a data operation mode of the operation request;

In some implementations of the present application, optimizing the allocation strategy based on the load of the network card includes:

Identify the nodes that process network card tasks, monitor the utilization of each core in the node in real time, and confirm the utilization of the node based on the utilization of each core;

The usage rate of the node is compared with a first predetermined threshold, and in response to the usage rate of the node being less than the first predetermined threshold, the operation request is allocated to the node according to a predetermined policy.

In some implementations of the present application, allocating an operation request to a node according to a predetermined strategy includes:

The operation requests are distributed to some cores of the node based on the utilization of the node according to a predetermined algorithm.

In some embodiments of the present application, the method further comprises:

In response to the usage rate of the node being greater than a first predetermined threshold, determining whether the usage rate of the node is greater than a second predetermined value;

In response to the usage rate of the node being greater than a second predetermined threshold, allocating the operation request to the node is prohibited.

In some embodiments of the present application, the method further comprises:

After allocating operation requests to some cores of a node, the utilization rate of the node is obtained in real time;

In response to the usage rate of the node exceeding the second predetermined threshold and the task requests of the network card to the node increasing, the allocation of operation requests to the node is suspended, and the task requests of the network card to the node are processed preferentially.

Another aspect of the present application further provides a storage optimization system, comprising:

The policy formulation module is used to establish node and disk allocation strategies based on the node distribution and the number of disks on the server;

The optimized execution module is used to allocate nodes that perform operations on the disk based on an allocation strategy, and optimize the allocation strategy based on the load of the network card.

In some embodiments of the present application, the policy formulation module is further configured to:

In response to receiving an operation request for a disk, confirming whether the operation request is sensitive to a delay requirement based on a type of application generating the operation request;

In some embodiments of the present application, the optimization execution module is further configured to:

Another aspect of the present application also provides a computer device, comprising:

at least one processor; and

The memory stores computer instructions executable on the processor, and when the instructions are executed by the processor, the steps of any one of the methods in the above-mentioned implementation manner are implemented.

Another aspect of the present application further provides a computer non-volatile readable storage medium, which stores a computer program. When the computer program is executed by a processor, the steps of any one of the methods in the above-mentioned embodiments are implemented.

Through a storage optimization method proposed in this application, a NVME (Non-Volatile Memory Express) solid-state drive and processor core binding strategy based on NUMA (Non-Uniform Memory Access) technology is proposed. Through this strategy, the performance optimization of the all-flash server is achieved, and the existing resources are reasonably used. Without upgrading to better hardware, the read and write IOPS (Input/Output Operations Per Second) and latency performance of the storage server can be effectively and significantly improved, which is more conducive to saving costs, reasonably using resources, and maximizing the value of existing resources.

BRIEF DESCRIPTION OF THE DRAWINGS

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings required for use in the embodiments or the description of the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present application. For ordinary technicians in this field, other drawings can be obtained based on these drawings without paying any creative work.

FIG1 is a schematic diagram of an embodiment of a storage optimization method provided in an embodiment of the present application;

FIG2 is a schematic diagram of the structure of a storage optimization system provided in an embodiment of the present application;

FIG3 is a schematic diagram of the structure of a computer device provided in an embodiment of the present application;

FIG4 is a schematic diagram of the structure of a computer non-volatile readable storage medium provided in an embodiment of the present application;

FIG5 is a schematic diagram of a scheduling method strategy for an existing NVME hard disk provided in an embodiment of the present application;

FIG6 is a schematic diagram of a scheduling method for an NVME hard disk proposed in the present application provided in an embodiment of the present application.

Detailed ways

In order to make the objectives, technical solutions and advantages of the present application more clearly understood, the embodiments of the present application are further described in detail below in combination with specific embodiments and with reference to the accompanying drawings.

It should be noted that all expressions using "first" and "second" in the embodiments of the present application are for distinguishing two non-identical entities with the same name or non-identical parameters. It can be seen that "first" and "second" are only for the convenience of expression and should not be understood as limitations on the embodiments of the present application. The subsequent embodiments will not explain this one by one.

This application provides an optimization solution for all-flash storage, aiming to optimize the problem of insufficient utilization of hardware resources in existing storage solutions. All-flash, as the name suggests, uses solid-state drives (SSDs) or other flash media to replace Replace the traditional hard disk (HDD). The most obvious feature is its high IOPS. All-flash technology was initially due to the use of higher-performance SSD (Solid State Drive) hard disks to achieve higher IOPS. In traditional all-flash solutions, the storage performance of the all-flash system is usually improved by "stacking". That is, by increasing the hardware bandwidth of the all-flash platform, selecting SSDs with higher IOPS, or equipping a CPU platform with higher bandwidth and faster processing speed to improve the IOPS of the all-flash platform. In this case, there is no effective optimization based on the characteristics of the hardware resources of the selected platform. It is true that the field of solid-state drives has developed rapidly, and the write speed of home-grade solid-state drives has reached 7000MB/S. In addition, with the rapid iteration of flash memory particles, the bandwidth of the PCIE (Peripheral Component Interconnect Express) bus has been upgraded. The improvement of all-flash storage only requires providing better solid-state drives to bring higher IOPS performance. Focusing on the research and development of squeezing platform performance is not as good as the performance improvement brought by directly matching new solid-state drives. However, this "violent" stacking method causes waste of other resources of the all-flash platform, but at the same time, new solid-state drives represent higher costs.

As shown in FIG1 , to solve the above problem, the present application proposes a storage optimization method, including:

Step S1: Establish a node and disk allocation strategy based on the node distribution and the number of disks on the server;

Step S2: Allocate nodes that perform operations on the disk based on an allocation strategy, and optimize the allocation strategy based on the load of the network card.

In the embodiments of the present application, a node on a server refers to a unit containing several processor cores in the server CPU. For server-level multi-core processors, they are generally far more than the multi-core processors used at the desktop level. For example, the number of cores is more than 32 cores as the manufacturing process is improved. In the design, multiple cores are divided into one node for arrangement. Therefore, the node in the present application refers to a module containing multiple processor cores. The number of disks refers to the number of solid-state hard disks mounted on the server, which are generally solid-state hard disks with NVME protocol, and of course, other types of storage media can also be used.

Therefore, in step S1, the number of NVME hard disks mounted on the server and the node distribution of the CPU on the server are determined, and the so-called node distribution includes the number of nodes and the number of cores, and then a strategy for allocating NVME solid-state hard disk data processing is formulated according to the number of CPU nodes and the number of cores. That is, which processor cores access the NVME solid-state hard disk and which access strategies are set.

It should be noted that in the traditional technical implementation, the node closest to the NVME hard disk interface is responsible for the data operation of the NVME solid-state drive. As before, a CPU with a certain scale of cores uses several cores as a node, and a CPU contains several nodes. In the arrangement of the CPU chip, there must be one or more nodes closest to the peripheral interface. For example, as shown in Figure 5, the CPU selected in the embodiment of the present application is Haiguang CS5250H, which has 128 processing cores, which are compiled into 8 nodes, each of which is 16 processor cores, and the interface connected to the NVME solid-state drive by node 1 and node 4 is the closest. According to the traditional implementation method, node 1 and node 4 will be used as the processing nodes of the NVME solid-state drive. That is, all operations on the NVME solid-state drive will be processed by the cores on node 1 and node 4. When an operation on the NVME solid-state drive is generated, the corresponding storage software will send the operation request to node 1 or node 4.

In the present application, the core binding technology of NUMA (Non Uniform Memory Access, which enables many servers to operate as a single system) software is used to bind the processor node with the corresponding solid-state drive through NUMA.

Therefore, in step S1, the corresponding allocation strategy needs to be determined according to the number of nodes on the server and the number of mounted NVME hard disks. In this step, the number of nodes and cores of the processor on the server can be obtained by software detection, and then an allocation strategy suitable for the server can be generated according to a general allocation method, or the user can manually set the allocation strategy of the processor node and NVME solid-state hard disk as needed.

In step S2, after the corresponding allocation strategy is formulated according to the server node and NVME solid state drive, when the production When an operation request for a solid state drive is generated, the corresponding operation request is sent to the processor node specified by the allocation strategy. The processor core in the processor node executes the operation request.

Furthermore, in this application, the server is a server for all-flash business. The business on the server is not just between NVME solid-state drives. Most of the data of the all-flash server comes from the network, and the network data needs to be received and sent through the network card. At the same time, the data of the network card must also be processed by the CPU. Processing the data of the network card requires the CPU load. Therefore, after the processor nodes are allocated based on NUMA technology and allocation strategy, when the network card load changes, some processor nodes will be occupied, and the network card and NVME solid-state drive will compete for processor computing resources. Therefore, it is necessary to dynamically adjust the NVME hard drive allocation strategy according to the load of the network card. For example, when the load of the network card is high, the operation request of the NVME solid-state drive that shares a processor node with the network card is allocated to its processor node.

In this embodiment, if the number of processor nodes exceeds or equals the number of NVME solid-state drives, it is recommended to allocate a processor node to each NVME solid-state drive. For example, as shown in Figure 6, Figure 6 shows the situation where the processor node and the mounted NVEM just match, and one node is bound to one NVME solid-state drive through NUMA technology. A mapping table between the processor node and the NVME solid-state drive is further created.

In this embodiment, when the system or upper-layer application generates an operation request for a certain NVME solid-state drive, the NVME solid-state drive accessed by the operation request is confirmed based on the operation request, and then the processing node corresponding to the NVME solid-state drive is found according to the mapping table, and the operation request is allocated to the corresponding processor node through NUMA technology.

In some embodiments of the present application, the method further comprises:

In this embodiment, as before, there are multiple processing nodes in the processor on the server. When arranging these processing nodes, there must be a node closest to the NVME solid-state drive interface. Therefore, in this embodiment, the processor node closest to the NVME solid-state drive interface is used as the disk priority node.

In some embodiments of the present application, the method further comprises:

In this embodiment, when an operation request for any NVME solid-state drive is received, it can be determined whether the application generating the operation request is sensitive to delay. For example, if it is a query service of a database type application, the operation request can be sent to the disk priority node for processing. Even if the NVME solid-state drive to be accessed by the operation request is not bound to the disk priority node in the pre-allocation strategy, the operation request can still be allocated to the disk priority node. The operation request is directed to the corresponding disk priority node through the NUMA command.

In some embodiments of the present application, the method further comprises:

In response to receiving an operation request for a disk, confirming a delay requirement of the operation request from the operation request to the data operation mode. Ask whether it is sensitive;

In this embodiment, the so-called confirmation of whether the operation request is sensitive to the delay requirement from the operation request to the data operation mode refers to whether the operation request occupies the processor node for a long time to complete the corresponding operation request when it is executed. For example, for writing or reading, it is determined whether the size of the written or read data meets the preset value. If it is small data, the corresponding operation request can be processed according to the usage rate of the disk priority node. If the write data is large, such as transferring a large number of files, such a request is rejected to use the disk priority node.

In this embodiment, as before, the present application binds the processor node to the corresponding NVME hard disk through NUMA. If a processor node also performs processing tasks on the network card after being assigned to the task of the corresponding NVME solid-state hard disk, the utilization rate of each core of the processor node is obtained, and the utilization rate of the processor node is confirmed. If the utilization rate of the processor node is lower than 50%, the number of allocations of the processor node to the corresponding NVME solid-state hard disk is adjusted through NUMA based on the binary method. Assuming that the processor node has 16 cores, 8 cores are allocated to the NVME solid-state hard disk for binding.

If the processor node usage is still low, for example, less than 25%, the remaining 4 processor cores can be allocated to the corresponding NVME solid-state drives through NUMA.

In some embodiments of the present application, the method further comprises:

In this embodiment, if the usage rate of the processor node exceeds 90%, in this case, sending operation requests to the processor node will be stopped.

In some embodiments of the present application, the method further comprises:

In this embodiment, after allocating operation requests to the processor node processing the network data of the network card, the state of the processor node is monitored, and if the utilization rate of the processor node exceeds 90%, the allocation of operation requests to the processor node is suspended.

Through a storage optimization method proposed in this application, a NVME solid-state drive and processor core binding strategy based on NUMA technology is proposed. Through this strategy, the performance optimization of the all-flash server is achieved, and the existing resources are reasonably used. Without upgrading to better hardware, the read and write IOPS and latency performance of the storage server can be effectively and significantly improved, which is more conducive to cost saving, reasonable use of resources, and maximizing the value of existing resources.

As shown in FIG. 2 , another aspect of the present application further provides a storage optimization system, including:

Strategy formulation module 1, strategy formulation module 1 is used to establish node and disk allocation strategy according to node distribution and disk number on the server;

The optimized execution module 2 is used to allocate nodes that perform operations on the disk based on an allocation strategy, and optimize the allocation strategy based on the load of the network card.

In some embodiments of the present application, the policy formulation module 1 is further configured to:

In some embodiments of the present application, the optimization execution module 2 is further configured to:

As shown in FIG3 , another aspect of the present application further provides a computer device, comprising:

at least one processor 21; and

The memory 22 stores computer instructions 23 that can be run on the processor 21. When the instructions 23 are executed by the processor 21, a storage optimization method is implemented, including:

In some embodiments of the present application, a node and disk partition is established based on the node distribution and the number of disks on the server. Matching strategies include:

In some embodiments of the present application, the method further comprises:

As shown in FIG. 4 , another aspect of the present application further provides a computer non-volatile readable storage medium 401, wherein the computer non-volatile readable storage medium 401 stores a computer program 402, and when the computer program 402 is executed by a processor, a storage optimization method is implemented, including:

In some embodiments of the present application, the method further comprises:

The above are exemplary embodiments disclosed in the present application, but it should be noted that various changes and modifications may be made without departing from the scope disclosed in the embodiments of the present application as defined in the claims. The functions, steps and/or actions of the method claims according to the disclosed embodiments described herein do not need to be performed in any particular order. In addition, although the elements disclosed in the embodiments of the present application may be described or required in individual form, they may also be understood as multiple unless explicitly limited to the singular.

It should be understood that, as used herein, the singular forms "a", "an" are intended to include the plural forms as well, unless the context clearly supports an exception. It should also be understood that, as used herein, "and/or" refers to any and all possible combinations including one or more of the associated listed items.

The serial numbers of the embodiments disclosed in the above-mentioned embodiments of the present application are only for description and do not represent the advantages or disadvantages of the embodiments.

A person skilled in the art will appreciate that all or part of the steps to implement the above embodiments may be accomplished by hardware or by instructing related hardware through a program, and the program may be stored in a computer non-volatile readable storage medium, and the above-mentioned storage medium may be a read-only memory, a disk or an optical disk, etc.

It should be understood by those skilled in the art that the discussion of any of the above embodiments is only exemplary and is not intended to imply The scope of the disclosure of the embodiments of the present application (including the claims) is limited to these examples; under the idea of the embodiments of the present application, the technical features in the above embodiments or different embodiments can also be combined, and there are many other changes in different aspects of the embodiments of the present application as above, which are not provided in detail for the sake of simplicity. Therefore, any omission, modification, equivalent replacement, improvement, etc. made within the spirit and principle of the embodiments of the present application should be included in the protection scope of the embodiments of the present application.

Claims

A storage optimization method, characterized by comprising:

Establish node and disk allocation strategy based on node distribution and number of disks on the server;

The nodes that perform operations on the disk are allocated based on the allocation strategy, and the allocation strategy is optimized based on the load of the network card.
The method according to claim 1, characterized in that the step of establishing a node and disk allocation strategy based on the node distribution and the number of disks on the server comprises:

In response to the number of nodes being greater than or equal to the number of disks, at least one node is allocated to each disk, and a corresponding mapping table is established based on the allocation relationship between the disks and the nodes.
The method according to claim 2, characterized in that the allocating nodes that perform operations on the disk based on the allocation strategy comprises:

In response to receiving an operation request for a disk, the node assigned to the disk is confirmed through the mapping table, and the operation request is assigned to the node for execution.
The method according to claim 1, further comprising:

A node closest to the disk interface among the nodes on the server is identified, and the node is used as a disk priority node.
The method according to claim 4, further comprising:

In response to receiving an operation request for a disk, confirming whether the operation request is sensitive to a delay requirement based on the type of application generating the operation request;

In response to the fact that the operation request is sensitive to a delay requirement, the operation request is allocated to the disk priority node for execution.
The method according to claim 5, further comprising:

In response to receiving an operation request for a disk, confirming whether the operation request is sensitive to a delay requirement based on a data operation mode of the operation request;

In response to the fact that the operation request is sensitive to a delay requirement, the operation request is allocated to the disk priority node for execution.
The method according to claim 1, characterized in that the optimizing the allocation strategy based on the load of the network card comprises:

Confirm the node that processes the network card task, monitor the utilization rate of each core in the node in real time, and confirm the utilization rate of the node based on the utilization rate of each core;

The usage rate of the node is compared with a first predetermined threshold, and in response to the usage rate of the node being less than the first predetermined threshold, the operation request is allocated to the node according to a predetermined policy.
The method according to claim 7, characterized in that the allocating operation requests to the nodes according to a predetermined strategy comprises:

The operation requests are distributed to part of the cores of the node based on the utilization rate of the node according to a predetermined algorithm.
The method according to claim 7, further comprising:

In response to the usage rate of the node being greater than a first predetermined threshold, determining whether the usage rate of the node is greater than a second predetermined value;

In response to the usage rate of the node being greater than a second predetermined threshold, allocating operation requests to the node is prohibited.
The method according to claim 8, further comprising:

After allocating the operation request to some cores of the node, obtaining the utilization rate of the node in real time;

In response to the usage rate of the node exceeding a second predetermined threshold and the task requests from the network card to the node increasing, allocating operation requests to the node is suspended, and the task requests from the network card to the node are processed preferentially.
The method according to claim 1, wherein the node is a processor node, the disk is a non-volatile memory architecture solid state disk, and the optimization of the allocation strategy based on the load of the network card further comprises:

The operation request of the non-volatile memory architecture solid state drive that shares a processor node with the network card is allocated to the processor node.
The method according to claim 5, characterized in that, in response to the operation request being sensitive to a delay requirement, allocating the operation request to the disk priority node for execution, further comprises:

When the operation request is a query service of a database type application, the operation request is sent to the disk priority node for processing.
The method according to claim 8, wherein the node is a processor node, the disk is a non-volatile memory architecture solid state disk, and the allocating operation requests to some cores of the node based on the usage rate of the node according to a predetermined algorithm comprises:

When the utilization rate of the processor node is lower than 50%, the allocated quantity of the processor node and the solid state disk corresponding to the non-volatile memory architecture is adjusted through the non-uniform memory access architecture based on the binary method.
The method according to claim 13, characterized in that when the utilization rate of the processor node is lower than 50%, adjusting the allocation quantity of the processor node and the solid state drive corresponding to the non-volatile memory architecture through the non-uniform memory access architecture based on the binary method comprises:

When the utilization rate of the processor node is lower than 50% and the processor node has 16 cores, 8 cores are allocated to the non-volatile memory architecture solid state drive for binding.
The method according to claim 14, further comprising:

When the processor node utilization rate is lower than 25%, the remaining four cores are allocated to the corresponding non-volatile memory architecture solid state drives through a non-uniform memory access architecture.
The method according to claim 9, wherein the node is a processor node, and in response to the usage rate of the node being greater than a second predetermined threshold, prohibiting the allocation of operation requests to the node, further comprises:

When the utilization rate of the processor node exceeds 90%, sending the operation request to the processor node is stopped.
The method according to claim 1, characterized in that the node distribution includes the number of nodes and the number of cores.
A storage optimization system, characterized by comprising:

A strategy formulation module, the strategy formulation module is used to establish a node and disk allocation strategy based on the node distribution and the number of disks on the server;

The optimized execution module is used to allocate nodes that perform operations on the disk based on the allocation strategy, and optimize the allocation strategy based on the load of the network card.
A computer device, comprising:

at least one processor; and

A memory storing computer instructions executable on the processor, wherein the instructions, when executed by the processor, implement the steps of the method according to any one of claims 1 to 17.
A computer non-volatile readable storage medium, wherein the computer non-volatile readable storage medium stores a computer program, wherein the computer program, when executed by a processor, implements the steps of the method according to any one of claims 1 to 17.