CN113014662A - Data processing method and storage system based on NVMe-oF protocol - Google Patents

Data processing method and storage system based on NVMe-oF protocol Download PDF

Info

Publication number
CN113014662A
CN113014662A CN202110266793.1A CN202110266793A CN113014662A CN 113014662 A CN113014662 A CN 113014662A CN 202110266793 A CN202110266793 A CN 202110266793A CN 113014662 A CN113014662 A CN 113014662A
Authority
CN
China
Prior art keywords
node
read
write
data
write request
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110266793.1A
Other languages
Chinese (zh)
Inventor
张胜玉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lenovo Beijing Ltd
Original Assignee
Lenovo Beijing Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lenovo Beijing Ltd filed Critical Lenovo Beijing Ltd
Priority to CN202110266793.1A priority Critical patent/CN113014662A/en
Publication of CN113014662A publication Critical patent/CN113014662A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network

Abstract

The disclosure relates to a data processing method and a storage system based on an NVMe-oF protocol, and the data processing method applied to a first node in the storage system comprises the following steps: receiving a first read-write request, wherein the first read-write request carries read-write control information of target data to be read and written, and the read-write control information is used for determining a target storage node of the target data; determining that the second node is the target storage node; and sending a second read-write request to the second node, wherein the second read-write request carries second client identification information, and the second client identification information is used for performing read-write operation on the second node and writing the target data into the second node or reading the target data from the second node. The method provided by the disclosure can greatly reduce the data volume transmitted between the storage nodes, reduce the bandwidth between the storage nodes and reduce the data reading and writing delay at the same time when data is read and written.

Description

Data processing method and storage system based on NVMe-oF protocol
Technical Field
The disclosure relates to the technical field oF distributed file storage, in particular to a data processing method and a storage system based on an NVMe-oF protocol.
Background
In distributed storage and dual-control and multi-control storage, volumes (defined storage spaces) are exported to clients in an NVMe over fabric (NVMe-af) manner, so as to be used by the clients. A client can mount the same volume from multiple storage nodes (nodes for short) at the same time, that is, the client can access any one node in the storage system, and high availability of applications is realized by using multipath (multipath), that is, when one node stops working, other nodes can still provide services. The multipath operation modes are mainly classified into polling (round robin), failover (failover) and other modes. Polling may perform a read of the memory space to determine whether a new I/O (write or read of data) is available for processing. When the polling mode is adopted, the client side can sequentially send different I/Os to the connected nodes according to self needs, and in a standard protocol, the client side does not know where the data block is located, so that data forwarding between the nodes can be generated, and additional bandwidth loss and delay increase are caused. For example, when a client is connected to two nodes through NVMe over fabric over RDMA at the same time, the client writes data into node 1 through a multi-path policy, and when a data block is at node 2 or needs to be copied to node 2, the data needs to be forwarded from node 1 to node 2, and since the data needs to be forwarded between different nodes, more bandwidth is needed between the nodes, which may cause bandwidth loss; meanwhile, the forwarding of data between nodes also causes delay in the writing or reading of data.
Disclosure of Invention
The embodiment oF the disclosure provides a data processing method and a storage system based on an NVMe-oF protocol, which can solve the problems in the prior art.
According to one aspect oF the present disclosure, there is provided a data processing method applied to a first node in a storage system based on NVMe-af protocol, the storage system further including a second node different from the first node, the first node and the second node being connected, the method including:
receiving a first read-write request, wherein the first read-write request carries read-write control information of target data to be read and written, and the read-write control information is used for determining a target storage node of the target data;
determining that the second node is the target storage node;
and sending a second read-write request to the second node, wherein the second read-write request carries second client identification information, and the second client identification information is used for performing read-write operation on the second node and writing the target data into the second node or reading the target data from the second node.
In some embodiments, the read-write control information includes at least one of read-write position information of the target data to be read-written, a memory key, a memory length, a target write or read offset, a type of a data block, or a number of data blocks.
In some embodiments, determining that the second node is the target storage node comprises:
searching the second node from a preset node-data block mapping table according to the read-write control information, wherein the node-data block mapping table comprises each storage node and corresponding data block information in the storage system;
and determining the second node corresponding to the second control information in the read-write control information as the target storage node.
In some embodiments, the method further comprises:
acquiring the second client identification information based on the determined second node;
and packaging the second client identification information and second control information corresponding to the second node in the read-write control information to generate the second read-write request.
In some embodiments, the method further comprises:
receiving a read-write execution result of the target data returned by the second node, wherein the read-write execution result is generated after the read-write operation is completed;
and sending the read-write execution result to a client.
In some embodiments, the second node is multiple, and sending a second read/write request to the second node includes:
and sending the corresponding second read-write request to each second node, wherein the second read-write request carries second client identification information corresponding to the second node and second control information corresponding to the second node, and the second control information is obtained from the read-write control information carried by the first read-write request.
In some embodiments, the method further comprises:
determining that the first node is the target storage node;
acquiring first client identification information corresponding to the first node;
and performing read-write operation on the first node based on the first client identification information, and writing the target data into the first node or reading the target data from the first node.
According to one oF the aspects oF the present disclosure, there is also provided a data processing method applied to a second node in a storage system based on NVMe-af protocol, where the storage system further includes a first node different from the second node, and the first node and the second node are connected, the method including:
receiving a second read-write request sent by the first node, wherein the second read-write request carries second client identification information;
based on the second client identification information, performing read-write operation on the second node, and writing target data into the second node or reading the target data from the second node;
the second read-write request is determined based on a first read-write request sent to the first node, the first read-write request carries read-write control information of target data to be read and written, and the read-write control information is used for determining a target storage node of the target data.
In some embodiments, the method further comprises:
and after the read-write operation is finished, sending the read-write execution result of the target data to the first node.
According to one oF the aspects oF the present disclosure, there is also provided a storage system based on NVMe-oF protocol, including a first node and a second node, the first node being connected with the second node, wherein,
the first node is configured to:
receiving a first read-write request, wherein the first read-write request carries read-write control information of target data to be read and written, and the read-write control information is used for determining a target storage node of the target data;
determining that the second node is the target storage node;
sending a second read-write request to the second node, where the second read-write request carries second client identification information, and the second client identification information is used to perform read-write operation on the second node, and write the target data into the second node or read the target data from the second node;
the second node is configured to:
receiving a second read-write request sent by the first node, wherein the second read-write request carries second client identification information;
based on the second client identification information, performing read-write operation on the second node, and writing target data into the second node or reading the target data from the second node;
the second read-write request is determined based on a first read-write request sent to the first node, the first read-write request carries read-write control information of target data to be read and written, and the read-write control information is used for determining a target storage node of the target data.
According to one of the aspects of the present disclosure, there is also provided a computer-readable storage medium having stored thereon computer-executable instructions, which, when executed by a processor, implement the above-mentioned data processing method.
In the data processing method and the storage system based on the NVMe-af protocol provided in various embodiments oF the present disclosure, when data is read and written, the first node may send, to the second node, second client identification information for enabling data transmission between the second node and the client, based on the received first read and write request, under the condition that the second node is determined to be the target storage node, so that the second node can directly write target data oF the client into the second node according to the second client identification information or send the target data to the client after reading the target data from the second node, thereby implementing fast reading oF the target data on a corresponding node in the storage system. The target data with large data volume does not need to be forwarded among the storage nodes, so that the data volume transmitted among the storage nodes is greatly reduced, the bandwidth among the storage nodes is reduced, and meanwhile, the data reading and writing delay is also reduced.
Drawings
FIG. 1 illustrates a system architecture diagram of distributed storage of an embodiment of the present disclosure;
FIG. 2 shows a flow diagram of a data processing method of an embodiment of the present disclosure;
FIG. 3 is a schematic diagram of data (including request data and target data) transmission in the data processing method according to the embodiment of the disclosure;
FIG. 4 is a schematic diagram of another data (including request data and target data) transmission of the data processing method according to the embodiment of the disclosure;
FIG. 5 shows a flow diagram of another data processing method of an embodiment of the present disclosure;
FIG. 6 shows a data (including request data and target data) transmission diagram of another data processing method of an embodiment of the present disclosure;
fig. 7 shows a schematic structural diagram oF a storage system based on the NVMe-af protocol according to an embodiment oF the present disclosure.
Detailed Description
Various aspects and features of the disclosure are described herein with reference to the drawings.
It will be understood that various modifications may be made to the embodiments of the present application. Accordingly, the foregoing description should not be construed as limiting, but merely as exemplifications of embodiments. Other modifications will occur to those skilled in the art within the scope and spirit of the disclosure.
The accompanying drawings, which are incorporated in and constitute a part of the specification, illustrate embodiments of the disclosure and, together with a general description of the disclosure given above, and the detailed description of the embodiments given below, serve to explain the principles of the disclosure.
These and other characteristics of the present disclosure will become apparent from the following description of preferred forms of embodiment, given as non-limiting examples, with reference to the attached drawings.
It is also to be understood that although the present disclosure has been described with reference to certain specific examples, those skilled in the art will be able to ascertain many other equivalents to the present disclosure.
The above and other aspects, features and advantages of the present disclosure will become more apparent in view of the following detailed description when taken in conjunction with the accompanying drawings.
Specific embodiments of the present disclosure are described hereinafter with reference to the accompanying drawings; however, it is to be understood that the disclosed embodiments are merely exemplary of the disclosure that may be embodied in various forms. Well-known and/or repeated functions and structures have not been described in detail so as not to obscure the present disclosure with unnecessary or unnecessary detail. Therefore, specific structural and functional details disclosed herein are not to be interpreted as limiting, but merely as a basis for the claims and as a representative basis for teaching one skilled in the art to variously employ the present disclosure in virtually any appropriately detailed structure.
Fig. 1 shows a system architecture diagram of distributed storage according to an embodiment of the present disclosure. As shown in FIG. 1, the system architecture includes a client 110 and a storage system 120. Among other things, the client 110 may act as a client node. The storage system 120 includes a plurality of storage nodes, and in this embodiment, may include at least a first node 121 and a second node 122. The clients 110 and storage nodes (e.g., first node 121, second node 122) in the storage system 120 may support the NVMe over Fabrics protocol to enable writing or reading of data on the storage system 120. When data in the client 110 is written into the storage system 120, the written data file is usually divided into a plurality of data blocks, and the data blocks are dispersedly stored on each node; when the client 110 reads data from the storage system 120, the corresponding data block is read from the corresponding node.
In the NVMe-af system architecture, the client 110 is responsible for initiating reading and writing oF data, the storage node in the storage system 120 is responsible for receiving and executing commands sent by the client 110, and the transmission oF the commands and data is realized through multiple paths between the client 110 and the storage node.
NVMe (non volatile Memory standard) transport is an abstract protocol layer intended to provide reliable NVMe command and data transfer. In order to support network storage oF the data center, the NVMe standard is extended on a PCIe bus through NVMe over Fabric (NVMe-oF). NVMe-od uses a message-based model to send requests and responses between a host and a target storage device over a network.
NVMe Over Fabrics provides another way to access NVMe besides PCIe-Fabrics, and adds no more than 10 μ s of latency between NVMe servers (clients 110) and NVMe storage targets (storage nodes). NVMe Over Fabrics uses Fabric technologies such as RDMA or Fibre Channel (FC) Fabrics instead of PCIe transport. NVMe over Fabric supports mapping NVMe to multiple Fabrics transport options, mainly including FC, InfiniBand, RoCE v2, iWARP, and TCP. Among these Fabrics options protocols, InfiniBand, RoCE v2 (routable RoCE), iWARP support RDMA and are ideal Fabrics.
RDMA (Remote Direct Memory Access) is a new Memory Access technology that allows computers to directly Access the Memory of other computers without the time-consuming processing of the processor. RDMA moves data quickly from one system to a remote system memory without any impact on the operating system. RDMA can be simply understood as using related hardware and network technologies, and the network card of the client 110 can directly read and write the memory of the storage node, and finally achieves the effects of high bandwidth, low latency, and low resource utilization. For example, the above-described InfiniBand-based NVMe utilizes RDMA techniques that tend to attract high performance computing workloads that require very high bandwidth and low latency.
Through the embodiment oF the disclosure, the problems that when data is accessed in the existing NVMe-oF system architecture, the data cannot be directly transmitted to corresponding storage nodes and needs to be forwarded among the nodes, so that more bandwidth is needed among the nodes, bandwidth loss exists, and writing or reading (I/O) oF the data is delayed are solved.
It should be noted that fig. 1 is only an example of an architecture to which the embodiments of the present disclosure can be applied to help those skilled in the art understand the technical content of the present disclosure, and it is not intended that the embodiments of the present disclosure may not include other devices or systems.
FIG. 2 shows a flow diagram of a data processing method of an embodiment of the present disclosure; fig. 3 shows a data transmission diagram of a data processing method according to an embodiment of the present disclosure. As shown in fig. 2 and fig. 3, an embodiment oF the present disclosure provides a data processing method, which is applied to a first node in a storage system based on an NVMe-af protocol, where the storage system further includes a second node different from the first node, and the first node and the second node are connected, where the method includes:
s201: receiving a first read-write request, wherein the first read-write request carries read-write control information of target data to be read and written, and the read-write control information is used for determining a target storage node of the target data.
The first read-write request is from a client, the client may include a host, and the client 110 may send a read-write request to the storage system, where the read-write request is used to request to connect storage nodes in the storage system, so as to write data of the client into the corresponding storage nodes, or read required data from the corresponding storage nodes.
In this embodiment, the first read-write request is sent to the first node in advance, that is, the first node may serve as an access node of the first read-write request, and the first node determines a target storage node of target data to be read-written. The target data to be read and written may be divided into at least one data block, each data block has a corresponding target storage node, for example, a first data block may be stored in a first node, and a second data block may be stored in a second node, that is, both the first node and the second node may be target storage nodes of the target data.
The read-write control information comprises at least one of read-write position information of the target data to be read-written, a memory key, a memory length, a target write-in or read-out offset, a type of a data block or the number of the data blocks, so as to determine a target storage node of the target data to be read-written.
The read-write position information of the target data is the target access address of each data block, and the target storage node of the target data can be directly determined through the read-write position information of the target data.
When the data blocks are stored in different target storage nodes respectively, and the memory key and/or the memory length of each data block are different from the identification information for identifying the data block, the target storage node of each data block may be determined based on the memory key and/or the memory length of each data block.
The target write or read offset refers to an offset of each data block at a target access address, and a target storage node of each data block in the target data can also be determined based on the target write or read offset.
In this embodiment, the target storage node of the target data may be quickly determined by obtaining the type or the number of the data blocks, for example, when the number of the data blocks is one and the data blocks can only be stored on one storage node, after the first node is determined to be the target storage node, it is not necessary to consider whether the second node is the target storage node.
S202: determining that the second node is the target storage node.
And the first node analyzes the received first read-write request, determines a target storage node of the target data, and needs to write a corresponding data block in the target data into the second node or read a corresponding data block from the second node when determining that the second node is the target storage node. In this embodiment, the second node may be any other node except the first node, that is, one or more second nodes serving as target storage nodes may be provided.
In some embodiments, the determining that the second node is the target storage node in step S202 includes:
step S2021: searching the second node from a preset node-data block mapping table according to the read-write control information, wherein the node-data block mapping table comprises each storage node and corresponding data block information in the storage system;
step S2022: and determining the second node corresponding to the second control information in the read-write control information as the target storage node.
Specifically, the read-write control information includes control information corresponding to each target storage node, and if the first node is a target storage node, the read-write control information includes first control information corresponding to the first node; and if the second node is the target storage node, the read-write control information comprises second control information corresponding to the second node.
After the first node obtains the read-write control information of the target data, the read-write control information can be matched with a preset node-data block mapping table, and a second node corresponding to the read-write control information is searched.
If the read-write control information includes a second node and the information (for example, the data block 2) of the target data to be stored in the second node matches with the information of the data block that can be stored by the second node in the node-data block mapping table, it is determined that the second node is found from the node-data block mapping table and is a target storage node.
In the searching process, the storage relationship between the second node and the data block in the read-write control information is completely matched with the mapping relationship between the node and the data block in the node-data block mapping table, and the matched second node is determined to be the target storage node, that is, the second read-write control information corresponding to the second node exists in the read-write control information. For example, if a target data block to be accessed by a second node in the read-write control information is data block 1, and a data block corresponding to the second node in the node-data block mapping table is data block 2, it is determined that a second node corresponding to the second control information does not exist in the preset node-data block mapping table, and the second node is not a target storage node. That is, the first node extracts the read-write control information related to the second node from the first read-write request, and matches the read-write control information related to the second node with a preset node-data block mapping table to determine whether the second node is a target storage node.
In the embodiment of the present disclosure, each storage node may store one or more data blocks, and the same data block may be stored in one or more storage nodes as long as the corresponding node-data block mapping relationship is satisfied.
In some embodiments, when the storage nodes in the storage system and the data block information that can be stored thereon change, the storage system may update the node-data block mapping table in time, and the first node may call the updated node-data block mapping table in real time after obtaining the first read-write request, so as to determine the target storage node of the target data, and ensure the accuracy of determining the target storage node.
In addition, the storage system can adopt algorithms such as raft and paxos to maintain the consistency requirement of storage on each storage node and ensure the integrity of target data reading and writing.
S203: and sending a second read-write request to the second node, wherein the second read-write request carries second client identification information, and the second client identification information is used for performing read-write operation on the second node and writing the target data into the second node or reading the target data from the second node.
The second client identification information is connection information between the second node and the client, so that the second node establishes data transmission connection with the client, and writes target data into the second node or reads the target data from the second node.
In specific implementation, a client-node mapping table may be constructed based on a data transmission relationship between the client and each storage node in the storage system, where a data connection manner between the client and each storage node is stored in the mapping table, and the data connection manner may be RDMA (remote direct data access) connection to implement direct data access, that is, the second client identification information may be an RDMA connection channel between the second node and the client. The data connection method may be a TCP/IP connection using, for example, a TCP/IP protocol. For example, NVMe/TCP protocol (NVMe over TCP) may be used, which, although it may introduce some network latency, is simpler and more efficient to deploy than RDMA, which has lower requirements for hardware.
When data is read and written, the first node may send, to the second node, second client identification information for enabling data transmission between the second node and the client, based on the received first read-write request, under the condition that the second node is determined to be the target storage node, so that the second node can directly write target data of the client into the second node according to the second client identification information or send the target data to the client after reading the target data from the second node, thereby implementing fast reading of the target data on a corresponding node in the storage system. Since the data size of the target data is usually several hundred bytes or even several megabytes (M), the first read-write request and the second read-write request are control commands, and the data size is only several to several tens of bytes. Target data does not need to be forwarded between the storage nodes (for example, a read-write request containing the target data needs to be sent to the first node first), so that the data volume transmitted between the storage nodes is greatly reduced, the bandwidth between the storage nodes is reduced, and meanwhile, the data read-write delay is also reduced.
In some embodiments, after determining that the second node is the target storage node through step S202, the method further comprises:
acquiring the second client identification information based on the determined second node;
and packaging the second client identification information and second control information corresponding to the second node in the read-write control information to generate the second read-write request.
Specifically, after the second node is determined to be the target storage node, second client identification information of the second node is acquired from a preset client-node mapping table, second control information corresponding to the second node in the read-write control information is extracted at the same time, and a second read-write request is generated and sent to the second node after the second control information and the second client identification information are encapsulated; after receiving the second read-write request, the second node establishes a preliminary link with the client based on the second control information in the second read-write request to determine that the target data is to be read and written, and performs read-write operation based on the second client identification information in the second read-write request.
As shown in fig. 2 and 3, in some embodiments, the method further comprises:
s204: receiving a read-write execution result of the target data returned by the second node, wherein the read-write execution result is generated after the read-write operation is completed;
s205: and sending the read-write execution result to a client.
It should be noted that, due to the strong consistency of the storage nodes in the storage system, a write operation is considered to be completed only when all data blocks are written. That is, the completion of the read-write operation refers to the completion of the read-write operation of all data blocks in the target data.
In the embodiment of the present disclosure, the read-write execution result is sent to the first node for confirmation and then sent to the client as a response to the first read-write request, so that the integrity of the entire read-write operation can be ensured.
In some embodiments, the number of the second nodes is multiple, and the step S203 sends a second read/write request to the second node, including:
and sending the corresponding second read-write request to each second node, wherein the second read-write request carries second client identification information corresponding to the second node and second control information corresponding to the second node, and the second control information is obtained from the read-write control information carried by the first read-write request.
When the second node, which is the target storage node, is plural, the second client information of each second node may be acquired, extracting second control information corresponding to each second node from the read-write control information carried by the first read-write request, packaging second client information and the second control information corresponding to each node to generate second read-write requests respectively corresponding to each node, and respectively sending a plurality of the second read-write requests to the corresponding second nodes, that is, each second read/write request carries information of a data block to be read/written, read/write address information of a target storage node (second node) corresponding to the data block information, and connection information (for example, RDMA connection channel) for enabling the data block to be read/written to perform data transmission with the corresponding target storage node, so that the corresponding data read-write operation can be carried out after the corresponding second node receives the second read-write request.
Compared with the prior art that all the read-write control information in the first read-write request is directly forwarded to each second node, in this embodiment, the corresponding second read-write request is sent to each second node, so that the data volume of the second read-write request transmitted between the first node and the second node can be effectively reduced, meanwhile, the second read-write request sent to each second node is generated at the first node, the delay of data read-write caused by the fact that each second node needs to search for the corresponding second control information after receiving the read-write control information containing a plurality of second nodes is reduced, and the data processing efficiency is effectively improved.
As shown in fig. 4, in some embodiments, the method further comprises:
s206: determining that the first node is the target storage node;
s207: acquiring first client identification information corresponding to the first node;
s208: and performing read-write operation on the first node based on the first client identification information, and writing the target data into the first node or reading the target data from the first node.
Since the first node may also be used as the target storage node, in this embodiment of the present disclosure, after the first node is determined to be the target storage node, the first client identification information may be obtained from the client-node mapping table, so as to establish a data transmission connection with the client, and thus, the target data on the first node is read.
After the step S208 is completed, the first node may directly generate a read/write execution result, and send the read/write execution result to the client.
Step S206 specifically includes the following steps:
s2061: searching the first node from a preset node-data block mapping table according to the read-write control information;
s2062: and determining the first node corresponding to the first control information in the read-write control information to be found as the target storage node.
Specifically, the read-write control information related to the first node in the first read-write request may be extracted, and the first node corresponding to the first control information may be searched from a preset node-data block mapping table based on the read-write control information. The method for determining the first node as the target storage node is similar to the method for determining the second node as the target storage node, and is not described herein again.
The embodiment of the disclosure is particularly applicable to a scenario of a read-write request with a wrong access, for example, when target data needs to be written into a second node, the data read-write request is sent to a first node by a mistake, the first node can analyze the data read-write request to determine a target storage node (second node) therein, and after the target storage node is determined, client identification information of the target storage node is obtained based on a preset client-node mapping table, so that the target storage node and the client establish a data transmission connection, and perform a data read-write operation on the target storage node without the need of the first node to retransmit the target data to the target storage node after receiving the target data.
FIG. 5 shows a flow diagram of another data processing method of an embodiment of the present disclosure; fig. 6 shows a data transmission diagram of another data processing method according to an embodiment of the present disclosure. As shown in fig. 5 and 6, an embodiment oF the present disclosure further provides a data processing method, which is applied to a second node in a storage system based on an NVMe-af protocol, where the storage system further includes a first node different from the second node, and the first node and the second node are connected, and the method includes:
s501: receiving a second read-write request sent by the first node, wherein the second read-write request carries second client identification information;
s502: based on the second client identification information, performing read-write operation on the second node, and writing target data into the second node or reading the target data from the second node;
the second read-write request is determined based on a first read-write request sent to the first node, the first read-write request carries read-write control information of target data to be read and written, and the read-write control information is used for determining a target storage node of the target data.
When a first read-write request from a client is sent to a first node, the first node determines that a second node is a target storage node of at least one data block in target data according to read-write control information of the target data carried in the first read-write request, and the first node sends a second read-write request to the second node, so that the second node establishes data transmission connection with the client, and writes the target data into the second node or reads the target data from the second node.
The second read-write request also carries second control information corresponding to the second node, the second control information can be extracted from the read-write control information carried by the first read-write request after the second node is determined to be the target storage node, so as to send the data read-write command to the second node, and the second node performs read-write operation on the second node by using the second client identification information after receiving the data read-write command.
When the data processing method provided by the embodiment of the disclosure is used for reading and writing data, when the first node determines that the second node is the target storage node based on the received first read-write request, the second node can obtain, from the first node, second client identification information for enabling data transmission between the second node and the client, and the second node can directly write target data of the client into the second node according to the second client identification information or read the target data from the second node and then send the target data to the client.
In some embodiments, the method further comprises:
s503: and after the read-write operation is finished, sending the read-write execution result of the target data to the first node.
After the read-write operation on the second node is completed, the second node can generate a read-write execution result, the read-write execution result is sent to the first node, the first node is enabled to confirm, and after the read-write execution result is confirmed by the first node, the read-write execution result can also be sent to the client side to confirm, so that the whole data read-write process is completed.
Fig. 7 shows a schematic structural diagram oF a storage system based on the NVMe-af protocol according to an embodiment oF the present disclosure. As shown in fig. 7, an embodiment oF the present disclosure provides a storage system based on NVMe-af protocol, including a first node 701 and a second node 702, where the first node 701 is connected to the second node 702, and wherein,
the first node 701 is configured to:
receiving a first read-write request, wherein the first read-write request carries read-write control information of target data to be read and written, and the read-write control information is used for determining a target storage node of the target data;
determining that the second node is the target storage node;
sending a second read-write request to the second node, where the second read-write request carries second client identification information, and the second client identification information is used to perform read-write operation on the second node, and write the target data into the second node or read the target data from the second node;
the second node 702 is configured to:
receiving a second read-write request sent by the first node, wherein the second read-write request carries second client identification information;
based on the second client identification information, performing read-write operation on the second node, and writing target data into the second node or reading the target data from the second node;
the second read-write request is determined based on a first read-write request sent to the first node, the first read-write request carries read-write control information of target data to be read and written, and the read-write control information is used for determining a target storage node of the target data.
A storage system based on the NVMe-od protocol provided in the embodiment oF the present disclosure corresponds to the data processing method applied to the first node and the second node in the above embodiment, and based on the above data processing method, a person skilled in the art can understand a specific implementation manner oF the data processing apparatus in the embodiment oF the present disclosure and various variations thereof, and any optional items in the embodiment oF the data processing method are also applicable to the storage system, which is not described herein again.
In the embodiment of the present disclosure, each of the first node 701 and the second node 702 may be an independent device, that is, each node includes an independent processor and a memory, the memory is used for storing a computer-executable instruction and target data, and the processor, when executing the computer-executable instruction, implements the data processing method applied to the first node 701 or the second node 702.
In other embodiments, the storage system 600 is a device such as a stand-alone host or a data server, the first node 701 and the second node 702 are storage modules or components respectively disposed therein, the storage system further includes a memory distinct from the first node 701 and the second node 702, and a processor coupled to the first node 701 and the second node 702 respectively, the memory is used for storing computer-executable instructions, and the processor is configured to call the first node 701 and/or the second node 702 to execute the data processing method when executing the computer-executable instructions.
The processor may be a general-purpose processor, including a central processing unit CPU, a Network Processor (NP), and the like; but also a digital signal processor DSP, an application specific integrated circuit ASIC, a field programmable gate array FPGA or other programmable logic device, discrete gate or transistor logic, discrete hardware components.
The memory may include Random Access Memory (RAM) and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.
The embodiment of the present disclosure also provides a computer-readable storage medium, on which computer-executable instructions are stored, and when the computer-executable instructions are executed by a processor, the data processing method is implemented.
The above embodiments are merely exemplary embodiments of the present disclosure, which is not intended to limit the present disclosure, and the scope of the present disclosure is defined by the claims. Various modifications and equivalents of the disclosure may occur to those skilled in the art within the spirit and scope of the disclosure, and such modifications and equivalents are considered to be within the scope of the disclosure.

Claims (10)

1. A data processing method applied to a first node in a storage system based on NVMe-oF protocol, the storage system further including a second node different from the first node, the first node and the second node being connected, the method comprising:
receiving a first read-write request, wherein the first read-write request carries read-write control information of target data to be read and written, and the read-write control information is used for determining a target storage node of the target data;
determining that the second node is the target storage node;
and sending a second read-write request to the second node, wherein the second read-write request carries second client identification information, and the second client identification information is used for performing read-write operation on the second node and writing the target data into the second node or reading the target data from the second node.
2. The method of claim 1, wherein the read-write control information comprises at least one of read-write location information, a memory key, a memory length, a target write or read offset, a type of data block, or a number of data blocks of the target data to be read-written.
3. The method of claim 1, wherein determining that the second node is the target storage node comprises:
searching the second node from a preset node-data block mapping table according to the read-write control information, wherein the node-data block mapping table comprises each storage node and corresponding data block information in the storage system;
and determining the second node corresponding to the second control information in the read-write control information as the target storage node.
4. The method of claim 3, wherein the method further comprises:
acquiring the second client identification information based on the determined second node;
and packaging the second client identification information and second control information corresponding to the second node in the read-write control information to generate the second read-write request.
5. The method of claim 1, wherein the method further comprises:
receiving a read-write execution result of the target data returned by the second node, wherein the read-write execution result is generated after the read-write operation is completed;
and sending the read-write execution result to a client.
6. The method of claim 1, wherein the second node is plural, sending a second read-write request to the second node, comprising:
and sending the corresponding second read-write request to each second node, wherein the second read-write request carries second client identification information corresponding to the second node and second control information corresponding to the second node, and the second control information is obtained from the read-write control information carried by the first read-write request.
7. The method of claim 1, wherein the method further comprises:
determining that the first node is the target storage node;
acquiring first client identification information corresponding to the first node;
and performing read-write operation on the first node based on the first client identification information, and writing the target data into the first node or reading the target data from the first node.
8. A data processing method applied to a second node in a storage system based on NVMe-oF protocol, the storage system further including a first node different from the second node, the first node and the second node being connected, the method comprising:
receiving a second read-write request sent by the first node, wherein the second read-write request carries second client identification information;
based on the second client identification information, performing read-write operation on the second node, and writing target data into the second node or reading the target data from the second node;
the second read-write request is determined based on a first read-write request sent to the first node, the first read-write request carries read-write control information of target data to be read and written, and the read-write control information is used for determining a target storage node of the target data.
9. The method of claim 8, wherein the method further comprises:
and after the read-write operation is finished, sending the read-write execution result of the target data to the first node.
10. A storage system based on NVMe-oF protocol, comprising a first node and a second node, the first node being connected with the second node, wherein,
the first node is configured to:
receiving a first read-write request, wherein the first read-write request carries read-write control information of target data to be read and written, and the read-write control information is used for determining a target storage node of the target data;
determining that the second node is the target storage node;
sending a second read-write request to the second node, where the second read-write request carries second client identification information, and the second client identification information is used to perform read-write operation on the second node, and write the target data into the second node or read the target data from the second node;
the second node is configured to:
receiving a second read-write request sent by the first node, wherein the second read-write request carries second client identification information;
based on the second client identification information, performing read-write operation on the second node, and writing target data into the second node or reading the target data from the second node;
the second read-write request is determined based on a first read-write request sent to the first node, the first read-write request carries read-write control information of target data to be read and written, and the read-write control information is used for determining a target storage node of the target data.
CN202110266793.1A 2021-03-11 2021-03-11 Data processing method and storage system based on NVMe-oF protocol Pending CN113014662A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110266793.1A CN113014662A (en) 2021-03-11 2021-03-11 Data processing method and storage system based on NVMe-oF protocol

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110266793.1A CN113014662A (en) 2021-03-11 2021-03-11 Data processing method and storage system based on NVMe-oF protocol

Publications (1)

Publication Number Publication Date
CN113014662A true CN113014662A (en) 2021-06-22

Family

ID=76405468

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110266793.1A Pending CN113014662A (en) 2021-03-11 2021-03-11 Data processing method and storage system based on NVMe-oF protocol

Country Status (1)

Country Link
CN (1) CN113014662A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113239098A (en) * 2021-07-14 2021-08-10 腾讯科技(深圳)有限公司 Data management method, computer and readable storage medium
CN113656683A (en) * 2021-07-12 2021-11-16 北京旷视科技有限公司 Subscription data pushing method, device and system, electronic equipment and storage medium
CN114327903A (en) * 2021-12-30 2022-04-12 苏州浪潮智能科技有限公司 NVMe-oF management system, resource allocation method and IO read-write method
CN115550377A (en) * 2022-11-25 2022-12-30 苏州浪潮智能科技有限公司 NVMF (network video and frequency) storage cluster node interconnection method, device, equipment and medium
WO2023174341A1 (en) * 2022-03-16 2023-09-21 中兴通讯股份有限公司 Data read-write method, and device, storage node and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103984662A (en) * 2014-05-29 2014-08-13 华为技术有限公司 Method and device for reading and writing data and storage system
CN106569739A (en) * 2016-10-09 2017-04-19 南京中新赛克科技有限责任公司 Data writing optimization method
CN107608627A (en) * 2017-08-21 2018-01-19 云宏信息科技股份有限公司 A kind of remote data classification storage method, electronic equipment and storage medium
WO2018137217A1 (en) * 2017-01-25 2018-08-02 华为技术有限公司 Data processing system, method, and corresponding device
CN110389986A (en) * 2019-07-18 2019-10-29 上海达梦数据库有限公司 Method for writing data, device, equipment and the storage medium of distributed system
CN110691062A (en) * 2018-07-06 2020-01-14 浙江大学 Data writing method, device and equipment
CN111857602A (en) * 2020-07-31 2020-10-30 重庆紫光华山智安科技有限公司 Data processing method, data processing device, data node and storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103984662A (en) * 2014-05-29 2014-08-13 华为技术有限公司 Method and device for reading and writing data and storage system
CN106569739A (en) * 2016-10-09 2017-04-19 南京中新赛克科技有限责任公司 Data writing optimization method
WO2018137217A1 (en) * 2017-01-25 2018-08-02 华为技术有限公司 Data processing system, method, and corresponding device
CN107608627A (en) * 2017-08-21 2018-01-19 云宏信息科技股份有限公司 A kind of remote data classification storage method, electronic equipment and storage medium
CN110691062A (en) * 2018-07-06 2020-01-14 浙江大学 Data writing method, device and equipment
CN110389986A (en) * 2019-07-18 2019-10-29 上海达梦数据库有限公司 Method for writing data, device, equipment and the storage medium of distributed system
CN111857602A (en) * 2020-07-31 2020-10-30 重庆紫光华山智安科技有限公司 Data processing method, data processing device, data node and storage medium

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113656683A (en) * 2021-07-12 2021-11-16 北京旷视科技有限公司 Subscription data pushing method, device and system, electronic equipment and storage medium
CN113239098A (en) * 2021-07-14 2021-08-10 腾讯科技(深圳)有限公司 Data management method, computer and readable storage medium
CN114327903A (en) * 2021-12-30 2022-04-12 苏州浪潮智能科技有限公司 NVMe-oF management system, resource allocation method and IO read-write method
CN114327903B (en) * 2021-12-30 2023-11-03 苏州浪潮智能科技有限公司 NVMe-oF management system, resource allocation method and IO read-write method
WO2023174341A1 (en) * 2022-03-16 2023-09-21 中兴通讯股份有限公司 Data read-write method, and device, storage node and storage medium
CN115550377A (en) * 2022-11-25 2022-12-30 苏州浪潮智能科技有限公司 NVMF (network video and frequency) storage cluster node interconnection method, device, equipment and medium

Similar Documents

Publication Publication Date Title
CN113014662A (en) Data processing method and storage system based on NVMe-oF protocol
US8281081B2 (en) Shared memory architecture
US10423332B2 (en) Fibre channel storage array having standby controller with ALUA standby mode for forwarding SCSI commands
CN111212141A (en) Shared storage system
EP4318251A1 (en) Data access system and method, and device and network card
US9910753B1 (en) Switchless fabric based atomics via partial-proxy
KR20200008483A (en) METHOD OF ACCESSING A DUAL LINE SSD DEVICE THROUGH PCIe EP AND NETWORK INTERFACE SIMULTANEOUSLY
US11809290B2 (en) Storage system and storage queue processing following port error
EP4177763A1 (en) Data access method and related device
CN114201421A (en) Data stream processing method, storage control node and readable storage medium
CN113360077B (en) Data storage method, computing node and storage system
CN115080479B (en) Transmission method, server, device, bare metal instance and baseboard management controller
CN115270033A (en) Data access system, method, equipment and network card
US10154079B2 (en) Pre-boot file transfer system
US20060039405A1 (en) Systems and methods for frame ordering in wide port SAS connections
CN112148206A (en) Data reading and writing method and device, electronic equipment and medium
CN109783002B (en) Data reading and writing method, management equipment, client and storage system
CN106933497B (en) Management scheduling device, system and method based on SAS
WO2022073399A1 (en) Storage node, storage device and network chip
US11720442B2 (en) Memory controller performing selective and parallel error correction, system including the same and operating method of memory device
US7657717B1 (en) Coherently sharing any form of instant snapshots separately from base volumes
CN116032498A (en) Memory area registration method, device and equipment
US9501290B1 (en) Techniques for generating unique identifiers
WO2022267909A1 (en) Method for reading and writing data and related apparatus
CN109376014B (en) Distributed lock manager implementation method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210622

RJ01 Rejection of invention patent application after publication