WO2023174341A1 - Procédé de lecture/d'écriture de données, et dispositif, nœud de stockage et support d'enregistrement - Google Patents

Procédé de lecture/d'écriture de données, et dispositif, nœud de stockage et support d'enregistrement Download PDF

Info

Publication number
WO2023174341A1
WO2023174341A1 PCT/CN2023/081675 CN2023081675W WO2023174341A1 WO 2023174341 A1 WO2023174341 A1 WO 2023174341A1 CN 2023081675 W CN2023081675 W CN 2023081675W WO 2023174341 A1 WO2023174341 A1 WO 2023174341A1
Authority
WO
WIPO (PCT)
Prior art keywords
read
storage node
write
storage
network link
Prior art date
Application number
PCT/CN2023/081675
Other languages
English (en)
Chinese (zh)
Inventor
金浩
屠要峰
韩银俊
许军宁
陈正华
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2023174341A1 publication Critical patent/WO2023174341A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0629Configuration or reconfiguration of storage systems
    • G06F3/0631Configuration or reconfiguration of storage systems by allocating resources to storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]

Definitions

  • the present disclosure relates to the field of communications, and in particular, to a data reading and writing method, device, storage node and storage medium.
  • a distributed storage system usually contains multiple storage nodes.
  • Each storage node contains one or more storage devices that support the NVMe (non-volatile memory express, non-volatile memory host controller interface specification) storage layer protocol.
  • Each storage node provides a logical address space.
  • the target space of IO (read and write) operations may be located on any one or more storage nodes.
  • the cluster topology may be updated at any time.
  • the present disclosure provides a data reading and writing method, device, storage node and storage medium to solve the problem of long network transmission paths and the introduction of new network delays when the second server forwards requests and response results when accessing a distributed storage cluster.
  • the second server includes a proxy server.
  • a data reading and writing method is provided, which is applied to a client device.
  • the method includes: sending a first reading and writing request to the first storage node through a first network link established with the first storage node.
  • the first storage node Be any storage node in the storage system; receive the read and write response from the second storage node through the second network link established with the second storage node, and use the read and write response as a response to the first read and write request,
  • the second storage node is a node determined by the first storage node that can execute the first read and write request; the first network link and the second network link belong to the same storage protocol channel.
  • a data reading and writing method is provided, which is applied to a second storage node.
  • the method includes: receiving a first reading and writing request from a first storage node, where the first storage node is any storage node in the storage system; In response to the first read and write request, a read and write response is returned to the client device through the second network link established by the client device and the second storage node.
  • the second network link and the first network link belong to the same storage protocol channel.
  • a network link is a network link established between the client device and the first storage node.
  • a data reading and writing system including: a client device, a first storage node and a second storage node.
  • the client device is configured to send a first read and write request to the first storage node through the first network link established with the first storage node, and the first storage node is any storage node in the storage system; through the first network link established with the second storage node
  • the second network link established receives the read and write response from the second storage node, and uses the read and write response as a response to the first read and write request.
  • the second storage node is determined by the first storage node.
  • a node capable of executing the first read and write request; the first network link and the second network link belong to the same storage protocol channel.
  • a client device including: a first sending unit and a first receiving unit.
  • the first sending unit is configured to send the first read and write request to the first storage node through the first network link established with the first storage node, and the first storage node is any storage node in the storage system;
  • the first receiving unit configured to receive a read and write response from the second storage node through the second network link established with the second storage node and use the read and write response as a response to the first read and write request, the second storage node being configured by the first The node determined by the storage node that can execute the first read and write request; the first network link and the second network link belong to the same storage protocol channel.
  • a second storage node including: a second receiving unit and a second sending unit.
  • the second receiving unit is configured to receive the first read and write request from the first storage node, which is any storage node in the storage system; the second sending unit is configured to respond to the first read and write request, Return a read and write response to the client device through the second network link established by the client device and the second storage node.
  • the second network link and the first network link belong to the same storage protocol channel.
  • the first network link is the connection between the client device and the second storage node. The network link established by the first storage node.
  • an electronic device including: a processor, a memory and a communication bus, wherein the processor and the memory complete communication with each other through the communication bus; the memory is used to store computer programs; the processor is used to execute the memory
  • the program stored in implements the data reading and writing method described in the first aspect or the data reading and writing method described in the second aspect.
  • a computer-readable storage medium which stores a computer program.
  • the computer program is executed by a processor, the data reading and writing method described in the first aspect or the data reading and writing method described in the second aspect is implemented.
  • Figure 1 is a schematic flow chart of the data reading and writing method in the present disclosure
  • Figure 2 is another schematic flow chart of the data reading and writing method in the present disclosure
  • Figure 3 is a schematic structural diagram of the data reading and writing system in the present disclosure
  • Figure 4 is a schematic diagram of the link layering principle between nodes and clients in the distributed storage system in the present disclosure
  • Figure 5 shows the node X, node Y and client device in the distributed storage system in this disclosure. Interaction flow chart between;
  • Figure 6 is a schematic structural diagram of a client device in the present disclosure.
  • Figure 7 is a schematic structural diagram of the second storage node in the present disclosure.
  • FIG. 8 is a schematic structural diagram of an electronic device in the present disclosure.
  • a distributed storage system usually contains multiple storage nodes.
  • Each storage node contains one or more storage devices that support the NVMe (non-volatile memory express, non-volatile memory host controller interface specification) interface specification.
  • Multiple Storage nodes provide a logical address space.
  • the target space of IO (read and write) operations may be located on any one or more storage nodes.
  • the topology of the cluster may be updated at any time.
  • the first method is that the client device copies a cluster partition table and can calculate the target storage node to be accessed.
  • the client device directly establishes an NVMe link with the target storage device to implement IO operations.
  • This method requires customized NVMe client devices to be updated synchronously in real time.
  • the cluster's partition information, routing calculation rules, client devices and cluster services are highly coupled, and the implementation cost is very high.
  • the client device sends an IO request to the first server.
  • the first server determines the second server where the request target address is located and forwards the IO request to the second server. After the second server completes the IO request, it needs to notify the second server. One server, and then the first server sends the response result.
  • the response message of this method needs to be sent from the second server to the first server, and then the latter sends it to the client device.
  • the network transmission path is long and new network delays are introduced.
  • the NVMe client device and each storage node in the distributed storage system have matching network links.
  • An NVMe Target server can only use the network link that matches the NVMe client device. Send read and write requests or receive read and write responses. if If the link used by the NVMe client device to send read and write requests and the link used to receive read and write responses do not belong to the same path, then even if the read and write requests sent match the read and write responses received, the NVMe client device cannot identify them.
  • the NVMe Target server and the NVMe client device establish network links through queue mapping.
  • the NVMe client device includes the NVMe layer and the network layer.
  • the NVMe layer includes the submission queue and the completion queue
  • the network layer includes the submission queue and the completion queue.
  • Each NVMe Target server in the distributed storage system also includes an NVMe layer and a network layer.
  • the NVMe layer includes a submission queue and a completion queue
  • the network layer includes a submission queue and a completion queue.
  • the establishment process of the network link between the NVMe client device and the NVMe Target server is: on the NVMe client device side, the submission queue of the NVMe layer is mapped to the submission queue of the network layer, and the NVMe client
  • the submission queue of the network layer on the device side is connected to the completion queue of the network layer on the NVMe Target server through network transmission; the completion queue of the network layer on the NVMe Target server is mapped to the completion queue of the NVMe layer.
  • the establishment process of the network link between the NVMe client device and the NVMe Target server is: on the NVMe client device side, the completion queue of the NVMe layer is mapped to the completion queue of the network layer, and the completion queue of the network layer The completion queue is connected to the submission queue of the network layer of the NVMe Target server through network transmission; on the NVMe Target server side, the submission queue of the network layer is mapped to the submission queue of the NVMe layer.
  • the NVMe layer submission queue of the NVMe client device sends a read and write request to the network layer submission queue, and the network layer submission queue sends the read and write request to the NVMe Target server's network
  • the completion queue of the layer sends the read and write request; after the completion queue of the network layer of the NVMe Target server receives the read and write request, it sends the read and write request to the completion queue of the NVMe layer.
  • the NVMe layer submission queue of the NVMe Target server sends a read and write response to the network layer submission queue
  • the network layer submission queue sends a read and write response to the network of the NVMe client device.
  • the completion queue of the NVMe client device sends the read and write response; after receiving the read and write request, the completion queue of the network layer of the NVMe client device sends the read and write response to the completion queue of the NVMe layer.
  • the NVMe layer of each NVMe client device is only set up with one submission queue and one completion queue.
  • the network layer is also set with only one submission queue and completion queue. Therefore, one NVMe client device can only be bound to one NVMe Target. server, and can only interact with the NVMe Target server. If you need to interact with other NVMe Target servers, you need to forward them through the bound NVMe Target server.
  • the read and write requests sent by the NVMe client device to the first NVMe Target server can only be sent to the NVMe client device by the first NVMe Target server through its own NVMe layer submission queue and network layer submission queue. Send read and write responses. And if the read-write response is sent by the second The NVMe Target server sends it to the NVMe client device through its NVMe layer submission queue and network layer submission queue, so the NVMe client device will not be able to recognize the read and write response.
  • the read and write response corresponding to the read and write request can only be obtained through the second NVMe Target server, the read and write response must be forwarded to the NVMe client device through the first NVMe Target server, so that the NVMe client device can recognize the read and write response.
  • Write response when the read and write response corresponding to the read and write request can only be obtained through the second NVMe Target server, the read and write response must be forwarded to the NVMe client device through the first NVMe Target server, so that the NVMe client device can recognize the read and write response. Write response.
  • the present disclosure provides a data reading and writing method, which can be applied to client devices.
  • the method may include the following steps 101 to 102.
  • Step 101 Send a first read and write request to the first storage node through the first network link established with the first storage node.
  • the first storage node is any storage node in the storage system.
  • Step 102 Receive a read and write response from the second storage node through the second network link established with the second storage node, and use the read and write response as a response to the first read and write request.
  • the second storage node is The node determined by the first storage node that can execute the first read and write request; the first network link and the second network link belong to the same storage protocol channel.
  • a read-write response from the second storage node is received, and the read-write response is used as a response to the first read-write request according to the storage protocol layer identifier.
  • the first network link and the second network link are both bottom network links of the same storage protocol layer link channel.
  • the read and write requests in this embodiment include read requests and/or write requests, and accordingly, the read and write responses include read responses and/or write responses.
  • the read and write request sent by the client device is specifically a read request, then the storage node returns a read response; when the read and write request sent by the client device is specifically a write request, then the storage node returns a write response. response.
  • the network link established between the client device and the storage node can be used to send requests and receive responses at the same time.
  • the client device can send a read and write request to the first storage node through the first network link, and the first storage node can also return a read and write response to the client device through the first network link.
  • the client device when the network links between the client device and different storage nodes are aggregated to implement the same storage protocol channel, when the client device sends a read and write request to a storage node, no matter which storage node the read and write response comes from, the client The device can recognize the read and write ring. Specifically in this embodiment, since the first network link and the second network link belong to the same storage protocol channel, although the read and write responses come from the second storage node, the client device can still recognize the read and write responses.
  • the data reading and writing method includes: the client and the storage cluster establish a storage protocol layer link, a first network link, a second network link and other network layer links, and the storage protocol layer selects any network link Send an IO read and write request and receive an IO read and write response from any network link.
  • the storage protocol layer identifier of the read and write response matches the storage protocol layer identifier of the read and write request.
  • This embodiment provides the following two methods to aggregate the first network link and the second network link to implement the same storage protocol channel.
  • different storage nodes still have uniquely matching submission queues and completion queues at the NVMe layer of the client device, but the submission queue and completion queue can be marked by the added storage protocol management layer to indicate that they can be processed through the
  • the submission queue sends read and write requests to different storage nodes and indicates that any read or write responses received by the completion queue can be identified.
  • the storage protocol management layer marking method Although the first network link and the second network link can be aggregated into one storage protocol channel, the storage protocol management layer needs to be added, resulting in a relatively large workload.
  • this embodiment sets a storage node that has a network link relationship with the client device.
  • the completion queue mapped to the NVMe layer of the client device is the same completion queue; the storage node that has a network link relationship with the client device Node, the submission queue mapped to the NVMe layer of the client device is the same submission queue.
  • the focus is on enabling the client device to identify the read and write response returned by the second storage node. Therefore, when the first network link and the second network link belong to the same storage protocol channel, the first network link is The completion queues of the storage node and the second storage node mapped to the NVMe layer of the client device are the same completion queue. In this way, no matter which storage node sends the read and write response, the NVMe layer of the client device receives the read and write response from the same completion queue. Read and write responses and identify them.
  • the first completion queue and the second completion queue in the network layer are obtained.
  • the first completion queue is used to receive read and write responses from the first storage node
  • the second completion queue is used to receive the read and write responses from the first storage node.
  • a read-write response from the second storage node and obtain the completion queue of the storage protocol layer.
  • the completion queue of the storage protocol layer is used to receive a read-write response from the first completion queue or a read-write response from the second completion queue;
  • the first completion queue and the second completion queue are mapped to the completion queues of the storage protocol layer.
  • the client device may send multiple read and write requests in a short period of time, in order to distinguish whether the received read and write response is a response to the first read and write request, in this embodiment, the third read and write request is Before the read-write response returned by the second storage node is used as a response to the first read-write request, the read-write response may also be verified.
  • the read-write response before using the read-write response as a response to the first read-write request, parse the storage protocol layer session identifier in the read-write response and the storage protocol layer session identifier in the first read-write request; determine the read-write response
  • the storage protocol layer session ID in is the same as the storage protocol layer session ID in the first read and write request.
  • the session identifier in the read-write response and the session identifier in the first read-write request are the same, it is confirmed that the read-write response and the first read-write request are requests and responses for the same session, and receipt can be confirmed at this time.
  • the read-write response matches the first read-write request, that is, the read-write response is the response to the first read-write request.
  • the second storage node can be directly accessed to reduce the probability of request forwarding.
  • This embodiment can also establish a second storage node after determining the response to the first read-write request. Mapping relationship between storage nodes and client devices. In an exemplary embodiment, this embodiment may also After determining the response to the first read-write request, a mapping relationship between the second storage node and the read-write request target data is established.
  • the identifier of the second storage node is extracted from the read and write response; based on the identifier of the second storage node, a mapping relationship between the second storage node and the client device is established, and the mapping relationship indicates that the read and write response data is stored in the second storage node. storage node. In one embodiment, the identifier of the second storage node is extracted from the read and write response; based on the identifier of the second storage node, a mapping relationship between the second storage node and the read and write request target data is established, and the mapping relationship indicates that the read and write response data is stored in Second storage node.
  • the data stored in the second storage node may be hotspot data. That is to say, the mapping relationship indicates that the hotspot data is stored in the second storage node. Therefore, when the client device needs to access the hotspot data, in order to reduce the probability of request forwarding, , you can directly send an access request to the second storage node.
  • the specific implementation of the client device accessing the read-write response in the second storage node may be: based on the mapping relationship, a second read-write request is generated, and the second read-write request is used to access the read-write response; through the second network link, send a second read-write request to the second storage node, and receive a read-write response returned by the second storage node in response to the second read-write request.
  • the first network link and the second network link belong to the same storage protocol channel.
  • the first storage node and the second storage node are mapped to the same submission queue of the NVMe layer of the client device.
  • the first submission queue and the second submission queue in the network layer are obtained.
  • the first submission queue is used to send read and write requests to the first storage node
  • the second submission queue is used to send read and write requests to the second storage node.
  • the node sends read and write requests; and obtains the submission queue in the storage protocol layer.
  • the submission queue of the storage protocol layer is used to send read and write requests to the first submission queue and the second submission queue; and maps the first submission queue and the second submission queue.
  • submission queue to the storage protocol layer is used to send read and write requests to the first submission queue and the second submission queue.
  • the first read and write request is sent to the first storage node through the first network link established with the first storage node, and the first storage node is any storage node in the storage system; through The second network link established by the second storage node receives the read and write response from the second storage node, and uses the read and write response as a response to the first read and write request.
  • the second storage node is determined by the first storage node.
  • a node capable of executing the first read and write request; the first network link and the second network link belong to the same storage protocol channel.
  • the storage protocol layer identification information of the read-write request and the read-write response is the same, so although the client device sends the first read-write request to the first storage node, However, it is still possible to send the read-write response to the client device after the second storage node executes the first read-write request, so that the client device can recognize that the read-write response is for the first read-write request.
  • the present disclosure provides a data reading and writing method, which can be applied to the second storage node; as shown in Figure 2, the method can include the following steps 201 to 202.
  • Step 201 Receive a first read and write request from a first storage node, which is any storage node in the storage system;
  • Step 202 Respond to the first read and write request through the client device and the second storage node.
  • the second network link established by the node returns a read and write response to the client device.
  • the second network link and the first network link belong to the same storage protocol channel.
  • the first network link is the network link established between the client device and the first storage node. .
  • the second network link and the first network link are both underlying network links of the same storage protocol channel.
  • the method is further configured to: receive a second read-write request from the client device through the second network link, the second read-write request being used to access the read-write response; wherein the read-write response Includes second storage node data.
  • a read and write response is returned to the client device over the second network link.
  • the present disclosure provides a data reading and writing system.
  • the system mainly includes: client Device 301, storage cluster, the storage cluster includes: a first storage node 302 and a second storage node 303 and other storage nodes; the client device 301 establishes a storage protocol layer link channel with the storage cluster to communicate with the storage cluster. All storage nodes establish network layer links respectively.
  • the client device 301 is configured to send a first read and write request to the first storage node 302 through the first network link established with the first storage node 302, which is any storage node in the storage system; through The second network link established by the second storage node 303 receives the read and write response from the second storage node 303, and uses the read and write response as a response to the first read and write request.
  • the second storage node 303 is configured by the first storage node 303.
  • the node determined by node 302 is capable of executing the first read and write request; the first network link and the second network link belong to the same storage protocol channel.
  • the client device 301 can send read and write requests to the second storage node 302 through the second network link established with the second storage node 303, and receive data from the second storage node 302 through the second network link established with the second storage node 303.
  • the read and write response of the second storage node 303 In an exemplary embodiment, the first network link and the second network link are both underlying network links of the same storage protocol channel.
  • the following takes the client device as an NVMe client device and the storage node as an NVMe Target server as an example to describe the application environment in this disclosure.
  • the existing NVM Express over Fabrics Revision 1.1a specification requires that the NVMe layer IO queue and the underlying network link have a one-to-one correspondence.
  • This disclosure proposes to decouple the storage layer and network layer, and the NVMe protocol and network link support 1:N(N Not less than 1) Mapping relationship.
  • NVMe request messages and response messages support transmission on different network links.
  • the client establishes network links with all servers in the distributed cluster. These network connections are mapped to the same storage layer NVMe protocol channel. Realize that one NVMe Path (path) is mapped to multiple network links, thereby completely solving the problem of standard NVMe clients accessing distributed storage clusters to obtain higher performance and better user experience.
  • This disclosure is different from the existing MultiPath (multipath) function. Multipath means that there are multiple path channels at the NVMe protocol level. According to the protocol specification requirements, an NVMe link is selected to implement IO interaction.
  • FIG. 4 a schematic diagram of the link principle between nodes and clients in the distributed storage system as shown in Figure 4 is given.
  • the client and all server nodes in the distributed storage system are respectively Create a network link, aggregate multiple network links to implement an NVMe access path, and the client reads and writes the entire cluster through an NVMe path.
  • Figure 5 is an interaction flow chart between node X, node Y and client device in the distributed storage system.
  • Step 501 The client sends an IO request to node X, requiring node X to have a data partition table and be able to determine the location information of the target space;
  • Step 502 Node Storage node Y forwards the request message to node Y.
  • the NVMe session life cycle of node X ends;
  • Step 503 Storage node Y receives the NVMe request, parses and processes the NVMe message, and executes the corresponding IO request;
  • Step 504 Node Y sends NVMe responds with messages to the client.
  • this disclosure makes the following extensions to the NVMe client device and the NVMe Target server respectively.
  • NVMe client device side According to the NVMe protocol definition, the client establishes a network link with the server through the connect command. The submission queue and completion queue of the NVMe protocol are mapped to the submission queue and completion queue of the network link respectively. .
  • the connect command specifies the server address list, and the client establishes network links with all service nodes. Taking RDMA as an example, all links share the same RDMA Protection Domain (PD), and the server shares the client through the RDMA network. Terminal memory region (Memory Region).
  • the client When the client sends a message, it uses the reserved field of the NVMe message header to save the link identifier FID (Fabric ID).
  • FID link identifier
  • the network layer sends a standard NVMe request message to the correct server based on the FID carried in the message header.
  • the completion queue notifies the NVMe layer protocol stack of messages received by all links, which are then received and processed by the client.
  • the enhancement of the client in this disclosure also includes the address mapping table caching function.
  • the response messages of all read and write requests are returned by the target node.
  • the client receives the response, it immediately updates the target address of the request and the mapping relationship of the target node. Subsequently, the address Access can directly initiate requests to the target node, reducing the probability of request forwarding and greatly improving read and write efficiency.
  • NVMe Target server side The distributed storage cluster implements a logical storage volume and provides external storage volume services through NVMe Target. This disclosure enhances the NVMe Target server side.
  • a fabric link dedicated to the NVMe Target service is established between NVMe Target servers. This link supports forwarding NVMe requests to other nodes as they are, so the NVMe Target server can receive NVMe requests from the client and other NVMe Target servers at the same time.
  • each NVMe Target server can calculate the server node where the target is located based on the request.
  • the NVMe Target server only sends NVMe response messages to the client, regardless of whether the request is forwarded by other nodes.
  • This disclosure extends the client and server of the NVMe-oF protocol.
  • the protocol layer messages transmitted by the network layer are consistent with the existing standard protocols. Therefore, the enhanced NVMe client of this disclosure can be compatible with the standard NVMe-oF Target server, a standard NVMe client can access the Target server enhanced by this disclosure.
  • the network layer is not limited to RDMA and is compatible with network types supported by the standard NVMe protocol.
  • the client device mainly includes : first sending unit 601, first receiving unit 602.
  • the first sending unit 601 is configured to send a first read and write request to the first storage node through the first network link established with the first storage node, and the first storage node is any storage node in the storage system; the first receiving Unit 602 is configured to receive a read and write response from the second storage node through the second network link established with the second storage node, and use the read and write response as a response to the first read and write request.
  • the second storage node is A node capable of executing the first read and write request determined by the first storage node; the first network link and the second network link belong to the same storage protocol channel.
  • the client device includes an IO client device.
  • the client device includes: a storage protocol layer sending unit, a storage protocol layer receiving unit, a first network layer sending unit, a first network layer receiving unit, a second network layer sending unit, a second network layer receiving unit.
  • the storage protocol layer sending unit sends the first read and write request
  • the network layer first sending unit sends the first read and write request to the first storage node
  • the network layer second receiving unit receives the read and write response from the second storage node
  • the storage protocol layer The receiving unit determines the read-write response as a response to the first read-write request.
  • the second storage node is a node determined by the first storage node that can execute the first read and write request; the first network link and the second network link belong to the same storage protocol channel.
  • the client device includes: a storage protocol layer sending unit, a storage protocol layer receiving unit, a first sending unit of the network layer, a first receiving unit of the network layer, a second sending unit of the network layer, a third sending unit of the network layer. Two receiving units, etc.
  • the storage protocol layer sending unit selects the first sending unit of the network layer to send the IO read and write request.
  • the storage protocol layer receiving unit supports the second receiving unit of the network layer to receive the storage layer read and write response.
  • the first sending unit of the network layer is any sending unit of the network layer.
  • the second receiving unit of the network layer is any receiving unit of the network layer.
  • read and write requests include storage protocol layer information and network layer link information.
  • the second receiving unit of the network layer transfers the read and write response from the second storage node to the storage protocol layer receiving unit, and the storage protocol layer receiving unit parses the storage protocol layer identifier and network layer identifier of the read and write response. , compare the storage protocol layer identifier of the read-write response with the storage protocol layer identifier of the first read-write request, and the read-write response received by the second receiving unit of the network is a response to the first read-write request sent by the second sending unit of the network.
  • the storage protocol layer records the second network layer identifier and establishes a mapping relationship between the read and write request target data and the second network link identifier.
  • the first network link and the second network link are any network link channels of the storage protocol layer.
  • the client device is further configured to: after using the read and write response as a response to the first read and write request, extract the identification of the second storage node from the read and write response; based on the second storage node identification, establishing a mapping relationship between the second storage node and the client device, and the mapping relationship indicates that the read and write responses are stored in the second storage node.
  • the client device is further configured to: after establishing a mapping relationship between the second storage node and the client device based on the identity of the second storage node, generate a second read-write request based on the mapping relationship.
  • the second read-write request is used to access the read-write response; send the second read-write request to the second storage node through the second network link, and receive the read-write response returned by the second storage node in response to the second read-write request.
  • the client device is further configured to: obtain the first submission queue and the second submission queue in the network layer before sending the second read and write request to the second storage node through the second network link,
  • the first submission queue is used to send read and write requests to the first storage node
  • the second submission queue is used to send read and write requests to the second storage node
  • the submission queue in the storage protocol layer is obtained, and the submission queue in the storage protocol layer is used to Send read and write requests to the first submission queue and the second submission queue; map the first submission queue of the network layer and the second submission queue of the network layer to the submission queue of the storage protocol layer.
  • the client device is further configured to: obtain the first completion queue in the network layer before receiving the read and write response from the second storage node through the second network link established with the second storage node. and a second completion queue, the first completion queue is used to receive read and write responses from the first storage node, the second completion queue is used to receive read and write responses from the second storage node; and obtains the completion queue of the storage protocol layer , the completion queue of the storage protocol layer is used to receive read and write responses from the first completion queue of the network layer or the read and write responses of the second completion queue of the network layer; map the first completion queue of the network layer and the second completion queue of the network layer to Stores the completion queue for the protocol layer.
  • the client device is further configured to: parse the session identifier in the read-write response and the session identifier in the first read-write request before using the read-write response as a response to the first read-write request. ; Make sure the session ID in the read-write response is the same as the session ID in the first read-write request.
  • the present disclosure provides a second storage node.
  • the second storage node mainly includes: a second receiving unit 701 and a second sending unit 702.
  • the second receiving unit 701 is configured to receive the first read and write request from the first storage node, which is any storage node in the storage system; wherein the second receiving unit includes an internal network layer receiving unit.
  • the second sending unit 702 is configured to respond to the first read and write request, return a read and write response to the client device through the second network link established by the client device and the second storage node, the second network link and the first network link. Belonging to the same storage protocol channel, the first network link is the network link established between the client device and the first storage node.
  • the second sending unit includes a client network layer sending unit.
  • a second storage node includes: a cluster internal network layer sending unit, a cluster internal network layer receiving unit, a client network layer sending unit, and a client network layer receiving unit.
  • the cluster internal network layer receiving unit receives the first read and write request from the first storage node, which is any storage node in the storage system; the client network layer sending unit communicates with the second storage node through the client device The established second network link returns a read and write response to the client device.
  • the second network link belongs to The storage protocol layer link channel established between the client and the storage cluster.
  • the second storage node is any storage node in the cluster.
  • the second storage node is further configured to: in response to the first read and write request, after returning a read and write response to the client device through the second network link established by the client device and the second storage node, Receive a second read and write request from the client device through the second network link, the second read and write request is used to access the read and write response; respond to the second read and write request, return to the client device through the second network link Read and write responses.
  • the present disclosure also provides an electronic device.
  • the electronic device mainly includes: a processor 801, a memory 802 and a communication bus 803.
  • the processor 801 and the memory 802 communicate through the communication bus 803. complete mutual communication.
  • the memory 802 stores a program that can be executed by the processor 801.
  • the processor 801 executes the program stored in the memory 802 to implement the following steps: sending a message to the first storage node through the first network link established with the first storage node.
  • the first storage node is any storage node in the storage system; through the second network link established with the second storage node, the read and write response from the second storage node is received, and the read and write response is In response to the first read and write request, the second storage node is a node determined by the first storage node that can execute the first read and write request; the first network link and the second network link belong to the same storage protocol channel;
  • the second network link and the first network link belong to the same storage protocol channel.
  • the first network link is the network link established between the client device and the first storage node.
  • the communication bus 803 mentioned in the above electronic equipment may be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus, etc.
  • PCI Peripheral Component Interconnect
  • EISA Extended Industry Standard Architecture
  • the communication bus 803 can be divided into an address bus, a data bus, a control bus, etc. For ease of presentation, only one thick line is used in Figure 8, but it does not mean that there is only one bus or one type of bus.
  • the memory 802 may include random access memory (RAM) or non-volatile memory (non-volatile memory), such as at least one disk memory.
  • RAM random access memory
  • non-volatile memory non-volatile memory
  • the memory may also be at least one storage device located remotely from the aforementioned processor 801.
  • the above-mentioned processor 801 can be a general-purpose processor, including a Central Processing Unit (CPU for short), a Network Processor (NP for short), etc., or it can also be a Digital Signal Processing (DSP for short). ), Application Specific Integrated Circuit (ASIC for short), Field-Programmable Gate Array (FPGA for short) or other programmable logic devices, discrete gate or transistor logic devices, and discrete hardware components.
  • CPU Central Processing Unit
  • NP Network Processor
  • DSP Digital Signal Processing
  • ASIC Application Specific Integrated Circuit
  • FPGA Field-Programmable Gate Array
  • a computer-readable storage medium stores a computer program.
  • the computer program When the computer program is run on a computer, it causes the computer to execute the above embodiments.
  • the described data reading and writing methods are also provided.
  • the solution provided by this disclosure establishes multiple network links with multiple storage nodes in the storage cluster and simultaneously establishes a storage protocol layer link; multiple network links belong to the same storage protocol link.
  • the method provided by the present disclosure sends a first read and write request to the first storage node through the first network link established with the first storage node.
  • One storage node is any storage node in the storage system; through the second network link established with the second storage node, the read and write response from the second storage node is received, and the read and write response is used as the first read and write request.
  • the second storage node is a node determined by the first storage node that can execute the first read and write request; the first network link and the second network link belong to the same storage protocol channel.
  • the client device sends the first read and write request to the first storage node, the first read and write request can still be executed by the second storage node. Finally, the read-write response is sent to the client device, so that the client device can recognize that the read-write response is for the first read-write request.
  • the computer program product includes one or more computer instructions.
  • the computer instructions when loaded and executed on a computer, produce processes or functions in accordance with the present disclosure, in whole or in part.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network or other programmable device.
  • the computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another, e.g., from a website, computer, server, or data center via a wireline (e.g., Coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, microwave, etc.) means to transmit to another website, computer, server or data center.
  • the computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device such as a server or data center integrated with one or more available media.
  • the available media may be magnetic media (such as floppy disks, hard disks, magnetic tapes, etc.), optical media (such as DVDs), or semiconductor media (such as solid state drives).

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer And Data Communications (AREA)

Abstract

La présente divulgation concerne un procédé de lecture/d'écriture de données, et un dispositif, un nœud de stockage et un support d'enregistrement. Le procédé consiste à : au moyen d'une première liaison de réseau qui est établie avec un premier nœud de stockage, envoyer une première demande de lecture/d'écriture au premier nœud de stockage ; et, au moyen d'une seconde liaison de réseau qui est établie avec un second nœud de stockage, recevoir une réponse de lecture/d'écriture en provenance du second nœud de stockage, et utiliser la réponse de lecture/d'écriture en réponse à la première demande de lecture/d'écriture, la première liaison de réseau et la seconde liaison de réseau appartenant au même canal de protocole de stockage.
PCT/CN2023/081675 2022-03-16 2023-03-15 Procédé de lecture/d'écriture de données, et dispositif, nœud de stockage et support d'enregistrement WO2023174341A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210258761.1A CN116804908A (zh) 2022-03-16 2022-03-16 数据读写方法、设备、存储节点及存储介质
CN202210258761.1 2022-03-16

Publications (1)

Publication Number Publication Date
WO2023174341A1 true WO2023174341A1 (fr) 2023-09-21

Family

ID=88022398

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/081675 WO2023174341A1 (fr) 2022-03-16 2023-03-15 Procédé de lecture/d'écriture de données, et dispositif, nœud de stockage et support d'enregistrement

Country Status (2)

Country Link
CN (1) CN116804908A (fr)
WO (1) WO2023174341A1 (fr)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102111448A (zh) * 2011-01-13 2011-06-29 华为技术有限公司 分布式哈希表dht存储系统的数据预取方法、节点和系统
CN108701004A (zh) * 2017-01-25 2018-10-23 华为技术有限公司 一种数据处理的系统、方法及对应装置
US10244069B1 (en) * 2015-12-24 2019-03-26 EMC IP Holding Company LLC Accelerated data storage synchronization for node fault protection in distributed storage system
CN110286849A (zh) * 2019-05-10 2019-09-27 深圳物缘科技有限公司 数据存储系统的数据处理方法和装置
CN113014662A (zh) * 2021-03-11 2021-06-22 联想(北京)有限公司 数据处理方法及基于NVMe-oF协议的存储系统

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102111448A (zh) * 2011-01-13 2011-06-29 华为技术有限公司 分布式哈希表dht存储系统的数据预取方法、节点和系统
US10244069B1 (en) * 2015-12-24 2019-03-26 EMC IP Holding Company LLC Accelerated data storage synchronization for node fault protection in distributed storage system
CN108701004A (zh) * 2017-01-25 2018-10-23 华为技术有限公司 一种数据处理的系统、方法及对应装置
CN110286849A (zh) * 2019-05-10 2019-09-27 深圳物缘科技有限公司 数据存储系统的数据处理方法和装置
CN113014662A (zh) * 2021-03-11 2021-06-22 联想(北京)有限公司 数据处理方法及基于NVMe-oF协议的存储系统

Also Published As

Publication number Publication date
CN116804908A (zh) 2023-09-26

Similar Documents

Publication Publication Date Title
US11487690B2 (en) Universal host and non-volatile memory express storage domain discovery for non-volatile memory express over fabrics
US20210084537A1 (en) Load balance method and apparatus thereof
WO2020186909A1 (fr) Procédé, appareil et système de traitement de service de réseau virtuel, contrôleur et support de stockage
US7969989B2 (en) High performance ethernet networking utilizing existing fibre channel arbitrated loop HBA technology
US11544001B2 (en) Method and apparatus for transmitting data processing request
US8725879B2 (en) Network interface device
US10574477B2 (en) Priority tagging based solutions in fc sans independent of target priority tagging capability
CN112130748B (zh) 一种数据访问方法、网卡及服务器
US11489921B2 (en) Kickstart discovery controller connection command
CN107608632B (zh) 一种分布式存储集群的通信方法、装置及系统
US20220222016A1 (en) Method for accessing solid state disk and storage device
US20190158627A1 (en) Method and device for generating forwarding information
WO2020134144A1 (fr) Procédé, nœud et système de transfert de données ou de messages
WO2017185322A1 (fr) Procédé et dispositif de découverte d'élément de réseau de stockage
JP7126021B2 (ja) OpenFlowインスタンスの構成
WO2018107433A1 (fr) Procédé et dispositif de traitement d'informations
WO2021175105A1 (fr) Procédé et appareil de connexion, dispositif, et support de stockage
WO2020187124A1 (fr) Procédé et dispositif de traitement de données
WO2023174341A1 (fr) Procédé de lecture/d'écriture de données, et dispositif, nœud de stockage et support d'enregistrement
US9077741B2 (en) Establishing communication between entities in a shared network
TW201006191A (en) UPnP/DLNA device support apparatus, system, and method
CN111865801B (zh) 一种基于Virtio端口传输数据的方法和系统
WO2015123986A1 (fr) Procédé et système d'enregistrement de données, et serveur d'accès
WO2024001549A9 (fr) Procédé de configuration d'adresse et dispositif électronique
WO2023078210A1 (fr) Procédé et appareil de traitement de paquets, et système de communication

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23769843

Country of ref document: EP

Kind code of ref document: A1