WO2022267909A1 - Method for reading and writing data and related apparatus - Google Patents

Method for reading and writing data and related apparatus Download PDF

Info

Publication number
WO2022267909A1
WO2022267909A1 PCT/CN2022/098309 CN2022098309W WO2022267909A1 WO 2022267909 A1 WO2022267909 A1 WO 2022267909A1 CN 2022098309 W CN2022098309 W CN 2022098309W WO 2022267909 A1 WO2022267909 A1 WO 2022267909A1
Authority
WO
WIPO (PCT)
Prior art keywords
storage controller
message
switch
storage
direct path
Prior art date
Application number
PCT/CN2022/098309
Other languages
French (fr)
Chinese (zh)
Inventor
晏思宇
刘世兴
曲迪
冀智刚
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2022267909A1 publication Critical patent/WO2022267909A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/062Securing storage systems
    • G06F3/0622Securing storage systems in relation to access
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]

Definitions

  • the present application relates to the technical field of storage, and in particular to a method for reading and writing data and related devices.
  • Enterprise storage or distributed storage systems typically include clusters of client servers, networks, and storage arrays.
  • a client-server cluster (referred to as a client for short) may carry various applications such as a structured query language (Structured Query Language, SQL) service or a database.
  • SQL Structured Query Language
  • the data read and written by the client is actually stored in the back-end storage array cluster.
  • the data read and write access of the client is also called input/output (input/output, I/O) request, and each data read and write access requires the client to perform actual data read and write operations in the storage array cluster through the network.
  • I/O input/output
  • a storage array cluster consists of many storage controllers and storage arrays. Considering factors such as high availability and data access consistency, each storage controller is often responsible for data read and write operations on some data disks. For each I/O request for data, a corresponding storage controller in the storage array cluster is required to be responsible for reading and writing the I/O request to the data disk. That is, each I/O request has a corresponding attributable storage controller, and only the attributable storage controller has the right to access the data in the storage array.
  • the client and the storage array are usually two independent systems, and the client cannot obtain the storage controller corresponding to each I/O request in advance. The corresponding relationship between I/O requests and storage controllers is stored in each storage controller of the storage array.
  • the client sends the I/O request to a storage controller in the storage array according to its own algorithm (such as load balancing, round robin, etc.).
  • the storage controller determines the target storage controller by searching the corresponding relationship between the I/O request and the storage controller, and reroutes the I/O request to the target storage controller for data read and write operations.
  • the present application proposes a method for reading and writing data.
  • a switch receives address information of the storage controller from a storage controller, and assigns corresponding switching address information to the storage controller. According to the address information of the storage controller and the exchange address information corresponding to the storage controller, the switch establishes a direct path connection with the storage controller, the direct path connection is used to transmit I/O packets, and the I The /O message is used to write data to the storage array through the storage controller or read data from the storage array through the storage control. Since the direct path connection established between the switch and the storage controller can transmit I/O packets, multiple addressing for I/O processing is avoided, the forwarding path of I/O packets is shortened, and the completion of data read and write operations is reduced. time, improve I/O access times (input/output per second, IOPS) performance, and improve data read and write efficiency.
  • IOPS input/output per second
  • the first aspect of the present application provides a method for reading and writing data.
  • the switch receives the address information of the storage controller, and the address information of the storage controller includes one or more of the following: the queue pair port number of the storage controller, the The Internet protocol address of the storage controller, the transmission control protocol TCP port number, or the protocol number of the storage controller; the switch allocates corresponding exchange address information for the storage controller, and the exchange address information includes the following one One or more items: the queue pair port number of the switch, the Internet protocol IP address of the switch or the port number of the switch; the switch establishes a direct path connection with the storage controller, and the direct path connection uses For transmitting input and output I/O messages, the I/O messages are used to write data to the storage array through the storage controller or read data from the storage array through the storage control.
  • the switch establishes a communication connection with the storage controller.
  • the address information of the storage controller is transmitted to the switch via the communication link.
  • This communication connection may also be referred to as a control plane connection.
  • the specific process of establishing the communication connection is as follows: (a), the switch initiates a monitoring task related to the direct path connection. (b) After the storage controller goes online, the storage controller locally allocates the queue pair port number of the storage controller (or the IP address and port number of the storage controller). (c) After the switch detects that the storage controller is online, the switch and the storage controller establish a control plane connection through a remote direct data access RDMA link establishment process (or a transmission control protocol TCP link establishment process).
  • the TCP connection is used to transmit address information of the storage controller.
  • the address information of the storage controller includes: the IP address of the storage controller, the TCP port number and the protocol number of the storage controller.
  • the RDMA connection is used to transmit address information of the storage controller.
  • the address information of the storage controller includes: the queue pair port number QP of the storage controller and the IP address of the storage controller.
  • the client when the client goes online, the client establishes a network connection with the storage controller, and the switch establishes a network connection with the storage controller.
  • the switch receives address information of the storage controller from the storage controller, and assigns corresponding exchange address information to the storage controller.
  • the switch establishes a direct path connection with the storage controller, the direct path connection is used to transmit I/O packets, and the I The /O message is used to write data to the storage array through the storage controller or read data from the storage array through the storage control.
  • the direct path connection established between the switch and the storage controller can transmit I/O packets, multiple addressing for I/O processing is avoided, the forwarding path of I/O packets is shortened, and the completion of data read and write operations is reduced. Time, improve IOPS performance, and improve data read and write efficiency.
  • the switch receives routing information from the storage controller, where the routing information includes one or more of the following information: identification information of the storage controller, input and output of the storage controller
  • the I/O address whether the I/O message whose destination is the storage controller needs to be copied, or the load information of the storage controller, the I/O address includes a logical unit number, a name space identifier and/or or logical block address.
  • the load information of the storage controller includes, but is not limited to: the total storage space of the storage controller, the remaining available storage space of the storage controller, the used storage space of the storage controller, the IOPS of the storage controller, the storage Whether the controller supports active-active mode, or the temperature of the storage controller.
  • the switch receives the routing information of the storage controller, so that the switch can grasp the status of the storage controller. After receiving the I/O message, the switch can quickly determine the corresponding storage controller for the I/O message. Improve data read and write efficiency.
  • the switch generates a first mapping relationship, where the first mapping relationship includes a mapping relationship between the identification information of the direct path connection and the I/O address.
  • the first mapping relationship may be a hash (hash) table of key-value pairs (key-value).
  • the key (key) of the table is the I/O address
  • the value (value) of the table is the identification information of the direct path connection.
  • the identification information of the direct path connection may be the identification information of the storage controller.
  • the first mapping relationship may include: identification information of the direct path connection, identification information of the storage controller, and an I/O address.
  • the first mapping relationship may include: identification information of the direct path, an I/O address, identification information of the storage controller, and identification information of the client.
  • the switch receives the first I/O message from the client, and the first I/O message is used to write data to the storage array through the storage controller or read data from the storage array through the storage controller ;
  • the switch determines the direct path connection according to the first I/O message and the first mapping relationship; the switch sends the first path connection to the storage controller through the direct path connection An I/O message.
  • the switch parses the address information in the first I/O packet.
  • the switch determines the destination of the first I/O message according to the address information in the first I/O message, that is, the storage controller corresponding to the first I/O message.
  • the switch determines the corresponding identification information of the direct path connection according to the identification information of the storage controller.
  • the switch sends the first I/O message to the storage controller by using the direct path connection.
  • the switch After receiving the first I/O message, the switch first determines the address information of the corresponding storage controller according to the I/O address in the first I/O message. Secondly, the switch determines the identification information of the direct path connection according to the address information of the storage controller. The switch sends the first I/O packet by using the direct path connection.
  • the first mapping relationship includes: identification information of the direct path, an I/O address, identification information of the storage controller, and identification information of the client.
  • the switch After receiving the first I/O message, the switch first determines the address information of the corresponding storage controller according to the I/O address in the first I/O message. Secondly, the switch determines the source address information of the I/O message according to the first I/O message, that is, the address information of the client corresponding to the first I/O message. Thirdly, the switch determines the identification information of the client according to the address information of the client. In a possible implementation manner, the address information of the client is consistent with the identification information of the client. Thirdly, the switch determines the corresponding identification information of the direct path connection according to the identification information of the client. The switch sends the first I/O packet by using the direct path connection.
  • the first mapping relationship includes: identification information of the direct path, an I/O address, identification information of the storage controller, and identification information of the client.
  • the switch after receiving the first I/O message, the switch first determines the address information of the corresponding storage controller according to the I/O address in the first I/O message. Second, the switch detects the number of direct path connections corresponding to the storage controller. When the storage controller corresponds to only one direct path connection, the switch directly uses the direct path connection to send the first I/O packet.
  • the switch determines the first I/O according to the source address information in the first I/O message, that is, the address information of the client corresponding to the first I/O message /O which client the packet comes from. Thirdly, the switch determines the identification information of the client according to the address information of the client. In a possible implementation manner, the address information of the client is consistent with the identification information of the client. Thirdly, the switch determines the corresponding identification information of the direct path connection according to the identification information of the client. The switch sends the first I/O packet by using the direct path connection.
  • the switch may dynamically select an appropriate storage controller to receive the first I/O packet according to status information reported by each storage controller.
  • the suitable storage controller here can be a storage controller with larger available space, or a storage controller with stronger IOPS performance, or a storage controller with lower processing load of the central processing unit (CPU) of the storage controller. Controller, not limited here. In order to balance the workload of each storage controller.
  • the switch detects the first I/O message, when the destination (storage controller) of the first I/O message is a storage controller implementing the active-active mode. Then the switch copies the first I/O packet. The switch sends the first I/O packet to a backup storage controller of the storage controller, and the storage controller and the backup storage controller work in a dual-active mode. The switch adds an identifier indicating that the message is a copy message to the first I/O message sent by the switch to the backup storage controller. Improve data security.
  • the switch detects the type of the first I/O message; when the first I/O message is a data message, the switch connects to the storage controller through the direct path The switch sends the first I/O packet; when the first I/O packet is a non-data packet, the switch transparently transmits the first I/O packet.
  • the switch By identifying the type of packets, different processing is performed on different packets to reduce the load on the network.
  • the switch receives a first reply message from the storage controller through the direct path connection, and the first reply message is a response to the first I/O message; the The switch sends the first reply packet to the client.
  • the switch receives the reply message from the storage controller through the direct path connection, and then the switch sends the reply message to the client, so that the client does not perceive the routing path of the reply message.
  • the switch receives a second reply message from the storage controller, the second reply message is a response to the first I/O message, and the purpose of the second reply message is The ground is the second storage controller, and the second storage controller is the storage controller initially allocated by the client for the first I/O message; the switch forwards the The second reply message; the switch receives the third reply message from the second storage controller, the third reply message is generated according to the second reply message; the switch forwards the message from The third reply message of the second storage controller.
  • the storage controller may forward the I/O reply message to the second storage controller, where the second storage controller is the storage controller initially allocated by the client to the first I/O message.
  • the second storage controller retains the network connection with the client, so the second storage controller carries the I/O reply message on the network connection and directly sends it to the client. Perceive the routing path of the reply message.
  • the second reply packet includes proxy indication information, and the proxy indication information instructs the second storage controller to send the second reply packet to the client on behalf of the storage controller,
  • the third reply packet does not include the proxy indication information.
  • the storage controller instructs the second storage controller to send the second reply message to the client through explicit proxy indication information, so that the client does not perceive the routing path of the reply message.
  • the switch copies the first I/O message; and the switch sends the copied copy to the backup storage controller of the storage controller.
  • the backup storage controller and the storage controller work in the active-active mode.
  • the storage controller is a part of the active-active cluster, and the active-active cluster works in the active-active mode (also referred to as the active-active cluster performing the active-active task).
  • the feature of an active-active cluster is that both clusters are running online and can support the same application load.
  • the client writes data to the active-active cluster, for example, the client sends an I/O message to the active-active cluster, one of the clusters in the active-active cluster performs read and write operations based on the I/O message, and the cluster Copy the I/O message and send it to another cluster, so that the other cluster stores the data in the I/O message.
  • the client When the client reads data from the active-active cluster, if one cluster in the active-active cluster fails and the other cluster is still working normally, the client can directly read data through the working cluster. It can be seen that the active-active mode can effectively improve the security of stored data.
  • the storage controller and the backup storage controller work in the active-active mode, after receiving the I/O message, the storage controller copies the I/O message and sends the I/O message to the backup storage controller. message.
  • the data in the I/O message is stored by the backup storage controller.
  • the switch when the switch receives the first I/O message, the switch sends the first I/O message to the storage controller through a direct path connection with the storage controller.
  • the switch sends first indication information to the second storage controller, where the first indication information instructs the second storage controller to update the local serial number of the received message, and the second storage controller is the client for the first I /O
  • the memory controller to which the message is initially assigned. For example, after receiving the I/O message, the storage controller assigns a serial number to the I/O message as the serial number of the received message.
  • the first indication information instructs the second storage controller to update the local serial number of the received message, so as to avoid the out-of-sequence problem caused by the second storage controller not receiving the first I/O message.
  • the switch sending the first indication information to the second storage controller includes: the switch sending a signal message to the second storage controller, and the message header of the signal message includes the message of the first I/O message Header information, where the header of the signal message further includes first indication information.
  • the switch notifies each storage controller to disable the out-of-sequence packet detection function.
  • the second storage controller is the storage controller initially allocated by the client to the first I/O message.
  • the direct path connection is a transmission control protocol TCP connection, or a remote direct data storage RDMA connection.
  • the second aspect of the present application provides a method for reading and writing data.
  • the storage controller sends the address information of the storage controller to the switch.
  • the address information of the storage controller includes one or more of the following: the queue pair of the storage controller Port number, the Internet protocol address of the storage controller, the transmission control protocol TCP port number of the storage controller, or the protocol number; the storage controller establishes a direct path connection with the switch, and the direct path connection uses For transmitting input and output I/O messages, the I/O messages are used to write data to the storage array through the storage controller or read data from the storage array through the storage control.
  • the TCP connection is used to transmit address information of the storage controller.
  • the address information of the storage controller includes: the IP address of the storage controller, the TCP port number and the protocol number of the storage controller.
  • the RDMA connection is used to transmit address information of the storage controller.
  • the address information of the storage controller includes: the queue pair port number QP of the storage controller and the IP address of the storage controller.
  • the client when the client goes online, the client establishes a network connection with the storage controller, and the switch establishes a network connection with the storage controller.
  • the switch receives address information of the storage controller from the storage controller, and assigns corresponding exchange address information to the storage controller.
  • the switch establishes a direct path connection with the storage controller, the direct path connection is used to transmit I/O packets, and the I The /O message is used to write data to the storage array through the storage controller or read data from the storage array through the storage control.
  • the direct path connection established between the switch and the storage controller can transmit I/O packets, multiple addressing for I/O processing is avoided, the forwarding path of I/O packets is shortened, and the completion of data read and write operations is reduced. Time, improve IOPS performance, and improve data read and write efficiency.
  • the storage controller sends routing information of the storage controller to the switch, where the routing information includes one or more of the following information: identification information of the storage controller, the storage The input and output I/O address of the controller, whether the I/O message whose destination is the storage controller needs to be copied, or the load information of the storage controller, the I/O address includes the logical unit number, naming ID and/or logical block address of the space.
  • the switch receives the routing information of the storage controller, so that the switch can grasp the status of the storage controller. After receiving the I/O message, the switch can quickly determine the corresponding storage controller for the I/O message. Improve data read and write efficiency.
  • the storage controller receives the first I/O packet from the switch through the direct path connection.
  • the storage controller generates a first reply message according to the first I/O message, and the first reply message is a response to the first I/O message; the storage The controller sends the first reply packet to the switch.
  • the switch receives the reply message from the storage controller through the direct path connection, and then the switch sends the reply message to the client, so that the client does not perceive the routing path of the reply message.
  • the storage controller generates a second reply message according to the first I/O message, and the second reply message is a response to the first I/O message; the storage The controller sends the second reply message to the switch, the destination of the second reply message is a second storage controller, and the second storage controller is the client and the first I /O The memory controller to which the message is initially assigned.
  • the storage controller may forward the I/O reply message to the second storage controller, where the second storage controller is the storage controller initially allocated by the client to the first I/O message.
  • the second storage controller retains the network connection with the client, so the second storage controller carries the I/O reply message on the network connection and directly sends it to the client. Perceive the routing path of the reply message.
  • the second reply packet includes proxy indication information, and the proxy indication information instructs the second storage controller to send the second reply packet to the client on behalf of the storage controller.
  • the storage controller instructs the second storage controller to send the second reply message to the client through explicit proxy instruction information, so that the client does not perceive the routing path of the reply message.
  • the direct path connection is a transmission control protocol TCP connection, or a remote direct data storage RDMA connection.
  • the third aspect of the present application provides a network device, including: a transceiver module and a processing module;
  • the transceiver module is used to receive the address information of the storage controller, the address information of the storage controller includes one or more of the following: the queue pair port number of the storage controller, the Internet protocol of the storage controller address, transmission control protocol TCP port number, or protocol number of the storage controller;
  • the processing module is configured to assign corresponding switching address information to the storage controller, and the switching address information includes one or more of the following: the queue pair port number of the switch, the Internet Protocol address of the switch, or the port number of the switch;
  • the processing module is further configured to establish a direct path connection with the storage controller, the direct path connection is used to transmit input and output I/O messages, and the I/O messages are used to pass through the storage controller Data is written to or read from the storage array through the storage control.
  • the transceiver module is further configured to receive routing information from the storage controller, where the routing information includes one or more of the following information: identification information of the storage controller, the storage The input and output I/O address of the controller, whether the I/O message whose destination is the storage controller needs to be copied, or the load information of the storage controller, the I/O address includes the logical unit number, naming ID and/or logical block address of the space.
  • the processing module is further configured to generate a first mapping relationship, where the first mapping relationship includes a mapping relationship between the identification information of the direct path connection and the I/O address.
  • the transceiver module is also configured to receive the first I/O message from the client;
  • the processing module is further configured to determine the direct path connection according to the first I/O message and the first mapping relationship;
  • the transceiver module is further configured to send the first I/O message to the storage controller through the direct path connection.
  • processing module is further configured to detect the type of the first I/O message
  • the transceiver module is further configured to send the first I/O message to the storage controller through the direct path connection when the first I/O message is a data message;
  • the transceiver module is further configured to transparently transmit the first I/O message when the first I/O message is a non-data message.
  • the transceiver module is further configured to receive a first reply message from the storage controller through the direct path connection, where the first reply message is the first I/O message the response to;
  • the transceiver module is further configured to send the first reply message to the client.
  • the transceiver module is further configured to receive a second reply message from the storage controller, the second reply message is a response to the first I/O message, and the second reply message is The destination of the reply message is the second storage controller, and the second storage controller is the storage controller initially assigned by the client to the first I/O message;
  • the transceiver module is further configured to forward the second reply message to the second storage controller;
  • the transceiver module is further configured to receive a third reply message from the second storage controller, the third reply message is generated according to the second reply message;
  • the transceiver module is further configured to forward the third reply message from the second storage controller.
  • the second reply packet includes proxy indication information, and the proxy indication information instructs the second storage controller to send the second reply packet to the client on behalf of the storage controller,
  • the third reply packet does not include the proxy indication information.
  • processing module is further configured to copy the first I/O message when the first I/O message needs to be copied;
  • the transceiver module is further configured to send the replicated first I/O message to a backup storage controller of the storage controller, and the backup storage controller and the storage controller work in a dual-active mode.
  • the direct path connection is a transmission control protocol TCP connection, or a remote direct data storage RDMA connection.
  • the fourth aspect of the present application provides a storage device, including: a transceiver module and a processing module;
  • the transceiver module is configured to send the address information of the storage controller to the switch, and the address information of the storage controller includes one or more of the following: the queue pair port number of the storage controller, the Internet protocol address, transmission control protocol TCP port number, or protocol number of the storage controller;
  • the processing module is configured to establish a direct path connection with the switch, the direct path connection is used to transmit input and output I/O messages, and the I/O messages are used to send the storage array to the storage controller through the storage controller Write data to or read data from the storage array through the storage control.
  • the transceiver module is further configured to send routing information of the storage controller to the switch, where the routing information includes one or more of the following information: identification information of the storage controller, The input/output I/O address of the storage controller, whether the I/O message whose destination is the storage controller needs to be copied, or the load information of the storage controller, the I/O address includes a logic unit number, namespace identifier and/or logical block address.
  • the transceiver module is further configured to receive the first I/O message from the switch through the direct path connection.
  • the processing module is further configured to generate a first reply message according to the first I/O message, and the first reply message is a response to the first I/O message;
  • the transceiver module is further configured to send the first reply message to the switch through the direct path connection.
  • the processing module is further configured to generate a second reply message according to the first I/O message, and the second reply message is a response to the first I/O message;
  • the transceiver module is further configured to send the second reply message to the switch, the destination of the second reply message is a second storage controller, and the second storage controller is the client A storage controller initially allocated for the first I/O packet.
  • the second reply packet includes proxy indication information, and the proxy indication information instructs the second storage controller to send the second reply packet to the client on behalf of the storage controller.
  • the direct path connection is a transmission control protocol TCP connection, or a remote direct data storage RDMA connection.
  • a fifth aspect of the present application provides a network device, where the network device includes: a processor, configured to enable the network device to implement the method described in the foregoing first aspect or any possible implementation manner of the first aspect.
  • the device may further include a memory, and the memory is coupled to the processor.
  • the processor executes the instructions stored in the memory, the network device may implement the method described in any possible implementation manner of the foregoing first aspect.
  • the device may further include a communication interface, which is used for the device to communicate with other devices.
  • the communication interface may be a transceiver, a circuit, a bus, a module, or other types of communication interfaces.
  • Coupling in this application is an indirect coupling or connection between devices, units or modules, which may be in electrical, mechanical or other forms, and is used for information exchange between devices, units or modules.
  • a sixth aspect of the present application provides a storage device, where the network device includes: a processor configured to enable the storage device to implement the method described in the foregoing second aspect or any possible implementation manner of the second aspect.
  • the device may further include a memory, and the memory is coupled to the processor. When the processor executes the instructions stored in the memory, the memory device may implement the method described in any possible implementation manner of the foregoing second aspect.
  • the device may further include a communication interface, which is used for the device to communicate with other devices.
  • the communication interface may be a transceiver, a circuit, a bus, a module, or other types of communication interfaces.
  • Coupling in this application is an indirect coupling or connection between devices, units or modules, which may be in electrical, mechanical or other forms, and is used for information exchange between devices, units or modules.
  • the seventh aspect of the present application provides a computer storage medium, which may be non-volatile; computer-readable instructions are stored in the computer storage medium, and the first aspect is realized when the computer-readable instructions are executed by a processor Or the method described in any possible implementation of the first aspect.
  • the eighth aspect of the present application provides a computer storage medium, which may be non-volatile; computer-readable instructions are stored in the computer storage medium, and the second aspect is realized when the computer-readable instructions are executed by a processor Or the method described in any possible implementation of the second aspect.
  • a ninth aspect of the present application provides a computer program product containing instructions, which when run on a computer, causes the computer to execute the method described in the first aspect or any possible implementation manner of the first aspect.
  • the tenth aspect of the present application provides a computer program product including instructions, which, when run on a computer, cause the computer to execute the method described in the second aspect or any possible implementation manner of the second aspect.
  • the eleventh aspect of the present application provides a storage system, where the storage system includes multiple network devices according to the above third aspect or the fifth aspect, and multiple storage devices according to the above fourth aspect or the sixth aspect.
  • FIG. 1a is a schematic diagram of a network architecture involved in an embodiment of the present application.
  • FIG. 1b is a schematic diagram of a network architecture proposed by an embodiment of the present application.
  • FIG. 2 is a schematic diagram of an application scenario proposed by an embodiment of the present application.
  • FIG. 3 is a schematic diagram of another application scenario proposed by the embodiment of the present application.
  • FIG. 4 is a schematic flow diagram of a data reading and writing method provided by an embodiment of the present application.
  • FIG. 5 is a schematic diagram of an embodiment of a data reading and writing method proposed in the embodiment of the present application.
  • FIG. 6 is a schematic diagram of an embodiment of another data reading and writing method proposed in the embodiment of the present application.
  • FIG. 7 is a schematic diagram of another embodiment of a data reading and writing method proposed in the embodiment of the present application.
  • FIG. 8 is a schematic structural diagram of a network device 800 provided in an embodiment of the present application.
  • FIG. 9 is a schematic structural diagram of a network device 900 provided in an embodiment of the present application.
  • FIG. 10 is a schematic structural diagram of a storage device 1000 provided by an embodiment of the present application.
  • the naming or numbering of the steps in this application does not mean that the steps in the method flow must be executed in the time/logic sequence indicated by the naming or numbering.
  • the execution order of the technical purpose is changed, as long as the same or similar technical effect can be achieved.
  • the division of units presented in this application is a logical division. In actual application, there may be other division methods. For example, multiple units can be combined or integrated in another system, or some features can be ignored. , or not, in addition, the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, and the indirect coupling or communication connection between units may be electrical or other similar forms, this Applications are not limited.
  • the units or subunits described as separate components may or may not be physically separated, may or may not be physical units, or may be distributed into multiple circuit units, and some or all of them may be selected according to actual needs unit to realize the purpose of the application scheme.
  • An I/O packet may also be called an I/O command or an I/O request.
  • the I/O command can be divided into a read command and a write command, and refers to a command issued by an application program running on the server for instructing to read data from the storage array or write data to the storage array.
  • the server's processor can receive the I/O command and store it in memory so that the I/O command waits to be processed.
  • the I/O command may be stored in a queue of the memory.
  • a queue is a special linear table that can be deleted at the front end of the table (front) and inserted at the back end (rear) of the table.
  • the end of the insertion operation is called the tail of the queue, and the end of the deletion operation is called the head of the queue.
  • the data elements of a queue are also called queue elements. Inserting a queue element into the queue is called enqueuing, and removing a queue element from the queue is called dequeuing.
  • the queue can also be called a first in first out (FIFO) linear list.
  • send queue send queue
  • receive queue receive queue
  • completion queue complete queue
  • CQ complete queue
  • event queue event queue
  • the sending queue of the sending end (for example: server) and the receiving queue of the receiving end (for example: storage controller) may be referred to as a queue pair (queue pair, QP).
  • the send queue can be used to store pending I/O commands; the receive queue is used to store and process received I/O commands.
  • the sending queue in the server is used to store the I/O commands issued by the server for instructing to read or write data to the storage controller, for example , the I/O command is a read command for reading data stored in the storage controller, the server can send the processed read command to the storage controller, and the read command can include the address and length of the data to be read, etc. , so that the storage controller can send the data it needs to read to the server according to the processed read command.
  • the I/O command is a write command for writing data to the storage controller
  • the server may process the write command, and then send the processed write command to the storage controller, so that the storage controller can process the
  • the subsequent write command searches for memory information in the receiving queue, and stores the data in the processed write command into the memory of the storage controller according to the found memory information.
  • the sending queue in the storage controller may also store the I/O command issued by the storage controller for instructing to read data from the server or write data to the server, and send the command to the server through the storage controller.
  • the I/O command issued by the storage controller for instructing to read data from the server or write data to the server, and send the command to the server through the storage controller.
  • FIG. 1a it is a schematic diagram of a network architecture involved in the embodiment of the present application.
  • the client cluster includes multiple clients, the storage array cluster includes multiple storage controllers (such as storage controller A and storage controller N shown in Figure 1a) and storage arrays (not shown in the figure), and the client Establish a connection with the storage controller through a switch. Since the client cluster and the storage array cluster in the distributed storage system usually belong to two independent systems, the client cannot obtain the address information of each storage controller in advance.
  • the client When the client initiates an I/O request (that is, the client sends an I/O packet to the storage array cluster through the switch), since the client cannot know the storage controller corresponding to the I/O request, the client will Algorithm to determine a storage controller in a storage array cluster.
  • the determined storage controller searches for the attributable storage controller of the I/O message, and reroutes the I/O message to the attributable storage controller to complete data reading or writing.
  • the attributable storage controller refers to the destination of the I/O packet, and the attributable storage controller may also be called a target storage controller.
  • the addressing address in the I/O message is the address of the storage controller (for example, logical unit number (logical unit number, LUN), logical block address (logical block address, LBA)).
  • the above process is as follows: the I/O message from the client first arrives at the storage controller A, and the storage controller A is randomly determined by the client.
  • Storage controller A decapsulates the I/O packet, searches the mapping relationship table according to the metadata in the I/O packet to determine the storage controller N that processes the I/O packet, and then storage controller A sends the I/O packet to The O packet is re-encapsulated and forwarded to the storage controller N.
  • the storage controller N After receiving the I/O message, the storage controller N re-decapsulates the I/O message to obtain the metadata and obtain the address information of the storage array.
  • the storage controller N performs read and write processing tasks on the storage array according to the I/O message. Subsequent packets of this I/O packet also need to be fed back to the client through the above process.
  • rerouting also called I/O rerouting
  • I/O packets need to be addressed and forwarded multiple times, so the forwarding path of I/O packets is long, resulting in a long time for data read and write operations to complete, and the number of I/O accesses per second (input /output per second, IOPS) is reduced, affecting data read and write efficiency.
  • the present application proposes a method for reading and writing data.
  • the switch receives address information of the storage controller from the storage controller, and assigns corresponding switching address information to the storage controller.
  • the switch establishes a direct path connection with the storage controller, the direct path connection is used to transmit I/O packets, and the I The /O message is used to write data to the storage array through the storage controller or read data from the storage array through the storage control. Since the direct path connection established between the switch and the storage controller can transmit the I/O message, the I/O message of the client does not need to go through the aforementioned rerouting process. It shortens the forwarding path of I/O packets, reduces the completion time of data read and write operations, increases the number of I/O accesses per second (input/output per second, IOPS), and improves the efficiency of data read and write.
  • IOPS input/output per second
  • FIG. 1 b is a schematic diagram of a network architecture proposed by the embodiment of the present application.
  • the network architecture includes a client cluster, a switch 110, and a storage array cluster, wherein the client cluster includes one or more clients, the storage array cluster includes one or more storage controllers 120, and each storage controller 120 manages one or more storage array (not shown).
  • the I/O request initiated by the client is forwarded to the storage controller 120 through the switch 110 .
  • the client when the client goes online, the client will establish a network connection with one or more storage controllers through the switch, and the network connection may be a Transmission Control Protocol (TCP) connection.
  • TCP Transmission Control Protocol
  • the switch establishes a direct path connection with the storage controller.
  • the storage controller may be a storage controller that has established a network connection with the client, or the storage controller may not have established a network connection with the client.
  • the direct path connection is used for direct communication between the switch and the storage controller (that is, communication that does not need to be forwarded by other storage controllers), and the direct path connection is used for transmitting I/O packets.
  • FIG. 2 is a schematic diagram of an application scenario proposed by the embodiment of the present application. After the client goes online, establish network connections with storage controller A and storage controller B respectively. The switch establishes connections with the storage controller A and the storage controller B respectively.
  • a possible implementation manner is: after the client goes online, the client establishes network connections with storage controller A and storage controller B respectively.
  • the switch establishes a direct path connection with each storage controller (storage controller A and storage controller B) respectively.
  • the client after the client goes online, the client establishes network connections with storage controller A and storage controller B respectively.
  • the client will randomly determine a storage controller for the I/O message, and the actual destination of the I/O message is the storage controller of the I/O message .
  • the randomly determined storage controller is consistent with the attributable storage controller, no direct path connection is established between the switch and the attributable storage controller.
  • a direct path connection is established between the switch and other storage controllers.
  • FIG. 3 is a schematic diagram of another application scenario proposed by the embodiment of the present application. After the client goes online, establish network connections with storage controller A and storage controller B respectively. The storage controller C in FIG. 3 has not established a network connection with the client. The switch establishes direct path connections with the storage controller A, storage controller B and storage controller C.
  • the attributable storage controller in the I/O request initiated by the client is the storage controller C, but the client has not established a network connection with the storage controller C.
  • the switch After the switch receives the I/O request, the switch establishes a direct path connection with the storage controller C.
  • the switch sends the I/O request to the storage controller C through the direct path connection, so that the storage controller C completes the I/O processing.
  • the switch establishes a diameter path connection with the storage controller C only when it needs to communicate with the storage controller C.
  • the switch when the I/O request initiated by the client is not assigned to the storage controller C, and the storage controller C has not established a network connection with the client. After the switch receives the I/O request, the switch establishes a direct path connection with the storage controller C. So that the subsequent I/O request on the storage controller C can be sent to the storage controller C through the direct path connection.
  • FIG. 4 it is a schematic flowchart of a data reading and writing method provided in the embodiment of the present application, and the data reading and writing method includes steps 401-408.
  • the switch receives routing information from the storage controller.
  • the switch receives the routing information from the storage controller, and the routing information includes one or more of the following information: identification information of the storage controller, I/O address of the storage controller, destination is storage control Whether the I/O packets of the controller need to be copied, or the load information of the controller is stored.
  • the I/O address includes but is not limited to: logical unit number (logical unit number, LUN), namespace identifier (name space id, NSID), or logical block address (Logical Block Address, LBA). In this application, it will refer to the load information of the storage array managed by the storage controller, and will be referred to as the load information of the storage controller for short.
  • the load information of the storage controller includes but not limited to: the total storage space of the storage controller, the remaining available storage space of the storage controller, the used storage space of the storage controller, the IOPS of the storage controller, the storage Whether the controller supports active-active mode, or the temperature of the storage controller.
  • the switch first establishes a communication connection with the storage controller.
  • the protocol types used in the communication connection include but are not limited to: Transmission Control Protocol (Transmission Control Protocol, TCP), Remote Direct Memory Access (Remote Direct Memory Access, RDMA) ) connection, or User Datagram Protocol (UDP).
  • TCP Transmission Control Protocol
  • RDMA Remote Direct Memory Access
  • UDP User Datagram Protocol
  • the switch establishes a communication connection with the storage controller.
  • the client configures available storage controllers.
  • the client establishes a communication connection with the storage controller.
  • This communication connection may also be referred to as a control plane connection.
  • the specific process of establishing a control plane connection is as follows:
  • the switch initiates a monitoring task about the direct path connection.
  • the storage controller After the storage controller goes online, the storage controller locally allocates the queue pair port number of the storage controller (or the IP address and port number of the storage controller).
  • the switch After the communication connection (control plane connection) is established between the switch and the storage controller, the switch receives routing information from the storage controller through the communication connection.
  • the I/O address in the routing information may be different, which will be described separately below.
  • the I/O address includes: a logical block address and a namespace identifier. That is, the switch receives the LBA of the storage controller and the NSID of the storage controller from the storage controller.
  • NVMe Non-Volatile Memory Express
  • the I/O address includes: a logical unit number and a logical block address. That is, the switch receives the LUN of the storage controller and the LBA of the storage controller from the storage controller.
  • SCSI Small Computer System Interface
  • the identification information of the storage controller may include an Internet Protocol (Internet Protocol, IP) address of the storage controller and/or a name of the storage controller.
  • IP Internet Protocol
  • Table 1 the identification information of the storage controller is shown in Table 1:
  • Whether the I/O message whose destination is the storage controller needs to be copied refers to whether the storage controller is in an active-active mode.
  • the storage controller is a part of the active-active cluster, and the active-active cluster works in the active-active mode (also referred to as the active-active cluster performing the active-active task).
  • the feature of an active-active cluster is that both clusters are running online and can support the same application load.
  • the client When a client reads data from an active-active cluster, if one of the active-active clusters fails and the other cluster is still working normally, the client can directly read data through the working cluster. It can be seen that the active-active mode can effectively improve the security of stored data.
  • the storage controller and the backup storage controller work in the active-active mode, after receiving the I/O message, the storage controller copies the I/O message and sends the I/O message to the backup storage controller. message.
  • the data in the I/O message is stored by the backup storage controller.
  • the switch receives address information of the storage controller from the storage controller.
  • the switch receives address information of the storage controller from the storage controller.
  • the address information of the storage controller received by the switch may be different. Each will be described below.
  • the TCP connection When a TCP connection is established between the switch and the storage controller, the TCP connection is used to transmit the I/O routing information in step 401 and the address information of the storage controller in step 402 .
  • the address information of the storage controller includes: the IP address of the storage controller, the TCP port number and the protocol number of the storage controller. For example, as shown in Table 2:
  • Storage controller IP address of the storage controller TCP port number of the storage controller agreement number Storage controller A 192.168.1.1 3260 6 storage controller B 192.168.1.2 3270 6
  • the RDMA connection When an RDMA connection is established between the switch and the storage controller, the RDMA connection is used to transmit the I/O routing information in step 401 and the address information of the storage controller in step 402 .
  • the address information of the storage controller includes: the queue pair port number QP of the storage controller and the IP address of the storage controller. For example, as shown in Table 3:
  • storage controller IP address of the storage controller The queue pair port number of the storage controller Storage controller A 192.168.1.1 551 storage controller B 192.168.1.2 552
  • step 401 and step 402 is not limited, that is, step 401 may be executed first and then step 402 may be executed, or step 402 may be executed first and then step 401 may be executed.
  • the switch allocates corresponding switching address information for the storage controller.
  • the switch After the switch receives the address information of the storage controller, the switch assigns the corresponding switching address information to the storage controller. According to the difference in the communication connection established between the switch and the storage controller, similar to step 402, there are differences in the exchange address information, which will be described respectively below:
  • the switching address information includes: the IP address of the switch, the TCP port number and the protocol number of the switch. For example, as shown in Table 4:
  • the switching address information includes: the queue pair port number QP of the switch and the IP address of the switch. For example, as shown in Table 5:
  • the switch establishes a direct path connection with the storage controller.
  • the switch sends the switching address information corresponding to the storage controller to the storage controller through a communication connection (that is, a control plane connection) with the storage controller. Then the switch side and the storage controller side respectively store the exchange address information and the address information of the storage controller, and the switch and the storage controller establish a direct path connection according to the above address information.
  • the switch may assign identification information of a direct path connection to each direct path connection.
  • the specific process of establishing the direct path connection depends on the type of the direct path connection.
  • the type of the direct path connection is TCP
  • the direct path connection is established through a TCP link establishment procedure.
  • the direct path connection is RDMA
  • the direct path connection is established through the RDMA link establishment procedure.
  • the type of the direct path connection is UDP
  • the direct path connection is established through a UDP link establishment procedure. It should be noted that the type of the direct path connection includes but not limited to TCP, UDP, or RDMA.
  • the direct path connection between switch A and storage controller A is expressed as: "storage controller A: 192.168.1.1//3260//6; switch A: 192.168.1.3//3290//6".
  • the identification information of the direct path connection is "controller A-switch A”.
  • the direct path connection is of type TCP.
  • the direct path connection between Switch A and Storage Controller B is expressed as: "Storage Controller B: 192.168.1.2//3270//6; Switch A: 192.168.1.3//3290//6".
  • the identification information of the direct path connection is "controller B-switch A”.
  • the direct path connection is of type TCP.
  • the direct path connection between switch A and storage controller A is expressed as: "storage controller A: 192.168.1.1//551; switch A//192.168.1.3//556".
  • the identification information of the direct path connection is "controller A-switch A”.
  • the direct path connection is of type RDMA.
  • the direct path connection between switch A and storage controller B is expressed as: "storage controller B: 192.168.1.2//552; switch A//192.168.1.3//556".
  • the identification information of the direct path connection is "controller B-switch A”.
  • the direct path connection is of type RDMA.
  • the identification information of the direct path connection may be the identification information of the storage controller.
  • the direct path connection between switch A and storage controller B is expressed as: "storage controller B: 192.168.1.2//552; switch A//192.168.1.3//556".
  • the identification information of the direct path connection is "controller B”.
  • multiple direct path connections may be established between any switch and any storage controller.
  • Different direct path connections can carry I/O packets of different clients. An example is given below.
  • the direct path connection #1 between switch A and storage controller B is expressed as: "storage controller B: 192.168.1.2//3270//6; switch A: 192.168.1.3//3290//6 ; Client A: 192.168.2.3//3310//6".
  • the identification information of the direct path connection #1 is "controller B-switch A-client A”.
  • the direct path connection #1 is of type TCP.
  • Direct Path Connection #2 between Switch A and Storage Controller B is represented as: "Storage Controller B: 192.168.1.2//3270//6; Switch A: 192.168.1.3//3290//6; Client B : 192.168.2.4//3320//6".
  • the identification information of the direct path connection #2 is "controller B-switch A-client B".
  • the direct path connection #2 is of type TCP.
  • direct path connection #1 between Switch A and Storage Controller A is represented as: "Storage Controller A: 192.168.1.1//551; Switch A//192.168.1.3//556; Customer Terminal A//192.168.2.3//540".
  • the identification information of the direct path connection is "controller A-switch A-client A”.
  • the direct path connection is of type RDMA.
  • Direct Path Connection #2 between Switch A and Storage Controller B is represented as: "Storage Controller B: 192.168.1.2//552; Switch A//192.168.1.3//556; Client B//192.168.2.4 //541".
  • the identification information of the direct path connection is "controller B-switch A-client B”.
  • the direct path connection is of type RDMA.
  • direct path connection there are other implementation solutions for the direct path connection, which are not limited here.
  • the foregoing example of the direct path connection may also include other information, for example, the status of the direct path connection, etc., which is not limited here.
  • the switch generates a first mapping relationship.
  • the switch generates the first mapping relationship, and there are multiple possible implementations of the first mapping relationship, which will be described respectively below.
  • the identification information of the direct path connection may be the identification information of the storage controller.
  • the first mapping relationship includes identification information and I/O addresses of storage controllers.
  • the first mapping relationship includes: a mapping relationship between identification information of the direct path connection, identification information of the storage controller, and an I/O address.
  • the first mapping relationship is shown in Table 7:
  • the first mapping relationship may be a hash (hash) table of key-value pairs (key-value).
  • the key (key) of the table is the I/O address
  • the value (value) of the table is the identification information of the direct path connection and the identification information of the storage controller.
  • the first mapping relationship includes: identification information of the direct path, an I/O address, identification information of the storage controller, and identification information of the client.
  • the first mapping relationship is shown in Table 8:
  • the first mapping relationship may also include other information of the direct path connection, including but not limited to: the TCP port number of the storage controller, the IP address of the storage controller, the IP address of the switch, the port number of the switch, the QP of the storage controller, QP and protocol number of the switch, or connection status information (such as packet sequence number, etc.).
  • the client sends the first I/O packet to the switch.
  • the client sends the first I/O packet to the switch, and the first I/O packet is used to write data to the storage array through the storage controller or read data from the storage array through the storage controller.
  • step 406 is performed before step 401, and after the client sends the first I/O packet to the switch, steps 401-405 are performed. That is, after the switch receives the I/O packet from the client, a direct path connection is established between the switch and the storage controller.
  • step 406 is performed. That is, after a direct path connection is established between the switch and the storage controller, the switch receives I/O packets from the client.
  • the switch determines the direct path connection according to the first I/O packet and the first mapping relationship.
  • the switch determines the direct path for forwarding the first I/O message from the first mapping relationship according to the first I/O message connect.
  • the switch when the first mapping relationship includes: identification information and an I/O address of the direct path connection.
  • the switch After the switch receives the first I/O message, the switch first analyzes the address information (including source address information and destination address information) in the first I/O message. The switch determines the identification information of the direct path connection according to the I/O address in the first I/O message. The switch sends the first I/O packet by using the direct path connection. In this case, there is only one direct path connection per storage controller.
  • the switch when the first mapping relationship includes: identification information of the direct path connection, identification information of the storage controller, and an I/O address.
  • the switch After the switch receives the first I/O message, the switch first analyzes the address information (including source address information and destination address information) in the first I/O message. The switch determines the address information of the corresponding storage controller according to the I/O address in the first I/O message. Secondly, the switch determines the identification information of the direct path connection according to the address information of the storage controller. The switch sends the first I/O packet by using the direct path connection.
  • the first mapping relationship includes: identification information of the direct path, an I/O address, identification information of the storage controller, and identification information of the client.
  • the switch After the switch receives the first I/O message, the switch first analyzes the address information (including source address information and destination address information) in the first I/O message. Secondly, according to the I/O address in the first I/O message, the address information of the corresponding storage controller is determined. Thirdly, the switch determines the source address information of the I/O message according to the first I/O message, that is, the address information of the client corresponding to the first I/O message. Thirdly, the switch determines the identification information of the client according to the address information of the client. In a possible implementation manner, the address information of the client is consistent with the identification information of the client. Thirdly, the switch determines the corresponding identification information of the direct path connection according to the identification information of the client. The switch sends the first I/O packet by using the direct path connection.
  • the first mapping relationship includes: identification information of the direct path, an I/O address, identification information of the storage controller, and identification information of the client.
  • the switch After the switch receives the first I/O message, firstly, the switch parses address information (including source address information and destination address information) in the first I/O message. Secondly, according to the I/O address in the first I/O message, the address information of the corresponding storage controller is determined. Again, the switch detects the number of direct path connections corresponding to the storage controller. When the storage controller corresponds to only one direct path connection, the switch directly uses the direct path connection to send the first I/O packet.
  • address information including source address information and destination address information
  • the switch determines the first I/O according to the source address information in the first I/O message, that is, the address information of the client corresponding to the first I/O message /O which client the packet comes from. Thirdly, the switch determines the identification information of the client according to the address information of the client. In a possible implementation manner, the address information of the client is consistent with the identification information of the client. Thirdly, the switch determines the corresponding identification information of the direct path connection according to the identification information of the client. The switch sends the first I/O packet by using the direct path connection.
  • the switch when the switch resolves the address information in the first I/O message as "LBA-A and NSID-A", the switch determines that the destination of the first I/O message is Storage controller A, the identification information of the storage controller A is "192.168.1.1/target100.com”. The switch determines that the identification information of the corresponding direct path connection is "controller A-switch A" according to the first mapping relationship. The switch determines the direct path connection according to the identification information of the direct path connection.
  • the switch may dynamically select an appropriate storage controller to receive the first I/O packet according to status information reported by each storage controller.
  • the suitable storage controller here can be a storage controller with larger available space, or a storage controller with stronger IOPS performance, or a storage controller with lower processing load of the central processing unit (CPU) of the storage controller. Controller, not limited here.
  • the switch detects the first I/O message, when the destination (storage controller) of the first I/O message is a storage controller implementing the active-active mode. Then the switch copies the first I/O packet.
  • the switch sends the first I/O packet to a backup storage controller of the storage controller, and the storage controller and the backup storage controller work in a dual-active mode.
  • the switch adds an identifier indicating that the message is a copy message to the first I/O message sent by the switch to the backup storage controller.
  • the switch detects the first I/O message, and when the first I/O message is a data message, the switch executes step 408, and sends the first I/O message to the storage controller through a direct path connection.
  • the switch transparently transmits the first I/O packet.
  • the switch sends the first I/O packet to the storage controller through the direct path connection.
  • the switch after determining the direct path connection, the switch replaces the destination address in the packet header of the first I/O packet with the address information of the storage controller included in the direct path connection.
  • the type of the direct path connection is RDMA as an example.
  • the destination address in the message header of the first I/O message is replaced with "192.168.2.4//541", then the first I/O message is connected to "controller B-switch A-client B" through the direct path ” to storage controller B.
  • the switch may also send the first indication information to the second storage controller.
  • the first indication information instructs the second storage controller to update the local serial number of the received message
  • the second storage controller is the storage controller initially allocated by the client for the first I/O message.
  • the first indication information instructs the second storage controller to update the local serial number of the received message, so as to avoid the out-of-sequence problem caused by the second storage controller not receiving the first I/O message.
  • the switch sending the first indication information to the second storage controller includes: the switch sending a signal message to the second storage controller, and the message header of the signal message includes the message of the first I/O message Header information, where the header of the signal message further includes first indication information.
  • the switch notifies each storage controller to disable the out-of-sequence packet detection function.
  • the second storage controller is the storage controller initially allocated by the client to the first I/O message.
  • the present application proposes a method for reading and writing data.
  • a switch receives address information of the storage controller from a storage controller, and assigns corresponding switching address information to the storage controller. According to the address information of the storage controller and the exchange address information corresponding to the storage controller, the switch establishes a direct path connection with the storage controller, the direct path connection is used to transmit I/O packets, and the I The /O message is used to write data to the storage array through the storage controller or read data from the storage array through the storage control. Since the direct path connection established between the switch and the storage controller can transmit the I/O message, the I/O message of the client does not need to go through the aforementioned rerouting process. It avoids multiple addressing for I/O processing, shortens the forwarding path of I/O packets, reduces the completion time of data read and write operations, improves IOPS performance, and improves data read and write efficiency.
  • the storage controller may send a reply message to the client in various ways. Description will be made below in conjunction with the accompanying drawings.
  • FIG. 5 is a schematic diagram of an embodiment of a data reading and writing method proposed in the embodiment of the present application.
  • the data reading and writing method includes steps 501-504.
  • the client sends a first I/O packet to the switch.
  • the switch is connected through the direct path, and sends the first I/O packet to the storage controller.
  • the storage controller writes data to the storage array or reads data from the storage array according to the first I/O message.
  • the switch receives the first reply packet from the storage controller through the direct path connection.
  • the first reply message is a response to the first I/O message.
  • the destination of the first reply message is the client.
  • the switch sends a first reply message to the client.
  • the switch sends the first reply message to the client through the communication connection established between the client and the storage controller.
  • the switch sends the first reply message to the client through the TCP connection established between the client and the storage controller.
  • a direct path connection is established between the switch and the client.
  • the specific manner of establishing the direct path connection is similar to the manner of establishing the direct path connection between the switch and the storage controller in the embodiment in FIG. 4 .
  • the switch sends the first reply message to the client through the direct path connection.
  • the storage controller sends the first reply message to the switch through the direct path connection, so that the client does not perceive the routing path of the first reply message.
  • FIG. 6 is a schematic diagram of an embodiment of another data reading and writing method proposed in the embodiment of the present application.
  • the data reading and writing method includes steps 601-606.
  • the client sends a first I/O packet to the switch.
  • the switch is connected through the direct path, and sends the first I/O packet to the storage controller.
  • the storage controller writes data to the storage array or reads data from the storage array according to the first I/O message.
  • the switch is connected through the direct path, and receives the second reply packet from the storage controller.
  • the destination of the second reply message is the second storage controller.
  • the second storage controller is the storage controller initially allocated by the client for the first I/O packet. That is, the second storage controller is a storage controller randomly assigned to the first I/O message by the client before sending the first I/O message.
  • the second reply packet includes proxy indication information, and the proxy indication information instructs the second storage controller to send the second reply packet to the client on behalf of the storage controller.
  • the storage controller sends the second reply message to the second storage controller through the network connection.
  • the corresponding switch performs network forwarding, and forwards the second reply message from the storage controller to the second storage controller.
  • the switch forwards the second reply packet to the second storage controller.
  • the second storage controller After receiving the second reply message from the switch, the second storage controller generates a third reply message according to the second reply message, where the destination of the third reply message is the client, and the second The storage controller sends the modified third reply message to the switch.
  • the second storage controller determines that the message needs to be sent to the client according to the proxy indication information included in the second reply message.
  • the second storage controller generates a third reply message according to the second reply message, the destination of the third reply message is the client, and the third reply message does not include the proxy indication information.
  • the switch After receiving the third reply packet from the second storage controller, the switch sends the third reply packet to the client.
  • the switch sends the third reply message to the client through the communication connection established between the client and the storage controller.
  • the switch sends the third reply message to the client through the TCP connection established between the client and the storage controller.
  • a direct path connection is established between the switch and the client.
  • the specific manner of establishing the direct path connection is similar to the manner of establishing the direct path connection between the switch and the storage controller in the embodiment in FIG. 4 .
  • the switch sends the third reply message to the client through the direct path connection.
  • the storage controller sends the third reply message to the switch through the direct path connection, so that the client does not perceive the routing path of the third reply message.
  • FIG. 7 is a schematic diagram of another embodiment of a data reading and writing method proposed in the embodiment of the present application.
  • the data reading and writing method includes steps 701-708.
  • the switch receives routing information from the front-end server.
  • the client accesses the storage system
  • the client accesses the storage system
  • the client accesses the storage system
  • the client accesses the storage system.
  • the client establishes a connection with the front-end server, and the connection may be a control plane connection, such as the TCP connection or the RDMA connection in the foregoing embodiments.
  • the switch After the client establishes a connection with the front-end server, the switch receives routing information from the front-end server.
  • the routing information of the front-end server includes but not limited to: identification information of the back-end server, I/O address of the back-end server, data storage address, or load information of the back-end server.
  • the I/O address includes, but is not limited to: logical unit number, namespace identifier, or logical block address.
  • the identification information of the backend server may include the IP address of the backend server and/or the name of the backend server.
  • the storage address of the data refers to the storage address of the data in the backend server, and the data may be business-related data. For example: video, audio, text or image, etc.
  • the load information of the back-end server includes, but is not limited to: the total storage space of the back-end server, the remaining available storage space of the back-end server, the used storage space of the back-end server, the IOPS performance of the back-end server, the The available bandwidth of the backend server, the total bandwidth of the backend server, or the temperature of the storage controller.
  • the backend server is equivalent to the storage controller in the foregoing embodiments.
  • the switch receives address information from the front-end server, where the address information is address information of the back-end server.
  • the switch receives address information of the back-end server from the front-end server.
  • the address information of the back-end server received by the switch may be different. Each will be described below.
  • the address information of the back-end server includes: the IP address of the back-end server, the port number and the protocol number of the back-end server
  • the address information of the back-end server includes: the queue pair port number QP of the back-end server and the IP address of the back-end server.
  • step 701 and step 702 is not limited, that is, step 701 may be executed first and then step 702 may be executed, or step 702 may be executed first and then step 701 may be executed.
  • the address information of the storage controller may be sent to the switch by the storage controller itself, or may be sent to the switch through other devices, for example, the front-end server in this embodiment.
  • the switch stores switching address information corresponding to the backend server.
  • the switch After the switch receives the address information of the backend server, the switch assigns the corresponding switching address information to the backend server. According to the difference in the communication connection established between the front-end server and the back-end server, similar to step 702, there are differences in the exchange address information, which will be explained separately below:
  • the exchange address information includes: the IP address of the switch, the port number and the protocol number of the switch.
  • the exchange address information includes: the queue pair port number QP of the switch and the IP address of the switch.
  • the switch establishes a direct path connection with the backend server.
  • the switch sends the exchange address information corresponding to the back-end server to the back-end server through a communication connection (that is, a control plane connection) with the back-end server.
  • the exchange address information and the address information of the back-end server are respectively stored on the switch side and the back-end server side, and the switch and the back-end server establish a direct path connection according to the above address information.
  • the switch may assign identification information of a direct path connection to each direct path connection.
  • the switch generates a second mapping relationship.
  • the switch generates a second mapping relationship, where the second mapping relationship includes: a mapping relationship between identification information of a direct path connection, identification information of a backend server, and an I/O address.
  • the second mapping relationship is an implementation manner of the foregoing first mapping relationship.
  • the client sends the second packet to the switch.
  • the client sends a second message to the switch, where the second message is used to read or write data to the backend server.
  • the client needs to upload video to the storage system, the client sends a second message to the switch, and the second message carries the video data.
  • the client needs to watch the video, the client sends a second message to the switch, and the second message carries the address or identifier of the video data to be read.
  • the second message is an implementation manner of the foregoing first message.
  • step 706 is performed before step 701, and after the client sends the second packet to the switch, steps 701-705 are performed. That is, after the switch receives the second message from the client, a direct path connection is established between the switch and the backend server.
  • step 706 is performed. That is, after a direct path connection is established between the switch and the backend server, the switch receives the second packet from the client.
  • the switch can transparently transmit messages such as link establishment and authorization verification sent by the client to the front-end server.
  • the front-end server processes the above-mentioned messages such as link establishment or authorization verification.
  • the switch determines the direct path connection according to the second packet and the second mapping relationship.
  • the switch determines the direct path connection used to forward the second message from the second mapping relationship according to the second message.
  • the switch sends the second packet to the storage backend server through the direct path connection.
  • This application proposes a method for reading and writing data. Since the direct path connection established between the switch and the back-end server can transmit messages, the messages of the client do not need to be re-routed by the front-end server. It avoids the communication congestion caused by the computing power of the front-end server and the network bandwidth, improves the processing efficiency of the message, improves the utilization efficiency of the back-end server, and realizes the load balancing of each back-end server.
  • FIG. 8 is a schematic structural diagram of a network device 800 provided in an embodiment of the present application.
  • the network device 800 shown in FIG. 8 shows some specific features, those skilled in the art will realize from the embodiments of the present application that for the sake of brevity, various other features are not shown in FIG. 8 so as not to confuse the present invention. Further relevant aspects of the embodiments disclosed in the application examples.
  • the network device 800 includes one or more processing modules (e.g., a CPU) 801, a network interface 802, a programming interface 803, a memory 804, and one or more communication buses 805 for Interconnect the various components.
  • the network device 800 may also omit or add some functional components or units based on the above examples.
  • the network interface 802 is used to connect with one or more other network devices/servers in the network system.
  • communication bus 805 includes circuitry that interconnects and controls communication between system components.
  • Memory 804 may include nonvolatile memory, for example, read-only memory (read-only memory, ROM), programmable read-only memory (programmable ROM, PROM), erasable programmable read-only memory (erasable PROM, EPROM) , Electrically Erasable Programmable Read-Only Memory (electrically EPROM, EEPROM) or flash memory.
  • Memory 804 may also include volatile memory, which may be random access memory (RAM), which acts as an external cache.
  • the memory 804 or the non-transitory computer-readable storage medium of the memory 804 stores the following programs, modules and data structures, or a subset thereof, for example including a transceiver module (not shown in the figure), a transceiver module 8041 and a processing Module 8042.
  • the network device 800 may have any function of the switch in the above method embodiments corresponding to FIG. 2 to FIG. 7 .
  • the network device 800 corresponds to the switch in the above-mentioned method embodiment, and each module in the network device 800 and the above-mentioned other operations and/or functions are to implement various steps and methods implemented by the switch in the above-mentioned method embodiment, respectively,
  • each module in the network device 800 and the above-mentioned other operations and/or functions are to implement various steps and methods implemented by the switch in the above-mentioned method embodiment, respectively.
  • FIGS. 2-7 For specific details, refer to the method embodiments corresponding to the foregoing FIGS. 2-7 , and details are not repeated here for the sake of brevity.
  • the network interface 802 on the network device 800 can complete the data sending and receiving operation, or the processor can call the program code in the memory, and cooperate with the network interface 802 to realize the function of the sending and receiving module when necessary .
  • the network device 800 is configured to execute the data reading and writing method provided by the embodiment of the present application, for example, executing the data reading and writing method corresponding to the above-mentioned embodiments shown in FIGS. 2-7 .
  • the specific structure of the network device described in FIG. 8 of this application may be as shown in FIG. 9 .
  • FIG. 9 is a schematic structural diagram of a network device 900 provided by an embodiment of the present application.
  • the network device 900 includes: a main control board 910 and an interface board 930 .
  • the main control board 910 is also called a main processing unit (main processing unit, MPU) or a route processor (route processor), and the main control board 910 is used for controlling and managing each component in the network device 900, including route calculation, device management , equipment maintenance, protocol processing functions.
  • the main control board 910 includes: a CPU 911 and a memory 912 .
  • the interface board 930 is also called a line processing unit (line processing unit, LPU), a line card (line card), or a service board.
  • the interface board 930 is used to provide various service interfaces and implement forwarding of data packets.
  • Service interfaces include but are not limited to Ethernet interfaces, POS (Packet over SONET/SDH) interfaces, etc.
  • the interface board 930 includes: a central processing unit 931 , a network processor 932 , a forwarding entry storage 934 and a physical interface card (physical interface card, PIC) 933 .
  • the CPU 931 on the interface board 930 is used to control and manage the interface board 930 and communicate with the CPU 911 on the main control board 910 .
  • the network processor 932 is configured to implement message forwarding processing.
  • the form of the network processor 932 may be a forwarding chip.
  • the physical interface card 933 is used to realize the docking function of the physical layer, and the original flow enters the interface board 930 through this, and the message after processing is sent from the physical interface card 933.
  • the physical interface card 933 includes at least one physical interface, which is also called a physical interface, and the physical interface may be a Flexible Ethernet (FlexE) physical interface.
  • the physical interface card 933 is also called a daughter card, which can be installed on the interface board 930, and is responsible for converting the photoelectric signal into a message, checking the validity of the message and forwarding it to the network processor 932 for processing.
  • the central processing unit 931 of the interface board 930 can also execute the functions of the network processor 932 , such as implementing software forwarding based on a general-purpose CPU, so that the interface board 930 does not need the network processor 932 .
  • the network device 900 includes multiple interface boards.
  • the network device 900 further includes an interface board 940 , and the interface board 940 includes: a central processing unit 941 , a network processor 942 , a forwarding entry storage 944 and a physical interface card 943 .
  • the network device 900 further includes a switching fabric unit 920 .
  • the SFU 920 may also be called a SFU (switch fabric unit, SFU).
  • SFU switch fabric unit
  • the switching fabric board 920 is used to complete the data exchange between the interface boards.
  • the interface board 930 and the interface board 940 may communicate through the switching fabric board 920 .
  • the main control board 910 is coupled to the interface board.
  • the main control board 910, the interface board 930 and the interface board 940, and the switching fabric board 920 are connected through a system bus and/or a system backplane to implement intercommunication.
  • an inter-process communication protocol IPC
  • IPC inter-process communication
  • the network device 900 includes a control plane and a forwarding plane.
  • the control plane includes a main control board 910 and a central processing unit 931.
  • the forwarding plane includes various components for performing forwarding, such as a forwarding entry storage 934, a physical interface card 933, and a network processing device 932.
  • the control plane performs functions such as publishing routes, generating forwarding tables, processing signaling and protocol packets, configuring and maintaining device status, etc., and the control plane sends the generated forwarding tables to the forwarding plane.
  • the network processor 932 The forwarding table issued above looks up and forwards the packets received by the physical interface card 933.
  • the forwarding table issued by the control plane may be stored in the forwarding table item storage 934 . In some embodiments, the control plane and the forwarding plane may be completely separated and not on the same device.
  • transceiver module in the network device 800 may be equivalent to the physical interface card 933 or the physical interface card 943 in the network device 900; the transceiver module 8041 and the processing module 8042 in the network device 800 may be equivalent to the central processing module
  • the processor 911 or the central processing unit 931 may also correspond to program codes or instructions stored in the memory 912.
  • the operations on the interface board 940 in the embodiment of the present application are consistent with the operations on the interface board 930 , and are not repeated for brevity.
  • the network device 900 in this embodiment may correspond to the switch in each of the foregoing method embodiments, and the main control board 910, the interface board 930, and/or the interface board 940 in the network device 900 may implement the For the sake of brevity, the functions and/or various steps implemented by the switch are not repeated here.
  • main control boards there may be one or more main control boards, and when there are multiple main control boards, it may include the main main control board and the standby main control board. There may be one or more interface boards. The stronger the data processing capability of the network device, the more interface boards it provides. There may also be one or more physical interface cards on the interface board. There may be no SFU, or there may be one or more SFUs. When there are multiple SFUs, they can jointly implement load sharing and redundant backup. Under the centralized forwarding architecture, the network device does not need a switching network board, and the interface board undertakes the processing function of the service data of the entire system.
  • the network device can have at least one SFU, and the data exchange between multiple interface boards can be realized through the SFU to provide large-capacity data exchange and processing capabilities.
  • the form of the network device can also be that there is only one board, that is, there is no switching fabric board, and the functions of the interface board and the main control board are integrated on this board.
  • the central processing unit and the main control board on the interface board can be combined into one central processing unit on the one board to perform the superimposed functions of the two. Which architecture to use depends on the specific networking deployment scenario, and there is no unique limitation here.
  • the foregoing switch may be implemented as a virtualization device.
  • the virtualization device may be a virtual machine (virtual machine, VM) running a program for sending packets, a virtual router or a virtual switch.
  • Virtualization devices are deployed on hardware devices (eg, physical servers).
  • a switch may be implemented based on a common physical server combined with a network functions virtualization (network functions virtualization, NFV) technology.
  • network functions virtualization network functions virtualization
  • FIG. 10 is a schematic structural diagram of a storage device 1000 provided by an embodiment of the present application.
  • the storage device 1000 shown in FIG. 10 is a storage array.
  • the storage device 1000 may include a storage controller 1100 and a disk array 1200, wherein the disk array 1200 here is used to provide storage space, and may include a cheap redundant array of independent disk (RAID for short) Or a disk enclosure containing multiple disks.
  • RAID redundant array of independent disk
  • There may be multiple disk arrays 1200 and the disk array 1200 includes multiple disks 1202 .
  • Disk 1202 is used to store data.
  • the disk array 1200 communicates with the controller 1100 through communication protocols such as SCSI protocol. Agreement is not limited here.
  • the storage controller 1100 in the storage device 1000 is configured to execute relevant steps in the foregoing method embodiments.
  • the disk array 1200 is just an example of the memory in the storage device.
  • the data may also be stored through a memory such as a tape library.
  • the magnetic disk 1202 is also only an example of the memory for constructing the magnetic disk array 1200 .
  • the disk array 1200 may also include a memory, including a non-volatile storage medium, such as a solid state disk (solid state disk, SSD for short), a cabinet containing multiple disks, or a server, which will not be described here. limited.
  • the storage controller 1100 is the “brain” of the storage device 1000 , and mainly includes a processor 1102 , a cache 1103 , a memory 1101 , a communication bus (bus for short) 1105 and a communication interface 1104 .
  • the processor 1102 , the cache memory 1103 , the memory 1101 and the communication interface 1104 communicate with each other through the communication bus 1105 .
  • the communication interface 1104 is used for communicating with switches, clients, other network devices or other storage devices.
  • the memory 1101 is used to store a program 1106 .
  • the memory 1101 may include a high-speed random access memory (random access memory, RAM for short), or may also include a non-volatile memory, such as at least one disk memory. It can be understood that the memory 1101 can be various non-transitory machine-readable media that can store program codes, such as RAM, magnetic disk, hard disk drive, optical disk, SSD or non-volatile memory.
  • Program 1106 may include program code including computer operating instructions.
  • Cache 1103 is the memory between the controller and the hard drive, which is smaller in capacity but faster than the hard drive.
  • the cache 1103 is used to temporarily store data, such as I/O transactions received from switches or other storage devices, and temporarily store data read from the disk 1202, so as to improve the performance and reliability of the array.
  • the cache 1103 can be various non-transitory machine-readable media that can store data, such as RAM, ROM, flash memory or SSD, which is not limited here.
  • the processor 1102 may be a central processing unit (central processing unit, CPU for short) or an application-specific integrated circuit (ASIC for short), or configured to implement one or more integrated circuits in the embodiments of the present application.
  • An operating system and other software programs are installed in the processor 1102, and different software programs can be regarded as different processing modules with different functions, such as processing the input/output (input/output, referred to as I/O) requests of the disk 1202 .
  • I/O input/output
  • the storage controller 1100 can implement various data management functions such as I/O operations, snapshots, mirroring, and duplication.
  • the processor 1102 is configured to execute the program 1106, specifically, relevant steps in the aforementioned method embodiments may be executed.
  • an embodiment of the present application also provides a computer program product, which, when running on a network device, causes the network device to perform the method performed by the switch in the method embodiments corresponding to FIGS. 2-7 above.
  • the embodiment of the present application also provides a chip system, including a processor and an interface circuit, and the interface circuit is configured to receive instructions and transmit them to the processor.
  • the processor is configured to implement the method in any one of the foregoing method embodiments.
  • the chip system further includes a memory, and there may be one or more processors in the chip system.
  • the processor can be realized by hardware or by software.
  • the processor may be a logic circuit, an integrated circuit, or the like.
  • the processor may be a general-purpose processor, and implements the method in any of the above method embodiments by reading the software code stored in the memory.
  • the memory can be integrated with the processor, or can be set separately from the processor, which is not limited in this application.
  • the memory can be a non-transitory processor, such as a read-only memory ROM, which can be integrated with the processor on the same chip, or can be respectively arranged on different chips.
  • the setting method of the processor is not specifically limited.
  • B corresponding to A means that B is associated with A, and B can be determined according to A.
  • determining B according to A does not mean determining B only according to A, and B may also be determined according to A and/or other information.
  • the disclosed system, device and method can be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components can be combined or integrated. to another system, or some features may be ignored, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
  • a unit described as a separate component may or may not be physically separated, and a component shown as a unit may or may not be a physical unit, that is, it may be located in one place, or may also be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing module, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional units.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

A method for reading and writing data, comprising: a switch receiving address information of a memory controller (1100) from the memory controller (1100), and assigning corresponding switching address information to the memory controller (1100). According to the address information of the memory controller (1100) and the corresponding switching address information of the memory controller (1100), the switch establishes a direct path connection to the memory controller (1100), wherein the direct path connection is used for transmitting an input/output (I/O) message, and the I/O message is used for writing data into a memory array by means of the memory controller (1100) or reading data from the memory array by means of the memory controller (1100). Because the direct path connection established between the switch and the memory controller (1100) allows for transmission of the I/O message, the multiple address searches of I/O processing is avoided, the forwarding path of the I/O message is shortened, the time for completing data reading and writing operations is reduced, and the data reading and writing efficiency is increased. Further disclosed is a related apparatus for reading and writing data.

Description

一种数据读写方法以及相关装置A data reading and writing method and related device
本申请要求于2021年06月22日提交中国国家知识产权局、申请号为202110694337.7、发明名称为“一种数据读写方法以及相关装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application filed with the State Intellectual Property Office of China on June 22, 2021, with the application number 202110694337.7, and the title of the invention is "A Data Reading and Writing Method and Related Devices", the entire contents of which are incorporated by reference in this application.
技术领域technical field
本申请涉及存储技术领域,尤其涉及一种数据读写方法以及相关装置。The present application relates to the technical field of storage, and in particular to a method for reading and writing data and related devices.
背景技术Background technique
企业存储或者分布式存储系统通常包括客户端服务器集群、网络和存储阵列集群。客户端服务器集群(简称为客户端)可以承载例如结构化查询语言(Structured Query Language,SQL)服务或者数据库等多种应用。客户端读写的数据实际存储于后端的存储阵列集群中。客户端的数据读写访问也称为输入/输出(input/output,I/O)请求,每一次数据读写访问都需要客户端通过网络到存储阵列集群中进行实际的数据读写操作。Enterprise storage or distributed storage systems typically include clusters of client servers, networks, and storage arrays. A client-server cluster (referred to as a client for short) may carry various applications such as a structured query language (Structured Query Language, SQL) service or a database. The data read and written by the client is actually stored in the back-end storage array cluster. The data read and write access of the client is also called input/output (input/output, I/O) request, and each data read and write access requires the client to perform actual data read and write operations in the storage array cluster through the network.
存储阵列集群由很多存储控制器和存储阵列组成,考虑到高可用和数据访问一致性等因素,往往每个存储控制器分管部分数据盘的数据读写操作。每一次对数据的I/O请求,都需要存储阵列集群中对应的一个存储控制器负责这次I/O请求对数据盘的读写处理。即每个I/O请求有对应的归属存储控制器,归属存储控制器才有权限访问存储阵列中的数据。当前,客户端和存储阵列通常为两套独立的系统,客户端无法预先获取每个I/O请求对应的存储控制器。I/O请求与存储控制器之间的对应关系保存在存储阵列的每一个存储控制器中。客户端根据自身的算法(例如负载均衡,轮询等),将I/O请求发送到存储阵列中的一个存储控制器。该存储控制器通过查找I/O请求与存储控制器之间的对应关系,确定目标存储控制器并将此I/O请求重路由到目标存储控制器以进行数据读写操作。A storage array cluster consists of many storage controllers and storage arrays. Considering factors such as high availability and data access consistency, each storage controller is often responsible for data read and write operations on some data disks. For each I/O request for data, a corresponding storage controller in the storage array cluster is required to be responsible for reading and writing the I/O request to the data disk. That is, each I/O request has a corresponding attributable storage controller, and only the attributable storage controller has the right to access the data in the storage array. Currently, the client and the storage array are usually two independent systems, and the client cannot obtain the storage controller corresponding to each I/O request in advance. The corresponding relationship between I/O requests and storage controllers is stored in each storage controller of the storage array. The client sends the I/O request to a storage controller in the storage array according to its own algorithm (such as load balancing, round robin, etc.). The storage controller determines the target storage controller by searching the corresponding relationship between the I/O request and the storage controller, and reroutes the I/O request to the target storage controller for data read and write operations.
上述方法由于需要重路由,因此I/O请求的转发路径较长,造成数据读写操作完成时间较长,影响数据读写效率。Since the above method requires rerouting, the forwarding path of the I/O request is longer, resulting in a longer time for data read and write operations to complete, which affects the efficiency of data read and write.
发明内容Contents of the invention
本申请提出一种数据读写方法,交换机接收来自存储控制器的所述存储控制器的地址信息,并为所述存储控制器分配对应的交换地址信息。根据该存储控制器的地址信息和该存储控制器对应的交换地址信息,该交换机建立与该存储控制器之间的直接路径连接,该直接路径连接用于传输I/O报文,所述I/O报文用于通过所述存储控制器向存储阵列写入数据或通过所述存储控制从存储阵列读取数据。由于交换机与存储控制器之间建立的直接路径连接可以传输I/O报文,避免了I/O处理的多次寻址,缩短了I/O报文的转发路径,减少数据读写操作完成时间,提升I/O访问次数(input/output per second,IOPS)性能,提升数据读写效率。The present application proposes a method for reading and writing data. A switch receives address information of the storage controller from a storage controller, and assigns corresponding switching address information to the storage controller. According to the address information of the storage controller and the exchange address information corresponding to the storage controller, the switch establishes a direct path connection with the storage controller, the direct path connection is used to transmit I/O packets, and the I The /O message is used to write data to the storage array through the storage controller or read data from the storage array through the storage control. Since the direct path connection established between the switch and the storage controller can transmit I/O packets, multiple addressing for I/O processing is avoided, the forwarding path of I/O packets is shortened, and the completion of data read and write operations is reduced. time, improve I/O access times (input/output per second, IOPS) performance, and improve data read and write efficiency.
本申请第一方面提供一种数据读写方法,交换机接收存储控制器的地址信息,所述存 储控制器的地址信息包括以下一项或多项:所述存储控制器的队列对端口号、所述存储控制器的互联网协议地址、所述存储控制器的传输控制协议TCP端口号、或者协议号;所述交换机为所述存储控制器分配对应的交换地址信息,所述交换地址信息包括以下一项或多项:所述交换机的队列对端口号、所述交换机的互联网协议IP地址或者所述交换机的端口号;所述交换机与所述存储控制器建立直接路径连接,所述直接路径连接用于传输输入输出I/O报文,所述I/O报文用于通过所述存储控制器向存储阵列写入数据或通过所述存储控制从存储阵列读取数据。The first aspect of the present application provides a method for reading and writing data. The switch receives the address information of the storage controller, and the address information of the storage controller includes one or more of the following: the queue pair port number of the storage controller, the The Internet protocol address of the storage controller, the transmission control protocol TCP port number, or the protocol number of the storage controller; the switch allocates corresponding exchange address information for the storage controller, and the exchange address information includes the following one One or more items: the queue pair port number of the switch, the Internet protocol IP address of the switch or the port number of the switch; the switch establishes a direct path connection with the storage controller, and the direct path connection uses For transmitting input and output I/O messages, the I/O messages are used to write data to the storage array through the storage controller or read data from the storage array through the storage control.
一种可能的场景中,交换机与该存储控制器建立通信连接。存储控制器的地址信息通过该通信连接传输至交换机中。该通信连接也可以称为控制面连接。具体建立该通信连接的过程如下:(a)、交换机发起关于直接路径连接的监听任务。(b)、当存储控制器上线后,该存储控制器本地分配存储控制器的队列对端口号(或者该存储控制器的IP地址与端口号)。(c)、交换机监听到存储控制器上线后,交换机与存储控制器通过远程直接数据存取RDMA建链流程(或者传输控制协议TCP建链流程)建立控制面连接。In a possible scenario, the switch establishes a communication connection with the storage controller. The address information of the storage controller is transmitted to the switch via the communication link. This communication connection may also be referred to as a control plane connection. The specific process of establishing the communication connection is as follows: (a), the switch initiates a monitoring task related to the direct path connection. (b) After the storage controller goes online, the storage controller locally allocates the queue pair port number of the storage controller (or the IP address and port number of the storage controller). (c) After the switch detects that the storage controller is online, the switch and the storage controller establish a control plane connection through a remote direct data access RDMA link establishment process (or a transmission control protocol TCP link establishment process).
当交换机与该存储控制器之间建立TCP连接,该TCP连接用于传输存储控制器的地址信息。该存储控制器的地址信息包括:该存储控制器的IP地址、该存储控制器的TCP端口号以及协议号。When a TCP connection is established between the switch and the storage controller, the TCP connection is used to transmit address information of the storage controller. The address information of the storage controller includes: the IP address of the storage controller, the TCP port number and the protocol number of the storage controller.
当交换机与该存储控制器之间建立RDMA连接,该RDMA连接用于传输存储控制器的地址信息。该存储控制器的地址信息包括:该存储控制器的队列对端口号QP和该存储控制器的IP地址。When an RDMA connection is established between the switch and the storage controller, the RDMA connection is used to transmit address information of the storage controller. The address information of the storage controller includes: the queue pair port number QP of the storage controller and the IP address of the storage controller.
本申请中,当客户端上线后,客户端会与存储控制器建立网络连接,则交换机与该存储控制器建立网络连接。当网络连接建立后,交换机接收来自存储控制器的所述存储控制器的地址信息,并为所述存储控制器分配对应的交换地址信息。根据该存储控制器的地址信息和该存储控制器对应的交换地址信息,该交换机建立与该存储控制器之间的直接路径连接,该直接路径连接用于传输I/O报文,所述I/O报文用于通过所述存储控制器向存储阵列写入数据或通过所述存储控制从存储阵列读取数据。由于交换机与存储控制器之间建立的直接路径连接可以传输I/O报文,避免了I/O处理的多次寻址,缩短了I/O报文的转发路径,减少数据读写操作完成时间,提升IOPS性能,提升数据读写效率。In this application, when the client goes online, the client establishes a network connection with the storage controller, and the switch establishes a network connection with the storage controller. After the network connection is established, the switch receives address information of the storage controller from the storage controller, and assigns corresponding exchange address information to the storage controller. According to the address information of the storage controller and the exchange address information corresponding to the storage controller, the switch establishes a direct path connection with the storage controller, the direct path connection is used to transmit I/O packets, and the I The /O message is used to write data to the storage array through the storage controller or read data from the storage array through the storage control. Since the direct path connection established between the switch and the storage controller can transmit I/O packets, multiple addressing for I/O processing is avoided, the forwarding path of I/O packets is shortened, and the completion of data read and write operations is reduced. Time, improve IOPS performance, and improve data read and write efficiency.
可选的,所述交换机接收来自所述存储控制器的路由信息,所述路由信息包括以下信息中的一项或多项:所述存储控制器的标识信息、所述存储控制器的输入输出I/O地址、目的地为所述存储控制器的I/O报文是否需要复制,或者所述存储控制器的负载信息,所述I/O地址包括逻辑单元号、命名空间的标识和/或逻辑区块地址。Optionally, the switch receives routing information from the storage controller, where the routing information includes one or more of the following information: identification information of the storage controller, input and output of the storage controller The I/O address, whether the I/O message whose destination is the storage controller needs to be copied, or the load information of the storage controller, the I/O address includes a logical unit number, a name space identifier and/or or logical block address.
本申请中将指存储控制器所管理的存储阵列的负载信息,简称为存储控制器的负载信息。存储控制器的负载信息包括但不限于:该存储控制器的存储空间总量、该存储控制器的剩余可用存储空间、该存储控制器的已用存储空间、该存储控制器的IOPS、该存储控制器是否支持双活模式,或者该存储控制器的温度。In this application, the load information of the storage array managed by the storage controller will be referred to as the load information of the storage controller for short. The load information of the storage controller includes, but is not limited to: the total storage space of the storage controller, the remaining available storage space of the storage controller, the used storage space of the storage controller, the IOPS of the storage controller, the storage Whether the controller supports active-active mode, or the temperature of the storage controller.
交换机接收存储控制器的路由信息,使得交换机能够掌握存储控制器的状态,当交换机收到I/O报文后,可以为该I/O报文快速的确定对应的存储控制器。提升数据读写效率。The switch receives the routing information of the storage controller, so that the switch can grasp the status of the storage controller. After receiving the I/O message, the switch can quickly determine the corresponding storage controller for the I/O message. Improve data read and write efficiency.
可选的,所述交换机生成第一映射关系,所述第一映射关系包括所述直接路径连接的标识信息和所述I/O地址的映射关系。该第一映射关系可以是键值对(key-value)的哈希(hash)表。该表的键(key)为I/O地址,该表的值(value)为该直接路径连接的标识信息。当交换机收到I/O报文后,可以为该I/O报文和第一映射关系快速的确定对应的存储控制器。提升数据读写效率。Optionally, the switch generates a first mapping relationship, where the first mapping relationship includes a mapping relationship between the identification information of the direct path connection and the I/O address. The first mapping relationship may be a hash (hash) table of key-value pairs (key-value). The key (key) of the table is the I/O address, and the value (value) of the table is the identification information of the direct path connection. After the switch receives the I/O message, it can quickly determine the corresponding storage controller for the I/O message and the first mapping relationship. Improve data read and write efficiency.
一种可能的实现方式中,当任一交换机与任一存储控制器之间只建立一条直接路径连接时,该直接路径连接的标识信息可以是该存储控制器的标识信息。In a possible implementation manner, when only one direct path connection is established between any switch and any storage controller, the identification information of the direct path connection may be the identification information of the storage controller.
可选的,该第一映射关系可以包括:直接路径连接的标识信息、存储控制器的标识信息和I/O地址。Optionally, the first mapping relationship may include: identification information of the direct path connection, identification information of the storage controller, and an I/O address.
可选的,该第一映射关系可以包括:直接路径的标识信息、I/O地址、存储控制器的标识信息和客户端的标识信息。Optionally, the first mapping relationship may include: identification information of the direct path, an I/O address, identification information of the storage controller, and identification information of the client.
可选的,所述交换机接收来自客户端的第一I/O报文,该第一I/O报文用于通过存储控制器向存储阵列写入数据或通过存储控制器从存储阵列读取数据;所述交换机根据所述第一I/O报文和所述第一映射关系,确定所述直连路径连接;所述交换机通过所述直接路径连接,向所述存储控制器发送所述第一I/O报文。Optionally, the switch receives the first I/O message from the client, and the first I/O message is used to write data to the storage array through the storage controller or read data from the storage array through the storage controller ; The switch determines the direct path connection according to the first I/O message and the first mapping relationship; the switch sends the first path connection to the storage controller through the direct path connection An I/O message.
具体的,交换机解析该第一I/O报文中的地址信息。交换机根据该第一I/O报文中的地址信息,确定该第一I/O报文的目的地,即该第一I/O报文对应的存储控制器。交换机根据该存储控制器的标识信息,确定对应的直接路径连接的标识信息。交换机使用该直接路径连接向存储控制器发送该第一I/O报文。Specifically, the switch parses the address information in the first I/O packet. The switch determines the destination of the first I/O message according to the address information in the first I/O message, that is, the storage controller corresponding to the first I/O message. The switch determines the corresponding identification information of the direct path connection according to the identification information of the storage controller. The switch sends the first I/O message to the storage controller by using the direct path connection.
可选的,当该第一映射关系包括:直接路径连接的标识信息、存储控制器的标识信息和I/O地址时。交换机收到第一I/O报文后,首先,根据第一I/O报文中的I/O地址,确定对应的存储控制器的地址信息。其次,交换机根据该存储控制器的地址信息确定直接路径连接的标识信息。交换机使用该直接路径连接发送该第一I/O报文。Optionally, when the first mapping relationship includes: identification information of the direct path connection, identification information of the storage controller, and an I/O address. After receiving the first I/O message, the switch first determines the address information of the corresponding storage controller according to the I/O address in the first I/O message. Secondly, the switch determines the identification information of the direct path connection according to the address information of the storage controller. The switch sends the first I/O packet by using the direct path connection.
可选的,当任一交换机与任一存储控制器之间建立一条或多条直接路径连接时,不同的直接路径连接用于承载不同客户端的I/O报文。该第一映射关系包括:直接路径的标识信息、I/O地址、存储控制器的标识信息和客户端的标识信息。Optionally, when one or more direct path connections are established between any switch and any storage controller, different direct path connections are used to bear I/O packets of different clients. The first mapping relationship includes: identification information of the direct path, an I/O address, identification information of the storage controller, and identification information of the client.
交换机收到第一I/O报文后,首先,根据第一I/O报文中的I/O地址,确定对应的存储控制器的地址信息。其次,交换机根据第一I/O报文确定该I/O报文的源地址信息,即该第一I/O报文对应的客户端的地址信息。再次,交换机根据该客户端的地址信息,确定客户端的标识信息,一种可能的实现方式中,该客户端的地址信息与该客户端的标识信息一致。再次,交换机根据该客户端的标识信息,确定对应的直接路径连接的标识信息。交换机使用该直接路径连接发送该第一I/O报文。After receiving the first I/O message, the switch first determines the address information of the corresponding storage controller according to the I/O address in the first I/O message. Secondly, the switch determines the source address information of the I/O message according to the first I/O message, that is, the address information of the client corresponding to the first I/O message. Thirdly, the switch determines the identification information of the client according to the address information of the client. In a possible implementation manner, the address information of the client is consistent with the identification information of the client. Thirdly, the switch determines the corresponding identification information of the direct path connection according to the identification information of the client. The switch sends the first I/O packet by using the direct path connection.
可选的,该第一映射关系包括:直接路径的标识信息、I/O地址、存储控制器的标识信息和客户端的标识信息。Optionally, the first mapping relationship includes: identification information of the direct path, an I/O address, identification information of the storage controller, and identification information of the client.
一种可能的实现中,交换机收到第一I/O报文后,首先,根据第一I/O报文中的I/O地址,确定对应的存储控制器的地址信息。其次,交换机检测该存储控制器对应的直接路径连接的数量。当该存储控制器仅对应一条直接路径连接时,交换机直接使用该直接路径连接发送该第一I/O报文。In a possible implementation, after receiving the first I/O message, the switch first determines the address information of the corresponding storage controller according to the I/O address in the first I/O message. Second, the switch detects the number of direct path connections corresponding to the storage controller. When the storage controller corresponds to only one direct path connection, the switch directly uses the direct path connection to send the first I/O packet.
当该存储控制器对应多条直接路径连接时,交换机根据该第一I/O报文中的源地址信息,即该第一I/O报文对应的客户端的地址信息,确定该第一I/O报文来自哪一个客户端。再次,交换机根据该客户端的地址信息,确定客户端的标识信息,一种可能的实现方式中,该客户端的地址信息与该客户端的标识信息一致。再次,交换机根据该客户端的标识信息,确定对应的直接路径连接的标识信息。交换机使用该直接路径连接发送该第一I/O报文。When the storage controller corresponds to multiple direct path connections, the switch determines the first I/O according to the source address information in the first I/O message, that is, the address information of the client corresponding to the first I/O message /O which client the packet comes from. Thirdly, the switch determines the identification information of the client according to the address information of the client. In a possible implementation manner, the address information of the client is consistent with the identification information of the client. Thirdly, the switch determines the corresponding identification information of the direct path connection according to the identification information of the client. The switch sends the first I/O packet by using the direct path connection.
可选的,交换机可以根据各个存储控制器上报的状态信息,动态选择合适的存储控制器接收该第一I/O报文。这里的合适的存储控制器可以是可用空间较大的存储控制器,或者IOPS性能较强的存储控制器,或者存储控制器的中央处理器(central processing unit,CPU)的处理负载较低的存储控制器,此处不作限制。以便均衡各个存储控制器的工作负载。Optionally, the switch may dynamically select an appropriate storage controller to receive the first I/O packet according to status information reported by each storage controller. The suitable storage controller here can be a storage controller with larger available space, or a storage controller with stronger IOPS performance, or a storage controller with lower processing load of the central processing unit (CPU) of the storage controller. Controller, not limited here. In order to balance the workload of each storage controller.
可选的,交换机检测该第一I/O报文,当第一I/O报文的目的地(存储控制器)为执行双活模式的存储控制器。则交换机复制该第一I/O报文。交换机向该存储控制器的备份存储控制器发送该第一I/O报文,该存储控制器与该备份存储控制器工作在双活模式。该交换机向备份存储控制器发送的第一I/O报文增加指示该报文为复制报文的标识。提升数据的安全性。Optionally, the switch detects the first I/O message, when the destination (storage controller) of the first I/O message is a storage controller implementing the active-active mode. Then the switch copies the first I/O packet. The switch sends the first I/O packet to a backup storage controller of the storage controller, and the storage controller and the backup storage controller work in a dual-active mode. The switch adds an identifier indicating that the message is a copy message to the first I/O message sent by the switch to the backup storage controller. Improve data security.
可选的,所述交换机检测所述第一I/O报文的类型;当所述第一I/O报文为数据报文时,所述交换机通过所述直接路径连接向所述存储控制器发送所述第一I/O报文;当所述第一I/O报文为非数据报文时,所述交换机透传所述第一I/O报文。通过识别报文的类型,对不同报文执行不同的处理,降低对网络的负载。Optionally, the switch detects the type of the first I/O message; when the first I/O message is a data message, the switch connects to the storage controller through the direct path The switch sends the first I/O packet; when the first I/O packet is a non-data packet, the switch transparently transmits the first I/O packet. By identifying the type of packets, different processing is performed on different packets to reduce the load on the network.
可选的,所述交换机通过所述直接路径连接,接收来自所述存储控制器的第一回复报文,所述第一回复报文为所述第一I/O报文的响应;所述交换机向所述客户端发送所述第一回复报文。交换机通过直接路径连接接收来自存储控制器的回复报文,交换机再将该回复报文发送至客户端,实现客户端不感知回复报文的路由路径。Optionally, the switch receives a first reply message from the storage controller through the direct path connection, and the first reply message is a response to the first I/O message; the The switch sends the first reply packet to the client. The switch receives the reply message from the storage controller through the direct path connection, and then the switch sends the reply message to the client, so that the client does not perceive the routing path of the reply message.
可选的,所述交换机接收来自所述存储控制器的第二回复报文,所述第二回复报文为所述第一I/O报文的响应,所述第二回复报文的目的地为第二存储控制器,所述第二存储控制器为所述客户端为所述第一I/O报文初始分配的存储控制器;所述交换机向所述第二存储控制器转发所述第二回复报文;所述交换机接收来自所述第二存储控制器的第三回复报文,所述第三回复报文是根据所述第二回复报文生成的;所述交换机转发来自所述第二存储控制器的所述第三回复报文。Optionally, the switch receives a second reply message from the storage controller, the second reply message is a response to the first I/O message, and the purpose of the second reply message is The ground is the second storage controller, and the second storage controller is the storage controller initially allocated by the client for the first I/O message; the switch forwards the The second reply message; the switch receives the third reply message from the second storage controller, the third reply message is generated according to the second reply message; the switch forwards the message from The third reply message of the second storage controller.
存储控制器可以将I/O回复报文转发给第二存储控制器,第二存储控制器为所述客户端为所述第一I/O报文初始分配的存储控制器。第二存储控制器保留与客户端的网络连接,因此第二存储控制器将I/O回复报文承载在网络连接上直接发给该客户端,交换机仅执行报文转发,也可以实现客户端不感知回复报文的路由路径。The storage controller may forward the I/O reply message to the second storage controller, where the second storage controller is the storage controller initially allocated by the client to the first I/O message. The second storage controller retains the network connection with the client, so the second storage controller carries the I/O reply message on the network connection and directly sends it to the client. Perceive the routing path of the reply message.
可选的,所述第二回复报文包括代理指示信息,所述代理指示信息指示所述第二存储控制器代理所述存储控制器将所述第二回复报文发送至所述客户端,所述第三回复报文不包括所述代理指示信息。存储控制器通过显式的代理指示信息指示第二存储控制器将该第二回复报文发送至客户端,实现客户端不感知回复报文的路由路径。Optionally, the second reply packet includes proxy indication information, and the proxy indication information instructs the second storage controller to send the second reply packet to the client on behalf of the storage controller, The third reply packet does not include the proxy indication information. The storage controller instructs the second storage controller to send the second reply message to the client through explicit proxy indication information, so that the client does not perceive the routing path of the reply message.
可选的,当所述第一I/O报文需要复制时,所述交换机复制所述第一I/O报文;所述交换机向所述存储控制器的备份存储控制器发送复制的所述第一I/O报文,所述备份存储控制器和所述存储控制器工作在双活模式。Optionally, when the first I/O message needs to be copied, the switch copies the first I/O message; and the switch sends the copied copy to the backup storage controller of the storage controller. For the first I/O message, the backup storage controller and the storage controller work in the active-active mode.
该存储控制器作为双活集群的一部分,双活集群工作在双活模式(也称为双活集群执行双活任务)。双活集群的特点是两个集群都是在线运行的,并且可以支持相同的应用负载。客户端在向双活集群写入数据时,例如客户端向双活集群发送I/O报文,则该双活集群中的其中一个集群根据该I/O报文执行读写操作,该集群复制该I/O报文并发送至另一个集群,以使另一个集群存储该I/O报文中的数据。客户端从双活集群中读取数据时,如果双活集群中有一个集群故障,另一个集群还可以正常工作,那么客户端可以直接通过正常工作的集群读取数据。由此可知,双活模式可以有效提升存储数据的安全性。类似地,当该存储控制器与备份存储控制器工作在双活模式,则该存储控制器接收I/O报文后,复制该I/O报文并向备份存储控制器发送该I/O报文。由该备份存储控制器存储该I/O报文中的数据。The storage controller is a part of the active-active cluster, and the active-active cluster works in the active-active mode (also referred to as the active-active cluster performing the active-active task). The feature of an active-active cluster is that both clusters are running online and can support the same application load. When the client writes data to the active-active cluster, for example, the client sends an I/O message to the active-active cluster, one of the clusters in the active-active cluster performs read and write operations based on the I/O message, and the cluster Copy the I/O message and send it to another cluster, so that the other cluster stores the data in the I/O message. When the client reads data from the active-active cluster, if one cluster in the active-active cluster fails and the other cluster is still working normally, the client can directly read data through the working cluster. It can be seen that the active-active mode can effectively improve the security of stored data. Similarly, when the storage controller and the backup storage controller work in the active-active mode, after receiving the I/O message, the storage controller copies the I/O message and sends the I/O message to the backup storage controller. message. The data in the I/O message is stored by the backup storage controller.
可选的,当交换机接收第一I/O报文,交换机通过与存储控制器的直接路径连接向存储控制器发送该第一I/O报文。交换机向第二存储控制器发送第一指示信息,该第一指示信息指示第二存储控制器更新本地的接收报文的序列号,第二存储控制器为所述客户端为所述第一I/O报文初始分配的存储控制器。例如,存储控制器接收I/O报文后,为该I/O报文分配一个序列号,作为该接收报文的序列号。通过第一指示信息指示第二存储控制器更新本地的接收报文的序列号,避免第二存储控制器因为没有收到第一I/O报文导致乱序问题。Optionally, when the switch receives the first I/O message, the switch sends the first I/O message to the storage controller through a direct path connection with the storage controller. The switch sends first indication information to the second storage controller, where the first indication information instructs the second storage controller to update the local serial number of the received message, and the second storage controller is the client for the first I /O The memory controller to which the message is initially assigned. For example, after receiving the I/O message, the storage controller assigns a serial number to the I/O message as the serial number of the received message. The first indication information instructs the second storage controller to update the local serial number of the received message, so as to avoid the out-of-sequence problem caused by the second storage controller not receiving the first I/O message.
可选的,交换机向第二存储控制器发送第一指示信息,包括:交换机向第二存储控制器发送信号报文,该信号报文的报文头中包括第一I/O报文的报文头信息,该信号报文的报文头还包括第一指示信息。Optionally, the switch sending the first indication information to the second storage controller includes: the switch sending a signal message to the second storage controller, and the message header of the signal message includes the message of the first I/O message Header information, where the header of the signal message further includes first indication information.
可选的,交换机通知各个存储控制器关闭乱序报文的检测功能。避免第二存储控制器因为没有收到第一I/O报文导致乱序问题,第二存储控制器为所述客户端为所述第一I/O报文初始分配的存储控制器。Optionally, the switch notifies each storage controller to disable the out-of-sequence packet detection function. To avoid an out-of-sequence problem caused by the second storage controller not receiving the first I/O message, the second storage controller is the storage controller initially allocated by the client to the first I/O message.
可选的,所述直接路径连接为传输控制协议TCP连接,或者,远程直接数据存储RDMA连接。Optionally, the direct path connection is a transmission control protocol TCP connection, or a remote direct data storage RDMA connection.
本申请第二方面提供一种数据读写方法,存储控制器向交换机发送存储控制器的地址信息,所述存储控制器的地址信息包括以下一项或多项:所述存储控制器的队列对端口号、所述存储控制器的互联网协议地址、所述存储控制器的传输控制协议TCP端口号、或者协议号;所述存储控制器与所述交换机建立直接路径连接,所述直接路径连接用于传输输入输出I/O报文,所述I/O报文用于通过所述存储控制器向存储阵列写入数据或通过所述存 储控制从存储阵列读取数据。The second aspect of the present application provides a method for reading and writing data. The storage controller sends the address information of the storage controller to the switch. The address information of the storage controller includes one or more of the following: the queue pair of the storage controller Port number, the Internet protocol address of the storage controller, the transmission control protocol TCP port number of the storage controller, or the protocol number; the storage controller establishes a direct path connection with the switch, and the direct path connection uses For transmitting input and output I/O messages, the I/O messages are used to write data to the storage array through the storage controller or read data from the storage array through the storage control.
当交换机与该存储控制器之间建立TCP连接,该TCP连接用于传输存储控制器的地址信息。该存储控制器的地址信息包括:该存储控制器的IP地址、该存储控制器的TCP端口号以及协议号。When a TCP connection is established between the switch and the storage controller, the TCP connection is used to transmit address information of the storage controller. The address information of the storage controller includes: the IP address of the storage controller, the TCP port number and the protocol number of the storage controller.
当交换机与该存储控制器之间建立RDMA连接,该RDMA连接用于传输存储控制器的地址信息。该存储控制器的地址信息包括:该存储控制器的队列对端口号QP和该存储控制器的IP地址。When an RDMA connection is established between the switch and the storage controller, the RDMA connection is used to transmit address information of the storage controller. The address information of the storage controller includes: the queue pair port number QP of the storage controller and the IP address of the storage controller.
本申请中,当客户端上线后,客户端会与存储控制器建立网络连接,则交换机与该存储控制器建立网络连接。当网络连接建立后,交换机接收来自存储控制器的所述存储控制器的地址信息,并为所述存储控制器分配对应的交换地址信息。根据该存储控制器的地址信息和该存储控制器对应的交换地址信息,该交换机建立与该存储控制器之间的直接路径连接,该直接路径连接用于传输I/O报文,所述I/O报文用于通过所述存储控制器向存储阵列写入数据或通过所述存储控制从存储阵列读取数据。由于交换机与存储控制器之间建立的直接路径连接可以传输I/O报文,避免了I/O处理的多次寻址,缩短了I/O报文的转发路径,减少数据读写操作完成时间,提升IOPS性能,提升数据读写效率。In this application, when the client goes online, the client establishes a network connection with the storage controller, and the switch establishes a network connection with the storage controller. After the network connection is established, the switch receives address information of the storage controller from the storage controller, and assigns corresponding exchange address information to the storage controller. According to the address information of the storage controller and the exchange address information corresponding to the storage controller, the switch establishes a direct path connection with the storage controller, the direct path connection is used to transmit I/O packets, and the I The /O message is used to write data to the storage array through the storage controller or read data from the storage array through the storage control. Since the direct path connection established between the switch and the storage controller can transmit I/O packets, multiple addressing for I/O processing is avoided, the forwarding path of I/O packets is shortened, and the completion of data read and write operations is reduced. Time, improve IOPS performance, and improve data read and write efficiency.
可选的,所述存储控制器向所述交换机发送所述存储控制器的路由信息,所述路由信息包括以下信息中的一项或多项:所述存储控制器的标识信息、所述存储控制器的输入输出I/O地址、目的地为所述存储控制器的I/O报文是否需要复制,或者所述存储控制器的负载信息,所述I/O地址包括逻辑单元号、命名空间的标识和/或逻辑区块地址。交换机接收存储控制器的路由信息,使得交换机能够掌握存储控制器的状态,当交换机收到I/O报文后,可以为该I/O报文快速的确定对应的存储控制器。提升数据读写效率。Optionally, the storage controller sends routing information of the storage controller to the switch, where the routing information includes one or more of the following information: identification information of the storage controller, the storage The input and output I/O address of the controller, whether the I/O message whose destination is the storage controller needs to be copied, or the load information of the storage controller, the I/O address includes the logical unit number, naming ID and/or logical block address of the space. The switch receives the routing information of the storage controller, so that the switch can grasp the status of the storage controller. After receiving the I/O message, the switch can quickly determine the corresponding storage controller for the I/O message. Improve data read and write efficiency.
可选的,所述存储控制器通过所述直接路径连接接收来自所述交换机的第一I/O报文。Optionally, the storage controller receives the first I/O packet from the switch through the direct path connection.
可选的,所述存储控制器根据所述第一I/O报文,生成第一回复报文,所述第一回复报文为所述第一I/O报文的响应;所述存储控制器向所述交换机发送所述第一回复报文。交换机通过直接路径连接接收来自存储控制器的回复报文,交换机再将该回复报文发送至客户端,实现客户端不感知回复报文的路由路径。Optionally, the storage controller generates a first reply message according to the first I/O message, and the first reply message is a response to the first I/O message; the storage The controller sends the first reply packet to the switch. The switch receives the reply message from the storage controller through the direct path connection, and then the switch sends the reply message to the client, so that the client does not perceive the routing path of the reply message.
可选的,所述存储控制器根据所述第一I/O报文,生成第二回复报文,所述第二回复报文为所述第一I/O报文的响应;所述存储控制器向所述交换机发送所述第二回复报文,所述第二回复报文的目的地为第二存储控制器,所述第二存储控制器为所述客户端为所述第一I/O报文初始分配的存储控制器。存储控制器可以将I/O回复报文转发给第二存储控制器,第二存储控制器为所述客户端为所述第一I/O报文初始分配的存储控制器。第二存储控制器保留与客户端的网络连接,因此第二存储控制器将I/O回复报文承载在网络连接上直接发给该客户端,交换机仅执行报文转发,也可以实现客户端不感知回复报文的路由路径。Optionally, the storage controller generates a second reply message according to the first I/O message, and the second reply message is a response to the first I/O message; the storage The controller sends the second reply message to the switch, the destination of the second reply message is a second storage controller, and the second storage controller is the client and the first I /O The memory controller to which the message is initially assigned. The storage controller may forward the I/O reply message to the second storage controller, where the second storage controller is the storage controller initially allocated by the client to the first I/O message. The second storage controller retains the network connection with the client, so the second storage controller carries the I/O reply message on the network connection and directly sends it to the client. Perceive the routing path of the reply message.
可选的,所述第二回复报文包括代理指示信息,所述代理指示信息指示所述第二存储控制器代理所述存储控制器将所述第二回复报文发送至所述客户端。存储控制器通过显式的代理指示信息指示第二存储控制器将该第二回复报文发送至客户端,实现客户端不感知 回复报文的路由路径。Optionally, the second reply packet includes proxy indication information, and the proxy indication information instructs the second storage controller to send the second reply packet to the client on behalf of the storage controller. The storage controller instructs the second storage controller to send the second reply message to the client through explicit proxy instruction information, so that the client does not perceive the routing path of the reply message.
可选的,所述直接路径连接为传输控制协议TCP连接,或者,远程直接数据存储RDMA连接。Optionally, the direct path connection is a transmission control protocol TCP connection, or a remote direct data storage RDMA connection.
本申请第三方面提供一种网络设备,包括:收发模块和处理模块;The third aspect of the present application provides a network device, including: a transceiver module and a processing module;
所述收发模块,用于接收存储控制器的地址信息,所述存储控制器的地址信息包括以下一项或多项:所述存储控制器的队列对端口号、所述存储控制器的互联网协议地址、所述存储控制器的传输控制协议TCP端口号、或者协议号;The transceiver module is used to receive the address information of the storage controller, the address information of the storage controller includes one or more of the following: the queue pair port number of the storage controller, the Internet protocol of the storage controller address, transmission control protocol TCP port number, or protocol number of the storage controller;
所述处理模块,用于为所述存储控制器分配对应的交换地址信息,所述交换地址信息包括以下一项或多项:所述交换机的队列对端口号、所述交换机的互联网协议地址或者所述交换机的端口号;The processing module is configured to assign corresponding switching address information to the storage controller, and the switching address information includes one or more of the following: the queue pair port number of the switch, the Internet Protocol address of the switch, or the port number of the switch;
所述处理模块,还用于与所述存储控制器建立直接路径连接,所述直接路径连接用于传输输入输出I/O报文,所述I/O报文用于通过所述存储控制器向存储阵列写入数据或通过所述存储控制从存储阵列读取数据。The processing module is further configured to establish a direct path connection with the storage controller, the direct path connection is used to transmit input and output I/O messages, and the I/O messages are used to pass through the storage controller Data is written to or read from the storage array through the storage control.
可选的,所述收发模块,还用于接收来自所述存储控制器的路由信息,所述路由信息包括以下信息中的一项或多项:所述存储控制器的标识信息、所述存储控制器的输入输出I/O地址、目的地为所述存储控制器的I/O报文是否需要复制,或者所述存储控制器的负载信息,所述I/O地址包括逻辑单元号、命名空间的标识和/或逻辑区块地址。Optionally, the transceiver module is further configured to receive routing information from the storage controller, where the routing information includes one or more of the following information: identification information of the storage controller, the storage The input and output I/O address of the controller, whether the I/O message whose destination is the storage controller needs to be copied, or the load information of the storage controller, the I/O address includes the logical unit number, naming ID and/or logical block address of the space.
可选的,所述处理模块,还用于生成第一映射关系,所述第一映射关系包括所述直接路径连接的标识信息和所述I/O地址的映射关系。Optionally, the processing module is further configured to generate a first mapping relationship, where the first mapping relationship includes a mapping relationship between the identification information of the direct path connection and the I/O address.
可选的,所述收发模块,还用于接收来自客户端的第一I/O报文;Optionally, the transceiver module is also configured to receive the first I/O message from the client;
所述处理模块,还用于根据所述第一I/O报文和所述第一映射关系,确定所述直连路径连接;The processing module is further configured to determine the direct path connection according to the first I/O message and the first mapping relationship;
所述收发模块,还用于通过所述直接路径连接,向所述存储控制器发送所述第一I/O报文。The transceiver module is further configured to send the first I/O message to the storage controller through the direct path connection.
可选的,所述处理模块,还用于检测所述第一I/O报文的类型;Optionally, the processing module is further configured to detect the type of the first I/O message;
所述收发模块,还用于当所述第一I/O报文为数据报文时,通过所述直接路径连接向所述存储控制器发送所述第一I/O报文;The transceiver module is further configured to send the first I/O message to the storage controller through the direct path connection when the first I/O message is a data message;
所述收发模块,还用于当所述第一I/O报文为非数据报文时,透传所述第一I/O报文。The transceiver module is further configured to transparently transmit the first I/O message when the first I/O message is a non-data message.
可选的,所述收发模块,还用于通过所述直接路径连接,接收来自所述存储控制器的第一回复报文,所述第一回复报文为所述第一I/O报文的响应;Optionally, the transceiver module is further configured to receive a first reply message from the storage controller through the direct path connection, where the first reply message is the first I/O message the response to;
所述收发模块,还用于向所述客户端发送所述第一回复报文。The transceiver module is further configured to send the first reply message to the client.
可选的,所述收发模块,还用于接收来自所述存储控制器的第二回复报文,所述第二回复报文为所述第一I/O报文的响应,所述第二回复报文的目的地为第二存储控制器,所述第二存储控制器为所述客户端为所述第一I/O报文初始分配的存储控制器;Optionally, the transceiver module is further configured to receive a second reply message from the storage controller, the second reply message is a response to the first I/O message, and the second reply message is The destination of the reply message is the second storage controller, and the second storage controller is the storage controller initially assigned by the client to the first I/O message;
所述收发模块,还用于向所述第二存储控制器转发所述第二回复报文;The transceiver module is further configured to forward the second reply message to the second storage controller;
所述收发模块,还用于接收来自所述第二存储控制器的第三回复报文,所述第三回复报文是根据所述第二回复报文生成的;The transceiver module is further configured to receive a third reply message from the second storage controller, the third reply message is generated according to the second reply message;
所述收发模块,还用于转发来自所述第二存储控制器的所述第三回复报文。The transceiver module is further configured to forward the third reply message from the second storage controller.
可选的,所述第二回复报文包括代理指示信息,所述代理指示信息指示所述第二存储控制器代理所述存储控制器将所述第二回复报文发送至所述客户端,所述第三回复报文不包括所述代理指示信息。Optionally, the second reply packet includes proxy indication information, and the proxy indication information instructs the second storage controller to send the second reply packet to the client on behalf of the storage controller, The third reply packet does not include the proxy indication information.
可选的,所述处理模块,还用于当所述第一I/O报文需要复制时,复制所述第一I/O报文;Optionally, the processing module is further configured to copy the first I/O message when the first I/O message needs to be copied;
所述收发模块,还用于向所述存储控制器的备份存储控制器发送复制的所述第一I/O报文,所述备份存储控制器和所述存储控制器工作在双活模式。The transceiver module is further configured to send the replicated first I/O message to a backup storage controller of the storage controller, and the backup storage controller and the storage controller work in a dual-active mode.
可选的,所述直接路径连接为传输控制协议TCP连接,或者,远程直接数据存储RDMA连接。Optionally, the direct path connection is a transmission control protocol TCP connection, or a remote direct data storage RDMA connection.
本申请第四方面提供一种存储设备,包括:收发模块和处理模块;The fourth aspect of the present application provides a storage device, including: a transceiver module and a processing module;
所述收发模块,用于向交换机发送存储控制器的地址信息,所述存储控制器的地址信息包括以下一项或多项:所述存储控制器的队列对端口号、所述存储控制器的互联网协议地址、所述存储控制器的传输控制协议TCP端口号、或者协议号;The transceiver module is configured to send the address information of the storage controller to the switch, and the address information of the storage controller includes one or more of the following: the queue pair port number of the storage controller, the Internet protocol address, transmission control protocol TCP port number, or protocol number of the storage controller;
所述处理模块,用于与所述交换机建立直接路径连接,所述直接路径连接用于传输输入输出I/O报文,所述I/O报文用于通过所述存储控制器向存储阵列写入数据或通过所述存储控制从存储阵列读取数据。The processing module is configured to establish a direct path connection with the switch, the direct path connection is used to transmit input and output I/O messages, and the I/O messages are used to send the storage array to the storage controller through the storage controller Write data to or read data from the storage array through the storage control.
可选的,所述收发模块,还用于向所述交换机发送所述存储控制器的路由信息,所述路由信息包括以下信息中的一项或多项:所述存储控制器的标识信息、所述存储控制器的输入输出I/O地址、目的地为所述存储控制器的I/O报文是否需要复制,或者所述存储控制器的负载信息,所述I/O地址包括逻辑单元号、命名空间的标识和/或逻辑区块地址。Optionally, the transceiver module is further configured to send routing information of the storage controller to the switch, where the routing information includes one or more of the following information: identification information of the storage controller, The input/output I/O address of the storage controller, whether the I/O message whose destination is the storage controller needs to be copied, or the load information of the storage controller, the I/O address includes a logic unit number, namespace identifier and/or logical block address.
可选的,所述收发模块,还用于通过所述直接路径连接接收来自所述交换机的第一I/O报文。Optionally, the transceiver module is further configured to receive the first I/O message from the switch through the direct path connection.
可选的,所述处理模块,还用于根据所述第一I/O报文,生成第一回复报文,所述第一回复报文为所述第一I/O报文的响应;Optionally, the processing module is further configured to generate a first reply message according to the first I/O message, and the first reply message is a response to the first I/O message;
所述收发模块,还用于通过所述直接路径连接向所述交换机发送所述第一回复报文。The transceiver module is further configured to send the first reply message to the switch through the direct path connection.
可选的,所述处理模块,还用于根据所述第一I/O报文,生成第二回复报文,所述第二回复报文为所述第一I/O报文的响应;Optionally, the processing module is further configured to generate a second reply message according to the first I/O message, and the second reply message is a response to the first I/O message;
所述收发模块,还用于向所述交换机发送所述第二回复报文,所述第二回复报文的目的地为第二存储控制器,所述第二存储控制器为所述客户端为所述第一I/O报文初始分配的存储控制器。The transceiver module is further configured to send the second reply message to the switch, the destination of the second reply message is a second storage controller, and the second storage controller is the client A storage controller initially allocated for the first I/O packet.
可选的,所述第二回复报文包括代理指示信息,所述代理指示信息指示所述第二存储控制器代理所述存储控制器将所述第二回复报文发送至所述客户端。Optionally, the second reply packet includes proxy indication information, and the proxy indication information instructs the second storage controller to send the second reply packet to the client on behalf of the storage controller.
可选的,所述直接路径连接为传输控制协议TCP连接,或者,远程直接数据存储RDMA连接。Optionally, the direct path connection is a transmission control protocol TCP connection, or a remote direct data storage RDMA connection.
本申请第五方面提供一种网络设备,该网络设备包括:处理器,用于使得网络设备实现如前述第一方面或第一方面的任一可能的实现方式中描述的方法。该设备还可以包括存 储器,存储器与处理器耦合,处理器执行存储器中存储的指令时,可以使得网络设备实现前述第一方面任一种可能的实现方式描述的方法。该设备还可以包括通信接口,通信接口用于该装置与其它设备通信,示例性的,通信接口可以是收发器、电路、总线、模块或其它类型的通信接口。A fifth aspect of the present application provides a network device, where the network device includes: a processor, configured to enable the network device to implement the method described in the foregoing first aspect or any possible implementation manner of the first aspect. The device may further include a memory, and the memory is coupled to the processor. When the processor executes the instructions stored in the memory, the network device may implement the method described in any possible implementation manner of the foregoing first aspect. The device may further include a communication interface, which is used for the device to communicate with other devices. Exemplarily, the communication interface may be a transceiver, a circuit, a bus, a module, or other types of communication interfaces.
本申请中存储器中的指令可以预先存储也可以在使用该网络设备时从互联网下载后存储,本申请对于存储器中指令的来源不进行具体限定。本申请中的耦合是装置、单元或模块之间的间接耦合或连接,其可以是电性,机械或其它的形式,用于装置、单元或模块之间的信息交互。The instructions in the memory in this application can be pre-stored or stored after being downloaded from the Internet when using the network device, and the source of the instructions in the memory is not specifically limited in this application. Coupling in this application is an indirect coupling or connection between devices, units or modules, which may be in electrical, mechanical or other forms, and is used for information exchange between devices, units or modules.
本申请第六方面提供一种存储设备,该网络设备包括:处理器,用于使得存储设备实现如前述第二方面或第二方面的任一可能的实现方式中描述的方法。该设备还可以包括存储器,存储器与处理器耦合,处理器执行存储器中存储的指令时,可以使得存储设备实现前述第二方面任一种可能的实现方式描述的方法。该设备还可以包括通信接口,通信接口用于该装置与其它设备通信,示例性的,通信接口可以是收发器、电路、总线、模块或其它类型的通信接口。A sixth aspect of the present application provides a storage device, where the network device includes: a processor configured to enable the storage device to implement the method described in the foregoing second aspect or any possible implementation manner of the second aspect. The device may further include a memory, and the memory is coupled to the processor. When the processor executes the instructions stored in the memory, the memory device may implement the method described in any possible implementation manner of the foregoing second aspect. The device may further include a communication interface, which is used for the device to communicate with other devices. Exemplarily, the communication interface may be a transceiver, a circuit, a bus, a module, or other types of communication interfaces.
本申请中存储器中的指令可以预先存储也可以在使用该存储设备时从互联网下载后存储,本申请对于存储器中指令的来源不进行具体限定。本申请中的耦合是装置、单元或模块之间的间接耦合或连接,其可以是电性,机械或其它的形式,用于装置、单元或模块之间的信息交互。The instructions in the memory in this application can be pre-stored or stored after being downloaded from the Internet when using the storage device, and the source of the instructions in the memory is not specifically limited in this application. Coupling in this application is an indirect coupling or connection between devices, units or modules, which may be in electrical, mechanical or other forms, and is used for information exchange between devices, units or modules.
本申请第七方面提供一种计算机存储介质,该计算机存储介质可以是非易失性的;该计算机存储介质中存储有计算机可读指令,当该计算机可读指令被处理器执行时实现第一方面或第一方面的任一可能的实现方式中描述的方法。The seventh aspect of the present application provides a computer storage medium, which may be non-volatile; computer-readable instructions are stored in the computer storage medium, and the first aspect is realized when the computer-readable instructions are executed by a processor Or the method described in any possible implementation of the first aspect.
本申请第八方面提供一种计算机存储介质,该计算机存储介质可以是非易失性的;该计算机存储介质中存储有计算机可读指令,当该计算机可读指令被处理器执行时实现第二方面或第二方面的任一可能的实现方式中描述的方法。The eighth aspect of the present application provides a computer storage medium, which may be non-volatile; computer-readable instructions are stored in the computer storage medium, and the second aspect is realized when the computer-readable instructions are executed by a processor Or the method described in any possible implementation of the second aspect.
本申请第九方面提供一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行如第一方面或第一方面的任一可能的实现方式中描述的方法。A ninth aspect of the present application provides a computer program product containing instructions, which when run on a computer, causes the computer to execute the method described in the first aspect or any possible implementation manner of the first aspect.
本申请第十方面提供一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行如第二方面或第二方面的任一可能的实现方式中描述的方法。The tenth aspect of the present application provides a computer program product including instructions, which, when run on a computer, cause the computer to execute the method described in the second aspect or any possible implementation manner of the second aspect.
本申请第十一方面提供了一种存储系统,该存储系统包括多个如上述第三方面或第五方面的网络设备,和多个如上述第四方面或第六方法的存储设备。The eleventh aspect of the present application provides a storage system, where the storage system includes multiple network devices according to the above third aspect or the fifth aspect, and multiple storage devices according to the above fourth aspect or the sixth aspect.
上述第三方面至第十一方面提供的方案,用于实现或配合实现上述第一方面或第二方面提供的方法,因此可以与第一方面或第二方面达到相同或相应的有益效果,此处不再进行赘述。The solutions provided in the above-mentioned third aspect to the eleventh aspect are used to realize or cooperate to realize the method provided in the above-mentioned first aspect or the second aspect, so the same or corresponding beneficial effects can be achieved as in the first aspect or the second aspect. will not be repeated here.
附图说明Description of drawings
图1a为本申请实施例涉及的一种网络架构示意图;FIG. 1a is a schematic diagram of a network architecture involved in an embodiment of the present application;
图1b为本申请实施例提出的一种网络架构示意图;FIG. 1b is a schematic diagram of a network architecture proposed by an embodiment of the present application;
图2为本申请实施例提出的一种应用场景示意图;FIG. 2 is a schematic diagram of an application scenario proposed by an embodiment of the present application;
图3为本申请实施例提出的又一种应用场景示意图;FIG. 3 is a schematic diagram of another application scenario proposed by the embodiment of the present application;
图4为本申请实施例提供的数据读写方法的流程示意图;FIG. 4 is a schematic flow diagram of a data reading and writing method provided by an embodiment of the present application;
图5为本申请实施例提出的一种数据读写方法的实施例示意图;FIG. 5 is a schematic diagram of an embodiment of a data reading and writing method proposed in the embodiment of the present application;
图6为本申请实施例提出的另一种数据读写方法的实施例示意图;FIG. 6 is a schematic diagram of an embodiment of another data reading and writing method proposed in the embodiment of the present application;
图7为本申请实施例提出的一种数据读写方法的又一种实施例示意图;FIG. 7 is a schematic diagram of another embodiment of a data reading and writing method proposed in the embodiment of the present application;
图8为本申请实施例提供的一种网络设备800的结构示意图;FIG. 8 is a schematic structural diagram of a network device 800 provided in an embodiment of the present application;
图9为本申请实施例提供的一种网络设备900的结构示意图;FIG. 9 is a schematic structural diagram of a network device 900 provided in an embodiment of the present application;
图10为本申请实施例提供的存储设备1000的结构示意图。FIG. 10 is a schematic structural diagram of a storage device 1000 provided by an embodiment of the present application.
具体实施方式detailed description
为了使本申请的目的、技术方案及优点更加清楚明白,下面结合附图,对本申请的实施例进行描述。显然,所描述的实施例仅仅是本申请一部分的实施例,而不是全部的实施例。本领域普通技术人员可知,随着新应用场景的出现,本申请实施例提供的技术方案对于类似的技术问题,同样适用。In order to make the purpose, technical solution and advantages of the present application clearer, the embodiments of the present application will be described below in conjunction with the accompanying drawings. Apparently, the described embodiments are only some of the embodiments of the present application, not all of them. Those skilled in the art know that, with the emergence of new application scenarios, the technical solutions provided in the embodiments of the present application are also applicable to similar technical problems.
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的描述在适当情况下可以互换,以便使实施例能够以除了在本申请图示或描述的内容以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或模块的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或模块,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或模块。在本申请中出现的对步骤进行的命名或者编号,并不意味着必须按照命名或者编号所指示的时间/逻辑先后顺序执行方法流程中的步骤,已经命名或者编号的流程步骤可以根据要实现的技术目的变更执行顺序,只要能达到相同或者相类似的技术效果即可。本申请中所出现的单元的划分,是一种逻辑上的划分,实际应用中实现时可以有另外的划分方式,例如多个单元可以结合成或集成在另一个系统中,或一些特征可以忽略,或不执行,另外,所显示的或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,单元之间的间接耦合或通信连接可以是电性或其他类似的形式,本申请中均不作限定。并且,作为分离部件说明的单元或子单元可以是也可以不是物理上的分离,可以是也可以不是物理单元,或者可以分布到多个电路单元中,可以根据实际的需要选择其中的部分或全部单元来实现本申请方案的目的。The terms "first", "second" and the like in the specification and claims of the present application and the above drawings are used to distinguish similar objects, and are not necessarily used to describe a specific sequence or sequence. It is to be understood that the descriptions so used are interchangeable under appropriate circumstances such that the embodiments can be practiced in sequences other than those illustrated or described herein. Furthermore, the terms "comprising" and "having", as well as any variations thereof, are intended to cover a non-exclusive inclusion, for example, a process, method, system, product or device comprising a series of steps or modules is not necessarily limited to the expressly listed Instead, other steps or modules not explicitly listed or inherent to the process, method, product or apparatus may be included. The naming or numbering of the steps in this application does not mean that the steps in the method flow must be executed in the time/logic sequence indicated by the naming or numbering. The execution order of the technical purpose is changed, as long as the same or similar technical effect can be achieved. The division of units presented in this application is a logical division. In actual application, there may be other division methods. For example, multiple units can be combined or integrated in another system, or some features can be ignored. , or not, in addition, the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, and the indirect coupling or communication connection between units may be electrical or other similar forms, this Applications are not limited. Moreover, the units or subunits described as separate components may or may not be physically separated, may or may not be physical units, or may be distributed into multiple circuit units, and some or all of them may be selected according to actual needs unit to realize the purpose of the application scheme.
尽管本申请实施例对此不作限制,但是利用“处理”、“计算”、“确定”、“建立”、“分析”以及“检查”等术语进行的讨论可以指计算机、计算平台、计算系统或其它电子计算设备的操作和/或处理,其将表示为计算机寄存器和/或存储器中的物理(例如,电子)量的数据操作和/或转换为类似地表示为计算机寄存器和/或存储器中的物理量的其它数据或可以存储执行操作和/或处理指令的非瞬时性存储介质中的其它信息。Although this embodiment of the present application does not limit this, discussions using terms such as "processing", "calculating", "determining", "establishing", "analyzing" and "checking" may refer to computers, computing platforms, computing systems or The operation and/or processing of other electronic computing devices that manipulate and/or convert data represented as physical (e.g., electronic) quantities in computer registers and/or memory into data similarly represented in computer registers and/or memory Other data of physical quantities or other information in a non-transitory storage medium that may store instructions for performing operations and/or processing.
为了便于理解,以下先介绍本申请实施例所涉及的技术术语。For ease of understanding, technical terms involved in the embodiments of the present application are firstly introduced below.
(1)、输入输出(input/output,I/O)报文。(1) Input and output (input/output, I/O) messages.
I/O报文也可以称为I/O命令或者I/O请求。具体的,I/O命令可以分为读命令和写命令,是指由服务器上运行的应用程序下发的用于指示从存储阵列读数据或向存储阵列写数据的命令。服务器的处理器可以接收I/O命令,并将其存储在存储器中,以使得I/O命令等待被处理。具体的,可以将I/O命令存储在存储器的队列中。An I/O packet may also be called an I/O command or an I/O request. Specifically, the I/O command can be divided into a read command and a write command, and refers to a command issued by an application program running on the server for instructing to read data from the storage array or write data to the storage array. The server's processor can receive the I/O command and store it in memory so that the I/O command waits to be processed. Specifically, the I/O command may be stored in a queue of the memory.
(2)、队列。(2), queue.
队列是一种特殊的线性表,可以在表的前端(front)进行删除操作,而在表的后端(rear)进行插入操作。进行插入操作的端称为队尾,进行删除操作的端称为队头。队列中没有元素时,称为空队列。队列的数据元素又称为队列元素。在队列中插入一个队列元素称为入队,从队列中删除一个队列元素称为出队。队列又可以被称为先进先出(first in first out,FIFO)线性表。A queue is a special linear table that can be deleted at the front end of the table (front) and inserted at the back end (rear) of the table. The end of the insertion operation is called the tail of the queue, and the end of the deletion operation is called the head of the queue. When there are no elements in the queue, it is called an empty queue. The data elements of a queue are also called queue elements. Inserting a queue element into the queue is called enqueuing, and removing a queue element from the queue is called dequeuing. The queue can also be called a first in first out (FIFO) linear list.
队列的类型可以有多种,例如,发送队列(send queue,SQ),接收队列(receive queue,RQ),完成队列(complete queue,CQ),事件队列(event queue,EQ)。There can be many types of queues, for example, send queue (send queue, SQ), receive queue (receive queue, RQ), completion queue (complete queue, CQ), event queue (event queue, EQ).
发送端(例如:服务器)的发送队列和接收端(例如:存储控制器)的接收队列可以称为队列对(queue pair,QP)。The sending queue of the sending end (for example: server) and the receiving queue of the receiving end (for example: storage controller) may be referred to as a queue pair (queue pair, QP).
发送队列可以用于存储待处理的I/O命令;接收队列用于存储以及处理接收的I/O命令。The send queue can be used to store pending I/O commands; the receive queue is used to store and process received I/O commands.
作为一个示例,当发送端为服务器,接收端为存储控制器时,服务器中的发送队列用于存储该服务器下发的用于指示向存储控制器读数据或写数据的I/O命令,例如,该I/O命令为读取存储控制器中存储的数据的读命令,服务器可以将处理后的读命令发送至存储控制器,该读命令中可以包括需要读取的数据的地址和长度等,以使得存储控制器可以根据该处理后的读命令,向服务器发送其需要读取的数据。As an example, when the sending end is a server and the receiving end is a storage controller, the sending queue in the server is used to store the I/O commands issued by the server for instructing to read or write data to the storage controller, for example , the I/O command is a read command for reading data stored in the storage controller, the server can send the processed read command to the storage controller, and the read command can include the address and length of the data to be read, etc. , so that the storage controller can send the data it needs to read to the server according to the processed read command.
又如,该I/O命令为向存储控制器中写入数据的写命令,服务器可以对写命令进行处理之后,将处理后的写命令发送至存储控制器,以使得存储控制器可以根据处理后的写命令查找接收队列中的内存信息,根据查找到的内存信息将处理后的写命令中的数据存储至存储控制器的内存中。As another example, the I/O command is a write command for writing data to the storage controller, the server may process the write command, and then send the processed write command to the storage controller, so that the storage controller can process the The subsequent write command searches for memory information in the receiving queue, and stores the data in the processed write command into the memory of the storage controller according to the found memory information.
类似地,存储控制器中的发送队列也可以存储该存储控制器下发的用于指示从服务器读数据或向服务器写数据的I/O命令,并通过存储控制器发送至服务器。具体请参考上文中的描述,此处不再赘述。Similarly, the sending queue in the storage controller may also store the I/O command issued by the storage controller for instructing to read data from the server or write data to the server, and send the command to the server through the storage controller. For details, please refer to the description above, and will not repeat them here.
以分布式存储系统为例,如图1a所示,为本申请实施例涉及的一种网络架构示意图。客户端集群中包括多个客户端,存储阵列集群中包括多个存储控制器(例如图1a所示的存储控制器A和存储控制器N)以及存储阵列(图中未示出),客户端与存储控制器之间通过交换机建立连接。由于分布式存储系统中客户端集群与存储阵列集群通常分属于两个独立的系统,因此,客户端无法预先获取各个存储控制器的地址信息。当客户端发起I/O请求时(即客户端通过交换机向存储阵列集群发送I/O报文),由于客户端无法获知该I/O请求对应的归属存储控制器,客户端根据负载均衡等算法,在存储阵列集群中确定一个存储控 制器。由该确定的存储控制器,查找该I/O报文的归属存储控制器,并将该I/O报文重路由至该归属存储控制器,完成读数据或者写数据。该归属存储控制器指的是该I/O报文的目的地,该归属存储控制器也可以称为目标存储控制器。该I/O报文中的寻址地址为该归属存储控制器的地址(例如,逻辑单元号(logical unit number,LUN)、逻辑区块地址(logical block address,LBA))。Taking a distributed storage system as an example, as shown in FIG. 1a, it is a schematic diagram of a network architecture involved in the embodiment of the present application. The client cluster includes multiple clients, the storage array cluster includes multiple storage controllers (such as storage controller A and storage controller N shown in Figure 1a) and storage arrays (not shown in the figure), and the client Establish a connection with the storage controller through a switch. Since the client cluster and the storage array cluster in the distributed storage system usually belong to two independent systems, the client cannot obtain the address information of each storage controller in advance. When the client initiates an I/O request (that is, the client sends an I/O packet to the storage array cluster through the switch), since the client cannot know the storage controller corresponding to the I/O request, the client will Algorithm to determine a storage controller in a storage array cluster. The determined storage controller searches for the attributable storage controller of the I/O message, and reroutes the I/O message to the attributable storage controller to complete data reading or writing. The attributable storage controller refers to the destination of the I/O packet, and the attributable storage controller may also be called a target storage controller. The addressing address in the I/O message is the address of the storage controller (for example, logical unit number (logical unit number, LUN), logical block address (logical block address, LBA)).
结合图1a,上述流程如下:来自客户端的I/O报文首先到达存储控制器A,该存储控制器A为客户端随机确定的。存储控制器A解封装I/O报文,根据I/O报文中的元数据查找映射关系表确定处理这个I/O报文的归属存储控制器N,然后存储控制器A将此I/O报文重新封装并转发至存储控制器N。存储控制器N接收该I/O报文后,重新解封装I/O报文,得到元数据获取存储阵列的地址信息。存储控制器N根据该I/O报文对存储阵列进行读写处理任务。此I/O报文的后续报文也需要经过上述流程反馈至客户端。Referring to FIG. 1a, the above process is as follows: the I/O message from the client first arrives at the storage controller A, and the storage controller A is randomly determined by the client. Storage controller A decapsulates the I/O packet, searches the mapping relationship table according to the metadata in the I/O packet to determine the storage controller N that processes the I/O packet, and then storage controller A sends the I/O packet to The O packet is re-encapsulated and forwarded to the storage controller N. After receiving the I/O message, the storage controller N re-decapsulates the I/O message to obtain the metadata and obtain the address information of the storage array. The storage controller N performs read and write processing tasks on the storage array according to the I/O message. Subsequent packets of this I/O packet also need to be fed back to the client through the above process.
上述由接收到I/O报文的存储控制器向该I/O报文的归属存储控制器转发该I/O报文的过程称为重路由(也称为I/O重路由),由于重路由过程中,I/O报文需要经过多次寻址转发,因此I/O报文的转发路径较长,造成数据读写操作完成时间较长,每秒的I/O访问次数(input/output per second,IOPS)降低,影响数据读写效率。The process of forwarding the I/O message from the storage controller receiving the I/O message to the storage controller to which the I/O message belongs is called rerouting (also called I/O rerouting). During the rerouting process, I/O packets need to be addressed and forwarded multiple times, so the forwarding path of I/O packets is long, resulting in a long time for data read and write operations to complete, and the number of I/O accesses per second (input /output per second, IOPS) is reduced, affecting data read and write efficiency.
有鉴于此,本申请提出一种数据读写方法,交换机接收来自存储控制器的所述存储控制器的地址信息,并为所述存储控制器分配对应的交换地址信息。根据该存储控制器的地址信息和该存储控制器对应的交换地址信息,该交换机建立与该存储控制器之间的直接路径连接,该直接路径连接用于传输I/O报文,所述I/O报文用于通过所述存储控制器向存储阵列写入数据或通过所述存储控制从存储阵列读取数据。由于交换机与存储控制器之间建立的直接路径连接可以传输I/O报文,因此客户端的I/O报文无需经过前述的重路由流程。缩短了I/O报文的转发路径,减少数据读写操作完成时间,提升每秒的I/O访问次数(input/output per second,IOPS),提升数据读写效率。In view of this, the present application proposes a method for reading and writing data. The switch receives address information of the storage controller from the storage controller, and assigns corresponding switching address information to the storage controller. According to the address information of the storage controller and the exchange address information corresponding to the storage controller, the switch establishes a direct path connection with the storage controller, the direct path connection is used to transmit I/O packets, and the I The /O message is used to write data to the storage array through the storage controller or read data from the storage array through the storage control. Since the direct path connection established between the switch and the storage controller can transmit the I/O message, the I/O message of the client does not need to go through the aforementioned rerouting process. It shortens the forwarding path of I/O packets, reduces the completion time of data read and write operations, increases the number of I/O accesses per second (input/output per second, IOPS), and improves the efficiency of data read and write.
下面,首先介绍本申请实施例所适用的网络架构。请参阅图1b,为本申请实施例提出的一种网络架构示意图。网络架构包括客户端集群、交换机110以及存储阵列集群,其中,客户端集群包括一个或多个客户端,存储阵列集群包括一个或多个存储控制器120,每个存储控制器120管理一个或多个存储阵列(图中未示出)。客户端发起的I/O请求通过交换机110转发至存储控制器120。In the following, the network architecture applicable to the embodiment of the present application will be introduced first. Please refer to FIG. 1 b , which is a schematic diagram of a network architecture proposed by the embodiment of the present application. The network architecture includes a client cluster, a switch 110, and a storage array cluster, wherein the client cluster includes one or more clients, the storage array cluster includes one or more storage controllers 120, and each storage controller 120 manages one or more storage array (not shown). The I/O request initiated by the client is forwarded to the storage controller 120 through the switch 110 .
本申请实施例中,当客户端上线后,该客户端会通过交换机与一个或多个存储控制器之间建立网络连接,该网络连接可以是传输控制协议(Transmission Control Protocol,TCP)连接。交换机与存储控制器建立直接路径连接,该存储控制器可以是与该客户端建立网络连接的存储控制器,该存储控制器也可以是未与该客户端建立网络连接的存储控制器。本申请中,直接路径连接用于交换机和存储控制器之间的直接通信(即不需要通过其他存储控制器转发的通信),且该直接路径连接用于传输I/O报文。In the embodiment of the present application, when the client goes online, the client will establish a network connection with one or more storage controllers through the switch, and the network connection may be a Transmission Control Protocol (TCP) connection. The switch establishes a direct path connection with the storage controller. The storage controller may be a storage controller that has established a network connection with the client, or the storage controller may not have established a network connection with the client. In the present application, the direct path connection is used for direct communication between the switch and the storage controller (that is, communication that does not need to be forwarded by other storage controllers), and the direct path connection is used for transmitting I/O packets.
下面结合附图进行说明:Describe below in conjunction with accompanying drawing:
(A)、请参阅图2,图2为本申请实施例提出的一种应用场景示意图。客户端上线后与存储控制器A和存储控制器B分别建立网络连接。交换机与该存储控制器A和存储控制 器B分别建立连接。(A). Please refer to FIG. 2, which is a schematic diagram of an application scenario proposed by the embodiment of the present application. After the client goes online, establish network connections with storage controller A and storage controller B respectively. The switch establishes connections with the storage controller A and the storage controller B respectively.
一种可能的实现方式为:客户端上线后,客户端与存储控制器A和存储控制器B分别建立网络连接。交换机与每一个存储控制器(存储控制器A和存储控制器B)分别建立直接路径连接。A possible implementation manner is: after the client goes online, the client establishes network connections with storage controller A and storage controller B respectively. The switch establishes a direct path connection with each storage controller (storage controller A and storage controller B) respectively.
另一种可能的实现方式中:客户端上线后,客户端与存储控制器A和存储控制器B分别建立网络连接。当客户端下发I/O报文,客户端会为该I/O报文随机确定一个存储控制器,该I/O报文的实际目的地为该I/O报文的归属存储控制器。当随机确定的存储控制器与该归属存储控制器一致,则交换机与该归属存储控制器之间不建立直接路径连接。可选的,该交换机与其它存储控制器之间建立直接路径连接。In another possible implementation manner: after the client goes online, the client establishes network connections with storage controller A and storage controller B respectively. When the client sends an I/O message, the client will randomly determine a storage controller for the I/O message, and the actual destination of the I/O message is the storage controller of the I/O message . When the randomly determined storage controller is consistent with the attributable storage controller, no direct path connection is established between the switch and the attributable storage controller. Optionally, a direct path connection is established between the switch and other storage controllers.
当随机确定的存储控制器与该归属存储控制器不一致,则交换机与该归属存储控制器之间建立直接路径连接。When the randomly determined storage controller is inconsistent with the attributable storage controller, a direct path connection is established between the switch and the attributable storage controller.
另一种可能的实现方式中:客户端上线后,客户端与存储控制器A和存储控制器B建立网络连接。当客户端下发I/O报文,交换机根据该I/O报文与所有的存储控制器(存储控制器A和存储控制器B)建立直接路径连接。(B)、请参阅图3,图3为本申请实施例提出的又一种应用场景示意图。客户端上线后与存储控制器A和存储控制器B分别建立网络连接。图3中的存储控制器C并未与该客户端建立网络连接。交换机与该存储控制器A、存储控制器B和存储控制器C建立直接路径连接。一种可能的示例中,该客户端发起的I/O请求中归属存储控制器为该存储控制器C,而该客户端并未与该存储控制器C建立网络连接。当交换机收到该I/O请求后,该交换机与该存储控制器C建立直接路径连接。该交换机通过直接路径连接向该存储控制器C发送该I/O请求,以使存储控制器C完成I/O处理。这种场景下,交换机只有在需要与存储控制器C通信时,才建立与存储控制器C之间的直径路径连接。In another possible implementation manner: after the client goes online, the client establishes a network connection with storage controller A and storage controller B. When the client sends an I/O message, the switch establishes direct path connections with all storage controllers (storage controller A and storage controller B) according to the I/O message. (B). Please refer to FIG. 3 . FIG. 3 is a schematic diagram of another application scenario proposed by the embodiment of the present application. After the client goes online, establish network connections with storage controller A and storage controller B respectively. The storage controller C in FIG. 3 has not established a network connection with the client. The switch establishes direct path connections with the storage controller A, storage controller B and storage controller C. In a possible example, the attributable storage controller in the I/O request initiated by the client is the storage controller C, but the client has not established a network connection with the storage controller C. After the switch receives the I/O request, the switch establishes a direct path connection with the storage controller C. The switch sends the I/O request to the storage controller C through the direct path connection, so that the storage controller C completes the I/O processing. In this scenario, the switch establishes a diameter path connection with the storage controller C only when it needs to communicate with the storage controller C.
另一种可能的实现方式中,当该客户端发起的I/O请求中归属存储控制器并不是存储控制器C,且存储控制器C与该客户端并未建立网络连接。当交换机收到该I/O请求后,该交换机与该存储控制器C建立直接路径连接。以便后续关于该存储控制器C的I/O请求,可以通过直接路径连接发送至该存储控制器C。In another possible implementation manner, when the I/O request initiated by the client is not assigned to the storage controller C, and the storage controller C has not established a network connection with the client. After the switch receives the I/O request, the switch establishes a direct path connection with the storage controller C. So that the subsequent I/O request on the storage controller C can be sent to the storage controller C through the direct path connection.
结合图1b,图2,以及图3所示的场景,本申请实施例提供了一种数据读写方法。如图4所示,为本申请实施例提供的数据读写方法的流程示意图,该数据读写方法包括步骤401-408。Combining the scenarios shown in FIG. 1b, FIG. 2, and FIG. 3, the embodiment of the present application provides a method for reading and writing data. As shown in FIG. 4 , it is a schematic flowchart of a data reading and writing method provided in the embodiment of the present application, and the data reading and writing method includes steps 401-408.
401、交换机接收来自存储控制器的路由信息。401. The switch receives routing information from the storage controller.
本实施例中,交换机接收来自存储控制器的路由信息,该路由信息包括以下信息中的一项或多项:存储控制器的标识信息、存储控制器的I/O地址、目的地为存储控制器的I/O报文是否需要复制,或者存储控制器的负载信息。I/O地址包括但不限于:逻辑单元号(logical unit number,LUN)、命名空间的标识(name space id,NSID),或逻辑区块地址(Logical Block Address,LBA)。本申请中,将指存储控制器所管理的存储阵列的负载信息,简称为存储控制器的负载信息。存储控制器的负载信息包括但不限于:该存储控制器的存储空间总量、该存储控制器的剩余可用存储空间、该存储控制器的已用存储空间、 该存储控制器的IOPS、该存储控制器是否支持双活模式,或者该存储控制器的温度。In this embodiment, the switch receives the routing information from the storage controller, and the routing information includes one or more of the following information: identification information of the storage controller, I/O address of the storage controller, destination is storage control Whether the I/O packets of the controller need to be copied, or the load information of the controller is stored. The I/O address includes but is not limited to: logical unit number (logical unit number, LUN), namespace identifier (name space id, NSID), or logical block address (Logical Block Address, LBA). In this application, it will refer to the load information of the storage array managed by the storage controller, and will be referred to as the load information of the storage controller for short. The load information of the storage controller includes but not limited to: the total storage space of the storage controller, the remaining available storage space of the storage controller, the used storage space of the storage controller, the IOPS of the storage controller, the storage Whether the controller supports active-active mode, or the temperature of the storage controller.
具体的,交换机首先与存储控制器之间建立通信连接,该通信连接采用的协议类型包括但不限于:传输控制协议(Transmission Control Protocol,TCP)、远程直接数据存取(Remote Direct Memory Access,RDMA)连接,或者用户数据报协议(User Datagram Protocol,UDP)。Specifically, the switch first establishes a communication connection with the storage controller. The protocol types used in the communication connection include but are not limited to: Transmission Control Protocol (Transmission Control Protocol, TCP), Remote Direct Memory Access (Remote Direct Memory Access, RDMA) ) connection, or User Datagram Protocol (UDP).
一种可能的场景为:交换机与该存储控制器建立通信连接。当客户端上线后,该客户端配置可使用的存储控制器。该客户端与该存储控制器建立通信连接。该通信连接也可以称为控制面连接。具体的建立控制面连接的流程如下:A possible scenario is: the switch establishes a communication connection with the storage controller. When the client comes online, the client configures available storage controllers. The client establishes a communication connection with the storage controller. This communication connection may also be referred to as a control plane connection. The specific process of establishing a control plane connection is as follows:
(a)、交换机发起关于直接路径连接的监听任务。(a). The switch initiates a monitoring task about the direct path connection.
(b)、当存储控制器上线后,该存储控制器本地分配存储控制器的队列对端口号(或者该存储控制器的IP地址与端口号)。(b) After the storage controller goes online, the storage controller locally allocates the queue pair port number of the storage controller (or the IP address and port number of the storage controller).
(c)、交换机监听到存储控制器上线后,交换机与存储控制器通过RDMA建链流程(或者TCP建链流程)建立控制面连接。(c) After the switch detects that the storage controller is online, the switch and the storage controller establish a control plane connection through an RDMA link establishment process (or a TCP link establishment process).
当交换机与存储控制器之间建立该通信连接(控制面连接)后,交换机通过该通信连接接收来自存储控制器的路由信息。根据存储控制器实际采用的存储协议,该路由信息中的I/O地址可能不同,下面分别进行说明。After the communication connection (control plane connection) is established between the switch and the storage controller, the switch receives routing information from the storage controller through the communication connection. According to the storage protocol actually adopted by the storage controller, the I/O address in the routing information may be different, which will be described separately below.
在一种可能的实现方式中,当存储控制器采用高速非易失性存储器(Non-Volatile Memory Express,NVMe)协议,则该I/O地址包括:逻辑区块地址和命名空间的标识。即交换机接收来自存储控制器的该存储控制器的LBA和该存储控制器的NSID。In a possible implementation manner, when the storage controller adopts a high-speed non-volatile memory (Non-Volatile Memory Express, NVMe) protocol, the I/O address includes: a logical block address and a namespace identifier. That is, the switch receives the LBA of the storage controller and the NSID of the storage controller from the storage controller.
在另一种可能的实现方式中,当存储控制器采用小型计算机系统接口(Small Computer System Interface,SCSI)协议,则该I/O地址包括:逻辑单元号和逻辑区块地址。即交换机接收来自存储控制器的该存储控制器的LUN和该存储控制器的LBA。In another possible implementation manner, when the storage controller adopts the Small Computer System Interface (Small Computer System Interface, SCSI) protocol, the I/O address includes: a logical unit number and a logical block address. That is, the switch receives the LUN of the storage controller and the LBA of the storage controller from the storage controller.
存储控制器的标识信息可以包括该存储控制器的互联网协议(Internet Protocol,IP)地址和/或该存储控制器的名称。示例性的,该存储控制器的标识信息如表1所示:The identification information of the storage controller may include an Internet Protocol (Internet Protocol, IP) address of the storage controller and/or a name of the storage controller. Exemplarily, the identification information of the storage controller is shown in Table 1:
表1Table 1
Figure PCTCN2022098309-appb-000001
Figure PCTCN2022098309-appb-000001
目的地为存储控制器的I/O报文是否需要复制,指的是该存储控制器是否处于双活(active-active)模式。该存储控制器作为双活集群的一部分,双活集群工作在双活模式(也称为双活集群执行双活任务)。双活集群的特点是两个集群都是在线运行的,并且可以支持相同的应用负载。客户端在向双活集群写入数据时,例如客户端向双活集群发送I/O报文,则该双活集群中的其中一个集群根据该I/O报文执行读写操作,该集群复制该I/O报文并发送至另一个集群,以使另一个集群存储该I/O报文中的数据。客户端从双活集群中读取数据时,如果双活集群中有一个集群故障,另一个集群还可以正常工作,那么客户 端可以直接通过正常工作的集群读取数据。由此可知,双活模式可以有效提升存储数据的安全性。类似地,当该存储控制器与备份存储控制器工作在双活模式,则该存储控制器接收I/O报文后,复制该I/O报文并向备份存储控制器发送该I/O报文。由该备份存储控制器存储该I/O报文中的数据。Whether the I/O message whose destination is the storage controller needs to be copied refers to whether the storage controller is in an active-active mode. The storage controller is a part of the active-active cluster, and the active-active cluster works in the active-active mode (also referred to as the active-active cluster performing the active-active task). The feature of an active-active cluster is that both clusters are running online and can support the same application load. When the client writes data to the active-active cluster, for example, the client sends an I/O message to the active-active cluster, one of the clusters in the active-active cluster performs read and write operations based on the I/O message, and the cluster Copy the I/O message and send it to another cluster, so that the other cluster stores the data in the I/O message. When a client reads data from an active-active cluster, if one of the active-active clusters fails and the other cluster is still working normally, the client can directly read data through the working cluster. It can be seen that the active-active mode can effectively improve the security of stored data. Similarly, when the storage controller and the backup storage controller work in the active-active mode, after receiving the I/O message, the storage controller copies the I/O message and sends the I/O message to the backup storage controller. message. The data in the I/O message is stored by the backup storage controller.
402、交换机接收来自存储控制器的存储控制器的地址信息。402. The switch receives address information of the storage controller from the storage controller.
本实施例中,交换机与存储控制器之间建立通信连接后,交换机接收来自存储控制器的存储控制器的地址信息。根据交换机与存储控制器之间建立的通信连接的不同,该交换机接收的存储控制器的地址信息可能不一样。下面分别进行说明。In this embodiment, after the communication connection is established between the switch and the storage controller, the switch receives address information of the storage controller from the storage controller. Depending on the communication connection established between the switch and the storage controller, the address information of the storage controller received by the switch may be different. Each will be described below.
当交换机与该存储控制器之间建立TCP连接,该TCP连接用于传输步骤401中的I/O路由信息与步骤402中的存储控制器的地址信息。该存储控制器的地址信息包括:该存储控制器的IP地址、该存储控制器的TCP端口号以及协议号。例如表2所示:When a TCP connection is established between the switch and the storage controller, the TCP connection is used to transmit the I/O routing information in step 401 and the address information of the storage controller in step 402 . The address information of the storage controller includes: the IP address of the storage controller, the TCP port number and the protocol number of the storage controller. For example, as shown in Table 2:
表2Table 2
存储控制器storage controller 存储控制器的IP地址IP address of the storage controller 存储控制器的TCP端口号TCP port number of the storage controller 协议号agreement number
存储控制器AStorage controller A 192.168.1.1192.168.1.1 32603260 66
存储控制器Bstorage controller B 192.168.1.2192.168.1.2 32703270 66
当交换机与该存储控制器之间建立RDMA连接,该RDMA连接用于传输步骤401中的I/O路由信息与步骤402中的存储控制器的地址信息。该存储控制器的地址信息包括:该存储控制器的队列对端口号QP和该存储控制器的IP地址。例如表3所示:When an RDMA connection is established between the switch and the storage controller, the RDMA connection is used to transmit the I/O routing information in step 401 and the address information of the storage controller in step 402 . The address information of the storage controller includes: the queue pair port number QP of the storage controller and the IP address of the storage controller. For example, as shown in Table 3:
表3table 3
存储控制器storage controller 存储控制器的IP地址IP address of the storage controller 存储控制器的队列对端口号The queue pair port number of the storage controller
存储控制器AStorage controller A 192.168.1.1192.168.1.1 551551
存储控制器Bstorage controller B 192.168.1.2192.168.1.2 552552
需要说明的是,步骤401与步骤402的执行顺序不作限制,即可以先执行步骤401再执行步骤402,也可以先执行步骤402再执行步骤401。It should be noted that the execution order of step 401 and step 402 is not limited, that is, step 401 may be executed first and then step 402 may be executed, or step 402 may be executed first and then step 401 may be executed.
403、交换机为该存储控制器分配对应的交换地址信息。403. The switch allocates corresponding switching address information for the storage controller.
本实施例中,当交换机接收该存储控制器的地址信息后,该交换机为该存储控制器分配对应的该交换地址信息。根据交换机与存储控制器之间建立的通信连接的不同,与步骤402类似,该交换地址信息存在不同,下面分别进行说明:In this embodiment, after the switch receives the address information of the storage controller, the switch assigns the corresponding switching address information to the storage controller. According to the difference in the communication connection established between the switch and the storage controller, similar to step 402, there are differences in the exchange address information, which will be described respectively below:
当交换机与该存储控制器之间建立TCP连接,该TCP连接用于传输步骤401中的I/O路由信息与步骤402中的交换地址信息。该交换地址信息包括:该交换机的IP地址、该交换机的TCP端口号以及协议号。例如表4所示:When a TCP connection is established between the switch and the storage controller, the TCP connection is used to transmit the I/O routing information in step 401 and the switching address information in step 402 . The switching address information includes: the IP address of the switch, the TCP port number and the protocol number of the switch. For example, as shown in Table 4:
表4Table 4
交换机switch 交换机的IP地址IP address of the switch 交换机的TCP端口号TCP port number of the switch 协议号agreement number
交换机ASwitch A 192.168.1.3192.168.1.3 32903290 66
当交换机与该存储控制器之间建立RDMA连接,该RDMA连接用于传输步骤401中的I/O路由信息与步骤402中的交换地址信息。该交换地址信息包括:该交换机的队列对端口号QP和该交换机的IP地址。例如表5所示:When an RDMA connection is established between the switch and the storage controller, the RDMA connection is used to transmit the I/O routing information in step 401 and the switching address information in step 402 . The switching address information includes: the queue pair port number QP of the switch and the IP address of the switch. For example, as shown in Table 5:
表5table 5
交换机switch 交换机的IP地址IP address of the switch 交换机的队列对端口号The queue pair port number of the switch
交换机ASwitch A 192.168.1.3192.168.1.3 556556
404、交换机与存储控制器建立直接路径连接。404. The switch establishes a direct path connection with the storage controller.
本实施例中,该交换机通过与存储控制器之间的通信连接(即控制面连接),向存储控制器发送该存储控制器对应的交换地址信息。则交换机侧与存储控制器侧分别存储了交换地址信息和存储控制器的地址信息,交换机与存储控制器根据上述地址信息建立直接路径连接。交换机可以为每个直接路径连接分配一个直接路径连接的标识信息。具体的建立直接路径连接的流程,取决于该直接路径连接的类型。当该直接路径连接的类型为TCP时,通过TCP建链流程建立直接路径连接。当该直接路径连接的类型为RDMA通过RDMA建链流程建立直接路径连接。当该直接路径连接的类型为UDP时,通过UDP建链流程建立直接路径连接。需要说明的是,直接路径连接的类型包括但不限于TCP、UDP、或者RDMA。In this embodiment, the switch sends the switching address information corresponding to the storage controller to the storage controller through a communication connection (that is, a control plane connection) with the storage controller. Then the switch side and the storage controller side respectively store the exchange address information and the address information of the storage controller, and the switch and the storage controller establish a direct path connection according to the above address information. The switch may assign identification information of a direct path connection to each direct path connection. The specific process of establishing the direct path connection depends on the type of the direct path connection. When the type of the direct path connection is TCP, the direct path connection is established through a TCP link establishment procedure. When the type of the direct path connection is RDMA, the direct path connection is established through the RDMA link establishment procedure. When the type of the direct path connection is UDP, the direct path connection is established through a UDP link establishment procedure. It should be noted that the type of the direct path connection includes but not limited to TCP, UDP, or RDMA.
示例性的,交换机A与存储控制器A之间的直接路径连接表示为:“存储控制器A:192.168.1.1//3260//6;交换机A:192.168.1.3//3290//6”。该直接路径连接的标识信息为“controller A-switch A”。该直接路径连接的类型为TCP。Exemplarily, the direct path connection between switch A and storage controller A is expressed as: "storage controller A: 192.168.1.1//3260//6; switch A: 192.168.1.3//3290//6". The identification information of the direct path connection is "controller A-switch A". The direct path connection is of type TCP.
交换机A与存储控制器B之间的直接路径连接表示为:“存储控制器B:192.168.1.2//3270//6;交换机A:192.168.1.3//3290//6”。该直接路径连接的标识信息为“controller B-switch A”。该直接路径连接的类型为TCP。The direct path connection between Switch A and Storage Controller B is expressed as: "Storage Controller B: 192.168.1.2//3270//6; Switch A: 192.168.1.3//3290//6". The identification information of the direct path connection is "controller B-switch A". The direct path connection is of type TCP.
在又一种示例中,交换机A与存储控制器A之间的直接路径连接表示为:“存储控制器A:192.168.1.1//551;交换机A//192.168.1.3//556”。该直接路径连接的标识信息为“controller A-switch A”。该直接路径连接的类型为RDMA。In yet another example, the direct path connection between switch A and storage controller A is expressed as: "storage controller A: 192.168.1.1//551; switch A//192.168.1.3//556". The identification information of the direct path connection is "controller A-switch A". The direct path connection is of type RDMA.
交换机A与存储控制器B之间的直接路径连接表示为:“存储控制器B:192.168.1.2//552;交换机A//192.168.1.3//556”。该直接路径连接的标识信息为“controller B-switch A”。该直接路径连接的类型为RDMA。The direct path connection between switch A and storage controller B is expressed as: "storage controller B: 192.168.1.2//552; switch A//192.168.1.3//556". The identification information of the direct path connection is "controller B-switch A". The direct path connection is of type RDMA.
当任一个交换机与任一个存储控制器之间只建立一条直接路径连接时,该直接路径连接与存储控制器之间存在唯一的对应关系。因此,该直接路径连接的标识信息可以采用该存储控制器的标识信息。例如:交换机A与存储控制器B之间的直接路径连接表示为:“存储控制器B:192.168.1.2//552;交换机A//192.168.1.3//556”。该直接路径连接的标识信息为“controller B”。When only one direct path connection is established between any switch and any storage controller, there is a unique corresponding relationship between the direct path connection and the storage controller. Therefore, the identification information of the direct path connection may be the identification information of the storage controller. For example, the direct path connection between switch A and storage controller B is expressed as: "storage controller B: 192.168.1.2//552; switch A//192.168.1.3//556". The identification information of the direct path connection is "controller B".
在另一种可能的实现方式中,任一个交换机与任一个存储控制器之间可以建立多条直接路径连接。不同的直接路径连接可以承载不同客户端的I/O报文。下面进行示例说明。In another possible implementation manner, multiple direct path connections may be established between any switch and any storage controller. Different direct path connections can carry I/O packets of different clients. An example is given below.
示例性的,交换机A与存储控制器B之间的直接路径连接#1表示为:“存储控制器B:192.168.1.2//3270//6;交换机A:192.168.1.3//3290//6;客户端A:192.168.2.3//3310//6”。该直接路径连接#1的标识信息为“controller B-switch A-client A”。该直接路径连接#1的类型为TCP。Exemplarily, the direct path connection #1 between switch A and storage controller B is expressed as: "storage controller B: 192.168.1.2//3270//6; switch A: 192.168.1.3//3290//6 ; Client A: 192.168.2.3//3310//6". The identification information of the direct path connection #1 is "controller B-switch A-client A". The direct path connection #1 is of type TCP.
交换机A与存储控制器B之间的直接路径连接#2表示为:“存储控制器B:192.168.1.2//3270//6;交换机A:192.168.1.3//3290//6;客户端B: 192.168.2.4//3320//6”。该直接路径连接#2的标识信息为“controller B-switch A-client B”。该直接路径连接#2的类型为TCP。Direct Path Connection #2 between Switch A and Storage Controller B is represented as: "Storage Controller B: 192.168.1.2//3270//6; Switch A: 192.168.1.3//3290//6; Client B : 192.168.2.4//3320//6". The identification information of the direct path connection #2 is "controller B-switch A-client B". The direct path connection #2 is of type TCP.
在又一种示例中,交换机A与存储控制器A之间的直接路径连接#1表示为:“存储控制器A:192.168.1.1//551;交换机A//192.168.1.3//556;客户端A//192.168.2.3//540”。该直接路径连接的标识信息为“controller A-switch A-client A”。该直接路径连接的类型为RDMA。In yet another example, direct path connection #1 between Switch A and Storage Controller A is represented as: "Storage Controller A: 192.168.1.1//551; Switch A//192.168.1.3//556; Customer Terminal A//192.168.2.3//540". The identification information of the direct path connection is "controller A-switch A-client A". The direct path connection is of type RDMA.
交换机A与存储控制器B之间的直接路径连接#2表示为:“存储控制器B:192.168.1.2//552;交换机A//192.168.1.3//556;客户端B//192.168.2.4//541”。该直接路径连接的标识信息为“controller B-switch A-client B”。该直接路径连接的类型为RDMA。Direct Path Connection #2 between Switch A and Storage Controller B is represented as: "Storage Controller B: 192.168.1.2//552; Switch A//192.168.1.3//556; Client B//192.168.2.4 //541". The identification information of the direct path connection is "controller B-switch A-client B". The direct path connection is of type RDMA.
需要说明的是,根据直接路径连接使用的协议的不同,该直接路径连接还存在其它的实现方案,此处不作限制。上述直接路径连接的示例,还可以包括其它信息,例如:该直接路径连接的状态等,此处不作限制。It should be noted that, according to different protocols used by the direct path connection, there are other implementation solutions for the direct path connection, which are not limited here. The foregoing example of the direct path connection may also include other information, for example, the status of the direct path connection, etc., which is not limited here.
405、交换机生成第一映射关系。405. The switch generates a first mapping relationship.
本实施例中,交换机生成第一映射关系,该第一映射关系存在多种可能的实现,下面分别进行说明。In this embodiment, the switch generates the first mapping relationship, and there are multiple possible implementations of the first mapping relationship, which will be described respectively below.
(a)、当任一个交换机与任一个存储控制器之间只建立一条直接路径连接时,该直接路径连接与存储控制器之间存在唯一的对应关系。因此,该直接路径连接的标识信息可以采用该存储控制器的标识信息。该第一映射关系包括存储控制器的标识信息和I/O地址。(a) When only one direct path connection is established between any switch and any storage controller, there is a unique corresponding relationship between the direct path connection and the storage controller. Therefore, the identification information of the direct path connection may be the identification information of the storage controller. The first mapping relationship includes identification information and I/O addresses of storage controllers.
示例性的,第一映射关系如表6所示:Exemplarily, the first mapping relationship is shown in Table 6:
表6Table 6
Figure PCTCN2022098309-appb-000002
Figure PCTCN2022098309-appb-000002
可选的,该第一映射关系包括:直接路径连接的标识信息、存储控制器的标识信息和I/O地址的映射关系。示例性的,第一映射关系如表7所示:Optionally, the first mapping relationship includes: a mapping relationship between identification information of the direct path connection, identification information of the storage controller, and an I/O address. Exemplarily, the first mapping relationship is shown in Table 7:
表7Table 7
Figure PCTCN2022098309-appb-000003
Figure PCTCN2022098309-appb-000003
可选地,该第一映射关系可以是键值对(key-value)的哈希(hash)表。该表的键(key)为I/O地址,该表的值(value)为该直接路径连接的标识信息和存储控制器的标识信息。Optionally, the first mapping relationship may be a hash (hash) table of key-value pairs (key-value). The key (key) of the table is the I/O address, and the value (value) of the table is the identification information of the direct path connection and the identification information of the storage controller.
(b)、任一个交换机与任一个存储控制器之间可以建立多条直接路径连接。不同的直接路径连接可以承载不同客户端的I/O报文。(b) Multiple direct path connections can be established between any switch and any storage controller. Different direct path connections can carry I/O packets of different clients.
可选的,该第一映射关系包括:直接路径的标识信息、I/O地址、存储控制器的标识 信息和客户端的标识信息。示例性的,第一映射关系如表8所示:Optionally, the first mapping relationship includes: identification information of the direct path, an I/O address, identification information of the storage controller, and identification information of the client. Exemplarily, the first mapping relationship is shown in Table 8:
表8Table 8
Figure PCTCN2022098309-appb-000004
Figure PCTCN2022098309-appb-000004
该第一映射关系还可以包括直接路径连接的其他信息,包括但不限于:存储控制器的TCP端口号、存储控制器IP地址、交换机的IP地址、交换机的端口号、存储控制器的QP、交换机的QP、协议号,或者连接状态信息(例如报文序列号等)。The first mapping relationship may also include other information of the direct path connection, including but not limited to: the TCP port number of the storage controller, the IP address of the storage controller, the IP address of the switch, the port number of the switch, the QP of the storage controller, QP and protocol number of the switch, or connection status information (such as packet sequence number, etc.).
406、客户端向交换机发送第一I/O报文。406. The client sends the first I/O packet to the switch.
本实施例中,客户端向交换机发送第一I/O报文,该第一I/O报文用于通过存储控制器向存储阵列写入数据或通过存储控制器从存储阵列读取数据。In this embodiment, the client sends the first I/O packet to the switch, and the first I/O packet is used to write data to the storage array through the storage controller or read data from the storage array through the storage controller.
一种可能的实现方式中,步骤406执行在步骤401之前,客户端向交换机发送第一I/O报文后,执行步骤401-405。即交换机接收到来自客户端的I/O报文后,交换机与存储控制器之间建立直接路径连接。In a possible implementation manner, step 406 is performed before step 401, and after the client sends the first I/O packet to the switch, steps 401-405 are performed. That is, after the switch receives the I/O packet from the client, a direct path connection is established between the switch and the storage controller.
在另一种可能的实现方式中,步骤401-405之后,执行步骤406。即交换机与存储控制器之间建立直接路径连接后,交换机接收来自客户端的I/O报文。In another possible implementation manner, after steps 401-405, step 406 is performed. That is, after a direct path connection is established between the switch and the storage controller, the switch receives I/O packets from the client.
407、交换机根据第一I/O报文和第一映射关系,确定直接路径连接。407. The switch determines the direct path connection according to the first I/O packet and the first mapping relationship.
本实施例中,当交换机接收来自客户端的第一I/O报文后,交换机根据该第一I/O报文,从第一映射关系中确定转发该第一I/O报文的直接路径连接。In this embodiment, after the switch receives the first I/O message from the client, the switch determines the direct path for forwarding the first I/O message from the first mapping relationship according to the first I/O message connect.
根据第一映射关系的不同实现,存在多种实现方案,下面分别进行说明。According to different implementations of the first mapping relationship, there are multiple implementation solutions, which will be described respectively below.
可选的,当该第一映射关系包括:直接路径连接的标识信息和I/O地址时。交换机收到第一I/O报文后,首先交换机解析该第一I/O报文中的地址信息(包括源地址信息和目的地址信息)。交换机根据第一I/O报文中的I/O地址,确定直接路径连接的标识信息。交换机使用该直接路径连接发送该第一I/O报文。这种情况下,每个存储控制器仅对应一个直接路径连接。Optionally, when the first mapping relationship includes: identification information and an I/O address of the direct path connection. After the switch receives the first I/O message, the switch first analyzes the address information (including source address information and destination address information) in the first I/O message. The switch determines the identification information of the direct path connection according to the I/O address in the first I/O message. The switch sends the first I/O packet by using the direct path connection. In this case, there is only one direct path connection per storage controller.
可选的,当该第一映射关系包括:直接路径连接的标识信息、存储控制器的标识信息和I/O地址时。交换机收到第一I/O报文后,首先交换机解析该第一I/O报文中的地址信息(包括源地址信息和目的地址信息)。交换机根据第一I/O报文中的I/O地址,确定对应的存储控制器的地址信息。其次,交换机根据该存储控制器的地址信息确定直接路径连接的标识信息。交换机使用该直接路径连接发送该第一I/O报文。Optionally, when the first mapping relationship includes: identification information of the direct path connection, identification information of the storage controller, and an I/O address. After the switch receives the first I/O message, the switch first analyzes the address information (including source address information and destination address information) in the first I/O message. The switch determines the address information of the corresponding storage controller according to the I/O address in the first I/O message. Secondly, the switch determines the identification information of the direct path connection according to the address information of the storage controller. The switch sends the first I/O packet by using the direct path connection.
可选的,当任一交换机与任一存储控制器之间建立一条或多条直接路径连接时,不同的直接路径连接用于承载不同客户端的I/O报文。该第一映射关系包括:直接路径的标识 信息、I/O地址、存储控制器的标识信息和客户端的标识信息。Optionally, when one or more direct path connections are established between any switch and any storage controller, different direct path connections are used to bear I/O packets of different clients. The first mapping relationship includes: identification information of the direct path, an I/O address, identification information of the storage controller, and identification information of the client.
交换机收到第一I/O报文后,首先交换机解析该第一I/O报文中的地址信息(包括源地址信息和目的地址信息)。其次,根据第一I/O报文中的I/O地址,确定对应的存储控制器的地址信息。再次,交换机根据第一I/O报文确定该I/O报文的源地址信息,即该第一I/O报文对应的客户端的地址信息。再次,交换机根据该客户端的地址信息,确定客户端的标识信息,一种可能的实现方式中,该客户端的地址信息与该客户端的标识信息一致。再次,交换机根据该客户端的标识信息,确定对应的直接路径连接的标识信息。交换机使用该直接路径连接发送该第一I/O报文。After the switch receives the first I/O message, the switch first analyzes the address information (including source address information and destination address information) in the first I/O message. Secondly, according to the I/O address in the first I/O message, the address information of the corresponding storage controller is determined. Thirdly, the switch determines the source address information of the I/O message according to the first I/O message, that is, the address information of the client corresponding to the first I/O message. Thirdly, the switch determines the identification information of the client according to the address information of the client. In a possible implementation manner, the address information of the client is consistent with the identification information of the client. Thirdly, the switch determines the corresponding identification information of the direct path connection according to the identification information of the client. The switch sends the first I/O packet by using the direct path connection.
可选的,该第一映射关系包括:直接路径的标识信息、I/O地址、存储控制器的标识信息和客户端的标识信息。Optionally, the first mapping relationship includes: identification information of the direct path, an I/O address, identification information of the storage controller, and identification information of the client.
一种可能的实现中,交换机收到第一I/O报文后,首先,首先交换机解析该第一I/O报文中的地址信息(包括源地址信息和目的地址信息)。其次,根据第一I/O报文中的I/O地址,确定对应的存储控制器的地址信息。再次,交换机检测该存储控制器对应的直接路径连接的数量。当该存储控制器仅对应一条直接路径连接时,交换机直接使用该直接路径连接发送该第一I/O报文。In a possible implementation, after the switch receives the first I/O message, firstly, the switch parses address information (including source address information and destination address information) in the first I/O message. Secondly, according to the I/O address in the first I/O message, the address information of the corresponding storage controller is determined. Again, the switch detects the number of direct path connections corresponding to the storage controller. When the storage controller corresponds to only one direct path connection, the switch directly uses the direct path connection to send the first I/O packet.
当该存储控制器对应多条直接路径连接时,交换机根据该第一I/O报文中的源地址信息,即该第一I/O报文对应的客户端的地址信息,确定该第一I/O报文来自哪一个客户端。再次,交换机根据该客户端的地址信息,确定客户端的标识信息,一种可能的实现方式中,该客户端的地址信息与该客户端的标识信息一致。再次,交换机根据该客户端的标识信息,确定对应的直接路径连接的标识信息。交换机使用该直接路径连接发送该第一I/O报文。示例性的,当交换机解析该第一I/O报文中的地址信息为“LBA-A和NSID-A”,则交换机根据第一映射关系确定该第一I/O报文的目的地为存储控制器A,该存储控制器A的标识信息为“192.168.1.1/target100.com”。交换机根据第一映射关系,确定对应的直接路径连接的标识信息为“controller A-switch A”。交换机根据该直接路径连接的标识信息,确定该直接路径连接。When the storage controller corresponds to multiple direct path connections, the switch determines the first I/O according to the source address information in the first I/O message, that is, the address information of the client corresponding to the first I/O message /O which client the packet comes from. Thirdly, the switch determines the identification information of the client according to the address information of the client. In a possible implementation manner, the address information of the client is consistent with the identification information of the client. Thirdly, the switch determines the corresponding identification information of the direct path connection according to the identification information of the client. The switch sends the first I/O packet by using the direct path connection. Exemplarily, when the switch resolves the address information in the first I/O message as "LBA-A and NSID-A", the switch determines that the destination of the first I/O message is Storage controller A, the identification information of the storage controller A is "192.168.1.1/target100.com". The switch determines that the identification information of the corresponding direct path connection is "controller A-switch A" according to the first mapping relationship. The switch determines the direct path connection according to the identification information of the direct path connection.
可选的,交换机可以根据各个存储控制器上报的状态信息,动态选择合适的存储控制器接收该第一I/O报文。这里的合适的存储控制器可以是可用空间较大的存储控制器,或者IOPS性能较强的存储控制器,或者存储控制器的中央处理器(central processing unit,CPU)的处理负载较低的存储控制器,此处不作限制。Optionally, the switch may dynamically select an appropriate storage controller to receive the first I/O packet according to status information reported by each storage controller. The suitable storage controller here can be a storage controller with larger available space, or a storage controller with stronger IOPS performance, or a storage controller with lower processing load of the central processing unit (CPU) of the storage controller. Controller, not limited here.
可选的,交换机检测该第一I/O报文,当第一I/O报文的目的地(存储控制器)为执行双活模式的存储控制器。则交换机复制该第一I/O报文。交换机向该存储控制器的备份存储控制器发送该第一I/O报文,该存储控制器与该备份存储控制器工作在双活模式。该交换机向备份存储控制器发送的第一I/O报文增加指示该报文为复制报文的标识。Optionally, the switch detects the first I/O message, when the destination (storage controller) of the first I/O message is a storage controller implementing the active-active mode. Then the switch copies the first I/O packet. The switch sends the first I/O packet to a backup storage controller of the storage controller, and the storage controller and the backup storage controller work in a dual-active mode. The switch adds an identifier indicating that the message is a copy message to the first I/O message sent by the switch to the backup storage controller.
可选的,交换机检测该第一I/O报文,当第一I/O报文为数据报文时,交换机执行步骤408,通过直接路径连接向存储控制器发送该第一I/O报文。当第一I/O报文为非数据报文(例如,控制报文)时,交换机透传该第一I/O报文。Optionally, the switch detects the first I/O message, and when the first I/O message is a data message, the switch executes step 408, and sends the first I/O message to the storage controller through a direct path connection. arts. When the first I/O packet is a non-data packet (for example, a control packet), the switch transparently transmits the first I/O packet.
408、交换机通过直接路径连接向存储控制器发送第一I/O报文。408. The switch sends the first I/O packet to the storage controller through the direct path connection.
具体的,交换机确定直接路径连接后,将该第一I/O报文的报文头中目的地址替换为直接路径连接包括的存储控制器的地址信息。例如:直接路径连接的类型为RDMA为例。该第一I/O报文的报文头中目的地址被替换为“192.168.2.4//541”,则该第一I/O报文通过该直接路径连接“controller B-switch A-client B”发送至存储控制器B。Specifically, after determining the direct path connection, the switch replaces the destination address in the packet header of the first I/O packet with the address information of the storage controller included in the direct path connection. For example: the type of the direct path connection is RDMA as an example. The destination address in the message header of the first I/O message is replaced with "192.168.2.4//541", then the first I/O message is connected to "controller B-switch A-client B" through the direct path ” to storage controller B.
可选的,交换机通过直接路径连接向存储控制器发送第一I/O报文外,交换机还可以向第二存储控制器发送第一指示信息。该第一指示信息指示第二存储控制器更新本地的接收报文的序列号,第二存储控制器为所述客户端为所述第一I/O报文初始分配的存储控制器。通过第一指示信息指示第二存储控制器更新本地的接收报文的序列号,避免第二存储控制器因为没有收到第一I/O报文导致乱序问题。Optionally, in addition to sending the first I/O packet to the storage controller through the direct path connection, the switch may also send the first indication information to the second storage controller. The first indication information instructs the second storage controller to update the local serial number of the received message, and the second storage controller is the storage controller initially allocated by the client for the first I/O message. The first indication information instructs the second storage controller to update the local serial number of the received message, so as to avoid the out-of-sequence problem caused by the second storage controller not receiving the first I/O message.
可选的,交换机向第二存储控制器发送第一指示信息,包括:交换机向第二存储控制器发送信号报文,该信号报文的报文头中包括第一I/O报文的报文头信息,该信号报文的报文头还包括第一指示信息。Optionally, the switch sending the first indication information to the second storage controller includes: the switch sending a signal message to the second storage controller, and the message header of the signal message includes the message of the first I/O message Header information, where the header of the signal message further includes first indication information.
可选的,交换机通知各个存储控制器关闭乱序报文的检测功能。避免第二存储控制器因为没有收到第一I/O报文导致乱序问题,第二存储控制器为所述客户端为所述第一I/O报文初始分配的存储控制器。Optionally, the switch notifies each storage controller to disable the out-of-sequence packet detection function. To avoid an out-of-sequence problem caused by the second storage controller not receiving the first I/O message, the second storage controller is the storage controller initially allocated by the client to the first I/O message.
本申请提出一种数据读写方法,交换机接收来自存储控制器的所述存储控制器的地址信息,并为所述存储控制器分配对应的交换地址信息。根据该存储控制器的地址信息和该存储控制器对应的交换地址信息,该交换机建立与该存储控制器之间的直接路径连接,该直接路径连接用于传输I/O报文,所述I/O报文用于通过所述存储控制器向存储阵列写入数据或通过所述存储控制从存储阵列读取数据。由于交换机与存储控制器之间建立的直接路径连接可以传输I/O报文,因此客户端的I/O报文无需经过前述的重路由流程。避免了I/O处理的多次寻址,缩短了I/O报文的转发路径,减少数据读写操作完成时间,提升IOPS性能,提升数据读写效率。The present application proposes a method for reading and writing data. A switch receives address information of the storage controller from a storage controller, and assigns corresponding switching address information to the storage controller. According to the address information of the storage controller and the exchange address information corresponding to the storage controller, the switch establishes a direct path connection with the storage controller, the direct path connection is used to transmit I/O packets, and the I The /O message is used to write data to the storage array through the storage controller or read data from the storage array through the storage control. Since the direct path connection established between the switch and the storage controller can transmit the I/O message, the I/O message of the client does not need to go through the aforementioned rerouting process. It avoids multiple addressing for I/O processing, shortens the forwarding path of I/O packets, reduces the completion time of data read and write operations, improves IOPS performance, and improves data read and write efficiency.
结合前述实施例,当存储控制器接收到第一I/O报文后,存储控制器可以通过多种方式向客户端发送回复报文。下面结合附图进行说明。With reference to the foregoing embodiments, after the storage controller receives the first I/O message, the storage controller may send a reply message to the client in various ways. Description will be made below in conjunction with the accompanying drawings.
(1)、请参阅图5,为本申请实施例提出的一种数据读写方法的实施例示意图。该数据读写方法包括步骤501-504。(1) Please refer to FIG. 5 , which is a schematic diagram of an embodiment of a data reading and writing method proposed in the embodiment of the present application. The data reading and writing method includes steps 501-504.
501、客户端向交换机发送第一I/O报文。501. The client sends a first I/O packet to the switch.
502、交换机通过直接路径连接,向存储控制器发送第一I/O报文。存储控制器根据该第一I/O报文向存储阵列写入数据或从存储阵列读取数据。502. The switch is connected through the direct path, and sends the first I/O packet to the storage controller. The storage controller writes data to the storage array or reads data from the storage array according to the first I/O message.
503、交换机通过直接路径连接接收来自存储控制器的第一回复报文。该第一回复报文为第一I/O报文的响应。该第一回复报文的目的地为该客户端。503. The switch receives the first reply packet from the storage controller through the direct path connection. The first reply message is a response to the first I/O message. The destination of the first reply message is the client.
504、交换机向客户端发送第一回复报文。504. The switch sends a first reply message to the client.
一种可能的实现方式中,交换机通过客户端与存储控制器之间建立的通信连接,向客户端发送该第一回复报文。例如:交换机通过客户端与存储控制器之间建立的TCP连接,向客户端发送第一回复报文。In a possible implementation manner, the switch sends the first reply message to the client through the communication connection established between the client and the storage controller. For example, the switch sends the first reply message to the client through the TCP connection established between the client and the storage controller.
在另一种可能的实现方式中,交换机与客户端之间建立直接路径连接。具体建立直接路径连接的方式与图4实施例中交换机与存储控制器之间建立直接路径连接的方式类似。交换机通过该直接路径连接向客户端发送第一回复报文。In another possible implementation manner, a direct path connection is established between the switch and the client. The specific manner of establishing the direct path connection is similar to the manner of establishing the direct path connection between the switch and the storage controller in the embodiment in FIG. 4 . The switch sends the first reply message to the client through the direct path connection.
本申请实施例中,存储控制器通过直接路径连接将第一回复报文发送至交换机,使得客户端不感知该第一回复报文的路由路径。In the embodiment of the present application, the storage controller sends the first reply message to the switch through the direct path connection, so that the client does not perceive the routing path of the first reply message.
(2)、请参阅图6,图6为本申请实施例提出的另一种数据读写方法的实施例示意图。该数据读写方法包括步骤601-606。(2) Please refer to FIG. 6 . FIG. 6 is a schematic diagram of an embodiment of another data reading and writing method proposed in the embodiment of the present application. The data reading and writing method includes steps 601-606.
601、客户端向交换机发送第一I/O报文。601. The client sends a first I/O packet to the switch.
602、交换机通过直接路径连接,向存储控制器发送第一I/O报文。存储控制器根据该第一I/O报文向存储阵列写入数据或从存储阵列读取数据。602. The switch is connected through the direct path, and sends the first I/O packet to the storage controller. The storage controller writes data to the storage array or reads data from the storage array according to the first I/O message.
603、交换机通过直接路径连接,接收来自存储控制器的第二回复报文。603. The switch is connected through the direct path, and receives the second reply packet from the storage controller.
该第二回复报文的目的地是第二存储控制器。该第二存储控制器为客户端为第一I/O报文初始分配的存储控制器。即,该第二存储控制器为客户端在发送第一I/O报文之前,为第一I/O报文随机分配的存储控制器。The destination of the second reply message is the second storage controller. The second storage controller is the storage controller initially allocated by the client for the first I/O packet. That is, the second storage controller is a storage controller randomly assigned to the first I/O message by the client before sending the first I/O message.
该第二回复报文包括代理指示信息,该代理指示信息指示第二存储控制器代理存储控制器将第二回复报文发送至客户端。The second reply packet includes proxy indication information, and the proxy indication information instructs the second storage controller to send the second reply packet to the client on behalf of the storage controller.
另一种可能的实现方式中,存储控制器之间存在网络连接。存储控制器通过网络连接向第二存储控制器发送第二回复报文。此时,相应的交换机执行网络转发,将该第二回复报文由存储控制器转发至第二存储控制器。In another possible implementation manner, there is a network connection between the storage controllers. The storage controller sends the second reply message to the second storage controller through the network connection. At this time, the corresponding switch performs network forwarding, and forwards the second reply message from the storage controller to the second storage controller.
604、交换机转发第二回复报文至第二存储控制器。604. The switch forwards the second reply packet to the second storage controller.
605、第二存储控制器接收到来自交换机的第二回复报文后,根据该第二回复报文生成第三回复报文,所述第三回复报文的目的地为该客户端,第二存储控制器向该交换机发送该修该第三回复报文。605. After receiving the second reply message from the switch, the second storage controller generates a third reply message according to the second reply message, where the destination of the third reply message is the client, and the second The storage controller sends the modified third reply message to the switch.
第二存储控制器根据第二回复报文包括的代理指示信息,确定需要将该报文发送至客户端。第二存储控制器根据该第二回复报文生成第三回复报文,该第三回复报文的目的地为客户端,该第三回复报文不包括该代理指示信息。The second storage controller determines that the message needs to be sent to the client according to the proxy indication information included in the second reply message. The second storage controller generates a third reply message according to the second reply message, the destination of the third reply message is the client, and the third reply message does not include the proxy indication information.
606、交换机接收到来自第二存储控制器的第三回复报文后,向客户端发送该第三回复报文。606. After receiving the third reply packet from the second storage controller, the switch sends the third reply packet to the client.
一种可能的实现方式中,交换机通过客户端与存储控制器之间建立的通信连接,向客户端发送该第三回复报文。例如:交换机通过客户端与存储控制器之间建立的TCP连接,向客户端发送第三回复报文。In a possible implementation manner, the switch sends the third reply message to the client through the communication connection established between the client and the storage controller. For example, the switch sends the third reply message to the client through the TCP connection established between the client and the storage controller.
在另一种可能的实现方式中,交换机与客户端之间建立直接路径连接。具体建立直接路径连接的方式与图4实施例中交换机与存储控制器之间建立直接路径连接的方式类似。交换机通过该直接路径连接向客户端发送第三回复报文。In another possible implementation manner, a direct path connection is established between the switch and the client. The specific manner of establishing the direct path connection is similar to the manner of establishing the direct path connection between the switch and the storage controller in the embodiment in FIG. 4 . The switch sends the third reply message to the client through the direct path connection.
本申请实施例中,存储控制器通过直接路径连接将第三回复报文发送至交换机,使得客户端不感知该第三回复报文的路由路径。In the embodiment of the present application, the storage controller sends the third reply message to the switch through the direct path connection, so that the client does not perceive the routing path of the third reply message.
结合前述实施例,本申请实施例提出的数据读写方法还可以应用于其它系统中。下面 结合附图进行说明,请参阅图7,图7为本申请实施例提出的一种数据读写方法的又一种实施例示意图。该数据读写方法包括步骤701-708。In combination with the foregoing embodiments, the data reading and writing methods proposed in the embodiments of the present application can also be applied to other systems. The following description will be made in conjunction with the accompanying drawings. Please refer to FIG. 7, which is a schematic diagram of another embodiment of a data reading and writing method proposed in the embodiment of the present application. The data reading and writing method includes steps 701-708.
701、交换机接收来自前端服务器的路由信息。701. The switch receives routing information from the front-end server.
本实施例中,当客户端访问存储系统时,例如:用户在在线视频软件中点击某个视频,则该客户端(在线视频软件)访问存储系统。客户端与前端服务器建立连接,该连接可以是控制面连接,例如前述实施例中的TCP连接或RDMA连接等。In this embodiment, when the client accesses the storage system, for example, the user clicks a certain video in the online video software, the client (online video software) accesses the storage system. The client establishes a connection with the front-end server, and the connection may be a control plane connection, such as the TCP connection or the RDMA connection in the foregoing embodiments.
当客户端与前端服务器建立连接后,交换机接收来自前端服务器的路由信息。该前端服务器的路由信息包括但不限于:后端服务器的标识信息、后端服务器的I/O地址、数据的存储地址、或者后端服务器的负载信息。I/O地址包括但不限于:逻辑单元号、命名空间的标识,或逻辑区块地址。After the client establishes a connection with the front-end server, the switch receives routing information from the front-end server. The routing information of the front-end server includes but not limited to: identification information of the back-end server, I/O address of the back-end server, data storage address, or load information of the back-end server. The I/O address includes, but is not limited to: logical unit number, namespace identifier, or logical block address.
后端服务器的标识信息可以包括该存后端服务器的IP地址和/或该后端服务器的名称。该数据的存储地址指的是后端服务器中数据的存储地址,该数据可以是业务关联的数据。例如:视频、音频、文本或者图像等。The identification information of the backend server may include the IP address of the backend server and/or the name of the backend server. The storage address of the data refers to the storage address of the data in the backend server, and the data may be business-related data. For example: video, audio, text or image, etc.
后端服务器的负载信息包括但不限于:该后端服务器的存储空间总量、该后端服务器的剩余可用存储空间、该后端服务器的已用存储空间、该后端服务器的IOPS性能、该后端服务器的可用带宽、该后端服务器的带宽总量、或者该存储控制器的温度。The load information of the back-end server includes, but is not limited to: the total storage space of the back-end server, the remaining available storage space of the back-end server, the used storage space of the back-end server, the IOPS performance of the back-end server, the The available bandwidth of the backend server, the total bandwidth of the backend server, or the temperature of the storage controller.
这里,后端服务器相当于前述各实施例中的存储控制器。Here, the backend server is equivalent to the storage controller in the foregoing embodiments.
702、交换机接收来自前端服务器的地址信息,该地址信息为后端服务器的地址信息。702. The switch receives address information from the front-end server, where the address information is address information of the back-end server.
本实施例中,交换机与前端服务器之间建立通信连接后,交换机接收来自前端服务器的后端服务器的地址信息。根据前端服务器与后端服务器之间建立的通信连接的不同,该交换机接收的后端服务器的地址信息可能不一样。下面分别进行说明。In this embodiment, after a communication connection is established between the switch and the front-end server, the switch receives address information of the back-end server from the front-end server. Depending on the communication connection established between the front-end server and the back-end server, the address information of the back-end server received by the switch may be different. Each will be described below.
当前端服务器与后端服务器之间建立TCP连接,则该后端服务器的地址信息包括:该后端服务器的IP地址、该后端服务器的端口号以及协议号When a TCP connection is established between the front-end server and the back-end server, the address information of the back-end server includes: the IP address of the back-end server, the port number and the protocol number of the back-end server
当前端服务器与后端服务器之间建立RDMA连接,该后端服务器的地址信息包括:该后端服务器的队列对端口号QP和该后端服务器的IP地址。When an RDMA connection is established between the front-end server and the back-end server, the address information of the back-end server includes: the queue pair port number QP of the back-end server and the IP address of the back-end server.
需要说明的是,步骤701与步骤702的执行顺序不作限制,即可以先执行步骤701再执行步骤702,也可以先执行步骤702再执行步骤701。It should be noted that the execution sequence of step 701 and step 702 is not limited, that is, step 701 may be executed first and then step 702 may be executed, or step 702 may be executed first and then step 701 may be executed.
本申请中,存储控制器的地址信息可以由存储控制器自己发送给交换机,也可以通过其他设备发给交换机,例如,本实施例中的前端服务器。In this application, the address information of the storage controller may be sent to the switch by the storage controller itself, or may be sent to the switch through other devices, for example, the front-end server in this embodiment.
703、交换机存储该后端服务器对应的交换地址信息。703. The switch stores switching address information corresponding to the backend server.
本实施例中,交换机接收该后端服务器的地址信息后,该交换机为该后端服务器分配对应的该交换地址信息。根据前端服务器与后端服务器之间建立的通信连接的不同,与步骤702类似,该交换地址信息存在不同,下面分别进行说明:In this embodiment, after the switch receives the address information of the backend server, the switch assigns the corresponding switching address information to the backend server. According to the difference in the communication connection established between the front-end server and the back-end server, similar to step 702, there are differences in the exchange address information, which will be explained separately below:
当前端服务器与该后端服务器之间建立TCP连接,该交换地址信息包括:该交换机的IP地址、该交换机的端口号以及协议号。When a TCP connection is established between the front-end server and the back-end server, the exchange address information includes: the IP address of the switch, the port number and the protocol number of the switch.
当前端服务器与该后端服务器之间建立RDMA连接,该交换地址信息包括:该交换机的队列对端口号QP和该交换机的IP地址。When an RDMA connection is established between the front-end server and the back-end server, the exchange address information includes: the queue pair port number QP of the switch and the IP address of the switch.
704、交换机与后端服务器建立直接路径连接。704. The switch establishes a direct path connection with the backend server.
本实施例中,该交换机通过与后端服务器之间的通信连接(即控制面连接),向后端服务器发送该后端服务器对应的交换地址信息。则交换机侧与后端服务器侧分别存储了交换地址信息和后端服务器的地址信息,交换机与后端服务器根据上述地址信息建立直接路径连接。交换机可以为每个直接路径连接分配一个直接路径连接的标识信息。In this embodiment, the switch sends the exchange address information corresponding to the back-end server to the back-end server through a communication connection (that is, a control plane connection) with the back-end server. The exchange address information and the address information of the back-end server are respectively stored on the switch side and the back-end server side, and the switch and the back-end server establish a direct path connection according to the above address information. The switch may assign identification information of a direct path connection to each direct path connection.
705、交换机生成第二映射关系。705. The switch generates a second mapping relationship.
本实施例中,交换机生成第二映射关系,该第二映射关系包括:直接路径连接的标识信息、后端服务器的标识信息和I/O地址的映射关系。该第二映射关系为前述第一映射关系的一种实现方式。In this embodiment, the switch generates a second mapping relationship, where the second mapping relationship includes: a mapping relationship between identification information of a direct path connection, identification information of a backend server, and an I/O address. The second mapping relationship is an implementation manner of the foregoing first mapping relationship.
706、客户端向交换机发送第二报文。706. The client sends the second packet to the switch.
本实施例中,客户端向交换机发送第二报文,该第二报文用于向后端服务器读取或写入数据。例如:客户端需要向存储系统上传视频时,客户端向交换机发送第二报文,该第二报文携带该视频数据。客户端需要观看视频时,客户端向交换机发送第二报文,该第二报文携带需要读取的视频数据的地址或者标识。该第二报文为前述第一报文的一种实现方式。In this embodiment, the client sends a second message to the switch, where the second message is used to read or write data to the backend server. For example, when the client needs to upload video to the storage system, the client sends a second message to the switch, and the second message carries the video data. When the client needs to watch the video, the client sends a second message to the switch, and the second message carries the address or identifier of the video data to be read. The second message is an implementation manner of the foregoing first message.
一种可能的实现方式中,步骤706执行在步骤701之前,客户端向交换机发送第二报文后,执行步骤701-705。即交换机接收到来自客户端第二报文后,交换机与后端服务器之间建立直接路径连接。In a possible implementation manner, step 706 is performed before step 701, and after the client sends the second packet to the switch, steps 701-705 are performed. That is, after the switch receives the second message from the client, a direct path connection is established between the switch and the backend server.
在另一种可能的实现方式中,步骤701-705之后,执行步骤706。即交换机与后端服务器之间建立直接路径连接后,交换机接收来自客户端的第二报文。In another possible implementation manner, after steps 701-705, step 706 is performed. That is, after a direct path connection is established between the switch and the backend server, the switch receives the second packet from the client.
可选的,在交换机与后端服务器建立直接路径连接前,交换机可以将客户端发送的建链、授权验证等报文透传至前端服务器。由前端服务器对上述建链或者授权验证等报文进行处理。Optionally, before the switch establishes a direct path connection with the back-end server, the switch can transparently transmit messages such as link establishment and authorization verification sent by the client to the front-end server. The front-end server processes the above-mentioned messages such as link establishment or authorization verification.
707、交换机根据第二报文和第二映射关系,确定直接路径连接。707. The switch determines the direct path connection according to the second packet and the second mapping relationship.
本实施例中,当交换机接收来自客户端的第二报文后,交换机根据该第二报文,从第二映射关系中确定转发该第二报文使用的直接路径连接。In this embodiment, after the switch receives the second message from the client, the switch determines the direct path connection used to forward the second message from the second mapping relationship according to the second message.
708、交换机通过直接路径连接向存后端服务器发送第二报文。708. The switch sends the second packet to the storage backend server through the direct path connection.
本申请提出一种数据读写方法,由于交换机与后端服务器之间建立的直接路径连接可以传输报文,因此客户端的报文无需经过前端服务器的重路由。避免了受限于前端服务器的计算能力与网络带宽造成的通信拥堵,提升了报文的处理效率,提升了后端服务器的利用效率,实现各个后端服务器的负载均衡。This application proposes a method for reading and writing data. Since the direct path connection established between the switch and the back-end server can transmit messages, the messages of the client do not need to be re-routed by the front-end server. It avoids the communication congestion caused by the computing power of the front-end server and the network bandwidth, improves the processing efficiency of the message, improves the utilization efficiency of the back-end server, and realizes the load balancing of each back-end server.
为了实现上述实施例,本申请还提供了一种网络设备。可以参阅图8,图8为本申请实施例提供的一种网络设备800的结构示意图。In order to implement the foregoing embodiments, the present application further provides a network device. Refer to FIG. 8 , which is a schematic structural diagram of a network device 800 provided in an embodiment of the present application.
图8所示的网络设备800尽管示出了某些特定特征,但是本领域的技术人员将从本申请实施例中意识到,为了简洁起见,图8未示出各种其他特征,以免混淆本申请实施例所公开的实施方式的更多相关方面。为此,作为示例,在一些实现方式中,网络设备800包括一个或多个处理模块(如,CPU)801、网络接口802、编程接口803、存储器804和一个 或多个通信总线805,用于将各种组件互连。在另一些实现方式中,网络设备800也可以在上述示例基础上省略或增加部分功能部件或单元。Although the network device 800 shown in FIG. 8 shows some specific features, those skilled in the art will realize from the embodiments of the present application that for the sake of brevity, various other features are not shown in FIG. 8 so as not to confuse the present invention. Further relevant aspects of the embodiments disclosed in the application examples. To this end, as an example, in some implementations, the network device 800 includes one or more processing modules (e.g., a CPU) 801, a network interface 802, a programming interface 803, a memory 804, and one or more communication buses 805 for Interconnect the various components. In other implementation manners, the network device 800 may also omit or add some functional components or units based on the above examples.
在一些实现方式中,网络接口802用于在网络系统中和一个或多个其他的网络设备/服务器连接。在一些实现方式中,通信总线805包括互连和控制系统组件之间的通信的电路。存储器804可以包括非易失性存储器,例如,只读存储器(read-only memory,ROM)、可编程只读存储器(programmable ROM,PROM)、可擦除可编程只读存储器(erasable PROM,EPROM)、电可擦除可编程只读存储器(electrically EPROM,EEPROM)或闪存。存储器804也可以包括易失性存储器,易失性存储器可以是随机存取存储器(random access memory,RAM),其用作外部高速缓存。In some implementations, the network interface 802 is used to connect with one or more other network devices/servers in the network system. In some implementations, communication bus 805 includes circuitry that interconnects and controls communication between system components. Memory 804 may include nonvolatile memory, for example, read-only memory (read-only memory, ROM), programmable read-only memory (programmable ROM, PROM), erasable programmable read-only memory (erasable PROM, EPROM) , Electrically Erasable Programmable Read-Only Memory (electrically EPROM, EEPROM) or flash memory. Memory 804 may also include volatile memory, which may be random access memory (RAM), which acts as an external cache.
在一些实现中,存储器804或存储器804的非暂时性计算机可读存储介质存储以下程序、模块和数据结构,或其子集,例如包括收发模块(图中未示出)、收发模块8041和处理模块8042。In some implementations, the memory 804 or the non-transitory computer-readable storage medium of the memory 804 stores the following programs, modules and data structures, or a subset thereof, for example including a transceiver module (not shown in the figure), a transceiver module 8041 and a processing Module 8042.
在一个可能的实施例中,该网络设备800可以具有上述图2-图7对应的方法实施例中的交换机中的任意功能。In a possible embodiment, the network device 800 may have any function of the switch in the above method embodiments corresponding to FIG. 2 to FIG. 7 .
应理解,网络设备800对应于上述方法实施例中的交换机,网络设备800中的各模块和上述其他操作和/或功能分别为了实现上述方法实施例中的交换机所实施的各种步骤和方法,具体细节可参见上述图2-图7对应的方法实施例,为了简洁,在此不再赘述。It should be understood that the network device 800 corresponds to the switch in the above-mentioned method embodiment, and each module in the network device 800 and the above-mentioned other operations and/or functions are to implement various steps and methods implemented by the switch in the above-mentioned method embodiment, respectively, For specific details, refer to the method embodiments corresponding to the foregoing FIGS. 2-7 , and details are not repeated here for the sake of brevity.
应理解,本申请可以是由网络设备800上的网络接口802来完成数据的收发操作,也可以是由处理器调用存储器中的程序代码,并在需要时配合网络接口802来实现收发模块的功能。It should be understood that in this application, the network interface 802 on the network device 800 can complete the data sending and receiving operation, or the processor can call the program code in the memory, and cooperate with the network interface 802 to realize the function of the sending and receiving module when necessary .
在各种实现中,网络设备800用于执行本申请实施例提供的数据读写方法,例如是执行上述图2-图7所示的实施例所对应的数据读写方法。In various implementations, the network device 800 is configured to execute the data reading and writing method provided by the embodiment of the present application, for example, executing the data reading and writing method corresponding to the above-mentioned embodiments shown in FIGS. 2-7 .
本申请图8所述的网络设备具体结构可以为图9所示。The specific structure of the network device described in FIG. 8 of this application may be as shown in FIG. 9 .
图9为本申请实施例提供的一种网络设备900的结构示意图,网络设备900包括:主控板910和接口板930。FIG. 9 is a schematic structural diagram of a network device 900 provided by an embodiment of the present application. The network device 900 includes: a main control board 910 and an interface board 930 .
主控板910也称为主处理模块(main processing unit,MPU)或路由处理器(route processor),主控板910用于对网络设备900中各个组件的控制和管理,包括路由计算、设备管理、设备维护、协议处理功能。主控板910包括:中央处理器911和存储器912。The main control board 910 is also called a main processing unit (main processing unit, MPU) or a route processor (route processor), and the main control board 910 is used for controlling and managing each component in the network device 900, including route calculation, device management , equipment maintenance, protocol processing functions. The main control board 910 includes: a CPU 911 and a memory 912 .
接口板930也称为线路处理模块(line processing unit,LPU)、线卡(line card)或业务板。接口板930用于提供各种业务接口并实现数据包的转发。业务接口包括但不限于以太网接口、POS(Packet over SONET/SDH)接口等。接口板930包括:中央处理器931、网络处理器932、转发表项存储器934和物理接口卡(physical interface card,PIC)933。The interface board 930 is also called a line processing unit (line processing unit, LPU), a line card (line card), or a service board. The interface board 930 is used to provide various service interfaces and implement forwarding of data packets. Service interfaces include but are not limited to Ethernet interfaces, POS (Packet over SONET/SDH) interfaces, etc. The interface board 930 includes: a central processing unit 931 , a network processor 932 , a forwarding entry storage 934 and a physical interface card (physical interface card, PIC) 933 .
接口板930上的中央处理器931用于对接口板930进行控制管理并与主控板910上的中央处理器911通信。The CPU 931 on the interface board 930 is used to control and manage the interface board 930 and communicate with the CPU 911 on the main control board 910 .
网络处理器932用于实现报文的转发处理。网络处理器932的形态可以是转发芯片。The network processor 932 is configured to implement message forwarding processing. The form of the network processor 932 may be a forwarding chip.
物理接口卡933用于实现物理层的对接功能,原始的流量由此进入接口板930,以及 处理后的报文从该物理接口卡933发出。物理接口卡933包括至少一个物理接口,物理接口也称物理口,物理接口可以为灵活以太(Flexible Ethernet,FlexE)物理接口。物理接口卡933也称为子卡,可安装在接口板930上,负责将光电信号转换为报文并对报文进行合法性检查后转发给网络处理器932处理。在一些实施例中,接口板930的中央处理器931也可执行网络处理器932的功能,比如基于通用CPU实现软件转发,从而接口板930中不需要网络处理器932。The physical interface card 933 is used to realize the docking function of the physical layer, and the original flow enters the interface board 930 through this, and the message after processing is sent from the physical interface card 933. The physical interface card 933 includes at least one physical interface, which is also called a physical interface, and the physical interface may be a Flexible Ethernet (FlexE) physical interface. The physical interface card 933 is also called a daughter card, which can be installed on the interface board 930, and is responsible for converting the photoelectric signal into a message, checking the validity of the message and forwarding it to the network processor 932 for processing. In some embodiments, the central processing unit 931 of the interface board 930 can also execute the functions of the network processor 932 , such as implementing software forwarding based on a general-purpose CPU, so that the interface board 930 does not need the network processor 932 .
可选的,网络设备900包括多个接口板,例如网络设备900还包括接口板940,接口板940包括:中央处理器941、网络处理器942、转发表项存储器944和物理接口卡943。Optionally, the network device 900 includes multiple interface boards. For example, the network device 900 further includes an interface board 940 , and the interface board 940 includes: a central processing unit 941 , a network processor 942 , a forwarding entry storage 944 and a physical interface card 943 .
可选的,网络设备900还包括交换网板920。交换网板920也可以称为交换网板单元(switch fabric unit,SFU)。在网络设备有多个接口板930的情况下,交换网板920用于完成各接口板之间的数据交换。例如,接口板930和接口板940之间可以通过交换网板920通信。Optionally, the network device 900 further includes a switching fabric unit 920 . The SFU 920 may also be called a SFU (switch fabric unit, SFU). When the network device has multiple interface boards 930, the switching fabric board 920 is used to complete the data exchange between the interface boards. For example, the interface board 930 and the interface board 940 may communicate through the switching fabric board 920 .
主控板910和接口板耦合。例如,主控板910、接口板930和接口板940,以及交换网板920之间通过系统总线和/或系统背板相连实现互通。在一种可能的实现方式中,主控板910和接口板930之间建立进程间通信协议(inter-process communication,IPC)通道,主控板910和接口板930之间通过IPC通道进行通信。The main control board 910 is coupled to the interface board. For example, the main control board 910, the interface board 930 and the interface board 940, and the switching fabric board 920 are connected through a system bus and/or a system backplane to implement intercommunication. In a possible implementation manner, an inter-process communication protocol (inter-process communication, IPC) channel is established between the main control board 910 and the interface board 930, and the main control board 910 and the interface board 930 communicate through the IPC channel.
在逻辑上,网络设备900包括控制面和转发面,控制面包括主控板910和中央处理器931,转发面包括执行转发的各个组件,比如转发表项存储器934、物理接口卡933和网络处理器932。控制面执行发布路由、生成转发表、处理信令和协议报文、配置与维护设备的状态等功能,控制面将生成的转发表下发给转发面,在转发面,网络处理器932基于控制面下发的转发表对物理接口卡933收到的报文查表转发。控制面下发的转发表可以保存在转发表项存储器934中。在有些实施例中,控制面和转发面可以完全分离,不在同一设备上。Logically, the network device 900 includes a control plane and a forwarding plane. The control plane includes a main control board 910 and a central processing unit 931. The forwarding plane includes various components for performing forwarding, such as a forwarding entry storage 934, a physical interface card 933, and a network processing device 932. The control plane performs functions such as publishing routes, generating forwarding tables, processing signaling and protocol packets, configuring and maintaining device status, etc., and the control plane sends the generated forwarding tables to the forwarding plane. On the forwarding plane, the network processor 932 The forwarding table issued above looks up and forwards the packets received by the physical interface card 933. The forwarding table issued by the control plane may be stored in the forwarding table item storage 934 . In some embodiments, the control plane and the forwarding plane may be completely separated and not on the same device.
应理解,网络设备800中的收发模块可以相当于网络设备900中的物理接口卡933或物理接口卡943;网络设备800中的收发模块8041和处理模块8042可以相当于网络设备900中的中央处理器911或中央处理器931,也可以相当于存储器912中存储的程序代码或指令。It should be understood that the transceiver module in the network device 800 may be equivalent to the physical interface card 933 or the physical interface card 943 in the network device 900; the transceiver module 8041 and the processing module 8042 in the network device 800 may be equivalent to the central processing module The processor 911 or the central processing unit 931 may also correspond to program codes or instructions stored in the memory 912.
应理解,本申请实施例中接口板940上的操作与接口板930的操作一致,为了简洁,不再赘述。应理解,本实施例的网络设备900可对应于上述各个方法实施例中的交换机,该网络设备900中的主控板910、接口板930和/或接口板940可以实现上述各个方法实施例中的交换机所具有的功能和/或所实施的各种步骤,为了简洁,在此不再赘述。It should be understood that the operations on the interface board 940 in the embodiment of the present application are consistent with the operations on the interface board 930 , and are not repeated for brevity. It should be understood that the network device 900 in this embodiment may correspond to the switch in each of the foregoing method embodiments, and the main control board 910, the interface board 930, and/or the interface board 940 in the network device 900 may implement the For the sake of brevity, the functions and/or various steps implemented by the switch are not repeated here.
值得说明的是,主控板可能有一块或多块,有多块的时候可以包括主用主控板和备用主控板。接口板可能有一块或多块,网络设备的数据处理能力越强,提供的接口板越多。接口板上的物理接口卡也可以有一块或多块。交换网板可能没有,也可能有一块或多块,有多块的时候可以共同实现负荷分担冗余备份。在集中式转发架构下,网络设备可以不需要交换网板,接口板承担整个系统的业务数据的处理功能。在分布式转发架构下,网络设备可以有至少一块交换网板,通过交换网板实现多块接口板之间的数据交换,提供大容量 的数据交换和处理能力。可选的,网络设备的形态也可以是只有一块板卡,即没有交换网板,接口板和主控板的功能集成在该一块板卡上,此时接口板上的中央处理器和主控板上的中央处理器在该一块板卡上可以合并为一个中央处理器,执行两者叠加后的功能。具体采用哪种架构,取决于具体的组网部署场景,此处不做唯一限定。It is worth noting that there may be one or more main control boards, and when there are multiple main control boards, it may include the main main control board and the standby main control board. There may be one or more interface boards. The stronger the data processing capability of the network device, the more interface boards it provides. There may also be one or more physical interface cards on the interface board. There may be no SFU, or there may be one or more SFUs. When there are multiple SFUs, they can jointly implement load sharing and redundant backup. Under the centralized forwarding architecture, the network device does not need a switching network board, and the interface board undertakes the processing function of the service data of the entire system. Under the distributed forwarding architecture, the network device can have at least one SFU, and the data exchange between multiple interface boards can be realized through the SFU to provide large-capacity data exchange and processing capabilities. Optionally, the form of the network device can also be that there is only one board, that is, there is no switching fabric board, and the functions of the interface board and the main control board are integrated on this board. At this time, the central processing unit and the main control board on the interface board The central processing units on the board can be combined into one central processing unit on the one board to perform the superimposed functions of the two. Which architecture to use depends on the specific networking deployment scenario, and there is no unique limitation here.
在一些可能的实施例中,上述交换机可以实现为虚拟化设备。虚拟化设备可以是运行有用于发送报文功能的程序的虚拟机(virtual machine,VM),虚拟路由器或虚拟交换机。虚拟化设备部署在硬件设备上(例如,物理服务器)。例如,可以基于通用的物理服务器结合网络功能虚拟化(network functions virtualization,NFV)技术来实现交换机。In some possible embodiments, the foregoing switch may be implemented as a virtualization device. The virtualization device may be a virtual machine (virtual machine, VM) running a program for sending packets, a virtual router or a virtual switch. Virtualization devices are deployed on hardware devices (eg, physical servers). For example, a switch may be implemented based on a common physical server combined with a network functions virtualization (network functions virtualization, NFV) technology.
应理解,上述各种产品形态的网络设备,分别具有上述方法实施例中交换机的任意功能,此处不再赘述。It should be understood that the above-mentioned network devices in various product forms respectively have any function of the switch in the above-mentioned method embodiments, which will not be repeated here.
图10为本申请实施例提供的存储设备1000的结构示意图。图10所示的存储设备1000为存储阵列。如图10所示,存储设备1000可以包括存储控制器1100和磁盘阵列1200,其中,这里的磁盘阵列1200用于提供存储空间,可以包括廉价冗余磁盘阵列(redundant array of independent disk,简称RAID)或包括多个磁盘的磁盘柜。磁盘阵列1200可以有多个,磁盘阵列1200包括多个磁盘1202。磁盘1202用于存储数据。磁盘阵列1200通过SCSI协议等通信协议与控制器1100通信。协议在此不作限定。FIG. 10 is a schematic structural diagram of a storage device 1000 provided by an embodiment of the present application. The storage device 1000 shown in FIG. 10 is a storage array. As shown in Figure 10, the storage device 1000 may include a storage controller 1100 and a disk array 1200, wherein the disk array 1200 here is used to provide storage space, and may include a cheap redundant array of independent disk (RAID for short) Or a disk enclosure containing multiple disks. There may be multiple disk arrays 1200 , and the disk array 1200 includes multiple disks 1202 . Disk 1202 is used to store data. The disk array 1200 communicates with the controller 1100 through communication protocols such as SCSI protocol. Agreement is not limited here.
该存储设备1000中的存储控制器1100,用于执行前述方法实施例中的相关步骤。The storage controller 1100 in the storage device 1000 is configured to execute relevant steps in the foregoing method embodiments.
可以理解的是,磁盘阵列1200仅仅是存储设备中的存储器的一个示例。在本申请实施例中,数据也可以通过磁带库等存储器存储。应注意,磁盘1202也仅仅是构建磁盘阵列1200的存储器的一个示例。在实际应用中,例如,为了在包含多个磁盘的机柜之间构建磁盘阵列,还可以有一种实现方式。因此,在本申请实施例中,磁盘阵列1200还可以包括存储器,包括非易失性存储介质,例如固态硬盘(solid state disk,简称SSD)、包含多个磁盘的机柜、或服务器,在此不作限定。It can be understood that the disk array 1200 is just an example of the memory in the storage device. In this embodiment of the present application, the data may also be stored through a memory such as a tape library. It should be noted that the magnetic disk 1202 is also only an example of the memory for constructing the magnetic disk array 1200 . In practical applications, for example, in order to build a disk array between cabinets containing multiple disks, there may also be an implementation. Therefore, in the embodiment of the present application, the disk array 1200 may also include a memory, including a non-volatile storage medium, such as a solid state disk (solid state disk, SSD for short), a cabinet containing multiple disks, or a server, which will not be described here. limited.
存储控制器1100是存储设备1000的“大脑”,主要包括处理器1102、缓存1103、存储器1101、通信总线(简称总线)1105和通信接口1104。处理器1102、缓存1103、存储器1101和通信接口1104通过通信总线1105相互通信。应注意,本申请实施例中,存储设备1000中可以有一个或多个控制器1100。可以理解的是,当存储设备1000包括至少两个控制器1100时,可以提高存储设备1000的稳定性。The storage controller 1100 is the “brain” of the storage device 1000 , and mainly includes a processor 1102 , a cache 1103 , a memory 1101 , a communication bus (bus for short) 1105 and a communication interface 1104 . The processor 1102 , the cache memory 1103 , the memory 1101 and the communication interface 1104 communicate with each other through the communication bus 1105 . It should be noted that, in the embodiment of the present application, there may be one or more controllers 1100 in the storage device 1000 . It can be understood that when the storage device 1000 includes at least two controllers 1100, the stability of the storage device 1000 can be improved.
通信接口1104用于与交换机、客户端、其它网络设备或其它存储设备进行通信。The communication interface 1104 is used for communicating with switches, clients, other network devices or other storage devices.
存储器1101用于存储程序1106。存储器1101可以包括高速随机存取存储器(random access memory,简称RAM),或者还可以包括非易失性存储器,例如至少一个磁盘存储器。可以理解的是,存储器1101可以是各种可以存储程序代码的非瞬时性机器可读介质,例如RAM、磁盘、硬盘驱动器、光盘、SSD或非易失性存储器。The memory 1101 is used to store a program 1106 . The memory 1101 may include a high-speed random access memory (random access memory, RAM for short), or may also include a non-volatile memory, such as at least one disk memory. It can be understood that the memory 1101 can be various non-transitory machine-readable media that can store program codes, such as RAM, magnetic disk, hard disk drive, optical disk, SSD or non-volatile memory.
程序1106可以包括程序代码,程序代码包括计算机操作指令。Program 1106 may include program code including computer operating instructions.
缓存1103是控制器和硬盘驱动器之间的存储器,其容量小于硬盘驱动器,但速度快于硬盘驱动器。缓存1103用于临时存储数据,例如从交换机或其它存储设备接收的I/O事务,并临时存储从磁盘1202读取的数据,以提高阵列的性能和可靠性。缓存1103可以是各种 可以存储数据的非瞬时性机器可读介质,例如RAM、ROM、闪存或SSD,此处不作限定。Cache 1103 is the memory between the controller and the hard drive, which is smaller in capacity but faster than the hard drive. The cache 1103 is used to temporarily store data, such as I/O transactions received from switches or other storage devices, and temporarily store data read from the disk 1202, so as to improve the performance and reliability of the array. The cache 1103 can be various non-transitory machine-readable media that can store data, such as RAM, ROM, flash memory or SSD, which is not limited here.
处理器1102可以是中央处理器(central processing unit,简称CPU)或专用集成电路(application-specific integrated circuit,简称ASIC),或者被配置为实现本申请实施例的一个或多个集成电路。处理器1102中安装有操作系统和其它软件程序,不同的软件程序可以被视为不同的处理模块,具有不同的功能,例如处理磁盘1202的输入/输出(input/output,简称I/O)请求、对磁盘1202中的数据进行其它处理,或者修改存储设备1000中保存的元数据。存储控制器1100可以实现I/O操作、快照、镜像、复制等各种数据管理功能。在本申请实施例中,处理器1102用于执行程序1106,具体可以执行前述方法实施例中的相关步骤。The processor 1102 may be a central processing unit (central processing unit, CPU for short) or an application-specific integrated circuit (ASIC for short), or configured to implement one or more integrated circuits in the embodiments of the present application. An operating system and other software programs are installed in the processor 1102, and different software programs can be regarded as different processing modules with different functions, such as processing the input/output (input/output, referred to as I/O) requests of the disk 1202 . Perform other processing on the data in the disk 1202 , or modify the metadata stored in the storage device 1000 . The storage controller 1100 can implement various data management functions such as I/O operations, snapshots, mirroring, and duplication. In the embodiment of the present application, the processor 1102 is configured to execute the program 1106, specifically, relevant steps in the aforementioned method embodiments may be executed.
进一步地,本申请实施例还提供了一种计算机程序产品,当该计算机程序产品在网络设备上运行时,使得网络设备执行上述图2-图7对应的方法实施例中交换机执行的方法。Further, an embodiment of the present application also provides a computer program product, which, when running on a network device, causes the network device to perform the method performed by the switch in the method embodiments corresponding to FIGS. 2-7 above.
本申请实施例还提供了一种芯片系统,包括处理器和接口电路,接口电路,用于接收指令并传输至处理器。其中,所述处理器用于实现上述任一方法实施例中的方法。The embodiment of the present application also provides a chip system, including a processor and an interface circuit, and the interface circuit is configured to receive instructions and transmit them to the processor. Wherein, the processor is configured to implement the method in any one of the foregoing method embodiments.
可选的,该芯片系统还包括存储器,该芯片系统中的处理器可以为一个或多个。该处理器可以通过硬件实现也可以通过软件实现。当通过硬件实现时,该处理器可以是逻辑电路、集成电路等。当通过软件实现时,该处理器可以是一个通用处理器,通过读取存储器中存储的软件代码来实现上述任一方法实施例中的方法。Optionally, the chip system further includes a memory, and there may be one or more processors in the chip system. The processor can be realized by hardware or by software. When implemented in hardware, the processor may be a logic circuit, an integrated circuit, or the like. When implemented by software, the processor may be a general-purpose processor, and implements the method in any of the above method embodiments by reading the software code stored in the memory.
可选的,该芯片系统中的存储器也可以为一个或多个。该存储器可以与处理器集成在一起,也可以和处理器分离设置,本申请并不限定。示例性的,存储器可以是非瞬时性处理器,例如只读存储器ROM,其可以与处理器集成在同一块芯片上,也可以分别设置在不同的芯片上,本申请对存储器的类型,以及存储器与处理器的设置方式不作具体限定。Optionally, there may be one or more memories in the chip system. The memory can be integrated with the processor, or can be set separately from the processor, which is not limited in this application. Exemplarily, the memory can be a non-transitory processor, such as a read-only memory ROM, which can be integrated with the processor on the same chip, or can be respectively arranged on different chips. The setting method of the processor is not specifically limited.
以上对本申请实施例进行了详细介绍,本申请实施例方法中的步骤可以根据实际需要进行顺序调度、合并或删减;本申请实施例装置中的模块可以根据实际需要进行划分、合并或删减。The above is a detailed introduction to the embodiment of the present application. The steps in the method of the embodiment of the present application can be sequentially scheduled, merged or deleted according to actual needs; the modules in the device of the embodiment of the present application can be divided, merged or deleted according to actual needs .
应理解,说明书通篇中提到的“一个实施例”或“一实施例”意味着与实施例有关的特定特征、结构或特性包括在本申请的至少一个实施例中。因此,在整个说明书各处出现的“在一个实施例中”或“在一实施例中”未必一定指相同的实施例。此外,这些特定的特征、结构或特性可以任意适合的方式结合在一个或多个实施例中。应理解,在本申请的各种实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。It should be understood that reference throughout the specification to "one embodiment" or "an embodiment" means that a particular feature, structure, or characteristic related to the embodiment is included in at least one embodiment of the present application. Thus, appearances of "in one embodiment" or "in an embodiment" in various places throughout the specification are not necessarily referring to the same embodiment. Furthermore, the particular features, structures or characteristics may be combined in any suitable manner in one or more embodiments. It should be understood that, in various embodiments of the present application, the sequence numbers of the above-mentioned processes do not mean the order of execution, and the execution order of the processes should be determined by their functions and internal logic, and should not be used in the embodiments of the present application. The implementation process constitutes any limitation.
本文中术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系。The term "and/or" in this article is just an association relationship describing associated objects, which means that there can be three relationships, for example, A and/or B can mean: A exists alone, A and B exist simultaneously, and there exists alone B these three situations. In addition, the character "/" in this article generally indicates that the contextual objects are an "or" relationship.
应理解,在本申请实施例中,“与A相应的B”表示B与A相关联,根据A可以确定B。但还应理解,根据A确定B并不意味着仅仅根据A确定B,还可以根据A和/或其它信息确定B。It should be understood that in this embodiment of the present application, "B corresponding to A" means that B is associated with A, and B can be determined according to A. However, it should also be understood that determining B according to A does not mean determining B only according to A, and B may also be determined according to A and/or other information.
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统,装 置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that for the convenience and brevity of description, the specific working process of the above-described system, device and unit can refer to the corresponding process in the foregoing method embodiment, and will not be repeated here.
在本申请所提供的几个实施例中,应该理解到,所揭露的系统,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed system, device and method can be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components can be combined or integrated. to another system, or some features may be ignored, or not implemented. In another point, the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。A unit described as a separate component may or may not be physically separated, and a component shown as a unit may or may not be a physical unit, that is, it may be located in one place, or may also be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理模块中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present application may be integrated into one processing module, each unit may exist separately physically, or two or more units may be integrated into one unit. The above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional units.

Claims (35)

  1. 一种数据读写方法,其特征在于,包括:A method for reading and writing data, characterized in that, comprising:
    交换机接收存储控制器的地址信息,所述存储控制器的地址信息包括以下一项或多项:所述存储控制器的队列对端口号、所述存储控制器的互联网协议地址、所述存储控制器的传输控制协议TCP端口号、或者协议号;The switch receives the address information of the storage controller, and the address information of the storage controller includes one or more of the following: the queue pair port number of the storage controller, the Internet Protocol address of the storage controller, the storage control The transmission control protocol TCP port number or protocol number of the device;
    所述交换机为所述存储控制器分配对应的交换地址信息,所述交换地址信息包括以下一项或多项:所述交换机的队列对端口号、所述交换机的互联网协议地址或者所述交换机的端口号;The switch allocates corresponding switching address information for the storage controller, and the switching address information includes one or more of the following: the queue pair port number of the switch, the Internet protocol address of the switch, or the The port number;
    所述交换机与所述存储控制器建立直接路径连接,所述直接路径连接用于传输输入输出I/O报文,所述I/O报文用于通过所述存储控制器向存储阵列写入数据或通过所述存储控制从存储阵列读取数据。The switch establishes a direct path connection with the storage controller, the direct path connection is used to transmit input and output I/O packets, and the I/O packets are used to write to the storage array through the storage controller data or read data from the storage array through the storage control.
  2. 根据权利要求1所述的方法,其特征在于,所述方法还包括:The method according to claim 1, further comprising:
    所述交换机接收来自所述存储控制器的路由信息,所述路由信息包括以下信息中的一项或多项:所述存储控制器的标识信息、所述存储控制器的输入输出I/O地址、目的地为所述存储控制器的I/O报文是否需要复制,或者所述存储控制器的负载信息,The switch receives routing information from the storage controller, and the routing information includes one or more of the following information: identification information of the storage controller, input and output I/O addresses of the storage controller . Whether the I/O message whose destination is the storage controller needs to be copied, or the load information of the storage controller,
    所述I/O地址包括逻辑单元号、命名空间的标识和/或逻辑区块地址。The I/O address includes a logical unit number, a namespace identifier and/or a logical block address.
  3. 根据权利要求2所述的方法,其特征在于,所述方法还包括:The method according to claim 2, further comprising:
    所述交换机生成第一映射关系,所述第一映射关系包括所述直接路径连接的标识信息和所述I/O地址的映射关系。The switch generates a first mapping relationship, where the first mapping relationship includes a mapping relationship between the identification information of the direct path connection and the I/O address.
  4. 根据权利要求3所述的方法,其特征在于,所述方法还包括:The method according to claim 3, further comprising:
    所述交换机接收来自客户端的第一I/O报文;The switch receives the first I/O message from the client;
    所述交换机根据所述第一I/O报文和所述第一映射关系,确定所述直连路径连接;The switch determines the direct path connection according to the first I/O message and the first mapping relationship;
    所述交换机通过所述直接路径连接,向所述存储控制器发送所述第一I/O报文。The switch sends the first I/O message to the storage controller through the direct path connection.
  5. 根据权利要求4所述的方法,其特征在于,所述方法还包括:The method according to claim 4, characterized in that the method further comprises:
    所述交换机检测所述第一I/O报文的类型;The switch detects the type of the first I/O packet;
    当所述第一I/O报文为数据报文时,所述交换机通过所述直接路径连接向所述存储控制器发送所述第一I/O报文;When the first I/O message is a data message, the switch sends the first I/O message to the storage controller through the direct path connection;
    当所述第一I/O报文为非数据报文时,所述交换机透传所述第一I/O报文。When the first I/O packet is a non-data packet, the switch transparently transmits the first I/O packet.
  6. 根据权利要求4-5中任一项所述的方法,其特征在于,所述方法还包括:The method according to any one of claims 4-5, wherein the method further comprises:
    所述交换机通过所述直接路径连接,接收来自所述存储控制器的第一回复报文,所述第一回复报文为所述第一I/O报文的响应;The switch receives a first reply message from the storage controller through the direct path connection, and the first reply message is a response to the first I/O message;
    所述交换机向所述客户端发送所述第一回复报文。The switch sends the first reply message to the client.
  7. 根据权利要求4-5在任一项所述的方法,其特征在于,所述方法还包括:According to the method described in any one of claims 4-5, the method further comprises:
    所述交换机接收来自所述存储控制器的第二回复报文,所述第二回复报文为所述第一I/O报文的响应,所述第二回复报文的目的地为第二存储控制器,所述第二存储控制器为所述客户端为所述第一I/O报文初始分配的存储控制器;The switch receives a second reply message from the storage controller, the second reply message is a response to the first I/O message, and the destination of the second reply message is the second a storage controller, where the second storage controller is a storage controller initially allocated by the client for the first I/O message;
    所述交换机向所述第二存储控制器转发所述第二回复报文;forwarding, by the switch, the second reply message to the second storage controller;
    所述交换机接收来自所述第二存储控制器的第三回复报文,所述第三回复报文是根据所述第二回复报文生成的;The switch receives a third reply message from the second storage controller, where the third reply message is generated according to the second reply message;
    所述交换机转发来自所述第二存储控制器的所述第三回复报文。The switch forwards the third reply packet from the second storage controller.
  8. 根据权利要求7所述的方法,其特征在于,所述第二回复报文包括代理指示信息,所述代理指示信息指示所述第二存储控制器代理所述存储控制器将所述第二回复报文发送至所述客户端,所述第三回复报文不包括所述代理指示信息。The method according to claim 7, wherein the second reply message includes proxy indication information, and the proxy indication information instructs the second storage controller to send the second reply message on behalf of the storage controller. The message is sent to the client, and the third reply message does not include the proxy indication information.
  9. 根据权利要求4-8中任一项所述的方法,其特征在于,所述方法还包括:The method according to any one of claims 4-8, further comprising:
    当所述第一I/O报文需要复制时,所述交换机复制所述第一I/O报文;When the first I/O message needs to be copied, the switch copies the first I/O message;
    所述交换机向所述存储控制器的备份存储控制器发送复制的所述第一I/O报文,所述备份存储控制器和所述存储控制器工作在双活模式。The switch sends the duplicated first I/O packet to a backup storage controller of the storage controller, and the backup storage controller and the storage controller work in a dual-active mode.
  10. 根据权利要求1-9中任一项所述的方法,其特征在于,The method according to any one of claims 1-9, characterized in that,
    所述直接路径连接为传输控制协议TCP连接,或者,远程直接数据存储RDMA连接。The direct path connection is a transmission control protocol TCP connection, or a remote direct data storage RDMA connection.
  11. 一种数据读写方法,其特征在于,包括:A method for reading and writing data, characterized in that, comprising:
    存储控制器向交换机发送所述存储控制器的地址信息,所述存储控制器的地址信息包括以下一项或多项:所述存储控制器的队列对端口号、所述存储控制器的互联网协议地址、所述存储控制器的传输控制协议TCP端口号、或者协议号;The storage controller sends the address information of the storage controller to the switch, and the address information of the storage controller includes one or more of the following: the queue pair port number of the storage controller, the Internet protocol of the storage controller address, transmission control protocol TCP port number, or protocol number of the storage controller;
    所述存储控制器与所述交换机建立直接路径连接,所述直接路径连接用于传输输入输出I/O报文,所述I/O报文用于通过所述存储控制器向存储阵列写入数据或通过所述存储控制从存储阵列读取数据。The storage controller establishes a direct path connection with the switch, the direct path connection is used to transmit input and output I/O packets, and the I/O packets are used to write to the storage array through the storage controller data or read data from the storage array through the storage control.
  12. 根据权利要求11所述的方法,其特征在于,所述方法还包括:The method according to claim 11, characterized in that the method further comprises:
    所述存储控制器向所述交换机发送所述存储控制器的路由信息,所述路由信息包括以下信息中的一项或多项:所述存储控制器的标识信息、所述存储控制器的输入输出I/O地址、目的地为所述存储控制器的I/O报文是否需要复制,或者所述存储控制器的负载信息,The storage controller sends routing information of the storage controller to the switch, where the routing information includes one or more of the following information: identification information of the storage controller, input of the storage controller Outputting the I/O address, whether the I/O message whose destination is the storage controller needs to be copied, or the load information of the storage controller,
    所述I/O地址包括逻辑单元号、命名空间的标识和/或逻辑区块地址。The I/O address includes a logical unit number, a namespace identifier and/or a logical block address.
  13. 根据权利要求11-12中任一项所述的方法,其特征在于,所述方法还包括:The method according to any one of claims 11-12, wherein the method further comprises:
    所述存储控制器通过所述直接路径连接接收来自所述交换机的第一I/O报文。The storage controller receives a first I/O packet from the switch through the direct path connection.
  14. 根据权利要求13所述的方法,其特征在于,所述方法还包括:The method according to claim 13, further comprising:
    所述存储控制器根据所述第一I/O报文,生成第一回复报文,所述第一回复报文为所述第一I/O报文的响应;The storage controller generates a first reply message according to the first I/O message, and the first reply message is a response to the first I/O message;
    所述存储控制器通过所述直接路径连接向所述交换机发送所述第一回复报文。The storage controller sends the first reply message to the switch through the direct path connection.
  15. 根据权利要求13所述的方法,其特征在于,所述方法还包括:The method according to claim 13, further comprising:
    所述存储控制器根据所述第一I/O报文,生成第二回复报文,所述第二回复报文为所述第一I/O报文的响应;The storage controller generates a second reply message according to the first I/O message, and the second reply message is a response to the first I/O message;
    所述存储控制器向所述交换机发送所述第二回复报文,所述第二回复报文的目的地为第二存储控制器,所述第二存储控制器为所述客户端为所述第一I/O报文初始分配的存储控制器。The storage controller sends the second reply message to the switch, the destination of the second reply message is a second storage controller, and the second storage controller is the client for the The storage controller to which the first I/O message is initially allocated.
  16. 根据权利要求15所述的方法,其特征在于,所述第二回复报文包括代理指示信息, 所述代理指示信息指示所述第二存储控制器代理所述存储控制器将所述第二回复报文发送至所述客户端。The method according to claim 15, wherein the second reply message includes proxy indication information, and the proxy indication information instructs the second storage controller to send the second reply message on behalf of the storage controller. The message is sent to the client.
  17. 根据权利要求11-16中任一项所述的方法,其特征在于,所述直接路径连接为传输控制协议TCP连接,或者,远程直接数据存储RDMA连接。The method according to any one of claims 11-16, wherein the direct path connection is a Transmission Control Protocol TCP connection, or a remote direct data storage RDMA connection.
  18. 一种交换机,其特征在于,包括:收发模块和处理模块;A switch, characterized by comprising: a transceiver module and a processing module;
    所述收发模块,用于接收存储控制器的地址信息,所述存储控制器的地址信息包括以下一项或多项:所述存储控制器的队列对端口号、所述存储控制器的互联网协议地址、所述存储控制器的传输控制协议TCP端口号、或者协议号;The transceiver module is used to receive the address information of the storage controller, the address information of the storage controller includes one or more of the following: the queue pair port number of the storage controller, the Internet protocol of the storage controller address, transmission control protocol TCP port number, or protocol number of the storage controller;
    所述处理模块,用于为所述存储控制器分配对应的交换地址信息,所述交换地址信息包括以下一项或多项:所述交换机的队列对端口号、所述交换机的互联网协议地址或者所述交换机的端口号;The processing module is configured to assign corresponding switching address information to the storage controller, and the switching address information includes one or more of the following: the queue pair port number of the switch, the Internet Protocol address of the switch, or the port number of the switch;
    所述处理模块,还用于与所述存储控制器建立直接路径连接,所述直接路径连接用于传输输入输出I/O报文,所述I/O报文用于通过所述存储控制器向存储阵列写入数据或通过所述存储控制从存储阵列读取数据。The processing module is further configured to establish a direct path connection with the storage controller, the direct path connection is used to transmit input and output I/O messages, and the I/O messages are used to pass through the storage controller Data is written to or read from the storage array through the storage control.
  19. 根据权利要求18所述的交换机,其特征在于,The switch according to claim 18, characterized in that,
    所述收发模块,还用于接收来自所述存储控制器的路由信息,所述路由信息包括以下信息中的一项或多项:所述存储控制器的标识信息、所述存储控制器的输入输出I/O地址、目的地为所述存储控制器的I/O报文是否需要复制,或者所述存储控制器的负载信息,所述I/O地址包括逻辑单元号、命名空间的标识和/或逻辑区块地址。The transceiver module is further configured to receive routing information from the storage controller, where the routing information includes one or more of the following information: identification information of the storage controller, input of the storage controller Outputting the I/O address, whether the I/O message whose destination is the storage controller needs to be copied, or the load information of the storage controller, the I/O address includes the logical unit number, the identifier of the namespace and /or logical block address.
  20. 根据权利要求19所述的交换机,其特征在于,The switch according to claim 19, characterized in that,
    所述处理模块,还用于生成第一映射关系,所述第一映射关系包括所述直接路径连接的标识信息和所述I/O地址的映射关系。The processing module is further configured to generate a first mapping relationship, where the first mapping relationship includes a mapping relationship between the identification information of the direct path connection and the I/O address.
  21. 根据权利要求20所述的交换机,其特征在于,The switch according to claim 20, characterized in that,
    所述收发模块,还用于接收来自客户端的第一I/O报文;The transceiver module is also used to receive the first I/O message from the client;
    所述处理模块,还用于根据所述第一I/O报文和所述第一映射关系,确定所述直连路径连接;The processing module is further configured to determine the direct path connection according to the first I/O message and the first mapping relationship;
    所述收发模块,还用于通过所述直接路径连接,向所述存储控制器发送所述第一I/O报文。The transceiver module is further configured to send the first I/O message to the storage controller through the direct path connection.
  22. 根据权利要求21所述的交换机,其特征在于,The switch according to claim 21, characterized in that,
    所述处理模块,还用于检测所述第一I/O报文的类型;The processing module is further configured to detect the type of the first I/O message;
    所述收发模块,还用于当所述第一I/O报文为数据报文时,通过所述直接路径连接向所述存储控制器发送所述第一I/O报文;The transceiver module is further configured to send the first I/O message to the storage controller through the direct path connection when the first I/O message is a data message;
    所述收发模块,还用于当所述第一I/O报文为非数据报文时,透传所述第一I/O报文。The transceiver module is further configured to transparently transmit the first I/O message when the first I/O message is a non-data message.
  23. 根据权利要求21-22中任一项所述的交换机,其特征在于,The switch according to any one of claims 21-22, characterized in that,
    所述收发模块,还用于通过所述直接路径连接,接收来自所述存储控制器的第一回复报文,所述第一回复报文为所述第一I/O报文的响应;The transceiver module is further configured to receive a first reply message from the storage controller through the direct path connection, where the first reply message is a response to the first I/O message;
    所述收发模块,还用于向所述客户端发送所述第一回复报文。The transceiver module is further configured to send the first reply message to the client.
  24. 根据权利要求22-23在任一项所述的交换机,其特征在于,The switch according to any one of claims 22-23, characterized in that,
    所述收发模块,还用于接收来自所述存储控制器的第二回复报文,所述第二回复报文为所述第一I/O报文的响应,所述第二回复报文的目的地为第二存储控制器,所述第二存储控制器为所述客户端为所述第一I/O报文初始分配的存储控制器;The transceiver module is further configured to receive a second reply message from the storage controller, the second reply message is a response to the first I/O message, and the second reply message is The destination is a second storage controller, where the second storage controller is the storage controller initially allocated by the client for the first I/O message;
    所述收发模块,还用于向所述第二存储控制器转发所述第二回复报文;The transceiver module is further configured to forward the second reply message to the second storage controller;
    所述收发模块,还用于接收来自所述第二存储控制器的第三回复报文,所述第三回复报文是根据所述第二回复报文生成的;The transceiver module is further configured to receive a third reply message from the second storage controller, the third reply message is generated according to the second reply message;
    所述收发模块,还用于转发来自所述第二存储控制器的所述第三回复报文。The transceiver module is further configured to forward the third reply message from the second storage controller.
  25. 根据权利要求24所述的交换机,其特征在于,所述第二回复报文包括代理指示信息,所述代理指示信息指示所述第二存储控制器代理所述存储控制器将所述第二回复报文发送至所述客户端,所述第三回复报文不包括所述代理指示信息。The switch according to claim 24, wherein the second reply message includes proxy indication information, and the proxy indication information instructs the second storage controller to send the second reply message on behalf of the storage controller. The message is sent to the client, and the third reply message does not include the proxy indication information.
  26. 根据权利要求21-25中任一项所述的交换机,其特征在于,The switch according to any one of claims 21-25, characterized in that,
    所述处理模块,还用于当所述第一I/O报文需要复制时,复制所述第一I/O报文;The processing module is further configured to copy the first I/O message when the first I/O message needs to be copied;
    所述收发模块,还用于向所述存储控制器的备份存储控制器发送复制的所述第一I/O报文,所述备份存储控制器和所述存储控制器工作在双活模式。The transceiver module is further configured to send the replicated first I/O message to a backup storage controller of the storage controller, and the backup storage controller and the storage controller work in a dual-active mode.
  27. 根据权利要求21-26中任一项所述的交换机,其特征在于,所述直接路径连接为传输控制协议TCP连接,或者,远程直接数据存储RDMA连接。The switch according to any one of claims 21-26, wherein the direct path connection is a transmission control protocol TCP connection, or a remote direct data storage RDMA connection.
  28. 一种存储设备,其特征在于,包括:收发模块和处理模块;A storage device, characterized by comprising: a transceiver module and a processing module;
    所述收发模块,用于向交换机发送存储控制器的地址信息,所述存储控制器的地址信息包括以下一项或多项:所述存储控制器的队列对端口号、所述存储控制器的互联网协议地址、所述存储控制器的传输控制协议TCP端口号、或者协议号;The transceiver module is configured to send the address information of the storage controller to the switch, and the address information of the storage controller includes one or more of the following: the queue pair port number of the storage controller, the Internet protocol address, transmission control protocol TCP port number, or protocol number of the storage controller;
    所述处理模块,用于与所述交换机建立直接路径连接,所述直接路径连接用于传输输入输出I/O报文,所述I/O报文用于通过所述存储控制器向存储阵列写入数据或通过所述存储控制从存储阵列读取数据。The processing module is configured to establish a direct path connection with the switch, the direct path connection is used to transmit input and output I/O messages, and the I/O messages are used to send the storage array to the storage controller through the storage controller Write data to or read data from the storage array through the storage control.
  29. 根据权利要求28所述的存储设备,其特征在于,The storage device of claim 28, wherein:
    所述收发模块,还用于向所述交换机发送所述存储控制器的路由信息,所述路由信息包括以下信息中的一项或多项:所述存储控制器的标识信息、所述存储控制器的输入输出I/O地址、目的地为所述存储控制器的I/O报文是否需要复制,或者所述存储控制器的负载信息,所述I/O地址包括逻辑单元号、命名空间的标识和/或逻辑区块地址。The transceiver module is further configured to send routing information of the storage controller to the switch, where the routing information includes one or more of the following information: identification information of the storage controller, storage control Whether the I/O address of the input and output I/O address of the storage controller, the I/O message destined for the storage controller needs to be copied, or the load information of the storage controller, the I/O address includes a logical unit number and a namespace ID and/or logical block address of .
  30. 根据权利要求28-29中任一项所述的存储设备,其特征在于,The storage device according to any one of claims 28-29, wherein:
    所述收发模块,还用于通过所述直接路径连接接收来自所述交换机的第一I/O报文。The transceiver module is further configured to receive the first I/O message from the switch through the direct path connection.
  31. 根据权利要求30所述的存储设备,其特征在于,The storage device according to claim 30, characterized in that,
    所述处理模块,还用于根据所述第一I/O报文,生成第一回复报文,所述第一回复报文为所述第一I/O报文的响应;The processing module is further configured to generate a first reply message according to the first I/O message, and the first reply message is a response to the first I/O message;
    所述收发模块,还用于通过所述直接路径连接向所述交换机发送所述第一回复报文。The transceiver module is further configured to send the first reply message to the switch through the direct path connection.
  32. 根据权利要求30所述的存储设备,其特征在于,The storage device according to claim 30, characterized in that,
    所述处理模块,还用于根据所述第一I/O报文,生成第二回复报文,所述第二回复报 文为所述第一I/O报文的响应;The processing module is further configured to generate a second reply message according to the first I/O message, and the second reply message is a response to the first I/O message;
    所述收发模块,还用于向所述交换机发送所述第二回复报文,所述第二回复报文的目的地为第二存储控制器,所述第二存储控制器为所述客户端为所述第一I/O报文初始分配的存储控制器。The transceiver module is further configured to send the second reply message to the switch, the destination of the second reply message is a second storage controller, and the second storage controller is the client A storage controller initially allocated for the first I/O message.
  33. 根据权利要求32所述的存储设备,其特征在于,所述第二回复报文包括代理指示信息,所述代理指示信息指示所述第二存储控制器代理所述存储控制器将所述第二回复报文发送至所述客户端。The storage device according to claim 32, wherein the second reply message includes proxy indication information, and the proxy indication information instructs the second storage controller to proxy the storage controller to transfer the second The reply message is sent to the client.
  34. 根据权利要求28-33中任一项所述的存储设备,其特征在于,所述直接路径连接为传输控制协议TCP连接,或者,远程直接数据存储RDMA连接。The storage device according to any one of claims 28-33, wherein the direct path connection is a Transmission Control Protocol TCP connection, or a remote direct data storage RDMA connection.
  35. 一种存储系统,其特征在于,包括多个如权利要求18-27中任意一项所述的交换机,和多个如权利要求28-34中任意一项所述的存储设备。A storage system, characterized by comprising multiple switches according to any one of claims 18-27, and multiple storage devices according to any one of claims 28-34.
PCT/CN2022/098309 2021-06-22 2022-06-13 Method for reading and writing data and related apparatus WO2022267909A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110694337.7A CN115509433A (en) 2021-06-22 2021-06-22 Data reading and writing method and related device
CN202110694337.7 2021-06-22

Publications (1)

Publication Number Publication Date
WO2022267909A1 true WO2022267909A1 (en) 2022-12-29

Family

ID=84499582

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/098309 WO2022267909A1 (en) 2021-06-22 2022-06-13 Method for reading and writing data and related apparatus

Country Status (2)

Country Link
CN (1) CN115509433A (en)
WO (1) WO2022267909A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1885817A (en) * 2006-06-08 2006-12-27 杭州华为三康技术有限公司 Internet memory area network IP SAN access method and exchanger
US20090113542A1 (en) * 2007-10-29 2009-04-30 The Boeing Company Virtual Local Area Network Switching Device And Associated Computer System And Method
US20110135303A1 (en) * 2009-12-07 2011-06-09 John Lewis Hufferd DIRECT MODE ADAPTER BASED SHORTCUT FOR FCoE DATA TRANSFER
CN103828332A (en) * 2013-12-04 2014-05-28 华为技术有限公司 Data processing method, device, storage controller, and cabinet
WO2016101287A1 (en) * 2014-12-27 2016-06-30 华为技术有限公司 Method for distributing data in storage system, distribution apparatus and storage system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1885817A (en) * 2006-06-08 2006-12-27 杭州华为三康技术有限公司 Internet memory area network IP SAN access method and exchanger
US20090113542A1 (en) * 2007-10-29 2009-04-30 The Boeing Company Virtual Local Area Network Switching Device And Associated Computer System And Method
US20110135303A1 (en) * 2009-12-07 2011-06-09 John Lewis Hufferd DIRECT MODE ADAPTER BASED SHORTCUT FOR FCoE DATA TRANSFER
CN103828332A (en) * 2013-12-04 2014-05-28 华为技术有限公司 Data processing method, device, storage controller, and cabinet
WO2016101287A1 (en) * 2014-12-27 2016-06-30 华为技术有限公司 Method for distributing data in storage system, distribution apparatus and storage system

Also Published As

Publication number Publication date
CN115509433A (en) 2022-12-23

Similar Documents

Publication Publication Date Title
KR102457091B1 (en) System and method for providing data replication in nvme-of ethernet ssd
US11256582B2 (en) System, and control method and program for input/output requests for storage systems
US10423332B2 (en) Fibre channel storage array having standby controller with ALUA standby mode for forwarding SCSI commands
US9369298B2 (en) Directed route load/store packets for distributed switch initialization
US8732381B2 (en) SAS expander for communication between drivers
KR20200008483A (en) METHOD OF ACCESSING A DUAL LINE SSD DEVICE THROUGH PCIe EP AND NETWORK INTERFACE SIMULTANEOUSLY
US20160216891A1 (en) Dynamic storage fabric
US20150264116A1 (en) Scalable Address Resolution
US11606429B2 (en) Direct response to IO request in storage system having an intermediary target apparatus
US11405455B2 (en) Elastic scaling in a storage network environment
US11765037B2 (en) Method and system for facilitating high availability in a multi-fabric system
US20080101236A1 (en) Storage system and communication bandwidth control method
CN110471627B (en) Method, system and device for sharing storage
WO2022267909A1 (en) Method for reading and writing data and related apparatus
US10353585B2 (en) Methods for managing array LUNs in a storage network with a multi-path configuration and devices thereof
US10642788B1 (en) Sand timer algorithm for tracking in-flight data storage requests for data replication
Dalessandro et al. iSER storage target for object-based storage devices
US20190332293A1 (en) Methods for managing group objects with different service level objectives for an application and devices thereof
WO2021179556A1 (en) Storage system and request processing method, and switch
CN117041147B (en) Intelligent network card equipment, host equipment, method and system
WO2023143103A1 (en) Message processing method, and gateway device and storage system
WO2022182917A1 (en) Efficient data transmissions between storage nodes in replication relationships

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22827405

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE