CN116962446B - Dynamic NVMe-oF link management method and system - Google Patents

Dynamic NVMe-oF link management method and system Download PDF

Info

Publication number
CN116962446B
CN116962446B CN202310955557.XA CN202310955557A CN116962446B CN 116962446 B CN116962446 B CN 116962446B CN 202310955557 A CN202310955557 A CN 202310955557A CN 116962446 B CN116962446 B CN 116962446B
Authority
CN
China
Prior art keywords
management service
nvme
link
network interface
client
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310955557.XA
Other languages
Chinese (zh)
Other versions
CN116962446A (en
Inventor
苟熙
徐文豪
王弘毅
张凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Zhiling Haina Technology Co ltd
Original Assignee
SmartX Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SmartX Inc filed Critical SmartX Inc
Priority to CN202310955557.XA priority Critical patent/CN116962446B/en
Publication of CN116962446A publication Critical patent/CN116962446A/en
Application granted granted Critical
Publication of CN116962446B publication Critical patent/CN116962446B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/10Active monitoring, e.g. heartbeat, ping or trace-route
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/14Session management
    • H04L67/141Setup of application sessions
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Cardiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer And Data Communications (AREA)

Abstract

The invention discloses an NVMe-oF link dynamic management system, which comprises a metadata management service, a network interface management service and an NVMe-oF link management service; the metadata management service updates IP and port information of all storage nodes of the storage cluster in real time; the network interface management service checks all network interfaces available for NVMe-oF connection oF the client currently and sends out the network interfaces in a message queue mode; the NVMe-oF link management service queries IP and port information oF all storage nodes currently available to the storage cluster from the metadata management service through the message queue, queries all network interface lists meeting the requirements oF configuration files oF the current client from the network interface management service, obtains a connection link set A which is all needed between the current client and the storage cluster by carrying out Cartesian product operation on the two, compares the connection link set A with the current existing links to obtain a missing link set B, and invokes an operating system tool to establish the missing links belonging to the set B. The storage service reliability is higher.

Description

Dynamic NVMe-oF link management method and system
Technical Field
The invention belongs to the technical field oF storage, and particularly relates to an NVMe-oF link dynamic management method and system.
Background
NVMe-oF (NVMe over Fabrics) is a high performance storage access protocol based on NVMe.
The NVM Express (NVMe) protocol defines how a host communicates with non-volatile memory over a PCIe bus. The NVMe specification is designed for SSD high-speed storage media and is a more efficient interface specification than SCSI, supporting 65535I/O queues, each supporting 65535 commands (queue depth). The queue map provides for anticipated CPU resource scheduling and can accommodate device drivers in interrupt or poll modes, providing higher data throughput and lower communication latency. With the development oF high-speed network technologies such as RDMA (Remote Direct Memory Access), NVMe-orf defines the use oF various common transport layer protocols to implement NVMe remote connection capability, and common are NVMe over RDMA, NVMe over TCP, and the like.
In the distributed storage system accessed by using an NVMe-oF protocol, a client is connected to the same Subsystem (Subsystem) through a plurality oF network paths, and the storage system automatically selects an available optimal path through an ANA (asymmetric namespace access) function to complete data transmission, so that the storage function can be ensured to be still normally used when one or more paths fail. The ANA mechanism is only applicable to selecting the best link between nodes that have established a connection. To take full advantage of the ANA mechanism, a typical way of connection is for clients to connect to all nodes of the distributed cluster when connecting the subsystems. For a pair of clients and subsystems, the clients need to establish as many connections as the number of nodes, as shown in FIG. 1.
In a distributed storage system, the number of nodes and the connection mode can be changed online elastically, and the ANA mechanism cannot sense the point:
1. after the node is added or removed, the client only tries to reconnect by using the original connection information. Therefore, when a cluster changes, such as a newly added node or a node being removed, a mechanism is needed to enable clients to dynamically manage the addition and deletion of links. In an extreme case, if all the nodes of the distributed storage system are replaced, the computing end statically connected to the original storage node will not be able to obtain the storage service.
2. In the normal operation state of the system, if a new network interface is added to the client, at this time, more links are actually available between the client and the storage service, and the addition of these links can increase the reliability of the storage service, but the client cannot automatically perceive these new links and initiate a connection. If the storage service is reconnected at shutdown, a brief interruption of the service running on the system will result.
Disclosure of Invention
In order to solve the above problems, an objective oF the present invention is to provide an NVMe-oh link dynamic management method and system, where when a storage cluster changes nodes, a link management service obtains a link set with highest guaranteed accessibility through operation, and then complements a missing link, so that reliability oF a storage service is higher.
In order to achieve the above purpose, the technical scheme of the invention is as follows: an NVMe-oF link dynamic management system is applied to a distributed storage cluster and comprises a metadata management service, a network interface management service and an NVMe-oF link management service; wherein the metadata management service is configured on a storage cluster, the network interface management service and the NVMe-orf link management service are configured on a client; the metadata management service updates IP and port information of all storage nodes of the storage cluster in real time and provides an API interface to the outside; the network interface management service checks all network interfaces currently available for NVMe-oF connection oF the client in a polling mode and sends out the network interfaces in a message queue mode; and the NVMe-oF link management service queries IP and port information oF all storage nodes currently available to the storage cluster from the metadata management service through a message queue, queries all network interface lists meeting the requirements oF configuration files oF the current client from the network interface management service, obtains all connection link sets A which are needed between the current client and the storage cluster by carrying out Cartesian product operation on the two, compares the connection link sets A with the current existing links to obtain a missing link set B, and invokes an NVMe-oF client tool provided by an operating system to establish the missing links belonging to the set B.
Preferably, when the node adding and/or deleting actions occur in the storage cluster, the metadata management service actively reports the IP and port information of the node itself, and/or the metadata management service periodically scans the accessibility state of the node in the whole storage cluster to keep the information in the IP and port information list of the storage cluster and the node information actually available in the cluster consistent.
Preferably, the network interface management service checks and updates the currently available network interface list in a polling manner, and all network interfaces available for NVMe-oh connection stored in the currently available network interface list can be acquired by other services through the message queue.
Preferably, the metadata management service issues the event oF the change oF the cluster member node to the outside through a public message queue, the network interface management service issues the interface change information oF the node to the outside through a message queue, and the NVMe-oh link management service subscribes to the message queue and responds to the corresponding event.
Preferably, when the node oF the storage cluster fails and cannot actively report information, the NVMe-oh link management service discovers the failure and updates the link by means oF active detection.
Based on the same conception, the invention also provides an NVMe-oF link dynamic management method which is applied to the system oF any one oF the above, and comprises the following steps: the method comprises the steps that a metadata management service in a storage cluster updates IP and port information of all storage nodes of the storage cluster in real time and provides an API (application program interface) for the outside; the IP and port information oF all the storage nodes are acquired by the NVMe-oF link management service in the form oF a message queue; and the NVMe-oF link management service performs Cartesian product operation on the acquired list oF the IP and port information oF all the storage nodes and the network interface list oF the current client which is acquired from the network interface management service on the client and meets the requirements oF the configuration file to obtain a connection link set A which is all needed between the current client and the storage cluster, compares the connection link set A with the current existing link to obtain a missing link set B, and then invokes an NVMe-oF client tool provided by an operating system to establish the missing link belonging to the set B.
Preferably, when the metadata management service performs node addition and/or deletion actions in the storage cluster, the metadata management service actively reports the IP and port information of the node itself, and/or periodically scans the accessibility state of the node in the whole storage cluster, so as to keep the information in the IP and port information lists of the storage cluster and the node information actually available in the storage cluster consistent.
Based on the same conception, the invention also provides an NVMe-oF link dynamic management method which is applied to the system oF any one oF the above, and comprises the following steps: the network interface management service on the client machine checks all network interfaces available for NVMe-oF connection by a polling mode and sends out the network interfaces by a message queue mode; the NVMe-oF link management service on the client queries IP and port information oF all storage nodes currently available to the storage cluster from the metadata management service on the storage cluster through a message queue, queries all network interface lists meeting the requirements oF configuration files oF the current client from the network interface management service, obtains all connection link sets A between the current client and the storage cluster by carrying out Cartesian product operation on the two connection link sets A, compares the connection link sets A with the current existing links to obtain a missing link set B, and invokes an NVMe-oF client tool provided by an operating system to establish the missing links belonging to the set B.
Preferably, the network interface management service checks and updates the currently available network interface list in a polling manner, and all network interfaces available for NVMe-oh connection stored in the currently available network interface list can be acquired by other services through the message queue.
Preferably, when the node oF the storage cluster fails and cannot actively report information, the NVMe-oh link management service discovers the failure and updates the link by means oF active detection.
By adopting the technical scheme, the invention has the following advantages and positive effects compared with the prior art:
1. in the technical scheme of the invention, when the storage cluster node changes, the link management service calculates the missing storage link according to the latest node information and complements the connection, so that the link reliability of the client and the storage cluster can be improved.
2. In the technical scheme of the invention, when the client adds an additional network interface, the link management service can calculate the newly added storage link according to the information of the latest available network interface and establish connection, so that the link between the client and the storage cluster is prevented from being completely interrupted due to the damage of the network interface.
Drawings
The invention is described in further detail below with reference to the attached drawing figures, wherein:
FIG. 1 is a schematic diagram of a prior art connection between a client and a distributed storage cluster;
FIG. 2 is a schematic diagram of a service architecture according to the present invention;
FIG. 3 is a schematic diagram of a metadata management service according to the present invention;
FIG. 4 is a schematic diagram of the structural configuration of the client network interface management service of the present invention;
fig. 5 is a schematic structural configuration diagram oF an NVMe-oh link management service oF the present invention.
Detailed Description
The invention is described in further detail below with reference to the drawings and the specific examples. Advantages and features of the invention will become more apparent from the following description and from the claims. It is noted that the drawings are in a very simplified form and utilize non-precise ratios, and are intended to facilitate a convenient, clear, description of the embodiments of the invention.
It should be noted that all directional indicators (such as up, down, left, right, front, and rear … …) in the embodiments of the present invention are merely used to explain the relative positional relationship, movement, etc. between the components in a particular posture (as shown in the drawings), and if the particular posture is changed, the directional indicator is changed accordingly.
Examples
The technical scheme oF the embodiment is mainly applied to a distributed storage system, and the NVMe-oF link dynamic management system oF the embodiment comprises distributed storage cluster metadata management service, network interface management service and NVMe-oF link management service.
As shown in fig. 2, for the service architecture diagram oF the present embodiment, the distributed storage cluster metadata management service is configured on the storage clusters, and the network interface management service and the NVMe-oh link management service are configured on the clients.
Referring to fig. 3, there is shown a structural configuration of a metadata management service in which a list is maintained, the list containing IP addresses and ports of all storage nodes of the entire cluster. Meanwhile, the service provides an api interface to the outside, when node adding and deleting actions occur in the cluster, IP and port information of the node can be actively reported, the cloud data management service can also scan regularly, the accessibility state of the node in the whole cluster is kept as much as possible, and information in the list and really available node information in the cluster are consistent.
Referring to fig. 4, there is shown a structural configuration oF a client network interface management service, which checks currently available network interfaces by polling, maintains an internal list, and other services can acquire the network interfaces currently available for NVMe-orf connection through a message queue, and all network interfaces available for NVMe-orf connection stored in the list. On a modern commercial server, there are generally many network interfaces that take on various functions, and these network interfaces have different performances and functional characteristics, so that different types oF network cards can be selected for NVMe-oh connection according to specific application scenarios. The network interface management service scans all available network interfaces on the server according to the conditions specified by the user in the configuration file, and screens out the interfaces meeting the requirements.
Referring to fig. 5, the structural configuration oF an NVMe-oh link management service is shown, the dynamic link management service operates depending on a metadata management service and a network interface management service, in a link refreshing task, the link management service firstly queries the IP addresses and ports oF all storage nodes currently available to a storage cluster from the metadata management service through a message queue, then queries all network interface lists meeting the requirements oF configuration files from the interface management service oF a current host, and obtains all connection link sets a corresponding to the current host and the storage system by performing cartesian product operation on the two, and compares the connection link sets with links existing in the current system to obtain a missing link set B. And finally, calling an NVMe-oF client tool provided by the operating system to establish the missing links belonging to the B set.
The change of the storage cluster node and the change event of the client host network interface trigger the operation of the link refreshing task, and the operation is mainly divided into a normal scene and a fault scene. Under normal circumstances, the metadata management service will issue events of changes of cluster member nodes to the outside through a public message queue, the network interface management service will issue interface change information of the nodes to the outside through a message queue, and the link management service will subscribe to the message queues to respond to the corresponding events. This mechanism requires that both the client node and the storage node can function properly to report the correct information, and when the node itself fails, the information may not be actively reported, in which case the link management service will discover the failure and update the link by means of active probing.
The following is a specific example:
a distributed storage Cluster Cluster A consists oF 3 nodes Node1, node2 and Node3, wherein a Client 1 is connected to the Cluster A through NVMe-oF protocol and uses storage services provided by the Client.
In the initial state, three storage links of < Client 1, node1>, < Client 1, node 2>, < Client 1, node 3> are established together between the Client 1 and the cluster, IO data transmission is completed between the Client and the cluster through the three links, and when at most two of the links fail, the Client can still normally complete IO.
In the storage cluster expansion scenario, a new storage Node 4 is added to the storage cluster, and if the technology is conventional, the client machine cannot perceive the addition of the Node 4, so that a new network link is not automatically established with the Node 4. On the premise of using the technical solution of the present embodiment, the link management service of the present embodiment obtains the message that Node 4 joins the cluster through the message queue, then performs a link refresh task, and establishes a new storage link between Client 1 and Node 4, which can make the link reliability between the storage cluster and the Client reach the optimal level.
In addition, in the scenario that a Client newly adds a network interface, when the Client deploys, the network interface supporting RDMA features is indicated by the configuration file to be used for establishing a storage link, in the initial state, the Client only has one network interface NIC 1 for establishing a storage link, and Client 1 and cluster together establish three storage links of < NIC 1, node1>, < NIC 1, node 2>, < NIC 1, node 3>, and when a new RDMA network card NIC 2 is inserted into the host and properly configured, the network interface management service monitors and publishes the new RDMA network card NIC 2 through the message queue. The link management service knows that the Client newly increases the available network interface NIC 2 through the message queue, the Client 1 and the cluster together establish three storage links of < NIC 1, node1>, < NIC 1, node 2>, < NIC 1, node 3>, the Client 1 and the cluster newly increase three storage links of < NIC 2, node1>, < NIC 2, node 2>, < NIC 2, node 3>, and both network interfaces of the Client are used for establishing storage links, so that the functions of load balancing and high availability can be realized, and any network card is suddenly damaged or manually pulled out, so that the connection between the Client and the storage cluster is not interrupted.
In the prior art, in the use scenario that a client connects to all nodes of a distributed cluster, the client will acquire all node information of the cluster when it is first connected and establish a multi-path connection to ensure high availability of storage links. However, when the nodes of the cluster change, the cluster cannot send node update information to the client and automatically establish a new connection, so that some practically available links between the client and the storage cluster are always lost, and the accessibility of the whole cluster does not reach the expected optimal effect. In the technical scheme provided by the embodiment, when the storage group is subjected to node change, the link management service can be notified, the link management service obtains the link set with highest guaranteed accessibility through operation, and then the missing link is complemented, so that the reliability of the storage service is higher.
In addition, in the case of adding a network interface of the client, replacing, etc., such as adding a network interface for nvme-of communication, more storage links can be actually established through this newly added network interface, and higher reliability is increased from the hardware level of the client side, but the client does not automatically establish such links. In the technical solution provided in this embodiment, when the link management service knows that the client machine has newly increased the available network interface through the message queue, the link refresh task will be executed, the new available link is automatically established, the link reliability between the client machine and the storage cluster will be automatically increased to the optimal state, and after the link corresponding to the new network interface is established, the previous network interface can be pulled out without affecting the IO, which can make it possible to replace the network interface without interrupting the storage service.
Based on the same conception, the invention also provides an NVMe-oF link dynamic management method which is applied to the system oF any one oF the above, and comprises the following steps: the method comprises the steps that a metadata management service in a storage cluster updates IP and port information of all storage nodes of the storage cluster in real time and provides an API (application program interface) for the outside; the IP and port information oF all the storage nodes are acquired by the NVMe-oF link management service in the form oF a message queue; and the NVMe-oF link management service performs Cartesian product operation on the acquired list oF the IP and port information oF all the storage nodes and the network interface list oF the current client which is acquired from the network interface management service on the client and meets the requirements oF the configuration file to obtain a connection link set A which is all needed between the current client and the storage cluster, compares the connection link set A with the current existing link to obtain a missing link set B, and then invokes an NVMe-oF client tool provided by an operating system to establish the missing link belonging to the set B.
Preferably, when the metadata management service performs node addition and/or deletion actions in the storage cluster, the metadata management service actively reports the IP and port information of the node itself, and/or periodically scans the accessibility state of the node in the whole storage cluster, so as to keep the information in the IP and port information lists of the storage cluster and the node information actually available in the storage cluster consistent.
Based on the same conception, the invention also provides an NVMe-oF link dynamic management method which is applied to the system oF any one oF the above, and comprises the following steps: the network interface management service on the client machine checks all network interfaces available for NVMe-oF connection by a polling mode and sends out the network interfaces by a message queue mode; the NVMe-oF link management service on the client queries IP and port information oF all storage nodes currently available to the storage cluster from the metadata management service on the storage cluster through a message queue, queries all network interface lists meeting the requirements oF configuration files oF the current client from the network interface management service, obtains all connection link sets A between the current client and the storage cluster by carrying out Cartesian product operation on the two connection link sets A, compares the connection link sets A with the current existing links to obtain a missing link set B, and invokes an NVMe-oF client tool provided by an operating system to establish the missing links belonging to the set B.
Preferably, the network interface management service checks and updates the currently available network interface list in a polling manner, and all network interfaces available for NVMe-oh connection stored in the currently available network interface list can be acquired by other services through the message queue.
Preferably, when the node oF the storage cluster fails and cannot actively report information, the NVMe-oh link management service discovers the failure and updates the link by means oF active detection.
Preferably, the high availability of the network interface may also be achieved by network card bonding.
In the technical scheme of the invention, when the storage cluster node changes, the link management service calculates the missing storage link according to the latest node information and complements the connection, so that the link reliability of the client and the storage cluster can be improved. When the client adds an extra network interface, the link management service will calculate the newly added storage link according to the information of the latest available network interface and establish a connection, so that the link between the client and the storage cluster is prevented from being completely interrupted due to the damage of the network interface.
Based on the same inventive concept, the present invention also provides a computer apparatus comprising: a memory for storing a processing program; and the processor is used for realizing the dynamic NVMe-oF link management method according to any one oF the above steps when executing the processing program.
Based on the same inventive concept, the invention further provides a readable storage medium, wherein a processing program is stored on the readable storage medium, and when the processing program is executed by a processor, the processing program realizes any one oF the NVMe-oF link dynamic management methods.
Those of ordinary skill in the art will appreciate that: all or part of the steps for implementing the above method embodiments may be implemented by hardware related to program instructions, and the foregoing program may be stored in a computer readable storage medium, where the program, when executed, performs steps including the above method embodiments; and the aforementioned storage medium includes: a mobile storage device, a Read Only Memory (ROM), a magnetic disk or an optical disk, or the like, which can store program codes.
The embodiments of the present invention have been described in detail with reference to the drawings, but the present invention is not limited to the above embodiments. Even if various changes are made to the present invention, it is within the scope of the appended claims and their equivalents to fall within the scope of the invention.

Claims (10)

1. The NVMe-oF link dynamic management system is applied to a distributed storage cluster, and is characterized by comprising a metadata management service, a network interface management service and an NVMe-oF link management service, wherein the NVMe-oF is a non-volatile memory host controller interface specification based on a network; wherein the metadata management service is configured on a storage cluster, the network interface management service and the NVMe-orf link management service are configured on a client;
the metadata management service updates IP and port information of all storage nodes of the storage cluster in real time and provides an API interface to the outside;
the network interface management service checks all network interfaces currently available for NVMe-oF connection oF the client in a polling mode and sends out the network interfaces in a message queue mode;
and the NVMe-oF link management service queries IP and port information oF all storage nodes currently available to the storage cluster from the metadata management service through a message queue, queries all network interface lists meeting the requirements oF configuration files oF the current client from the network interface management service, obtains all connection link sets A which are needed between the current client and the storage cluster by carrying out Cartesian product operation on the two, compares the connection link sets A with the current existing links to obtain a missing link set B, and invokes an NVMe-oF client tool provided by an operating system to establish the missing links belonging to the set B.
2. The NVMe-orf link dynamic management system oF claim 1, wherein the metadata management service actively reports the IP and port information oF the node itself when node addition and/or deletion actions occur in the storage cluster, and/or the metadata management service periodically scans the accessibility status oF the nodes in the entire storage cluster to keep the information in the IP and port information list oF the storage cluster and the node information actually available in the cluster consistent.
3. The NVMe-orf link dynamic management system oF claim 1 wherein the network interface management service checks and updates a list oF currently available network interfaces by means oF polling, all network interfaces available for NVMe-orf connections stored in the list oF currently available network interfaces, and other services can obtain the network interfaces currently available for NVMe-orf connections through a message queue.
4. The NVMe-orf link dynamic management system oF claim 1, wherein the metadata management service issues events oF cluster member node changes to the outside through a published message queue, the network interface management service issues interface change information oF the node to the outside through a message queue, and the NVMe-orf link management service subscribes to the message queue and responds to the corresponding events.
5. The NVMe-orf link dynamic management system oF claim 1, wherein when a node oF the storage cluster fails to actively report information, the NVMe-orf link management service discovers the failure and updates the link by actively probing.
6. An NVMe-orf link dynamic management method, wherein the NVMe-orf is a network-based nonvolatile memory host controller interface specification, and is applied to the system oF any one oF claims 1 to 5, and the method is characterized by comprising the following steps:
the method comprises the steps that a metadata management service in a storage cluster updates IP and port information of all storage nodes of the storage cluster in real time and provides an API (application program interface) for the outside;
the IP and port information oF all the storage nodes are acquired by the NVMe-oF link management service in the form oF a message queue; and the NVMe-oF link management service performs Cartesian product operation on the acquired list oF the IP and port information oF all the storage nodes and the network interface list oF the current client which is acquired from the network interface management service on the client and meets the requirements oF the configuration file to obtain a connection link set A which is all needed between the current client and the storage cluster, compares the connection link set A with the current existing link to obtain a missing link set B, and then invokes an NVMe-oF client tool provided by an operating system to establish the missing link belonging to the set B.
7. The method according to claim 6, wherein the metadata management service actively reports IP and port information oF the node itself when node addition and/or deletion actions occur in the storage cluster, and/or periodically scans the accessibility state oF the node in the whole storage cluster to keep the information in the IP and port information lists oF the storage cluster and the node information actually available in the clusters consistent.
8. An NVMe-orf link dynamic management method, wherein the NVMe-orf is a network-based nonvolatile memory host controller interface specification, and is applied to the system oF any one oF claims 1 to 5, and the method is characterized by comprising the following steps:
the network interface management service on the client machine checks all network interfaces available for NVMe-oF connection by a polling mode and sends out the network interfaces by a message queue mode;
the NVMe-oF link management service on the client queries IP and port information oF all storage nodes currently available to the storage cluster from the metadata management service on the storage cluster through a message queue, queries all network interface lists meeting the requirements oF configuration files oF the current client from the network interface management service, obtains all connection link sets A between the current client and the storage cluster by carrying out Cartesian product operation on the two connection link sets A, compares the connection link sets A with the current existing links to obtain a missing link set B, and invokes an NVMe-oF client tool provided by an operating system to establish the missing links belonging to the set B.
9. The method according to claim 8, wherein the network interface management service checks and updates a currently available network interface list by means oF polling, wherein all network interfaces available for NVMe-oh connection stored in the currently available network interface list are available, and other services can obtain the network interfaces currently available for NVMe-oh connection through a message queue.
10. The method according to claim 8, wherein when the node oF the storage cluster fails and cannot actively report information, the NVMe-oh link management service discovers the failure and updates the link by actively probing.
CN202310955557.XA 2023-08-01 2023-08-01 Dynamic NVMe-oF link management method and system Active CN116962446B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310955557.XA CN116962446B (en) 2023-08-01 2023-08-01 Dynamic NVMe-oF link management method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310955557.XA CN116962446B (en) 2023-08-01 2023-08-01 Dynamic NVMe-oF link management method and system

Publications (2)

Publication Number Publication Date
CN116962446A CN116962446A (en) 2023-10-27
CN116962446B true CN116962446B (en) 2024-02-23

Family

ID=88446132

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310955557.XA Active CN116962446B (en) 2023-08-01 2023-08-01 Dynamic NVMe-oF link management method and system

Country Status (1)

Country Link
CN (1) CN116962446B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10986174B1 (en) * 2020-09-18 2021-04-20 EMC IP Holding Company LLC Automatic discovery and configuration of server nodes
CN114827145A (en) * 2022-04-24 2022-07-29 阿里巴巴(中国)有限公司 Server cluster system, and metadata access method and device
CN114844912A (en) * 2022-04-22 2022-08-02 北京志凌海纳科技有限公司 Data link distribution method and device and distributed block storage system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10986174B1 (en) * 2020-09-18 2021-04-20 EMC IP Holding Company LLC Automatic discovery and configuration of server nodes
CN114844912A (en) * 2022-04-22 2022-08-02 北京志凌海纳科技有限公司 Data link distribution method and device and distributed block storage system
CN114827145A (en) * 2022-04-24 2022-07-29 阿里巴巴(中国)有限公司 Server cluster system, and metadata access method and device

Also Published As

Publication number Publication date
CN116962446A (en) 2023-10-27

Similar Documents

Publication Publication Date Title
US11886731B2 (en) Hot data migration method, apparatus, and system
CN106534328B (en) Node connection method and distributed computing system
EP2435916B1 (en) Cache data processing using cache cluster with configurable modes
CN111615066B (en) Distributed micro-service registration and calling method based on broadcast
JP6198737B2 (en) System, method and configuration for dynamic discovery of resource servers in a traffic director environment
US7225356B2 (en) System for managing operational failure occurrences in processing devices
US20200050479A1 (en) Blockchain network and task scheduling method therefor
US20130007253A1 (en) Method, system and corresponding device for load balancing
CN103581276A (en) Cluster management device and system, service client side and corresponding method
US9390156B2 (en) Distributed directory environment using clustered LDAP servers
US7836351B2 (en) System for providing an alternative communication path in a SAS cluster
CN113839862B (en) Method, system, terminal and storage medium for synchronizing ARP information between MCLAG neighbors
CN114265753A (en) Management method and management system of message queue and electronic equipment
CN108512753B (en) Method and device for transmitting messages in cluster file system
US7334038B1 (en) Broadband service control network
US9544371B1 (en) Method to discover multiple paths to disk devices cluster wide
CN116962446B (en) Dynamic NVMe-oF link management method and system
CN112491951A (en) Request processing method, server and storage medium in peer-to-peer network
CN111880932A (en) Data storage method and device based on multiple network ports
CN114928615B (en) Load balancing method, device, equipment and readable storage medium
CN112328404B (en) Load balancing method and device, electronic equipment and computer readable medium
US20230030168A1 (en) Protection of i/o paths against network partitioning and component failures in nvme-of environments
WO2021249173A1 (en) Distributed storage system, abnormality processing method therefor, and related device
US20200341968A1 (en) Differential Update of Local Cache from Central Database
CN116455963A (en) Cluster node registration method, medium, device and computing equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP03 Change of name, title or address

Address after: 8b, building 1, No. 48, Zhichun Road, Haidian District, Beijing 100098

Patentee after: Beijing Zhiling Haina Technology Co.,Ltd.

Country or region after: China

Address before: 8b, building 1, No. 48, Zhichun Road, Haidian District, Beijing 100098

Patentee before: Beijing zhilinghaina Technology Co.,Ltd.

Country or region before: China