WO2017084618A1 - 共享存储式集群文件系统节点通信的监控方法及监控节点 - Google Patents

共享存储式集群文件系统节点通信的监控方法及监控节点 Download PDF

Info

Publication number
WO2017084618A1
WO2017084618A1 PCT/CN2016/106412 CN2016106412W WO2017084618A1 WO 2017084618 A1 WO2017084618 A1 WO 2017084618A1 CN 2016106412 W CN2016106412 W CN 2016106412W WO 2017084618 A1 WO2017084618 A1 WO 2017084618A1
Authority
WO
WIPO (PCT)
Prior art keywords
node
message
storage
cluster
multicast
Prior art date
Application number
PCT/CN2016/106412
Other languages
English (en)
French (fr)
Inventor
郭旭艳
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2017084618A1 publication Critical patent/WO2017084618A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/40Support for services or applications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/04Processing captured monitoring data, e.g. for logfile generation

Definitions

  • the present invention relates to the field of shared storage file system cluster communication, and in particular to a monitoring method and a monitoring node for node communication of a shared storage cluster file system.
  • the shared storage cluster file system inter-node communication module uses corosync (Corosync is part of the cluster management suite. It can define the way information is transmitted through a simple configuration file when transferring information. Protocol, etc.), the underlying communication is implemented according to the totem protocol, the node information is transmitted by multicast, and the reliable reception of the message is ensured in the form of a unicast token, and the synchronization of the nodes in the cluster is synchronized.
  • Corosync is part of the cluster management suite. It can define the way information is transmitted through a simple configuration file when transferring information. Protocol, etc.), the underlying communication is implemented according to the totem protocol, the node information is transmitted by multicast, and the reliable reception of the message is ensured in the form of a unicast token, and the synchronization of the nodes in the cluster is synchronized.
  • the Operational state is a stable working state of the cluster and has a stable ring.
  • the Gather and Commit states are processes in which the cluster determines the state of the node. By repeating the broadcast of its own members until the members of each node reach a consistent state, for the node that is confirmed as leave. The cluster also needs to isolate the problem node. This process may take a long time, and the process cluster will not process the application message. This state will cause the message processing delay of the cluster and the cluster instability to increase.
  • the running status of the cluster file system makes it difficult to adjust the running status in time.
  • the embodiment of the invention provides a monitoring method and a monitoring node for the node communication of the shared storage cluster file system, so as to solve at least the real-time understanding of the running state of the cluster file system in the shared storage cluster file system, and it is difficult to timely adjust the running state.
  • the problem is a problem.
  • a monitoring method for a shared storage cluster file system node communication is provided, which is applied to a monitoring node in a shared storage cluster file system, including: receiving a group of cluster nodes in the cluster file system Broadcasting, the monitoring node and the cluster node are both located in the cluster file system; acquiring a message type of the multicast message and a message parameter corresponding to the message type; querying and storing according to the message parameter a storage record corresponding to the multicast message in the table; when a preset time interval arrives, acquiring a problem node of the cluster file system according to the storage record corresponding to the multicast message in the storage table And cluster status.
  • the acquiring the message type of the multicast message and the message parameter corresponding to the message type specifically includes: acquiring a message type of the multicast message; and when the message type is an application layer message type Obtaining a first message parameter of the multicast message, where the first message parameter includes: a message number of an application layer message corresponding to the multicast message, and a first node where the multicast message is multicast a first ring label of the ring and a first sender address of the node that multicasts the multicast message in the first ring; and when the message type is a node join message type, acquiring the first message of the multicast message
  • the second message parameter includes: a second ring label of the second ring where the node that multicasts the multicast message is located, and a node that multicasts the multicast message is in the second ring The second sender address and a list of node members recorded by the node that multicasts the multicast message itself.
  • the querying, according to the message parameter, the storage record corresponding to the multicast message in the storage table specifically: when the multicast message is an application layer message type, according to the message number and Determining, by the first ring label, whether there is a first storage record having the message number and the first ring label in an application layer message table of the storage table; when the application layer message table does not exist, When the first storage record is stored, the first message parameter of the multicast message is stored in the application layer message table; when the first storage record exists in the application layer message table, the determination result is the first ring The previous node of the node corresponding to the first sender address There is a message loss, and the previous node is found to be a suspected problem node, and the parameters of the suspected problem node are stored to the suspected problem node table in the storage table.
  • the determining that the previous node is a suspected problem node, and storing the parameter of the suspected problem node to the suspected problem node table in the storage table specifically: according to the first sender address And obtaining a first node address of the suspected problem node in the first ring; determining, according to the first node address, the message number, and the first ring label, whether the suspected problem node table exists a second storage record having the first node address, the message number, and the first ring label; when the second storage record exists in the suspect problem node table, adding a message of the suspected problem node Recording the number of times; when the second storage record does not exist in the suspected problem node table, storing the first parameter including the first node address, the message number, and the first ring label to the suspect problem Node table.
  • the querying, according to the message parameter, the storage record corresponding to the multicast message in the storage table specifically: when the multicast message is a node join message type, according to the second ring Marking with the second sender address, determining whether a node in the storage table joins a message table to have a third storage record having the second ring label and the second sender address; when the node joins When the third storage record does not exist in the message table, storing the second message parameter of the multicast message to the node joining message table; when the node joins the message table, the third storage record exists, according to Determining, by the node member list, whether there is an increased or decreased node member in the node member list of the multicast message compared to the third storage record; when there is a decrease in the node member list of the multicast message And determining, by the node member, the second node address of the reduced node member, and determining, according to the second node address and the second sender address, the leaving node of the
  • the method when the predetermined time interval is reached, acquiring the problem node and the cluster status of the cluster file system according to the storage record corresponding to the multicast message in the storage table, specifically: In the suspected problem node table, when there is a special suspected problem node in which the number of message records reaches a preset maximum value in the suspected problem node, determining the preset maximum value and other nodes except the special suspected problem node Whether the number of message records is the same; when the preset maximum value is the same as the number of message records of other nodes except the special suspect problem node, it is determined that the cluster file system is a busy state of the cluster service; When the number of message records of other nodes other than the special suspect problem node is different, it is determined that the special suspect problem node is the problem node.
  • the method when the predetermined time interval is reached, acquiring the problem node and the cluster status of the cluster file system according to the storage record corresponding to the multicast message in the storage table, specifically: determining Whether the leaving node table is empty; when the leaving node table is not empty, and the leaving node table has a plurality of storage records having the same second node address, determining the same as the foregoing The node corresponding to the second node address is the problem node; when the leaving node table is empty, determining that the node joins the message table has the smallest second sender address in the second ring Whether the number of stored records reaches a preset value; when the node joins the message table, the number of stored records having the smallest second sender address in the second ring reaches a preset value, determining that The cluster file system is a token timeout frequent state.
  • the monitoring method further includes: acquiring, according to the problem node and the cluster state, a corresponding adjustment parameter; multicasting the adjustment parameter to the cluster node, so that the cluster node is adjusted according to the adjustment parameter Current configuration.
  • the obtaining, according to the problem node and the cluster state, the corresponding adjustment parameter specifically: when the cluster file system in which the cluster node is located is in a busy state of the cluster service, the method is adjusted according to a first preset rate.
  • the current message transmission window value is a new message transmission window value, and the current maximum transmittable information value of each of the cluster nodes is reduced according to a second preset magnification.
  • the maximum transmittable information value when the cluster file system in which the cluster node is located is a token timeout frequent state, the token timeout time of the cluster node is increased to a new token according to a third preset magnification. overtime time.
  • a monitoring node in a shared storage cluster file system including:
  • a first receiving module configured to receive a multicast message of a cluster node in the cluster file system, where the monitoring node and the cluster node are located in the cluster file system; and the first acquiring module is configured to acquire the group a message type of the broadcast message and a message parameter corresponding to the message type; the query module is configured to query, according to the message parameter, a storage record corresponding to the multicast message in the storage table; the second obtaining module, setting And obtaining a problem node and a cluster state of the cluster file system according to the storage record corresponding to the multicast message in the storage table when a predetermined time interval is reached.
  • the first acquiring module is specifically configured to: obtain a message type of the multicast message, and acquire a first message parameter of the multicast message when the message type is an application layer message type,
  • the first message parameter includes: a message number of the application layer message corresponding to the multicast message, a first ring label of the first ring where the node that multicasts the multicast message is located, and a node that multicasts the multicast message a first sender address in the first ring;
  • the message type is a node join message type
  • the second message parameter includes at least: a multicast station a second ring label of the second ring where the node of the multicast message is located, a second sender address of the node that multicasts the multicast message in the second ring, and a node that multicasts the multicast message A list of node members recorded.
  • the query module specifically includes:
  • a first determining sub-module configured to: when the multicast message is an application layer message type, determine, according to the message number and the first ring label, whether the application layer message table of the storage table has the a message number and a first storage record of the first ring label; the first storage submodule is configured to store the first of the multicast message when the first storage record does not exist in the application layer message table Message parameters to the application layer message table; the second storage submodule is set to When the first storage record exists in the application layer message table, the determination result is that the previous node of the node corresponding to the first sender address in the first ring has a message loss, and the The previous node is a suspected problem node, and stores the parameters of the suspected problem node to the suspected problem node table in the storage table.
  • the second storage sub-module specifically includes: an acquiring unit, configured to acquire, according to the first sender address, a first node address of the suspected problem node in the first ring; a determining unit, And determining, according to the first node address, the message number, and the first ring label, whether the first node address, the message number, and the first ring are present in the suspect problem node table a second storage record of the label; the recording unit is configured to increase the number of message records of the suspected problem node when the second storage record exists in the suspect problem node table; the storage unit is set to be the suspected problem When the second storage record does not exist in the node table, the first parameter including the first node address, the message number, and the first ring label is stored to the suspect problem node table.
  • the query module specifically includes:
  • a second determining sub-module configured to determine, according to the second ring tag and the second sender address, whether the node in the storage table joins the message table when the multicast message is a node join message type
  • a third storage submodule configured to: when the node join message table does not have the third storage record, the storage The second message parameter of the multicast message is added to the node join message table; the third determining sub-module is configured to determine, according to the node member list, when the third storage record exists in the node join message table Whether there is an increased or decreased node member in the node member list of the multicast message compared with the third storage record; and a fourth determining sub-module configured to reduce the node member list in the multicast message Obtaining a second node address of the reduced node member, and determining, according to the second node address and the second sender address, the leaving of the storage table Whether there is a
  • the second acquiring module specifically includes:
  • a sixth determining sub-module configured to be in the suspected problem node table, when the suspected problem node has a special suspected problem node whose message recording times reach a preset maximum value, determining the preset maximum value and the dividing Whether the number of message records of other nodes other than the node of the special suspect problem is the same.
  • the preset maximum value is the same as the number of message records of other nodes except the special suspect problem node, it is determined that the cluster file system is busy for the cluster service.
  • the seventh determining sub-module is configured to determine that the special suspected problem node is the problem node when the preset maximum value is different from the number of message records of other nodes than the special suspect problem node.
  • the second acquiring module specifically includes:
  • the eighth determining submodule is configured to determine whether the leaving node table is empty; the ninth determining submodule is configured to: when the leaving node table is not empty, and the leaving node table has multiple identical When the storage of the second node address is recorded, determining that the node corresponding to the same second node address is the problem node; and the tenth determining sub-module is configured to be empty when the leaving node table is Determining, by the node joining message table, whether the number of stored records having the smallest second sender address in the second ring reaches a preset value; the eleventh determining sub-module is configured to join when the node joins In the message table, when the number of the storage records having the smallest second sender address in the second ring reaches a preset value, the cluster file system is determined to be a token timeout frequent state.
  • the monitoring node further includes:
  • a third obtaining module configured to obtain a corresponding adjustment according to the problem node and the cluster state
  • a multicast module configured to multicast the adjustment parameter to the cluster node, so that the cluster node adjusts a current self configuration according to the adjustment parameter.
  • the third acquiring module is specifically configured to:
  • the current message transmission window value is increased according to the first preset magnification value, and the new message transmission window value is reduced according to the second preset magnification.
  • the current maximum transmittable information value of the cluster node is a new maximum transmittable information value; when the cluster file system in which the cluster node is located is a token timeout frequent state, the third preset magnification is adjusted.
  • the token timeout time of the cluster node is a new token timeout period.
  • a storage medium is also provided.
  • the storage medium is configured to store program code for performing a monitoring method of the shared storage cluster file system node communication described above.
  • the foregoing solution monitors the running status of the cluster by collecting multicast messages, and analyzes the state of the node according to the multicast message statistics of each node, and provides statistical judgments on the cluster status and the problem nodes, thereby improving the processing capability and stability of the communication service, and
  • the device can obtain the device fault notification in the first time, so that the administrator can have an intuitive observation effect on the entire cluster communication status, and can timely understand the device status, locate the fault target, improve the work efficiency, and improve the overall performance of the cluster file system.
  • 1 is a schematic diagram of a message that a corosync on a node P1 accepts an application A1 message M1M2M3 on the node and multicasts in the cluster;
  • FIG. 2 is a schematic diagram of a token token transmitted from node P1 to P2 after a node multicast message
  • Figure 3 is a schematic diagram of P2 continuing to deliver a token to P3 after receiving a token acknowledgement message.
  • FIG. 4 is a schematic diagram of a node broadcast node joining a message joinmsg after a node is added to the cluster;
  • FIG. 5 is a schematic diagram of broadcasting a collection of its own members after the other nodes of the cluster receive the joinmsg;
  • Figure 6 is a schematic diagram of the node not receiving the joinmsg of the other node and thus not implementing the consensus;
  • FIG. 7 is a schematic diagram of adding a monitoring node in a cluster
  • Figure 8 is a flowchart of a method in the first embodiment of the present invention.
  • Figure 9 is a flow chart of a method in a second embodiment of the present invention.
  • Figure 10 is a flowchart of a method in a third embodiment of the present invention.
  • Figure 11 is a flowchart of a method in a fourth embodiment of the present invention.
  • Figure 12 is a flowchart of a method in a fifth embodiment of the present invention.
  • Figure 13 is a flowchart of a method in a sixth embodiment of the present invention.
  • Figure 14 is a second flowchart of a method according to a sixth embodiment of the present invention.
  • Figure 15 is a flowchart 1 of the method in the seventh embodiment of the present invention.
  • Figure 16 is a flowchart 2 of the method in the seventh embodiment of the present invention.
  • Figure 17 is a block diagram showing the overall structure of a ninth embodiment of the present invention.
  • Figure 18 is a block diagram showing the detailed structure of a ninth embodiment of the present invention.
  • Figure 19 is a schematic view of the overall flow of the method of the present invention.
  • Figure 20 is a second schematic diagram of the overall process of the method of the present invention.
  • the present invention provides a monitoring method for node communication of a shared storage cluster file system, which is applied to a monitoring node in a shared storage cluster file system, and the method includes:
  • Step 101 Receive a multicast message of a cluster node in the cluster file system.
  • the monitoring node can be added to the cluster multicast group of the cluster file system, so that the monitoring node and the cluster node are located in the cluster file system, wherein the monitoring node can be the same host or blade server or other server as the cluster node.
  • Step 102 Acquire a message type of the multicast message and a message parameter corresponding to the message type.
  • the message type corresponding to the multicast message is obtained, and different message parameters corresponding to different message types are obtained according to the message type.
  • Step 103 Query, according to the message parameter, a storage record in the storage table corresponding to the multicast message.
  • the related storage record about the multicast message corresponding to the message parameter is queried from the storage table.
  • the storage table can be used for each multicast message of each node of the cluster. Collect and count.
  • Step 104 Acquire a problem node and a cluster state of the cluster file system according to a storage record corresponding to the multicast message in the storage table when a preset time interval arrives.
  • the storage table collects and records the multicast messages in the cluster file system for a period of time.
  • the cluster file system is determined and obtained according to the storage records in the storage table during the time period. What is the state of the cluster and whether there are related problem nodes.
  • the monitoring method is to add a node in the existing cluster, and the node configures a cluster multicast address and can receive the cluster broadcast message.
  • the node By receiving the multicast message of each node in the cluster file system, querying the storage record of the multicast message in the storage table according to the type of the multicast message and the message parameter corresponding to the multicast message, to learn the problem of the cluster file system Node and cluster status, using multicast communication messages to analyze node running status in a shared storage cluster file system, overcoming the peer-to-peer architecture of the shared storage cluster file system in the prior art, the cluster processing capability and node failure are lacking.
  • the message type of the multicast message and the message parameter corresponding to the message type in step 102 include:
  • Step 1021 Obtain a message type of the multicast message.
  • Step 1022 When the message type is an application layer message type, obtain a first message parameter of the multicast message.
  • the first message parameter includes: a message number of the application layer message corresponding to the multicast message, a first ring label of the first ring where the node that multicasts the multicast message is located, and a node that multicasts the multicast message.
  • the first sender address in the first ring includes: a message number of the application layer message corresponding to the multicast message, a first ring label of the first ring where the node that multicasts the multicast message is located, and a node that multicasts the multicast message.
  • Step 1023 Acquire a second message parameter of the multicast message when the message type is a node join message type.
  • the second message parameter includes at least: a second ring label of the second ring where the node that multicasts the multicast message is located, a second sender address and a multicast of the node that multicasts the multicast message in the second ring A list of node members recorded by the node of the multicast message itself.
  • the message type of the multicast message is obtained.
  • the message types are mainly classified into two types, one is an application layer message type, and the other is a node join message joinmsg type, where the application layer message type refers to a group.
  • the message type of the application layer message sent by the application layer is multicast to other nodes in the cluster.
  • the node join message type refers to when a node joins in the multicast ring.
  • the obtained parameters include at least the message number seq of the application layer message corresponding to the multicast message, the ring number of the ring where the node that multicasts the multicast message, that is, the first ring label ring_id1 of the first ring, and the multicast The address of the multicast message in the ring, that is, the first sender address sender_id1 in the first ring; when the message type is the node join message type, the obtained parameters include at least: the node where the multicast message is multicast The ring label of the ring, that is, the second ring label ring_id2 of the second ring, the address of the node that multicasts the multicast message in the ring, that is, the second sender address sender_id2 in the second ring, and the multicast message of the multicast message.
  • the member set of the node member recorded by the node itself that is, the node member list proc_list.
  • step 103 when the multicast message is an application layer message type, in step 103, the storage table and the multicast are queried according to the message parameters.
  • the corresponding storage record of the message including:
  • Step 1031 When the multicast message is an application layer message type, determine, according to the message number and the first ring label, whether the message number and the first one exist in the application layer message table of the storage table. The first stored record of the ring label.
  • the storage table stores a record of the multicast message of the cluster node and related parameters of the multicast message, where the storage table includes an application layer message table, and the application layer message table correspondingly stores the application about the multicast message.
  • the layer message type When the multicast message is an application layer message type, it is determined whether the group of the same application layer message in the same ring as the multicast message already exists in the application layer message table according to the seq and ring_id1 corresponding to the multicast message.
  • the broadcast record that is, the first storage record described above.
  • Step 1032 Store the first message parameter of the multicast message to the application layer message table when the first storage record does not exist in the application layer message table.
  • step 1031 when the result of the determination is that the first storage record does not exist in the application layer message table, the multicast message is stored in the application layer message table, and at least the first corresponding to the multicast message is stored. Message parameters are stored in the application layer message table.
  • Step 1033 When the first storage record exists in the application layer message table, the determination result is that the previous node of the node corresponding to the first sender address in the first ring has a message loss.
  • the previous node is a suspected problem node, and stores the parameters of the suspected problem node to the suspected problem node table in the storage table.
  • the storage table further includes a suspected problem node table.
  • a suspected problem node table According to the judgment in step 1031, when the judgment result is that the first storage record exists in the application layer message table, it indicates that the same application layer message has repeated multicast on the same ring. In this case, it can be concluded that the message loss exists in the ring on the ring, which triggers the repeated multicast.
  • the message confirmation process of the token token in the multicast ring is judged.
  • the sending message node is Pn
  • the Pn-1 node in the current member list is inferred to be a lost message node according to the order of token token delivery, and is listed as a suspected problem node, and the node corresponding to sender_id1 in the multicast message is learned.
  • the previous node has a message loss, and then it is judged that there may be a problem at the previous node, that is, a suspected problem node, and the suspected problem node and corresponding parameters are stored in the suspected problem node table.
  • step 1033 the previous node is found to be a suspected problem node, and the parameter of the suspected problem node is stored to the suspected problem node table in the storage table. Specifically, including:
  • Step 10331 Acquire, according to the first sender address, a first node address of the suspected problem node in the first ring.
  • the message confirmation process of the token token in the multicast ring is learned that the message of the previous node corresponding to the sender_id1 in the multicast message is lost, and then the message is determined.
  • a problem may occur at a node, and the first sender address is the address of the node that multicasts the multicast message in the first ring, and the node address in the multicast ring is arranged in the order of the sequence number from small to large.
  • Step 10332 Determine, according to the first node address, the message number, and the first ring label, whether the first node address, the And a second storage record of the first ring number.
  • step 10331 nodeid1 is obtained, and according to the nodeid1, seq, and ring_id1, it is determined whether there is a record in the record stored in the suspected problem node table that the same node in the same ring loses the same application layer message, that is, the second storage record.
  • Step 10333 When the second storage record exists in the suspect problem node table, increase the number of message records of the suspect problem node.
  • step 10332 When the result of the determination in step 10332 is that the second storage record already exists in the node table of the suspected problem, the number of times of recording the same application layer message in the same ring for the suspected problem node is increased, that is, the number of records in the same ring is increased.
  • the suspected problem node has the number of times the other nodes repeat the multicast caused by the loss of the message.
  • Step 10334 When the second storage record does not exist in the suspect problem node table, storing the first parameter including the first node address, the message number, and the first ring label to the suspect problem Node table.
  • step 10332 When the result of the determination in step 10332 is that there is no second storage record in the node table of the suspected problem, at least parameters such as nodeid1, seq, and ring_id1 are stored in the suspected problem node table, and which of the rings in which ring has the application layer lost is recorded. The occurrence of the message.
  • step 103 when the multicast message is a node join message type, in step 103, the storage table and the multicast message are queried according to the message parameter.
  • Corresponding storage records including:
  • Step 1034 When the multicast message is a node join message type, determine, according to the second ring tag and the second sender address, whether the node in the storage table joins the message table to have the first A second ring record and a third store record of the second sender address.
  • the storage table further includes a node join message table.
  • the multicast message is a node join message, according to ring_id2 and sender_id2, it is determined whether the node joins the message table and stores that the multicast message is sent by the same node in the same ring.
  • the node joins the message record of the message type, that is, the third storage record described above.
  • Step 1035 When the third join record does not exist in the node join message table, store the second message parameter of the multicast message to the node join message table.
  • the node join message table does not have a record of a message that joins the message type of the node multicasted by the same node in the same ring, that is, the third storage record
  • the multicast message is stored in the node join message table, and the storage will be at least
  • the second message parameter corresponding to the multicast message is stored in an application layer message table.
  • Step 1036 When the third join record exists in the node join message table, determine, according to the node member list, whether the node member list of the multicast message exists compared with the third storage record. Increase or decrease the number of node members.
  • the node join message table When there is a third storage record in the node join message table, it indicates that the node joins the message table and stores a message record of the node join message type sent by the same node in the same ring as the multicast message, according to the multicast message.
  • the proc_list in the corresponding second message parameter is compared with the node member list parameter existing in the third storage record, and it is determined whether there is an increased or decreased node member in the proc_list of the multicast message, that is, the multicast message group is determined.
  • the relevant information about the departure or joining of nodes in the current ring is determined.
  • Step 1037 When there is a reduced node member in the node member list of the multicast message, obtain a second node address of the reduced node member, according to the second node address and the second sender address, Determining whether there is a fourth storage record having the second node address and the second sender address in the leaving node table of the storage table.
  • the storage table further includes a leaving node table (leave table).
  • a leaving node table (leave table).
  • Step 1038 When the fourth storage record exists in the leaving node table, increase the number of message records of the reduced node member.
  • step 1037 When the result of the determination in step 1037 is that there is a fourth storage record in the leaving node table, it indicates that the reduced node member has been multicasted by the same node to other cluster nodes, and the corresponding number of message records is increased, that is, increased. A record of the number of departures for this reduced node member.
  • Step 1039 When the fourth storage information does not exist in the leaving node table, storing a second parameter including the second node address and the second sender address to the leaving node table.
  • step 1037 When the result of the determination in step 1037 is that there is no fourth storage record in the leaving node table, the corresponding parameter is stored, and at least the parameters such as the second node address and the second sender address are stored in the leaving node table during storage.
  • Step 1010 When there is an added node member in the node member list of the multicast message, obtain a third node address of the added node member, according to the third node address and the second sender address. Determining whether there is a fifth storage record having the third node address and the second sender address in the leaving node table.
  • step 1036 When the result of the determination in step 1036 is that there is an added node member in the node member list of the multicast message, that is, the added node corresponding to the added node member exists in the ring, the added node member is obtained in the second ring.
  • the third node address nodeid3 determines, according to nodeid3 and sender_id2, whether the record of the message left by the same node sent by the same node, which is the fifth storage record, is stored in the leaving node table.
  • Step 10311 Delete the fifth storage record when the fifth storage record exists in the leaving node table.
  • step 1034 to step 10311 when the multicast message is added to the message type, the process of querying the storage record corresponding to the multicast message in the storage table and the middle thereof are obtained according to the message parameter corresponding to the message type.
  • the use of related parameters involved and the judgment of the conditions are described.
  • the nodes in the storage table are added to the message table, and the existing data in the node table is compared and judged.
  • Correlation results Store or accumulate or delete the corresponding parameters to achieve the status of the cluster file system. Real-time collection and monitoring of interest.
  • the message type of the multicast message is an application layer message type
  • the storage table corresponding to the multicast message acquires the problem node and cluster status of the cluster file system, including:
  • Step 1041 In the suspected problem node table, when there is a special suspected problem node in which the number of message records reaches a preset maximum value in the suspected problem node, determining the preset maximum value and the node other than the special suspect problem The number of message records of other nodes is the same.
  • Step 1042 When the preset maximum value is the same as the number of message records of other nodes except the special suspect problem node, determine that the cluster file system is a cluster service busy state.
  • Step 1043 When the preset maximum value is different from the number of message records of other nodes except the special suspect problem node, determine that the special suspect problem node is the problem node.
  • the message type of the multicast message is the application layer message type
  • the predetermined time interval arrives, when judging whether there is a problem node and the current cluster state in the cluster file system, it is necessary to judge and analyze the data in the suspect problem node table.
  • the maximum number of records reached is the same as the number of message records of other suspected problem nodes recorded in the suspect problem node table.
  • the suspect nodes in the cluster nodes are all the same. If the message is lost, it is judged that the cluster file system is in a busy state of the cluster service. If it is not the same, it can be considered that the node with the maximum number of message records has a frequent message loss condition, and the node can be judged as a problem node.
  • the message type of the multicast message is a node join message
  • the type is based on the first embodiment and the fifth embodiment
  • the problem node and the cluster status of the cluster file system are obtained according to the storage record corresponding to the multicast message in the storage table. Specifically, including:
  • Step 1044 Determine whether the leaving node table is empty.
  • Step 1045 When the leaving node table is not empty, and the leaving node table has a plurality of storage records having the same second node address, determining the same second node address The corresponding node is the problem node.
  • the data collected and stored in the leaving node table is first judged and analyzed, and when there is a record in the leaving node table.
  • the data collected and stored in the leaving node table is first judged and analyzed, and when there is a record in the leaving node table.
  • there is a parameter that is not empty, and in the storage record leaving the node table if there are multiple records containing the same nodeid2, it can be known that there is a corresponding reduction in the corresponding storage records.
  • the record of the node can be known that the node corresponding to the nodeid2 exits the multicast ring multiple times, and then the node corresponding to the nodeid2 is determined to be the problem node.
  • Step 1046 When the leaving node table is empty, determine whether the node joins the message table, and whether the number of stored records having the smallest second sender address in the second ring reaches a preset value.
  • Step 1047 When the node joins the message table, and the number of the storage records having the smallest second sender address in the second ring reaches a preset value, determining that the cluster file system is a token timeout frequently status.
  • the smallest second sender address refers to the node address arranged in order in the multicast ring, and the address of the node that sends the multicast message to other nodes is the smallest among the node addresses of the ring.
  • the cluster status and the problem node are determined by the cumulative count of the vertical node itself and the comparison between the horizontal nodes, and the status of the cluster file system is timely and effective. Make tests and judgments.
  • the monitoring method further includes:
  • Step 105 Acquire corresponding adjustment parameters according to the problem node and the cluster state.
  • Step 106 Multicast the adjustment parameter to the cluster node, so that the cluster node adjusts the current self configuration according to the adjustment parameter.
  • the corresponding response policy is provided according to the problem node and the cluster state.
  • the targeted adjustment parameter is obtained, and the adjustment parameter is multicast to the cluster.
  • the node enables other nodes to adjust their configuration according to the adjustment parameters in time to solve the system problem.
  • the corresponding adjustment parameters are obtained according to the problem node and the cluster state in the foregoing step 105, which specifically includes:
  • Step 1051 When the cluster file system in which the cluster node is located is in a busy state of the cluster service, increase the current message transmission window value to a new message transmission window value according to the first preset magnification, according to the second preset magnification. The current maximum transmittable information value of each of the cluster nodes is reduced to a new maximum transmittable information value.
  • the first preset magnification is preferably a magnification of 1.2
  • the second preset magnification is preferably a magnification of 0.9.
  • the cluster service is busy.
  • Each node has a multicast message loss.
  • the message transmission window value window_size is increased according to the magnification of 1.2
  • the maximum transmittable information value max_messages when each cluster node multicast message is narrowed according to the magnification of 0.9, so as to alleviate the busy cluster. Status, reducing message loss.
  • Step 1052 When the cluster file system in which the cluster node is located is in a token timeout frequent state, the token timeout time of the cluster node is increased to a new token timeout period according to a third preset magnification.
  • the third preset magnification is preferably a magnification of 1.2.
  • the token timeout time of each node in the cluster is increased according to the magnification of 1.2 to reduce the token system timeout caused by the token timeout.
  • the message is resent.
  • each of the results including the problematic node, the token token timeout, the cluster service busy, and the adjustment parameters corresponding to the various results may be outputted.
  • the corosync in the cluster management suite is distributed as a core component of the cluster communication, and is distributed on each node of the cluster.
  • the parameters of the corosync are configured in each cluster node, and the application message of the node can be broadcasted to the cluster and received.
  • Multicast messages of other nodes in order to achieve synchronization of the entire cluster state, and through the delivery of tokens, the reliable transmission of messages, cluster member change detection, thereby maintaining the stability of the entire cluster.
  • the communication layer of the corosync of the other nodes except the monitoring node may be added with the adjustment parameter message in the cluster, and after receiving the message, the corosync performs the totem configuration modification of the corosync.
  • the corosync in the cluster management suite in the other cluster nodes receives the parameter adjustment message parsing and modifies the relevant configuration.
  • Step 1 The monitoring node joins the multicast group and prepares to receive multicast messages.
  • Step 2 Receive the multicast message, parse the multicast message, and at least obtain the message type and the parameter data corresponding to the message type, and save the data.
  • the third step according to the query result of the parameter data, store the corresponding data, and determine the state of the node.
  • Step 4 Perform statistical analysis at regular intervals, query the problem node table and leave the node table, determine possible problem nodes and cluster status, adjust parameters according to the current configuration, and output statistical results and troubleshooting suggestions.
  • Step 5 Broadcast the adjustment parameters in the cluster.
  • Step 6 The cluster node receives a message about adjusting parameters and updates the configuration according to the adjustment parameters.
  • the method disclosed by the present invention realizes the intuitive observation effect of the cluster and the processing capability of each cluster node, dynamically adjusts the configuration parameters according to the cluster service requirements, improves the reliability and stability of the cluster, and improves the system.
  • the means of observation of cluster communication is convenient for developers to analyze.
  • the storage table specifically includes the following parts:
  • the application layer message table can be included in the columns sender_id1, seq, srpaddr, ring_id1, timestamp.
  • Suspected problem node table which can contain nodeid1, seq, ring_id1, timestamp.
  • the node joins the message table and can include the columns sender_id2, ring_id2, proc_list, timestamp.
  • the leave table can be a nodeid2 containing sender_id2, timestamp, and not present in the proc_list.
  • the srpaddr column is used to record the physical address of the multicast message sending node, and the timestamp column is used to record the related time of the multicast message, which is used for recording the time, so as to arrive at a preset time interval, according to the storage table.
  • the storage record obtains the problem node and cluster status of the cluster file system, and can also periodically clear the data in the storage table when the time arrives in a preset time period.
  • the multicast message of the cluster node in the cluster file system is received, and the message type of the multicast message and the message parameter corresponding to the message type are obtained.
  • the message parameter query the storage table The process of storing records corresponding to the multicast message is described in its entirety.
  • the multicast message of the cluster node in the cluster file system is received to determine whether the message type of the multicast message is an application layer message type. If yes, the seq, sender_id1, and ring_id1 of the multicast message are parsed to obtain Seq and ring_id1 are related records in the application layer message table of the query condition query storage table.
  • the message is stored in the application layer message table, if the application layer message table exists Correlation record, according to the sender address of the message in the relevant record, it is determined that the node Pn corresponding to the sender address of the message is the suspect node of the previous node Pn-1 in the ring, and then the suspect node table is determined.
  • the multicast message is not the application layer message type, determine whether the multicast message is a joinmsg message type, and if yes, obtain the sender_id2, ring_id2, and proc_list parameters of the multicast message, with sender_id2 and ring_id2 as Whether the node in the conditional query storage table joins the message table stores the same record, if any, compares the multicast message with the proc_list in the same record, and when there is a reduced node in the proc_list of the multicast message, the reduced node is obtained.
  • the node address nodeid, and then the parameter sender_id2, the node address of the reduced node, nodeid, ring_id2, is used to query whether there is a related record in the leave table. If there is a related record in the leave table, the number of records of the node corresponding to the nodeid is accumulated, that is, Accumulating the number of times of the node corresponding to the nodeid. If there is no related record in the leave table, the record of the message of the reduced node corresponding to the nodeid is added in the leave table; when the multicast message is neither the application layer message type nor the If it is not the joinmsg message type, the multicast message is discarded.
  • the storage record corresponding to the multicast message in the storage table is queried according to the message parameter corresponding to the message type of the multicast message;
  • a preset time interval arrives, an overall description is made of the process of acquiring the problem node and the cluster state of the cluster file system according to the storage record corresponding to the multicast message in the storage table.
  • the suspect problem node table When the preset time interval arrives, check the suspect problem node table, count the number of records of each suspected problem node, that is, the number of occurrences of each suspected problem node, and determine whether the number of message records of the node in the suspected problem node has reached The preset maximum value, if any, compares the number of message records of the node with the number of message records of other suspected problem nodes, when the number of message records of other suspected problem nodes and the number of message records reaches the preset maximum value of the node If the number of records is the same, it is judged that the cluster service is busy.
  • the invention monitors the status of the cluster and the node by receiving and analyzing the multicast messages of the nodes of the cluster, and the cluster is busy, and the token timeout situation is adjusted by the multicast to adapt the parameters of the nodes to the processing capability of the cluster, and the monitoring node of the multicast group is added.
  • For parameter adjustment a parameter adjustment message processing needs to be added to the cluster node communication module, but the existing cluster size is not affected.
  • the monitoring node does not participate in specific services and monitors the cluster by collecting multicast messages.
  • the node status is analyzed according to the multicast message statistics of each node, and the cluster state and the problem node are given statistical judgments, and the cluster parameters are adjusted in time to improve the communication service processing capability and stability, and the communication status of the entire cluster is
  • the device fault notification can be obtained in the first time, so that the management personnel can timely understand the equipment status, locate the fault target, and improve work efficiency.
  • the present invention also discloses a monitoring node in a shared storage cluster file system, the monitoring node includes: a first receiving module 2100, a first obtaining module 2200, a query module 2300, and a second The module 2400 is obtained.
  • the first receiving module 2100 is configured to receive the multicast message of the cluster node in the cluster file system, where the monitoring node and the cluster node are located in the cluster file system, and the first obtaining module 2200 is configured to obtain the message of the multicast message. a type and a message parameter corresponding to the message type; the query module 2300 is configured to query, according to the message parameter, a storage record corresponding to the multicast message in the storage table; and the second obtaining module 2400 is configured to be in a preset When the time interval arrives, the problem node and the cluster state of the cluster file system are obtained according to the storage record corresponding to the multicast message in the storage table.
  • the above monitoring node is a node added in an existing cluster, and the node is configured with a cluster multicast address and can receive cluster broadcast messages.
  • the node is configured with a cluster multicast address and can receive cluster broadcast messages.
  • the first obtaining module 2200 is specifically configured to: obtain a message type of the multicast message; and when the message type is an application layer message type, obtain a first message parameter of the multicast message, where the first message parameter includes at least The message number of the application layer message corresponding to the multicast message, the first ring label of the first ring where the node that multicasts the multicast message, and the node that multicasts the multicast message are the first in the first ring.
  • a sender address when the message type is a node join message type, obtaining a second message parameter of the multicast message, where the second message parameter includes at least: a second ring of the second ring where the node that multicasts the multicast message is located The ring label, the second sender address of the node that multicasts the multicast message, and the node member list recorded by the node that multicasts the multicast message.
  • the first obtaining module 2200 obtains the message type of the multicast message.
  • the message types herein are mainly classified into two types, one is an application layer message type, and the other is a node joining. The type of message.
  • the obtained parameters include at least: the multicast The message number seq of the application layer message corresponding to the message, the ring number of the ring where the node that multicasts the multicast message, that is, the first ring label ring_id1 of the first ring and the address of the node that multicasts the multicast message in the ring That is, the first sender address sender_id1 in the first ring;
  • the obtained parameters include at least: a ring number of the ring in which the node that multicasts the multicast message is located, that is, the second ring The second ring label ring_id2, the address of the node in the ring that multicasts the multicast message, that is, the second sender address sender_id2 in the second ring, and the member set of the node member recorded by the node that multicasts the multicast message, That is, the node member list proc_list.
  • the query module 2300 specifically includes: a first determining submodule 2310, a first storing submodule 2320, and a second storing submodule 2330.
  • the first determining sub-module 2310 is configured to: when the multicast message is an application layer message type, determine, according to the message number and the first ring label, whether the message exists in the application layer message table of the storage table. And a first storage record of the first ring label; the first storage sub-module 2320 is configured to: when the first storage record does not exist in the application layer message table, store the first message parameter of the multicast message to the An application layer message table; the second storage submodule 2330 is configured to: when the first storage record exists in the application layer message table, the determination result is the node of the first ring corresponding to the first sender address A node has a message loss, and the previous node is a suspected problem node, and stores the parameter of the suspected problem node to the suspected problem node table in the storage table.
  • the parameter compares and judges the existing data in the application layer message table and the suspected problem node table in the storage table, and finally stores or accumulates or deletes the corresponding parameter according to the relevant result, and reaches the state of the cluster file system. Real-time collection and monitoring of information.
  • the second storage sub-module 2330 specifically includes: an obtaining unit 2331, a determining unit 2331, a recording unit 2333, and a storage unit 2334.
  • the obtaining unit 2331 is configured to acquire, according to the first sender address, the first node address of the suspected problem node in the first ring; the determining unit 2331 is configured to set the message according to the first node address, the message number, and the a first ring label, determining whether there is a second storage record having the first node address, the message number, and the first ring label in the suspect problem node table; and the recording unit 2333 is configured to exist in the suspect problem node table
  • the second storage record increases the number of message records of the suspected problem node;
  • the storage unit 2334 is configured to store the first node address, the message number when the second storage record does not exist in the suspect problem node table And the first parameter of the first ring label to the suspect problem node table.
  • the query module 2300 specifically includes: a second determining submodule 2340, a third storing submodule 2350, a third determining submodule 2360, a fourth determining submodule 2370, a recording submodule 2380, and a fourth storage submodule 2390,
  • the fifth judgment sub-module 23100 deletes the sub-module 23110.
  • the second determining sub-module 2340 is configured to: when the multicast message is a node join message type, determine, according to the second ring tag and the second sender address, whether the node in the storage table joins the message table a third storage record having the second ring label and the second sender address; the third storage sub-module 2350 is configured to store the multicast message when the third storage record does not exist in the node join message table The second message parameter is added to the node to join the message table; the third determining sub-module 2360 is configured to, when the node joins the message table, the third storage record exists, according to the node member list, determine that compared with the third storage record Whether there is an increased or decreased node member in the node member list of the multicast message; and the fourth determining sub-module 2370 is configured to acquire the reduced node when there is a reduced node member in the node member list of the multicast message a second node address of the member, determining, according to the second node address and the
  • the second determining sub-module 2340 to the deleting sub-module 23110 specifically included in the query module 2300 implements adding a message to the node in the storage table by using the parameter of the obtained multicast message when the multicast message is added to the message type by the node.
  • the existing data in the table and the leaving node table are compared and judged, and finally the corresponding operations of storing or accumulating or deleting the corresponding parameters according to the related results, realizing the real-time collection and monitoring of the state information of the cluster file system.
  • the second obtaining module 2400 specifically includes: a sixth determining submodule 2410 and a seventh determining submodule 2420.
  • the sixth judging sub-module 2410 is configured to be in the suspect problem node table.
  • the preset maximum value is determined.
  • the number of message records of other nodes other than the node with special suspected problem is the same.
  • the preset maximum value is the same as the number of message records of other nodes except the node with special suspected problem, it is determined that the cluster file system is in a busy state of the cluster service;
  • the seven-decision sub-module 2420 is configured to determine that the special suspected problem node is the problem node when the preset maximum value is different from the number of message records of other nodes than the special suspect problem node.
  • the second obtaining module 2400 specifically includes: an eighth determining submodule 2430, a ninth determining submodule 2440, a tenth determining submodule 2450, and an eleventh determining submodule. 2460.
  • the eighth determining sub-module 2430 is configured to determine whether the leaving node table is empty; the ninth determining sub-module 2440 is configured to: when the leaving node table is not empty, and the leaving node table has multiple When storing the record of the second node address, determining that the node corresponding to the same second node address is the problem node; and the tenth determining sub-module 2450 is set to determine that the node is empty when the leaving node table is empty.
  • the eleventh judgment sub-module 2460 is set to be added to the message table when the node The smallest second transmission in the second ring.
  • Each of the above-mentioned judging modules and each judging sub-module determines the cluster status and the problem through the cumulative counting of the vertical nodes and the comparison between the horizontal nodes in the process of performing different queries and judgments in the storage table according to the type of the multicast message.
  • the node detects and judges the status of the cluster file system in a timely and effective manner.
  • monitoring node further includes:
  • the third obtaining module 2500 is configured to acquire corresponding adjustment parameters according to the problem node and the cluster state.
  • the multicast module 2600 is configured to multicast the adjustment parameter to the cluster node, so that the cluster node adjusts the current configuration according to the adjustment parameter.
  • the third obtaining module 2500 is specifically configured to:
  • the current message transmission window value is increased according to the first preset magnification value, and the new message transmission window value is reduced according to the second preset magnification.
  • the current maximum transmittable information value of the cluster node is a new maximum transmittable information value; when the cluster file system in which the cluster node is located is a token timeout frequent state, the cluster node is adjusted according to a third preset magnification.
  • the token timeout is a new token timeout.
  • the monitoring node provided in the cluster file system provided by the present invention overcomes the peer-to-peer architecture existing in the shared storage cluster file system in the prior art, and the cluster processing capability, the node failure lacks statistical analysis summary, and the cluster parameters cannot be dynamic.
  • the problems and defects of adjustment have realized the timely detection, discovery and resolution of cluster file system problems.
  • the method according to the above embodiment can be implemented by means of software plus a necessary general hardware platform, and of course, by hardware, but in many cases, the former is A better implementation.
  • the technical solution of the present invention which is essential or contributes to the prior art, may be embodied in the form of a software product stored in a storage medium (such as ROM/RAM, disk,
  • the optical disc includes a number of instructions for causing a terminal device (which may be a cell phone, a computer, a server, or a network device, etc.) to perform the methods described in various embodiments of the present invention.
  • the foregoing storage medium may include, but not limited to, a USB flash drive, a Read-Only Memory (ROM), a Random Access Memory (RAM), a mobile hard disk, and a magnetic memory.
  • ROM Read-Only Memory
  • RAM Random Access Memory
  • a mobile hard disk e.g., a hard disk
  • magnetic memory e.g., a hard disk
  • modules or steps of the present invention described above can be implemented with a general purpose computing device, which can be centralized on a single computing device, or Distributed over a network of computing devices, optionally, they may be implemented in program code executable by the computing device, such that they may be stored in the storage device for execution by the computing device, and in some cases
  • the steps shown or described may be performed in a different order than that herein, or they may be separately fabricated into individual integrated circuit modules, or a plurality of the modules or steps may be implemented as a single integrated circuit module.
  • the invention is not limited to any specific combination of hardware and software.
  • the monitoring method and the monitoring node of the shared storage cluster file system node communication provided by the embodiment of the present invention have the following beneficial effects: monitoring the cluster running status by collecting multicast messages, and according to the group of each node
  • the broadcast message statistically analyzes the state of the node, gives statistical judgments on the cluster status and problem nodes, improves the processing capability and stability of the communication service, and can obtain the device failure notification at the first time, so that the management personnel can visually observe the communication status of the entire cluster.
  • the effect is to timely understand the condition of the device, locate the fault target, improve work efficiency, and improve the overall performance of the cluster file system.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

本发明实施例提供了一种共享存储式集群文件系统节点通信的监控方法及监控节点,涉及共享存储式文件系统集群通信领域,其中监控方法包括:接收所述集群文件系统中集群节点的组播消息,所述监控节点与所述集群节点都位于所述集群文件系统中;获取所述组播消息的消息类型及与所述消息类型相对应的消息参数;根据所述消息参数,查询存储表中与所述组播消息相对应的存储记录;在一预设时间间隔到达时,根据所述存储表中与所述组播消息相对应的存储记录,获取所述集群文件系统的问题节点和集群状态。该方案可使管理人员对整个集群通信状况有了直观观察效果,使其能及时对集群通信参数进行调整,提高集群文件系统整体性能。

Description

共享存储式集群文件系统节点通信的监控方法及监控节点 技术领域
本发明涉及共享存储式文件系统集群通信领域,尤其涉及一种共享存储式集群文件系统节点通信的监控方法及监控节点。
背景技术
如图1-图6所示,共享存储式集群文件系统节点间通信模块采用corosync(Corosync是集群管理套件的一部分,它在传递信息的时候可以通过一个简单的配置文件来定义信息传递的方式和协议等),其底层通信实现依据totem协议,通过组播的方式传递节点信息,并以单播token的形式确保消息的可靠接收,实现集群内节点变化的同步。
在组播消息丢失时,会通过token循环识别并重播,直至丢失消息节点收到该消息或者token循环次数达到最大值,Safe Order要求广播的消息每个节点的都要收到对于应用需要广播的消息才会转发应用处理,对于有Safe Order的要求的消息接收失败会引发不断的重播,引起消息处理延时。且Operational状态是集群稳定的工作状态,具有稳定的ring,而Gather,Commit态是集群判断节点状态的过程,通过反复广播自身成员,直到各节点成员均达到一致状态,对于被确认为leave的节点,集群还需要隔离这个问题节点,这个过程可能需要很长的时间,而且这个过程集群不会处理应用的消息,这样的状态会导致集群的消息处理延迟以及集群不稳定性增加,存在不能实时了解集群文件系统运行状态,难以及时对运行状态做出调控的问题。
随着共享存储式集群文件系统中集群规模的增大,会导致时延增加,存在不能动态调整集群节点的参数来适应业务量的需要的问题,难以根据实际情况调整发挥自身优势。
发明内容
本发明实施例提供了共享存储式集群文件系统节点通信的监控方法及监控节点,以至少解决共享存储式集群文件系统中存在的不能实时了解集群文件系统运行状态,难以及时对运行状态做出调控的问题。
根据本发明的一个实施例,提供了一种共享存储式集群文件系统节点通信的监控方法,应用于共享存储式集群文件系统中的监控节点,包括:接收所述集群文件系统中集群节点的组播消息,所述监控节点与所述集群节点都位于所述集群文件系统中;获取所述组播消息的消息类型及与所述消息类型相对应的消息参数;根据所述消息参数,查询存储表中与所述组播消息相对应的存储记录;在一预设时间间隔到达时,根据所述存储表中与所述组播消息相对应的存储记录,获取所述集群文件系统的问题节点和集群状态。
可选地,所述获取所述组播消息的消息类型及与所述消息类型相对应的消息参数,具体包括:获取所述组播消息的消息类型;当所述消息类型为应用层消息类型时,获取所述组播消息的第一消息参数,所述第一消息参数至少包括:所述组播消息对应的应用层消息的消息编号、组播所述组播消息的节点所在的第一环的第一环标号和组播所述组播消息的节点在所述第一环中的第一发送者地址;当所述消息类型为节点加入消息类型时,获取所述组播消息的第二消息参数,所述第二消息参数至少包括:组播所述组播消息的节点所在的第二环的第二环标号、组播所述组播消息的节点在所述第二环中的第二发送者地址和组播所述组播消息的节点自身记录的节点成员列表。
可选地,所述根据所述消息参数,查询存储表中与所述组播消息相对应的存储记录,具体包括:当所述组播消息为应用层消息类型时,根据所述消息编号及所述第一环标号,判断所述存储表的应用层消息表中是否存在具有所述消息编号及所述第一环标号的第一存储记录;当所述应用层消息表中不存在所述第一存储记录时,存储所述组播消息的第一消息参数至所述应用层消息表;当所述应用层消息表中存在所述第一存储记录时,判断结果为所述第一环中与所述第一发送者地址相对应的节点的上一节点 存在消息丢失,得出所述上一节点为疑似问题节点,并存储所述疑似问题节点的参数至所述存储表中的疑似问题节点表。
可选地,所述得出所述上一节点为疑似问题节点,并存储所述疑似问题节点的参数至所述存储表中的疑似问题节点表,具体包括:根据所述第一发送者地址,获取所述疑似问题节点在所述第一环中的第一节点地址;根据所述第一节点地址、所述消息编号及所述第一环标号,判断所述疑似问题节点表中是否存在具有所述第一节点地址、所述消息编号及所述第一环标号的第二存储记录;当所述疑似问题节点表中存在所述第二存储记录时,增加所述疑似问题节点的消息记录次数;当所述疑似问题节点表中不存在所述第二存储记录时,存储包括所述第一节点地址、所述消息编号及所述第一环标号的第一参数至所述疑似问题节点表。
可选地,所述根据所述消息参数,查询存储表中与所述组播消息相对应的存储记录,具体包括:当所述组播消息为节点加入消息类型时,根据所述第二环标记与所述第二发送者地址,判断所述存储表中的节点加入消息表中是否存在具有所述第二环标号及所述第二发送者地址的第三存储记录;当所述节点加入消息表中不存在所述第三存储记录时,存储所述组播消息的第二消息参数至所述节点加入消息表;当所述节点加入消息表中存在所述第三存储记录时,根据所述节点成员列表,判断与所述第三存储记录相比,所述组播消息的节点成员列表中是否存在增加或减少的节点成员;当所述组播消息的节点成员列表中存在减少的节点成员时,获取所述减少的节点成员的第二节点地址,根据所述第二节点地址及所述第二发送者地址,判断所述存储表的离开节点表中是否存在具有所述第二节点地址及所述第二发送者地址的第四存储记录;当所述离开节点表中存在所述第四存储记录时,增加所述减少的节点成员的消息记录次数;当所述离开节点表中不存在所述第四存储信息时,存储包括所述第二节点地址、所述第二发送者地址的第二参数至所述离开节点表;当所述组播消息的节点成员列表中存在增加的节点成员时,获取所述增加的节点成员的第三节点地址,根据所述第三节点地址及所述第二发送者地址,判断所述离开节点表中是 否存在具有所述第三节点地址及所述第二发送者地址的第五存储记录;当所述离开节点表中存在所述第五存储记录时,删除所述第五存储记录。
可选地,所述在一预设时间间隔到达时,根据所述存储表中与所述组播消息相对应的存储记录,获取所述集群文件系统的问题节点和集群状态,具体包括:所述疑似问题节点表中,当所述疑似问题节点中存在有消息记录次数达到预设最大值的特别疑似问题节点时,判断所述预设最大值与除所述特别疑似问题节点外的其他节点的消息记录次数是否相同;当预设最大值与除所述特别疑似问题节点外的其他节点的消息记录次数相同时,判断所述集群文件系统为集群业务繁忙状态;当预设最大值与除所述特别疑似问题节点外的其他节点的消息记录次数不同时,判断所述特别疑似问题节点为所述问题节点。
可选地,所述在一预设时间间隔到达时,根据所述存储表中与所述组播消息相对应的存储记录,获取所述集群文件系统的问题节点和集群状态,具体包括:判断所述离开节点表是否为空;当所述离开节点表不为空,且所述离开节点表中,存在多个具有相同的所述第二节点地址的存储记录时,判断与所述相同的所述第二节点地址相对应的节点为所述问题节点;当所述离开节点表为空,判断所述节点加入消息表中,具有所述第二环中最小的所述第二发送者地址的存储记录的数量是否达到预设值;当所述节点加入消息表中,具有所述第二环中最小的所述第二发送者地址的存储记录的数量达到预设值时,判断所述集群文件系统为令牌超时频繁状态。
可选地,所述监控方法还包括:根据所述问题节点和集群状态获取对应的调整参数;将所述调整参数组播至所述集群节点,以使所述集群节点根据所述调整参数调整当前自身配置。
可选地,所述根据所述问题节点和集群状态获取对应的调整参数,具体包括:当所述集群节点所处的集群文件系统为集群业务繁忙状态时,按照第一预设的倍率调大当前消息传输窗口值为一新消息传输窗口值,按照第二预设的倍率缩小每个所述集群节点的当前最大可传输信息值为一新 的最大可传输信息值;当所述集群节点所处的集群文件系统为令牌超时频繁状态时,按照第三预设的倍率调大所述集群节点的令牌超时时间为一新的令牌超时时间。
根据本发明的另一实施例,提供了一种共享存储式集群文件系统中的监控节点,包括:
第一接收模块,设置为接收所述集群文件系统中集群节点的组播消息,所述监控节点与所述集群节点都位于所述集群文件系统中;第一获取模块,设置为获取所述组播消息的消息类型及与所述消息类型相对应的消息参数;查询模块,设置为根据所述消息参数,查询存储表中与所述组播消息相对应的存储记录;第二获取模块,设置为在一预设时间间隔到达时,根据所述存储表中与所述组播消息相对应的存储记录,获取所述集群文件系统的问题节点和集群状态。
可选地,所述第一获取模块具体设置为:获取所述组播消息的消息类型;当所述消息类型为应用层消息类型时,获取所述组播消息的第一消息参数,所述第一消息参数至少包括:所述组播消息对应的应用层消息的消息编号、组播所述组播消息的节点所在的第一环的第一环标号和组播所述组播消息的节点在所述第一环中的第一发送者地址;当所述消息类型为节点加入消息类型时,获取所述组播消息的第二消息参数,所述第二消息参数至少包括:组播所述组播消息的节点所在的第二环的第二环标号、组播所述组播消息的节点在所述第二环中的第二发送者地址和组播所述组播消息的节点自身记录的节点成员列表。
可选地,所述查询模块具体包括:
第一判断子模块,设置为当所述组播消息为应用层消息类型时,根据所述消息编号及所述第一环标号,判断所述存储表的应用层消息表中是否存在具有所述消息编号及所述第一环标号的第一存储记录;第一存储子模块,设置为当所述应用层消息表中不存在所述第一存储记录时,存储所述组播消息的第一消息参数至所述应用层消息表;第二存储子模块,设置为 当所述应用层消息表中存在所述第一存储记录时,判断结果为所述第一环中与所述第一发送者地址相对应的节点的上一节点存在消息丢失,得出所述上一节点为疑似问题节点,并存储所述疑似问题节点的参数至所述存储表中的疑似问题节点表。
可选地,所述第二存储子模块具体包括:获取单元,设置为根据所述第一发送者地址,获取所述疑似问题节点在所述第一环中的第一节点地址;判断单元,设置为根据所述第一节点地址、所述消息编号及所述第一环标号,判断所述疑似问题节点表中是否存在具有所述第一节点地址、所述消息编号及所述第一环标号的第二存储记录;记录单元,设置为当所述疑似问题节点表中存在所述第二存储记录时,增加所述疑似问题节点的消息记录次数;存储单元,设置为当所述疑似问题节点表中不存在所述第二存储记录时,存储包括所述第一节点地址、所述消息编号及所述第一环标号的第一参数至所述疑似问题节点表。
可选地,所述查询模块具体包括:
第二判断子模块,设置为当所述组播消息为节点加入消息类型时,根据所述第二环标记与所述第二发送者地址,判断所述存储表中的节点加入消息表中是否存在具有所述第二环标号及所述第二发送者地址的第三存储记录;第三存储子模块,设置为当所述节点加入消息表中不存在所述第三存储记录时,存储所述组播消息的第二消息参数至所述节点加入消息表;第三判断子模块,设置为当所述节点加入消息表中存在所述第三存储记录时,根据所述节点成员列表,判断与所述第三存储记录相比,所述组播消息的节点成员列表中是否存在增加或减少的节点成员;第四判断子模块,设置为当所述组播消息的节点成员列表中存在减少的节点成员时,获取所述减少的节点成员的第二节点地址,根据所述第二节点地址及所述第二发送者地址,判断所述存储表的离开节点表中是否存在具有所述第二节点地址及所述第二发送者地址的第四存储记录;记录子模块,设置为当所述离开节点表中存在所述第四存储记录时,增加所述减少的节点成员的消息记录次数;第四存储子模块,设置为当所述离开节点表中不存在所述第四存 储信息时,存储包括所述第二节点地址、所述第二发送者地址的第二参数至所述离开节点表;第五判断子模块,设置为当所述组播消息的节点成员列表中存在增加的节点成员时,获取所述增加的节点成员的第三节点地址,根据所述第三节点地址及所述第二发送者地址,判断所述离开节点表中是否存在具有所述第三节点地址及所述第二发送者地址的第五存储记录;删除子模块,设置为当所述离开节点表中存在所述第五存储记录时,删除所述第五存储记录。
可选地,所述第二获取模块,具体包括:
第六判断子模块,设置为所述疑似问题节点表中,当所述疑似问题节点中存在有消息记录次数达到预设最大值的特别疑似问题节点时,判断所述预设最大值与除所述特别疑似问题节点外的其他节点的消息记录次数是否相同,当预设最大值与除所述特别疑似问题节点外的其他节点的消息记录次数相同时,判断所述集群文件系统为集群业务繁忙状态;第七判断子模块,设置为当预设最大值与除所述特别疑似问题节点外的其他节点的消息记录次数不同时,判断所述特别疑似问题节点为所述问题节点。
可选地,所述第二获取模块,具体包括:
第八判断子模块,设置为判断所述离开节点表是否为空;第九判断子模块,设置为当所述离开节点表不为空,且所述离开节点表中,存在多个具有相同的所述第二节点地址的存储记录时,判断与所述相同的所述第二节点地址相对应的节点为所述问题节点;第十判断子模块,设置为当所述离开节点表为空,判断所述节点加入消息表中,具有所述第二环中最小的所述第二发送者地址的存储记录的数量是否达到预设值;第十一判断子模块,设置为当所述节点加入消息表中,具有所述第二环中最小的所述第二发送者地址的存储记录的数量达到预设值时,判断所述集群文件系统为令牌超时频繁状态。
可选地,所述监控节点还包括:
第三获取模块,设置为根据所述问题节点和集群状态获取对应的调整 参数;组播模块,设置为将所述调整参数组播至所述集群节点,以使所述集群节点根据所述调整参数调整当前自身配置。
可选地,所述第三获取模块,具体设置为:
当所述集群节点所处的集群文件系统为集群业务繁忙状态时,按照第一预设的倍率调大当前消息传输窗口值为一新消息传输窗口值,按照第二预设的倍率缩小每个所述集群节点的当前最大可传输信息值为一新的最大可传输信息值;当所述集群节点所处的集群文件系统为令牌超时频繁状态时,按照第三预设的倍率调大所述集群节点的令牌超时时间为一新的令牌超时时间。
根据本发明的又一个实施例,还提供了一种存储介质。该存储介质设置为存储用于执行上述共享存储式集群文件系统节点通信的监控方法的程序代码。
本发明实施例的有益效果是:
上述方案,通过收集组播消息来监控集群运行状况,并根据每个节点的组播消息统计分析节点状态,对集群状态和问题节点给出统计判断,提高其通信业务处理能力及稳定性,且能够第一时间获取设备故障通知,使管理人员对整个集群通信状况有了直观观察效果,能够及时了解设备状况、定位故障目标、提高工作效率,进而提高集群文件系统整体性能。
附图说明
此处所说明的附图用来提供对本发明的进一步理解,构成本申请的一部分,本发明的示意性实施例及其说明用于解释本发明,并不构成对本发明的不当限定。在附图中:
图1为节点P1上的corosync接受该节点上的应用A1消息M1M2M3,在集群中组播的消息示意图;
图2为节点组播消息后,令牌token由节点P1传到P2中的示意图;
图3为P2收到token确认接收消息后,继续传递token到P3的示意 图;
图4为集群中加入节点后该节点广播节点加入消息joinmsg示意图;
图5为集群其他节点收到joinmsg后广播自身成员集合示意图;
图6为节点未收到其他节点joinmsg从而未实现consensus的示意图;
图7为集群中增加监控节点示意图;
图8为本发明第一实施例中方法流程图;
图9为本发明第二实施例中方法流程图;
图10为本发明第三实施例中方法流程图;
图11为本发明第四实施例中方法流程图;
图12为本发明第五实施例中方法流程图;
图13为本发明第六实施例中方法流程图一;
图14为本发明第六实施例中方法流程图二;
图15为本发明第七实施例中方法流程图一;
图16为本发明第七实施例中方法流程图二;
图17为本发明第九实施例中整体结构框图;
图18为本发明第九实施例中详细结构框图;
图19为本发明中方法的整体流程示意图一;
图20为本发明中方法的整体流程示意图二。
具体实施方式
下文中将参考附图并结合实施例来详细说明本发明。需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。下面将参照附图更详细地描述本公开的示例性实施例。虽然附图中显示了本公开的示例性实施例,然而应当理解,可以以各种形式实现本公开而不应被这里阐述的实施例所限制。相反,提供这些实施例是为了能够更透彻 地理解本公开,并且能够将本公开的范围完整的传达给本领域的技术人员。
第一实施例
如图7、图8所示,本发明提供一种共享存储式集群文件系统节点通信的监控方法,应用于共享存储式集群文件系统中的监控节点,该方法包括:
步骤101:接收所述集群文件系统中集群节点的组播消息。
这里可以将监控节点加入集群文件系统的集群组播组中,使得该监控节点与上述集群节点都位于集群文件系统中,其中该监控节点可以是与集群节点相同的主机或者刀片服务器或者其他服务器。
步骤102:获取所述组播消息的消息类型及与所述消息类型相对应的消息参数。
在步骤101中接收到组播消息之后,便获取该组播消息对应的消息类型,进而根据该消息类型获取与不同的消息类型对应的不同的消息参数。
步骤103:根据所述消息参数,查询存储表中与所述组播消息相对应的存储记录。
根据步骤102中获取的消息参数,从存储表中查询与消息参数相对应的关于组播消息的相关存储记录,与步骤101相对应地,该存储表对集群各节点每一次的组播消息可都进行收集及统计。
步骤104:在一预设时间间隔到达时,根据所述存储表中与所述组播消息相对应的存储记录,获取所述集群文件系统的问题节点和集群状态。
存储表会在一段时间内对集群文件系统中的组播消息进行采集与记录,在预设定的时间间隔到达时,根据存储表中在该时间段内的存储记录来判断并获取集群文件系统的集群状态是怎么样的及是否有相关的问题节点出现。
该监控方法是在现有集群中增加一节点,该节点配置集群组播地址,能够接收集群广播消息。通过接收集群文件系统中各节点的组播消息,根据组播消息的类型及与组播消息相对应的消息参数,查询存储表中关于对组播消息的存储记录,来获知集群文件系统的问题节点和集群状态,在共享存储式集群文件系统中利用组播通信消息分析节点运行状态,克服了现有技术中存在共享存储式集群文件系统的对等式架构中,集群处理能力、节点故障缺乏统计分析汇总及集群参数不能动态调整的问题和缺陷。
第二实施例
具体地,如图9所示,在第一实施例基础上,步骤102中的获取组播消息的消息类型及与所述消息类型相对应的消息参数,具体包括:
步骤1021:获取所述组播消息的消息类型。
步骤1022:当所述消息类型为应用层消息类型时,获取所述组播消息的第一消息参数。
该第一消息参数至少包括:该组播消息对应的应用层消息的消息编号、组播该组播消息的节点所在的第一环的第一环标号和组播该组播消息的节点在该第一环中的第一发送者地址。
步骤1023:当所述消息类型为节点加入消息类型时,获取所述组播消息的第二消息参数。
该第二消息参数至少包括:组播该组播消息的节点所在的第二环的第二环标号、组播该组播消息的节点在该第二环中的第二发送者地址和组播该组播消息的节点自身记录的节点成员列表。
接收到组播消息,获取该组播消息的消息类型,这里的消息类型主要分为两类,一类是应用层消息类型,一类是节点加入消息joinmsg类型,其中应用层消息类型是指组播环中有应用层发来的消息时,将应用层发来的应用层消息组播至集群中其他各节点时的消息类型,其中节点加入消息类型是指在组播环中有节点加入时,各节点之间组播的关于该节点加入及自身记录的节点成员集合的消息类型。当该组播消息是应用层消息类型时, 获取的参数要至少包括:该组播消息对应的应用层消息的消息编号seq、组播该组播消息的节点所在的环的环标号,即第一环的第一环标号ring_id1和组播该组播消息的节点在环中的地址,即第一环中的第一发送者地址sender_id1;当消息类型为节点加入消息类型时,获取的参数至少包括:组播该组播消息的节点所在的环的环标号,即第二环的第二环标号ring_id2、组播该组播消息的节点在环中的地址,即第二环中的第二发送者地址sender_id2和组播该组播消息的节点自身记录的节点成员的成员集合,即节点成员列表proc_list。对以上参数的获取以便于对组播消息表达出的集群状态做出判断。
第三实施例
进一步地,如图10所示,在第一实施例与第二实施例的基础上,当组播消息为应用层消息类型时,步骤103中根据所述消息参数,查询存储表中与组播消息相对应的存储记录,具体包括:
步骤1031:当组播消息为应用层消息类型时,根据所述消息编号及所述第一环标号,判断所述存储表的应用层消息表中是否存在具有所述消息编号及所述第一环标号的第一存储记录。
这里,该存储表中存储有对集群节点的组播消息及该组播消息的相关参数的记录,其中存储表包括应用层消息表,该应用层消息表对应存储的是关于组播消息为应用层消息类型时的相关记录。当组播消息为应用层消息类型时,根据与该组播消息对应的seq及ring_id1,判断该应用层消息表中是否已经存在有与该组播消息为同一环中对相同应用层消息的组播记录,即上述的第一存储记录。
步骤1032:当所述应用层消息表中不存在所述第一存储记录时,存储所述组播消息的第一消息参数至所述应用层消息表。
根据步骤1031中的判断,当判断结果是应用层消息表中不存在第一存储记录时,就将该组播消息存储进应用层消息表,存储时至少将与该组播消息对应的第一消息参数存储进应用层消息表。
步骤1033:当所述应用层消息表中存在所述第一存储记录时,判断结果为所述第一环中与所述第一发送者地址相对应的节点的上一节点存在消息丢失,得出所述上一节点为疑似问题节点,并存储所述疑似问题节点的参数至所述存储表中的疑似问题节点表。
该存储表中还包括疑似问题节点表,根据步骤1031中的判断,当判断结果是应用层消息表中存在第一存储记录时,表明同一环上对相同的应用层消息出现了重复组播的情况,可以由此得出该环上有节点存在消息丢失引发了该重复组播的情况,结合共享存储式集群文件系统节点通信过程中,组播环内token令牌的消息确认过程,判断重发消息节点为Pn,根据token令牌传递的顺序推测当前成员列表中的Pn-1节点为丢失消息节点,将其列为疑似问题节点,得知与该组播消息中的sender_id1相对应的节点的上一节点存在消息丢失,进而判断该上一节点处可能出现了问题,即为疑似问题节点,将该疑似问题节点及相应参数存储至疑似问题节点表。
第四实施例
进一步地,如图11所示,在第三实施例基础上,步骤1033中得出上一节点为疑似问题节点,并存储所述疑似问题节点的参数至所述存储表中的疑似问题节点表,具体包括:
步骤10331:根据第一发送者地址,获取所述疑似问题节点在所述第一环中的第一节点地址。
结合共享存储式集群文件系统节点通信过程中,组播环内token令牌的消息确认过程,得知与该组播消息中的sender_id1相对应的节点的上一节点存在消息丢失,进而判断该上一节点处可能出现了问题,而该第一发送者地址为组播该组播消息的节点在第一环中的地址,组播环中的节点地址是按序号从小到大的顺序编排,可以根据该第一发送者地址sender_id1得知该节点上一节点在第一环中的地址,即上述的第一节点地址nodeid1。
步骤10332:根据所述第一节点地址、所述消息编号及所述第一环标号,判断所述疑似问题节点表中是否存在具有所述第一节点地址、所述消 息编号及所述第一环标号的第二存储记录。
在步骤10331中获取了nodeid1,根据该nodeid1、seq及ring_id1,判断疑似问题节点表中存储的记录里是否存在同一环中同一节点丢失相同应用层消息的记录,即该第二存储记录。
步骤10333:当所述疑似问题节点表中存在所述第二存储记录时,增加所述疑似问题节点的消息记录次数。
当步骤10332中的判断结果为疑似问题节点表中已存在有第二存储记录时,此时增加对该疑似问题节点在同一环中丢失相同应用层消息的记录次数,也就是增加同一环中由该疑似问题节点存在消息的丢失而引起的其他节点重复组播的次数。
步骤10334:当所述疑似问题节点表中不存在所述第二存储记录时,存储包括所述第一节点地址、所述消息编号及所述第一环标号的第一参数至所述疑似问题节点表。
当步骤10332中的判断结果为疑似问题节点表中不存在有第二存储记录时,至少将nodeid1、seq及ring_id1等参数存储至疑似问题节点表,记录下哪个环中哪个节点存在丢失哪个应用层消息的情况的发生。
第五实施例
如图12所示,在第一实施例及第二实施例的基础上,当组播消息为节点加入消息类型时,步骤103中根据所述消息参数,查询存储表中与所述组播消息相对应的存储记录,具体包括:
步骤1034:当所述组播消息为节点加入消息类型时,根据所述第二环标记与所述第二发送者地址,判断所述存储表中的节点加入消息表中是否存在具有所述第二环标号及所述第二发送者地址的第三存储记录。
存储表中还包括节点加入消息表,当该组播消息为节点加入消息时,根据ring_id2、sender_id2,判断该节点加入消息表中是否存储有与该组播消息为在同一环中同一节点发送的节点加入消息类型的消息记录,即上述的第三存储记录。
步骤1035:当所述节点加入消息表中不存在所述第三存储记录时,存储所述组播消息的第二消息参数至所述节点加入消息表。
当节点加入消息表中不存在对同一环中同一节点组播的节点加入消息类型的消息的记录即第三存储记录时,就将该组播消息存储至节点加入消息表,存储时至少将与该组播消息对应的第二消息参数存储进应用层消息表。
步骤1036:当所述节点加入消息表中存在所述第三存储记录时,根据所述节点成员列表,判断与所述第三存储记录相比,所述组播消息的节点成员列表中是否存在增加或减少的节点成员。
当节点加入消息表中存在第三存储记录时,表明该节点加入消息表中存储有与该组播消息为在同一环中同一节点发送的节点加入消息类型的消息记录,根据与组播消息相对应的第二消息参数中的proc_list,与该第三存储记录中存在的节点成员列表参数做比较,判断该组播消息的proc_list中是否存在增加或减少的节点成员,即判断该组播消息组播的关于当前环中节点离开或加入的相关情况。
步骤1037:当所述组播消息的节点成员列表中存在减少的节点成员时,获取所述减少的节点成员的第二节点地址,根据所述第二节点地址及所述第二发送者地址,判断所述存储表的离开节点表中是否存在具有所述第二节点地址及所述第二发送者地址的第四存储记录。
存储表中还包括离开节点表(leave表),当步骤1036的判断结果为组播消息的节点成员列表中存在减少的节点成员,即环中存在与该减少的节点成员对应的离开的节点,则获取该减少的节点成员在第二环中的第二节点地址nodeid2,根据nodeid2、sender_id2判断该离开节点表中是否存储有同一节点发送的具有同一减少的节点的消息记录,即上述第四存储记录。
步骤1038:当所述离开节点表中存在所述第四存储记录时,增加所述减少的节点成员的消息记录次数。
当步骤1037中判断结果为离开节点表中存在第四存储记录时,表明该减少的节点成员已被同一节点向其他集群节点组播过相应的消息,此时增加相应的消息记录次数,即增加对该减少的节点成员的离开次数的记录。
步骤1039:当所述离开节点表中不存在所述第四存储信息时,存储包括所述第二节点地址、所述第二发送者地址的第二参数至所述离开节点表。
当步骤1037中判断结果为离开节点表中不存在第四存储记录时,就对相应参数进行存储,存储时至少将第二节点地址、第二发送者地址等参数存储进该离开节点表。
步骤10310:当所述组播消息的节点成员列表中存在增加的节点成员时,获取所述增加的节点成员的第三节点地址,根据所述第三节点地址及所述第二发送者地址,判断所述离开节点表中是否存在具有所述第三节点地址及所述第二发送者地址的第五存储记录。
当步骤1036的判断结果为组播消息的节点成员列表中存在增加的节点成员,即环中存在与该增加的节点成员对应的加入的节点,则获取该增加的节点成员在第二环中的第三节点地址nodeid3,根据nodeid3、sender_id2判断该离开节点表中是否存储有关于此次增加的节点成员的由同一节点发送的对该节点之前离开的消息的记录,即上述第五存储记录。
步骤10311:当所述离开节点表中存在所述第五存储记录时,删除所述第五存储记录。
当离开节点表中存在与此次增加的节点成员由同一发送节点发送的该成员之前离开的记录,则删除该记录。
上述的步骤1034至步骤10311中,对当组播消息为节点加入消息类型时,根据获取与该消息类型相对应的消息参数,查询存储表中与组播消息相对应的存储记录的过程及中间涉及的相关参数的使用和进行的条件判断做出了描述,通过利用获取的组播消息的参数对存储表中的节点加入消息表、离开节点表中的已有数据做出比较判断,最终根据相关结果对相应的参数进行存储或累加或删除的相应操作,达到对集群文件系统状态信 息的实时收集与监控。
第六实施例
本实施例中将针对组播消息的不同消息类型,对如何获取集群文件系统的问题节点和集群状态做出描述。
如图13所示,一方面,当组播消息的消息类型为应用层消息类型时,基于第一实施例和第三实施例,步骤104中在一预设时间间隔到达时,根据存储表中与组播消息相对应的存储记录,获取集群文件系统的问题节点和集群状态,具体包括:
步骤1041:所述疑似问题节点表中,当所述疑似问题节点中存在有消息记录次数达到预设最大值的特别疑似问题节点时,判断所述预设最大值与除所述特别疑似问题节点外的其他节点的消息记录次数是否相同。
步骤1042:当预设最大值与除所述特别疑似问题节点外的其他节点的消息记录次数相同时,判断所述集群文件系统为集群业务繁忙状态。
步骤1043:当预设最大值与除所述特别疑似问题节点外的其他节点的消息记录次数不同时,判断所述特别疑似问题节点为所述问题节点。
当组播消息的消息类型为应用层消息类型时,且预定时间间隔到达,在判断集群文件系统中是否有问题节点及当前的集群状态时,需要对疑似问题节点表中的数据进行判断与分析,当疑似问题节点表中存储的疑似问题节点中有消息记录的累计次数达到预设最大值的节点出现,即出现特别疑似问题节点时,需要判断该特别疑似问题节点在预定时间间隔内的消息记录次数达到的该最大值是否与疑似问题节点表中记录的其他疑似问题节点的消息记录次数是否相同,如果相同,那就可以认为此时的集群节点中的疑似问题节点都出现了同样多的消息丢失,则判断集群文件系统处于集群业务繁忙状态,如果不相同,那就可以认为达到消息记录次数最大值的该特别疑似问题节点存在频繁的消息丢失情况,即可判断该节点为问题节点。
如图14所示,另一方面,当该组播消息的消息类型为节点加入消息 类型时,基于第一实施例与第五实施例,步骤104中在一预设时间间隔到达时,根据存储表中与组播消息相对应的存储记录,获取集群文件系统的问题节点和集群状态,具体包括:
步骤1044:判断所述离开节点表是否为空。
步骤1045:当所述离开节点表不为空,且所述离开节点表中,存在多个具有相同的所述第二节点地址的存储记录时,判断与所述相同的所述第二节点地址相对应的节点为所述问题节点。
当组播消息的消息类型为节点加入消息类型,获取集群文件系统中的问题节点和集群状态时,要先对离开节点表中收集并存储的数据进行判断与分析,当离开节点表中有记录存在,即不为空时,且在离开节点表的存储记录中,存在有多个记录中都包含有相同的nodeid2这一参数时,则可获知不同的存储记录中对应的有对相同的减少节点的记录,可以得知与该nodeid2相对应的节点出现多次退出即离开组播环的情况发生,则此时判断与该nodeid2相对应的节点为问题节点。
步骤1046:当所述离开节点表为空,判断所述节点加入消息表中,具有所述第二环中最小的所述第二发送者地址的存储记录的数量是否达到预设值。
步骤1047:当所述节点加入消息表中,具有所述第二环中最小的所述第二发送者地址的存储记录的数量达到预设值时,判断所述集群文件系统为令牌超时频繁状态。
这里,最小的第二发送者地址是指在组播环里按顺序编排的节点地址中,向其他节点发送组播消息的节点的地址在该环的节点地址中为最小。当离开节点表中没有记录存在,即为空时,需要判断存储表中的节点加入消息表的存储记录中,具有该最小的第二发送者地址参数的存储记录的数量有没有达到预设值,即由与该最小的第二发送者地址相对应的节点组播的组播消息的数量是否达到预设值,该预设值一般为预设的在允许范围内的最大值,当达到预设值时,说明与该最小的第二发送者地址相对应的节 点在预设时间间隔内频繁发送组播消息,引起该频繁发送操作的原因一般为发送组播消息给其他节点之后,token令牌没有及时的被其他节点接收响应,因此引起了频繁的重发,则在此时判断该集群文件系统处于令牌超时频繁状态。
上述根据组播消息的类型在存储表中进行不同的查询及判断过程中,通过纵向节点本身的累积计数及横向节点间的比较,判断集群状态及问题节点,及时有效的对集群文件系统的状态做出检测与判断。
第七实施例
在第一实施例中介绍了如何实时监测并获取集群文件系统的问题节点和集群状态,如图15所示,在本实施例中将对监测并获取集群文件系统的问题节点和集群状态之后,如何解决集群文件系统中的问题进行描述,相应地,该监控方法还包括:
步骤105:根据所述问题节点和集群状态获取对应的调整参数。
步骤106:将所述调整参数组播至所述集群节点,以使所述集群节点根据所述调整参数调整当前自身配置。
在当步骤104获取集群文件系统的问题节点和集群状态之后,根据该问题节点及集群状态提供相对应的应对策略,此时通过获取有针对性的调整参数,并将该调整参数组播至集群节点,使其他节点根据该调整参数及时调整自身配置来解决系统问题。
进一步地,如图16所示,上述步骤105中根据问题节点和集群状态获取对应的调整参数,具体包括:
步骤1051:当所述集群节点所处的集群文件系统为集群业务繁忙状态时,按照第一预设的倍率调大当前消息传输窗口值为一新消息传输窗口值,按照第二预设的倍率缩小每个所述集群节点的当前最大可传输信息值为一新的最大可传输信息值。
这里的第一预设的倍率优选为1.2的倍率,第二预设的倍率优选为0.9的倍率。根据集群出现的具体问题分情况处理,在集群业务繁忙的状态下, 每个节点都有组播消息丢失,此时按照1.2的倍率调大消息传输窗口值window_size,按照0.9的倍率缩小每个集群节点组播消息时的最大可传输信息值max_messages,以实现缓解集群繁忙状态,减少消息丢失情况。
步骤1052:当所述集群节点所处的集群文件系统为令牌超时频繁状态时,按照第三预设的倍率调大所述集群节点的令牌超时时间为一新的令牌超时时间。
这里的第三预设的倍率优选为1.2的倍率,对于token令牌超时频繁问题,按照1.2的倍率调大集群中每个节点的令牌超时时间,以减少集群系统中因令牌超时造成的消息重发。
相应地,获取集群文件系统的问题节点和集群状态之后,可对包括出现问题节点、token令牌超时频繁、集群业务繁忙在内的每种结果及与各种结果相对应的调整参数进行输出提示。
进一步地,集群管理套件中的corosync作为集群通信的核心组件,分布在集群每个节点上,corosync的参数配置在每个集群节点相同,它能够将本节点的应用消息广播到集群中,并接收其他节点的组播消息,以此达到整个集群状态的同步,并通过令牌的传递,实现消息的可靠传送,集群成员变化检测,从而维持整个集群稳定。本发明中提到的方法中,可在集群中除监控节点外的其他节点corosync的通信层增加调节参数消息,corosync接收该消息后,进行corosync的totem配置修改。由其他集群节点中的集群管理套件中的corosync接收参数调整消息解析并对相关配置进行修改。
第八实施例
为使本发明目的、技术方案优点更加清楚,本实施例将进一步结合具体实现场景进行说明。
第一步:监控节点加入组播组,准备接收组播消息。
第二步:接收到组播消息,解析组播消息,至少获取消息类型及与消息类型相应的参数数据,并将该些数据进行保存。
第三步:根据参数数据的查询结果,存储相应数据,判断节点状态。
第四步:定时进行统计分析,查询问题节点表和离开节点表,判断可能的问题节点、集群状态,按照当前配置调整参数,输出统计结果和排查建议。
第五步:将调整参数在集群中进行广播。
第六步:集群节点收到关于调整参数的消息,根据调整参数更新配置。
采用本发明公开的方法,与现有技术相比,实现集群和各集群分节点处理能力的直观的观测效果,根据集群业务需求动态调整配置参数,提高了集群的可靠性和稳定性,改善了集群通信的观测手段,方便开发人员分析。
其中,存储表具体包括以下几部分:
应用层消息表,可以为包含列sender_id1,seq,srpaddr,ring_id1,timestamp。
疑似问题节点表,可以为包含nodeid1,seq,ring_id1,timestamp。
节点加入消息表,可以为包含列sender_id2,ring_id2,proc_list,timestamp。
leave表,可以为包含sender_id2,timestamp及未在proc_list中存在的nodeid2。
其中srpaddr列用于记录组播消息发送节点的物理地址,timestamp列用于记录组播消息的相关时间,用于对时间的记录,以便于进行在一预设时间间隔到达时,根据存储表中的存储记录,获取集群文件系统的问题节点和集群状态,也可以通过计时实现在一预设时间段到达时,对存储表中的数据进行定期清除。
进一步地,这里将结合图19对共享存储式集群文件系统节点通信的监控方法中,接收集群文件系统中集群节点的组播消息,获取组播消息的消息类型及与消息类型相对应的消息参数,根据该消息参数,查询存储表 中与该组播消息相对应的存储记录的过程做出整体描述。
当进程开始后,接收集群文件系统中集群节点的组播消息,来判断该组播消息的消息类型是否为应用层消息类型,若是,则解析获取该组播消息的seq、sender_id1、ring_id1,以seq和ring_id1为查询条件查询存储表的应用层消息表中是否存在相关记录,若应用层消息表中不存在相关记录,就将该消息存储至应用层消息表中,若应用层消息表中存在相关记录,就根据该相关记录中的消息发送者地址,判断与该消息发送者地址相对应的节点Pn在环中的上一节点Pn-1为疑似问题节点,此时再判断疑似问题节点表中是否存在具有节点Pn-1的节点地址nodeid及具有相同的seq、ring_id1的相关记录,若存在,则更新该具有Pn-1的节点地址nodeid的相关记录的记录次数,即更新该nodeid对应的Pn-1节点的消息丢失造成的未收消息次数,若不存在,则保存Pn-1的nodeid、seq、ring_id1至存储表中的疑似问题节点表;若该组播消息不为应用层消息类型时,判断该组播消息是否为joinmsg消息类型,若是,则获取该组播消息的sender_id2、ring_id2、proc_list参数,以sender_id2和ring_id2为条件查询存储表中的节点加入消息表是否存储有相同记录,如果有则比较组播消息与该相同记录中的proc_list,当该组播消息的proc_list中有减少的节点,获取该减少的节点的节点地址nodeid,此时再以参数sender_id2、该减少的节点的节点地址nodeid、ring_id2查询leave表中是否存在相关记录,如果leave表中存在相关记录,则累加该nodeid对应的节点的记录次数,即累加该nodeid对应的节点的leave次数,若leave表中不存在相关记录,则在leave表中增加对该nodeid对应的减少的节点的消息的记录;当组播消息既不为应用层消息类型也不为joinmsg消息类型,则丢弃该组播消息。
相应地,这里将结合图20对共享存储式集群文件系统节点通信的监控方法中,根据与组播消息的消息类型相对应的消息参数,查询存储表中与组播消息相对应的存储记录;在一预设时间间隔到达时,根据存储表中与组播消息相对应的存储记录,获取该集群文件系统的问题节点和集群状态的过程做出整体描述。
当预设时间间隔到达时,检查疑似问题节点表,统计每个疑似问题节点的记录次数,即每个疑似问题节点的出现错误的次数,判断疑似问题节点中是否有节点的消息记录次数达到了预设的最大值,如果有,将该节点的消息记录次数与其他疑似问题节点的消息记录次数作比较,当其他疑似问题节点的消息记录次数与消息记录次数达到预设的最大值的节点的记录次数相同,则判断集群业务繁忙,若不相同,则判断消息记录次数达到预设的最大值的节点存在问题;当疑似问题节点中不存在有节点的消息记录次数达到预设最大值,则查询leave表,判断leave表中是否有记录存在,若leave表中有节点存在,判断该leave表的记录中是否存在多个具有相同的节点地址的存储记录,即leave表中的记录是否为对同一节点离开的消息记录,若是,则表明与该相同的节点地址对应的节点存在问题,需要等待集群裁决处理;当leave表中没有记录存在,则判断节点加入消息表中关于具有环中最小地址的节点的消息记录是否达到预设值,若是,则判断集群中存在token超时频繁,若不是,则不对该情况进行处理。
本发明通过接收和分析集群各节点组播消息来监控集群和节点状态,及对集群繁忙,token超时情况通过组播来对各节点参数进行调整以适应集群处理能力,加入组播组的监控节点上不设corosync通信模块,为进行参数调节,需要在集群节点通讯模块中增加一调节参数消息处理,但不影响现有集群规模,该监控节点不参与具体业务,通过收集组播消息来监控集群运行状况,并根据每个节点的组播消息统计分析节点状态,对集群状态和问题节点给出统计判断,并适时调整集群参数,提高其通信业务处理能力及稳定性,对整个集群通信状况有了直观观察效果,能够第一时间获取设备故障通知,使管理人员能够及时了解设备状况、定位故障目标、提高工作效率。
第九实施例
如图7、图17所示,本发明还公开了一种共享存储式集群文件系统中的监控节点,该监控节点包括:第一接收模块2100、第一获取模块2200、查询模块2300、第二获取模块2400。
其中,第一接收模块2100,设置为接收集群文件系统中集群节点的组播消息,该监控节点与集群节点都位于集群文件系统中;第一获取模块2200,设置为获取该组播消息的消息类型及与该消息类型相对应的消息参数;查询模块2300,设置为根据该消息参数,查询存储表中与该组播消息相对应的存储记录;第二获取模块2400,设置为在一预设时间间隔到达时,根据该存储表中与该组播消息相对应的存储记录,获取该集群文件系统的问题节点和集群状态。
上述监控节点是在现有集群中增加的节点,该节点配置集群组播地址,能够接收集群广播消息。通过接收集群文件系统中各节点的组播消息,根据组播消息的类型及与组播消息相对应的消息参数,查询存储表中关于对组播消息的存储记录,来获知集群文件系统的问题节点和集群状态,在共享存储式集群文件系统中利用组播通信消息分析节点运行状态,克服了现有技术中存在共享存储式集群文件系统的对等式架构中,集群处理能力、节点故障缺乏统计分析汇总及集群参数不能动态调整的问题和缺陷。
具体地,第一获取模块2200具体设置为:获取该组播消息的消息类型;当该消息类型为应用层消息类型时,获取该组播消息的第一消息参数,该第一消息参数至少包括:该组播消息对应的应用层消息的消息编号、组播该组播消息的节点所在的第一环的第一环标号和组播该组播消息的节点在该第一环中的第一发送者地址;当该消息类型为节点加入消息类型时,获取该组播消息的第二消息参数,该第二消息参数至少包括:组播该组播消息的节点所在的第二环的第二环标号、组播该组播消息的节点在该第二环中的第二发送者地址和组播该组播消息的节点自身记录的节点成员列表。
当第一接收模块2100接收到组播消息之后,第一获取模块2200获取该组播消息的消息类型,这里的消息类型主要分为两类,一类是应用层消息类型,一类是节点加入消息类型。
当该组播消息是应用层消息类型时,获取的参数要至少包括:该组播 消息对应的应用层消息的消息编号seq、组播该组播消息的节点所在的环的环标号,即第一环的第一环标号ring_id1和组播该组播消息的节点在环中的地址,即第一环中的第一发送者地址sender_id1;当消息类型为节点加入消息类型时,获取的参数至少包括:组播该组播消息的节点所在的环的环标号,即第二环的第二环标号ring_id2、组播该组播消息的节点在环中的地址,即第二环中的第二发送者地址sender_id2和组播该组播消息的节点自身记录的节点成员的成员集合,即节点成员列表proc_list。对以上参数的获取以便于对组播消息表达出的集群状态做出判断。
其中,如图18所示,查询模块2300具体包括:第一判断子模块2310、第一存储子模块2320、第二存储子模块2330。
具体地,第一判断子模块2310,设置为当该组播消息为应用层消息类型时,根据该消息编号及该第一环标号,判断该存储表的应用层消息表中是否存在具有该消息编号及该第一环标号的第一存储记录;第一存储子模块2320,设置为当该应用层消息表中不存在该第一存储记录时,存储该组播消息的第一消息参数至该应用层消息表;第二存储子模块2330,设置为当该应用层消息表中存在该第一存储记录时,判断结果为该第一环中与该第一发送者地址相对应的节点的上一节点存在消息丢失,得出该上一节点为疑似问题节点,并存储该疑似问题节点的参数至该存储表中的疑似问题节点表。
上述查询模块2300中具体包括的第一判断子模块2310、第一存储子模块2320、第二存储子模块2330,实现了在当组播消息为应用层消息类型时,通过利用获取的组播消息的参数对存储表中的应用层消息表、疑似问题节点表中的已有数据做出比较判断,最终根据相关结果对相应的参数进行存储或累加或删除的相应操作,达到对集群文件系统状态信息的实时收集与监控。
进一步地,第二存储子模块2330具体包括:获取单元2331、判断单元2331、记录单元2333、存储单元2334。
其中,获取单元2331,设置为根据第一发送者地址,获取该疑似问题节点在该第一环中的第一节点地址;判断单元2331,设置为根据该第一节点地址、该消息编号及该第一环标号,判断该疑似问题节点表中是否存在具有该第一节点地址、该消息编号及该第一环标号的第二存储记录;记录单元2333,设置为当该疑似问题节点表中存在该第二存储记录时,增加该疑似问题节点的消息记录次数;存储单元2334,设置为当该疑似问题节点表中不存在该第二存储记录时,存储包括该第一节点地址、该消息编号及该第一环标号的第一参数至该疑似问题节点表。
进一步地,查询模块2300具体包括:第二判断子模块2340、第三存储子模块2350、第三判断子模块2360、第四判断子模块2370、记录子模块2380、第四存储子模块2390、第五判断子模块23100、删除子模块23110。
其中,第二判断子模块2340,设置为当该组播消息为节点加入消息类型时,根据该第二环标记与该第二发送者地址,判断该存储表中的节点加入消息表中是否存在具有该第二环标号及该第二发送者地址的第三存储记录;第三存储子模块2350,设置为当该节点加入消息表中不存在该第三存储记录时,存储该组播消息的第二消息参数至该节点加入消息表;第三判断子模块2360,设置为当该节点加入消息表中存在该第三存储记录时,根据该节点成员列表,判断与该第三存储记录相比,该组播消息的节点成员列表中是否存在增加或减少的节点成员;第四判断子模块2370,设置为当该组播消息的节点成员列表中存在减少的节点成员时,获取该减少的节点成员的第二节点地址,根据该第二节点地址及该第二发送者地址,判断该存储表的离开节点表中是否存在具有该第二节点地址及该第二发送者地址的第四存储记录;记录子模块2380,设置为当该离开节点表中存在该第四存储记录时,增加该减少的节点成员的消息记录次数;第四存储子模块2390,设置为当该离开节点表中不存在该第四存储信息时,存储包括该第二节点地址、该第二发送者地址的第二参数至该离开节点表;第五判断子模块23100,设置为当该组播消息的节点成员列表中存在增加的节点成员时,获取该增加的节点成员的第三节点地址,根据该第三节点地址及该 第二发送者地址,判断该离开节点表中是否存在具有该第三节点地址及该第二发送者地址的第五存储记录;删除子模块23110,设置为当该离开节点表中存在该第五存储记录时,删除该第五存储记录。
上述查询模块2300中具体包括的第二判断子模块2340至删除子模块23110实现了在当组播消息为节点加入消息类型时,通过利用获取的组播消息的参数对存储表中的节点加入消息表、离开节点表中的已有数据做出比较判断,最终根据相关结果对相应的参数进行存储或累加或删除的相应操作,达到对集群文件系统状态信息的实时收集与监控。
具体地,当组播消息为应用层消息类型时,第二获取模块2400具体包括:第六判断子模块2410、第七判断子模块2420。
其中,第六判断子模块2410,设置为该疑似问题节点表中,当该疑似问题节点中存在有消息记录次数达到预设最大值的特别疑似问题节点时,判断该预设最大值与除该特别疑似问题节点外的其他节点的消息记录次数是否相同,当预设最大值与除该特别疑似问题节点外的其他节点的消息记录次数相同时,判断该集群文件系统为集群业务繁忙状态;第七判断子模块2420,设置为当预设最大值与除该特别疑似问题节点外的其他节点的消息记录次数不同时,判断该特别疑似问题节点为该问题节点。
相应地,当组播消息为节点加入消息类型时,第二获取模块2400,具体包括:第八判断子模块2430、第九判断子模块2440、第十判断子模块2450、第十一判断子模块2460。
其中,第八判断子模块2430,设置为判断该离开节点表是否为空;第九判断子模块2440,设置为当该离开节点表不为空,且该离开节点表中,存在多个具有相同的该第二节点地址的存储记录时,判断与该相同的该第二节点地址相对应的节点为该问题节点;第十判断子模块2450,设置为当该离开节点表为空,判断该节点加入消息表中,具有该第二环中最小的该第二发送者地址的存储记录的数量是否达到预设值;第十一判断子模块2460,设置为当该节点加入消息表中,具有该第二环中最小的该第二发送 者地址的存储记录的数量达到预设值时,判断该集群文件系统为令牌超时频繁状态。
上述的各判断模块及各判断子模块,在根据组播消息的类型在存储表中进行不同的查询及判断过程中,通过纵向节点本身的累积计数及横向节点间的比较,判断集群状态及问题节点,及时有效的对集群文件系统的状态做出检测与判断。
更进一步地,监控节点还包括:
第三获取模块2500,设置为根据该问题节点和集群状态获取对应的调整参数。
组播模块2600,设置为将该调整参数组播至该集群节点,以使该集群节点根据该调整参数调整当前自身配置。
相应地,第三获取模块2500具体设置为:
当该集群节点所处的集群文件系统为集群业务繁忙状态时,按照第一预设的倍率调大当前消息传输窗口值为一新消息传输窗口值,按照第二预设的倍率缩小每个该集群节点的当前最大可传输信息值为一新的最大可传输信息值;当该集群节点所处的集群文件系统为令牌超时频繁状态时,按照第三预设的倍率调大该集群节点的令牌超时时间为一新的令牌超时时间。
本发明提供的增加在集群文件系统中的监控节点,克服了现有技术中存在于共享存储式集群文件系统的对等式架构中,集群处理能力、节点故障缺乏统计分析汇总及集群参数不能动态调整的问题和缺陷,实现了对集群文件系统问题的及时检测、发现与解决。
本说明书中的各个实施例均采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似的部分互相参见即可。
尽管已描述了本发明实施例的优选实施例,但本领域内的技术人员一旦得知了基本创造性概念,则可对这些实施例作出另外的变更和修改。所 以,所附权利要求意欲解释为包括优选实施例以及落入本发明实施例范围的所有变更和修改。
最后,还需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者终端设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者终端设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者终端设备中还存在另外的相同要素。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到根据上述实施例的方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,或者网络设备等)执行本发明各个实施例所述的方法。
可选地,在本实施例中,上述存储介质可以包括但不限于:U盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、移动硬盘、磁碟或者光盘等各种可以存储程序代码的介质。
可选地,本实施例中的具体示例可以参考上述实施例及可选实施方式中所描述的示例,本实施例在此不再赘述。
显然,本领域的技术人员应该明白,上述的本发明的各模块或各步骤可以用通用的计算装置来实现,它们可以集中在单个的计算装置上,或者 分布在多个计算装置所组成的网络上,可选地,它们可以用计算装置可执行的程序代码来实现,从而,可以将它们存储在存储装置中由计算装置来执行,并且在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤,或者将它们分别制作成各个集成电路模块,或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。这样,本发明不限制于任何特定的硬件和软件结合。
以上所述仅为本发明的优选实施例而已,并不用于限制本发明,对于本领域的技术人员来说,本发明可以有各种更改和变化。凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。
工业实用性
如上所述,本发明实施例提供的一种共享存储式集群文件系统节点通信的监控方法及监控节点,具有以下有益效果:通过收集组播消息来监控集群运行状况,并根据每个节点的组播消息统计分析节点状态,对集群状态和问题节点给出统计判断,提高其通信业务处理能力及稳定性,且能够第一时间获取设备故障通知,使管理人员对整个集群通信状况有了直观观察效果,能够及时了解设备状况、定位故障目标、提高工作效率,进而提高集群文件系统整体性能。

Claims (19)

  1. 一种共享存储式集群文件系统节点通信的监控方法,应用于共享存储式集群文件系统中的监控节点,包括:
    接收所述集群文件系统中集群节点的组播消息,所述监控节点与所述集群节点都位于所述集群文件系统中;
    获取所述组播消息的消息类型及与所述消息类型相对应的消息参数;
    根据所述消息参数,查询存储表中与所述组播消息相对应的存储记录;
    在一预设时间间隔到达时,根据所述存储表中与所述组播消息相对应的存储记录,获取所述集群文件系统的问题节点和集群状态。
  2. 根据权利要求1所述的共享存储式集群文件系统节点通信的监控方法,其中,所述获取所述组播消息的消息类型及与所述消息类型相对应的消息参数,具体包括:
    获取所述组播消息的消息类型;
    当所述消息类型为应用层消息类型时,获取所述组播消息的第一消息参数,所述第一消息参数至少包括:所述组播消息对应的应用层消息的消息编号、组播所述组播消息的节点所在的第一环的第一环标号和组播所述组播消息的节点在所述第一环中的第一发送者地址;
    当所述消息类型为节点加入消息类型时,获取所述组播消息的第二消息参数,所述第二消息参数至少包括:组播所述组播消息的节点所在的第二环的第二环标号、组播所述组播消息的节点在所述第二环中的第二发送者地址和组播所述组播消息的节点自身记录的节点成员列表。
  3. 根据权利要求2所述的共享存储式集群文件系统节点通信的监控方法,其中,所述根据所述消息参数,查询存储表中与所述组播消息相对应的存储记录,具体包括:
    当所述组播消息为应用层消息类型时,根据所述消息编号及所述第一环标号,判断所述存储表的应用层消息表中是否存在具有所述消息编号及所述第一环标号的第一存储记录;
    当所述应用层消息表中不存在所述第一存储记录时,存储所述组播消息的第一消息参数至所述应用层消息表;
    当所述应用层消息表中存在所述第一存储记录时,判断结果为所述第一环中与所述第一发送者地址相对应的节点的上一节点存在消息丢失,得出所述上一节点为疑似问题节点,并存储所述疑似问题节点的参数至所述存储表中的疑似问题节点表。
  4. 根据权利要求3所述的共享存储式集群文件系统节点通信的监控方法,其中,所述得出所述上一节点为疑似问题节点,并存储所述疑似问题节点的参数至所述存储表中的疑似问题节点表,具体包括:
    根据所述第一发送者地址,获取所述疑似问题节点在所述第一环中的第一节点地址;
    根据所述第一节点地址、所述消息编号及所述第一环标号,判断所述疑似问题节点表中是否存在具有所述第一节点地址、所述消息编号及所述第一环标号的第二存储记录;
    当所述疑似问题节点表中存在所述第二存储记录时,增加所述疑似问题节点的消息记录次数;
    当所述疑似问题节点表中不存在所述第二存储记录时,存储包括所述第一节点地址、所述消息编号及所述第一环标号的第一参数至所述疑似问题节点表。
  5. 根据权利要求2所述的共享存储式集群文件系统节点通信的监控方法,其中,所述根据所述消息参数,查询存储表中与所述组播消息相对应的存储记录,具体包括:
    当所述组播消息为节点加入消息类型时,根据所述第二环标记与所述第二发送者地址,判断所述存储表中的节点加入消息表中是否存 在具有所述第二环标号及所述第二发送者地址的第三存储记录;
    当所述节点加入消息表中不存在所述第三存储记录时,存储所述组播消息的第二消息参数至所述节点加入消息表;
    当所述节点加入消息表中存在所述第三存储记录时,根据所述节点成员列表,判断与所述第三存储记录相比,所述组播消息的节点成员列表中是否存在增加或减少的节点成员;
    当所述组播消息的节点成员列表中存在减少的节点成员时,获取所述减少的节点成员的第二节点地址,根据所述第二节点地址及所述第二发送者地址,判断所述存储表的离开节点表中是否存在具有所述第二节点地址及所述第二发送者地址的第四存储记录;
    当所述离开节点表中存在所述第四存储记录时,增加所述减少的节点成员的消息记录次数;
    当所述离开节点表中不存在所述第四存储信息时,存储包括所述第二节点地址、所述第二发送者地址的第二参数至所述离开节点表;
    当所述组播消息的节点成员列表中存在增加的节点成员时,获取所述增加的节点成员的第三节点地址,根据所述第三节点地址及所述第二发送者地址,判断所述离开节点表中是否存在具有所述第三节点地址及所述第二发送者地址的第五存储记录;
    当所述离开节点表中存在所述第五存储记录时,删除所述第五存储记录。
  6. 根据权利要求3所述的共享存储式集群文件系统节点通信的监控方法,其中,所述在一预设时间间隔到达时,根据所述存储表中与所述组播消息相对应的存储记录,获取所述集群文件系统的问题节点和集群状态,具体包括:
    所述疑似问题节点表中,当所述疑似问题节点中存在有消息记录次数达到预设最大值的特别疑似问题节点时,判断所述预设最大值与除所述特别疑似问题节点外的其他节点的消息记录次数是否相同;
    当预设最大值与除所述特别疑似问题节点外的其他节点的消息记录次数相同时,判断所述集群文件系统为集群业务繁忙状态;
    当预设最大值与除所述特别疑似问题节点外的其他节点的消息记录次数不同时,判断所述特别疑似问题节点为所述问题节点。
  7. 根据权利要求5所述的共享存储式集群文件系统节点通信的监控方法,其中,所述在一预设时间间隔到达时,根据所述存储表中与所述组播消息相对应的存储记录,获取所述集群文件系统的问题节点和集群状态,具体包括:
    判断所述离开节点表是否为空;
    当所述离开节点表不为空,且所述离开节点表中,存在多个具有相同的所述第二节点地址的存储记录时,判断与所述相同的所述第二节点地址相对应的节点为所述问题节点;
    当所述离开节点表为空,判断所述节点加入消息表中,具有所述第二环中最小的所述第二发送者地址的存储记录的数量是否达到预设值;
    当所述节点加入消息表中,具有所述第二环中最小的所述第二发送者地址的存储记录的数量达到预设值时,判断所述集群文件系统为令牌超时频繁状态。
  8. 根据权利要求1所述的共享存储式集群文件系统节点通信的监控方法,其中,所述监控方法还包括:
    根据所述问题节点和集群状态获取对应的调整参数;
    将所述调整参数组播至所述集群节点,以使所述集群节点根据所述调整参数调整当前自身配置。
  9. 根据权利要求8所述的共享存储式集群文件系统节点通信的监控方法,其中,所述根据所述问题节点和集群状态获取对应的调整参数,具体包括:
    当所述集群节点所处的集群文件系统为集群业务繁忙状态时,按照第一预设的倍率调大当前消息传输窗口值为一新消息传输窗口值,按照第二预设的倍率缩小每个所述集群节点的当前最大可传输信息值为一新的最大可传输信息值;
    当所述集群节点所处的集群文件系统为令牌超时频繁状态时,按照第三预设的倍率调大所述集群节点的令牌超时时间为一新的令牌超时时间。
  10. 一种共享存储式集群文件系统中的监控节点,包括:
    第一接收模块,设置为接收所述集群文件系统中集群节点的组播消息,所述监控节点与所述集群节点都位于所述集群文件系统中;
    第一获取模块,设置为获取所述组播消息的消息类型及与所述消息类型相对应的消息参数;
    查询模块,设置为根据所述消息参数,查询存储表中与所述组播消息相对应的存储记录;
    第二获取模块,设置为在一预设时间间隔到达时,根据所述存储表中与所述组播消息相对应的存储记录,获取所述集群文件系统的问题节点和集群状态。
  11. 根据权利要求10所述的共享存储式集群文件系统中的监控节点,其中,所述第一获取模块具体设置为:
    获取所述组播消息的消息类型;
    当所述消息类型为应用层消息类型时,获取所述组播消息的第一消息参数,所述第一消息参数至少包括:所述组播消息对应的应用层消息的消息编号、组播所述组播消息的节点所在的第一环的第一环标号和组播所述组播消息的节点在所述第一环中的第一发送者地址;
    当所述消息类型为节点加入消息类型时,获取所述组播消息的第二消息参数,所述第二消息参数至少包括:组播所述组播消息的节点 所在的第二环的第二环标号、组播所述组播消息的节点在所述第二环中的第二发送者地址和组播所述组播消息的节点自身记录的节点成员列表。
  12. 根据权利要求11所述的共享存储式集群文件系统中的监控节点,其中,所述查询模块具体包括:
    第一判断子模块,设置为当所述组播消息为应用层消息类型时,根据所述消息编号及所述第一环标号,判断所述存储表的应用层消息表中是否存在具有所述消息编号及所述第一环标号的第一存储记录;
    第一存储子模块,设置为当所述应用层消息表中不存在所述第一存储记录时,存储所述组播消息的第一消息参数至所述应用层消息表;
    第二存储子模块,设置为当所述应用层消息表中存在所述第一存储记录时,判断结果为所述第一环中与所述第一发送者地址相对应的节点的上一节点存在消息丢失,得出所述上一节点为疑似问题节点,并存储所述疑似问题节点的参数至所述存储表中的疑似问题节点表。
  13. 根据权利要求12所述的共享存储式集群文件系统中的监控节点,其中,所述第二存储子模块具体包括:
    获取单元,设置为根据所述第一发送者地址,获取所述疑似问题节点在所述第一环中的第一节点地址;
    判断单元,设置为根据所述第一节点地址、所述消息编号及所述第一环标号,判断所述疑似问题节点表中是否存在具有所述第一节点地址、所述消息编号及所述第一环标号的第二存储记录;
    记录单元,设置为当所述疑似问题节点表中存在所述第二存储记录时,增加所述疑似问题节点的消息记录次数;
    存储单元,设置为当所述疑似问题节点表中不存在所述第二存储记录时,存储包括所述第一节点地址、所述消息编号及所述第一环标号的第一参数至所述疑似问题节点表。
  14. 根据权利要求11所述的共享存储式集群文件系统中的监控节点,其中,所述查询模块具体包括:
    第二判断子模块,设置为当所述组播消息为节点加入消息类型时,根据所述第二环标记与所述第二发送者地址,判断所述存储表中的节点加入消息表中是否存在具有所述第二环标号及所述第二发送者地址的第三存储记录;
    第三存储子模块,设置为当所述节点加入消息表中不存在所述第三存储记录时,存储所述组播消息的第二消息参数至所述节点加入消息表;
    第三判断子模块,设置为当所述节点加入消息表中存在所述第三存储记录时,根据所述节点成员列表,判断与所述第三存储记录相比,所述组播消息的节点成员列表中是否存在增加或减少的节点成员;
    第四判断子模块,设置为当所述组播消息的节点成员列表中存在减少的节点成员时,获取所述减少的节点成员的第二节点地址,根据所述第二节点地址及所述第二发送者地址,判断所述存储表的离开节点表中是否存在具有所述第二节点地址及所述第二发送者地址的第四存储记录;
    记录子模块,设置为当所述离开节点表中存在所述第四存储记录时,增加所述减少的节点成员的消息记录次数;
    第四存储子模块,设置为当所述离开节点表中不存在所述第四存储信息时,存储包括所述第二节点地址、所述第二发送者地址的第二参数至所述离开节点表;
    第五判断子模块,设置为当所述组播消息的节点成员列表中存在增加的节点成员时,获取所述增加的节点成员的第三节点地址,根据所述第三节点地址及所述第二发送者地址,判断所述离开节点表中是否存在具有所述第三节点地址及所述第二发送者地址的第五存储记录;
    删除子模块,设置为当所述离开节点表中存在所述第五存储记录 时,删除所述第五存储记录。
  15. 根据权利要求12所述的共享存储式集群文件系统中的监控节点,其中,所述第二获取模块,具体包括:
    第六判断子模块,设置为所述疑似问题节点表中,当所述疑似问题节点中存在有消息记录次数达到预设最大值的特别疑似问题节点时,判断所述预设最大值与除所述特别疑似问题节点外的其他节点的消息记录次数是否相同,当预设最大值与除所述特别疑似问题节点外的其他节点的消息记录次数相同时,判断所述集群文件系统为集群业务繁忙状态;
    第七判断子模块,设置为当预设最大值与除所述特别疑似问题节点外的其他节点的消息记录次数不同时,判断所述特别疑似问题节点为所述问题节点。
  16. 根据权利要求14所述的共享存储式集群文件系统中的监控节点,其中,所述第二获取模块,具体包括:
    第八判断子模块,设置为判断所述离开节点表是否为空;
    第九判断子模块,设置为当所述离开节点表不为空,且所述离开节点表中,存在多个具有相同的所述第二节点地址的存储记录时,判断与所述相同的所述第二节点地址相对应的节点为所述问题节点;
    第十判断子模块,设置为当所述离开节点表为空,判断所述节点加入消息表中,具有所述第二环中最小的所述第二发送者地址的存储记录的数量是否达到预设值;
    第十一判断子模块,设置为当所述节点加入消息表中,具有所述第二环中最小的所述第二发送者地址的存储记录的数量达到预设值时,判断所述集群文件系统为令牌超时频繁状态。
  17. 根据权利要求10所述的共享存储式集群文件系统中的监控节点,其中,所述监控节点还包括:
    第三获取模块,设置为根据所述问题节点和集群状态获取对应的调整参数;
    组播模块,设置为将所述调整参数组播至所述集群节点,以使所述集群节点根据所述调整参数调整当前自身配置。
  18. 根据权利要求17所述的共享存储式集群文件系统中的监控节点,其中,所述第三获取模块,具体设置为:
    当所述集群节点所处的集群文件系统为集群业务繁忙状态时,按照第一预设的倍率调大当前消息传输窗口值为一新消息传输窗口值,按照第二预设的倍率缩小每个所述集群节点的当前最大可传输信息值为一新的最大可传输信息值;
    当所述集群节点所处的集群文件系统为令牌超时频繁状态时,按照第三预设的倍率调大所述集群节点的令牌超时时间为一新的令牌超时时间。
  19. 一种计算机存储介质,设置为存储用于执行如权利要求1至9中任一项所述的共享存储式集群文件系统节点通信的监控方法的计算机程序。
PCT/CN2016/106412 2015-11-18 2016-11-18 共享存储式集群文件系统节点通信的监控方法及监控节点 WO2017084618A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510793859.7A CN106713398A (zh) 2015-11-18 2015-11-18 共享存储式集群文件系统节点通信的监控方法及监控节点
CN201510793859.7 2015-11-18

Publications (1)

Publication Number Publication Date
WO2017084618A1 true WO2017084618A1 (zh) 2017-05-26

Family

ID=58717365

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/106412 WO2017084618A1 (zh) 2015-11-18 2016-11-18 共享存储式集群文件系统节点通信的监控方法及监控节点

Country Status (2)

Country Link
CN (1) CN106713398A (zh)
WO (1) WO2017084618A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111090502A (zh) * 2018-10-24 2020-05-01 阿里巴巴集团控股有限公司 一种流数据任务调度方法和装置
CN112104567A (zh) * 2020-09-03 2020-12-18 中国银联股份有限公司 流量控制方法、装置、设备及介质
CN114338612A (zh) * 2021-12-22 2022-04-12 威创集团股份有限公司 一种组播地址的动态分配方法、系统、设备及存储介质

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107171900A (zh) * 2017-07-25 2017-09-15 郑州云海信息技术有限公司 一种节点运行状态的获取方法及系统
CN109104299B (zh) * 2018-07-11 2021-12-07 新华三技术有限公司成都分公司 降低集群震荡的方法及装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1945539A (zh) * 2006-10-19 2007-04-11 华为技术有限公司 计算机集群系统中共享资源锁分配方法与计算机及集群系统
CN102880506A (zh) * 2012-09-10 2013-01-16 曙光信息产业(北京)有限公司 一种基于作业调度系统的应用作业控制系统及其控制方法
US20130139219A1 (en) * 2011-11-28 2013-05-30 Hangzhou H3C Technologies Co., Ltd. Method of fencing in a cluster system
CN103440160A (zh) * 2013-08-15 2013-12-11 华为技术有限公司 虚拟机恢复方法和虚拟机迁移方法以及装置与系统
CN104065741A (zh) * 2014-07-04 2014-09-24 用友软件股份有限公司 数据采集系统和数据采集方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1945539A (zh) * 2006-10-19 2007-04-11 华为技术有限公司 计算机集群系统中共享资源锁分配方法与计算机及集群系统
US20130139219A1 (en) * 2011-11-28 2013-05-30 Hangzhou H3C Technologies Co., Ltd. Method of fencing in a cluster system
CN102880506A (zh) * 2012-09-10 2013-01-16 曙光信息产业(北京)有限公司 一种基于作业调度系统的应用作业控制系统及其控制方法
CN103440160A (zh) * 2013-08-15 2013-12-11 华为技术有限公司 虚拟机恢复方法和虚拟机迁移方法以及装置与系统
CN104065741A (zh) * 2014-07-04 2014-09-24 用友软件股份有限公司 数据采集系统和数据采集方法

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111090502A (zh) * 2018-10-24 2020-05-01 阿里巴巴集团控股有限公司 一种流数据任务调度方法和装置
CN111090502B (zh) * 2018-10-24 2024-05-17 阿里巴巴集团控股有限公司 一种流数据任务调度方法和装置
CN112104567A (zh) * 2020-09-03 2020-12-18 中国银联股份有限公司 流量控制方法、装置、设备及介质
CN112104567B (zh) * 2020-09-03 2022-11-18 中国银联股份有限公司 流量控制方法、装置、设备及介质
CN114338612A (zh) * 2021-12-22 2022-04-12 威创集团股份有限公司 一种组播地址的动态分配方法、系统、设备及存储介质
CN114338612B (zh) * 2021-12-22 2023-03-24 威创集团股份有限公司 一种组播地址的动态分配方法、系统、设备及存储介质

Also Published As

Publication number Publication date
CN106713398A (zh) 2017-05-24

Similar Documents

Publication Publication Date Title
WO2017084618A1 (zh) 共享存储式集群文件系统节点通信的监控方法及监控节点
US8539088B2 (en) Session monitoring method, apparatus, and system based on multicast technologies
US10305746B2 (en) Network insights
US8135979B2 (en) Collecting network-level packets into a data structure in response to an abnormal condition
US10333724B2 (en) Method and system for low-overhead latency profiling
WO2016127884A1 (zh) 消息推送方法及装置
US10516545B2 (en) Congestion management in a multicast communication network
CN113438129B (zh) 数据采集方法及装置
WO2022062407A1 (zh) 链路的监控方法、装置、存储介质以及电子装置
CN103312593B (zh) 一种消息分发系统及方法
CN109257335B (zh) 保持回源链路的方法、回源方法、相关装置及存储介质
JP2008059114A (ja) Snmpを利用した自動ネットワーク監視システム
US10931529B2 (en) Terminal device management method, server, and terminal device for managing terminal devices in local area network
WO2012171168A1 (zh) 监控室内覆盖网络的方法、设备及系统
JP2019525292A (ja) ホットライブビデオ判定方法及び装置
CN101442474A (zh) 自举路由器及超时时间管理的方法和系统
JP2008306435A (ja) パケット中継装置
US11914495B1 (en) Evaluating machine and process performance in distributed system
CN112583659A (zh) 视联网网络状态的检测方法、装置、终端设备和存储介质
KR20200007912A (ko) 데이터 트래픽을 모니터링하기 위한 방법, 장치 및 시스템
CN112000544A (zh) 一种物联网设备大屏实时监控方法
CN116723081A (zh) 一种丢包优化方法和装置
WO2013189421A2 (zh) 分布式的话单统计方法、装置以及系统
CN106230658A (zh) 一种监控网络设备的方法和装置
CN105592485A (zh) 一种基于snmp网管协议实时采集并处理消息的方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16865795

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16865795

Country of ref document: EP

Kind code of ref document: A1