CN115314361B

CN115314361B - Server cluster management method and related components thereof

Info

Publication number: CN115314361B
Application number: CN202210939359.XA
Authority: CN
Inventors: 耿元
Original assignee: Suzhou Inspur Intelligent Technology Co Ltd
Current assignee: Suzhou Inspur Intelligent Technology Co Ltd
Priority date: 2022-08-05
Filing date: 2022-08-05
Publication date: 2023-08-22
Anticipated expiration: 2042-08-05
Also published as: CN115314361A

Abstract

The application discloses a server cluster management method and related components thereof, which relate to the field of cluster management and are applied to any target server in a server cluster, wherein the target server acquires working parameter information of an NFS server, the working parameter information comprises one or two of network state parameters and hardware parameters, then whether the NFS server fails or not is judged according to the working parameter information of the NFS server, and if the NFS server fails, the target server is used as a new current NFS server. When the NFS server fails, the target server is used for replacing the NFS server to work as the current NFS server, so that the situation that the data file is lost can be effectively avoided, a plurality of NFS servers specially used for the NFS function are not required to be arranged, the economic cost is reduced, and the waste of server resources is avoided.

Description

Server cluster management method and related components thereof

Technical Field

The present application relates to the field of cluster management, and in particular, to a server cluster management method and related components thereof.

Background

NFS (Network File System ) servers are an important component of a server cluster, in which a server management system manages all normal servers in the server cluster through NFS servers, and the NFS servers can store files generated by the server management system and each normal server in a management process, so as to ensure consistency of file data in the same local area network, but when the NFS servers fail, data files interacted between the server management system and other servers may be lost, which may further cause economic loss and risk. In order to avoid the situation of file loss in the prior art, a plurality of NFS servers specially used for the NFS function are usually arranged in a server cluster, but the economic cost of the mode is higher, and the server resource waste is caused.

Disclosure of Invention

The application aims to provide a server cluster management method and related components thereof, which can effectively avoid the situation of data file loss, and a plurality of NFS servers specially used for the NFS function are not required to be arranged, so that the economic cost is reduced, and the waste of server resources is avoided.

In order to solve the above technical problems, the present application provides a server cluster management method, which is applied to a target server in a server cluster, where the target server is any one server in the server cluster, and the server cluster management method includes:

acquiring working parameter information of an NFS server, wherein the working parameter information comprises one or two of network state parameters and hardware parameters;

judging whether the NFS server fails according to the working parameter information of the NFS server;

and if the NFS server fails, taking the target server as a new current NFS server.

Preferably, after the target server is used as a new current NFS server, the method further includes:

when an executable file is received, determining an execution mode of the executable file, wherein the execution mode comprises an out-of-band mode and an in-band mode;

when the execution mode of the executable file is an out-of-band mode, generating a first result file corresponding to the executable file;

and when the execution mode of the executable file is an in-band mode, the executable file is sent to N servers in the server cluster, so that N servers in the server cluster generate N second result files according to the executable file, and N is a positive integer.

Preferably, after generating the first result file corresponding to the executable file, the method further includes:

storing the first result file into a storage space of the target server;

after the N servers in the server cluster generate N second result files according to the executable file, the method further includes:

and acquiring N second result files, and storing the N second result files in the storage space of the target server.

when the NFS server is detected to resume normal operation, the NFS server is used as a new current NFS server;

and sending all the first result files and all the second result files stored by the target server to the NFS server.

Preferably, sending all the first result files and all the second result files stored by the target server to the NFS server includes:

determining a fault starting time when the NFS server fails;

determining the fault end time when the NFS server is detected to recover normal operation;

and sending all the first result files and all the second result files acquired by the target server in a time period from the fault starting time to the fault ending time to the NFS server.

Preferably, before acquiring the working parameter information of the NFS server, the method further includes:

judging whether the target server and the NFS server are in the same network segment;

and if the network is in the same network segment, entering a step of acquiring the working parameter information of the NFS server.

Preferably, if the NFS server does not fail, the method further includes:

when the NFS server sends or receives a data file, acquiring the data file;

and storing the data file into a storage space of the target server.

The application also provides a server cluster management system, which comprises:

a parameter obtaining unit, configured to obtain working parameter information of the NFS server, where the working parameter information includes one or two of a network state parameter and a hardware parameter;

the judging unit is used for judging whether the NFS server fails according to the working parameter information of the NFS server, and triggering the executing unit if the NFS server fails;

the execution unit is configured to use the target server as a new current NFS server.

The application also provides a server cluster management device, which comprises:

a memory for storing a computer program;

and the processor is used for realizing the steps of the server cluster management method when executing the computer program.

The present application also provides a computer readable storage medium having stored thereon a computer program which when executed by a processor implements the steps of a server cluster management method as described above.

The application provides a server cluster management method and related components thereof, which relate to the field of cluster management and are applied to any target server in a server cluster, wherein the target server acquires working parameter information of an NFS (network state) server, the working parameter information comprises one or two of network state parameters and hardware parameters, then whether the NFS server fails or not is judged according to the working parameter information of the NFS server, and if the NFS server fails, the target server is used as a new current NFS server. When the NFS server fails, the target server is used for replacing the NFS server to work as the current NFS server, so that the situation that the data file is lost can be effectively avoided, a plurality of NFS servers specially used for the NFS function are not required to be arranged, the economic cost is reduced, and the waste of server resources is avoided.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required in the prior art and the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flowchart of a server cluster management method provided by the present application;

fig. 2 is a schematic structural diagram of a server cluster according to the present application;

fig. 3 is a schematic structural diagram of a server cluster management system according to the present application;

fig. 4 is a schematic structural diagram of a server cluster management device provided by the present application.

Detailed Description

The core of the application is to provide a server cluster management method and related components thereof, which can effectively avoid the situation of data file loss, and a plurality of NFS servers specially used for the NFS function are not required to be arranged, so that the economic cost is reduced, and the waste of server resources is avoided.

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

Referring to fig. 1, fig. 1 is a flowchart of a server cluster management method provided by the present application, which is applied to a target server in a server cluster, wherein the target server is any server in the server cluster, and the server cluster management method includes:

s1: acquiring working parameter information of the NFS server, wherein the working parameter information comprises one or two of network state parameters and hardware parameters;

when managing a server cluster, referring to fig. 2, fig. 2 is a schematic structural diagram of a server cluster provided by the present application, in general, a server management system is connected to NFS servers, where the NFS servers are connected to all other common servers in the server cluster, so as to achieve the purpose that the server management system manages all the common servers in the server cluster, specifically, the server management system may be built up from common servers, various executable files are stored in the server management system, and a user sends the executable files to the server management system, and then operates in the server management system to make the server management system send the executable files to the NFS servers and the common servers, so as to manage the common servers, and in addition, the server management system has a file transfer function, so that the purpose that the server cluster communicates with other external devices is achieved through the function; the NFS server may also be built by servers, and the main function of the NFS server is to ensure consistency of data in the server cluster, and after the server management system sends executable files to the NFS server and a common server, the result files generated by the servers are stored in the NFS server, so that when the NFS server fails, the server cluster will lose the medium for storing the result files, which may cause inconsistent data of the server cluster in subsequent use.

Based on this, a certain server in the server cluster may be defined as a target server in advance, where the server is a client server of the NFS server, and when the NFS server works normally, the target server may acquire the working parameter information of the NFS server in real time so that the target server determines whether the NFS server fails.

S2: judging whether the NFS server fails according to the working parameter information of the NFS server;

in order to detect whether the NFS server fails in real time, the target server acquires the working parameter information of the NFS server to determine, where the network state parameters include an IP address and a communication interface of the NFS server, and when the IP address or the interface changes, it may indicate that the connection between the NFS server and the server management system or between the NFS server and the server cluster is disconnected; the hardware parameters may include state information of hardware such as a CPU (Central Processing Unit ) inside the NFS server, a memory, a cache, and a hard disk, and whether hardware failure or damage exists in the hardware is determined by using the state information to determine whether the NFS server fails.

S3: if the NFS server fails, the target server is taken as a new current NFS server.

After the NFS server fails, the servers in the server cluster and the server management system are disconnected from the NFS server, so that the continuity of the NFS function is guaranteed, the target server serves as a new NFS server and is connected with the servers in the server cluster and the server management system, and when the NFS server fails, the target server replaces all the original functions of the NFS server, so that the continuity of the NFS function is realized, and the situation that data files are lost is avoided.

In summary, the target server acquires the working parameter information of the NFS server, where the working parameter information includes one or two of a network state parameter and a hardware parameter, and then determines whether the NFS server fails according to the working parameter information of the NFS server, and if the NFS server fails, the target server is used as a new current NFS server. When the NFS server fails, the target server is used for replacing the NFS server to work as the current NFS server, so that the situation that the data file is lost can be effectively avoided, a plurality of NFS servers specially used for the NFS function are not required to be arranged, the economic cost is reduced, and the waste of server resources is avoided.

Based on the above embodiments:

as a preferred embodiment, after the target server is the new current NFS server, the method further includes:

when the execution mode of the executable file is in-band mode, the executable file is sent to N servers in the server cluster, so that N servers in the server cluster generate N second result files according to the executable file, and N is a positive integer.

In order to improve cluster management efficiency, in the present application, it is considered that network management modes are generally divided into an out-of-band mode and an in-band mode, where the two management modes are determined according to factors such as whether a corresponding executable file occupies network bandwidth and whether the corresponding executable file occupies a server interface when being executed by a certain server, and the execution modes of the executable file that occupies the network bandwidth and occupies the server interface are in-band modes, and the execution modes of the executable file that does not use the network bandwidth and occupies the server interface are out-of-band modes. Based on this, the in-band executable file occupies some resources, and it can be seen that when the amount of data contained in the executable file is large, the performance of the server executing the executable file may be affected, if the current NFS server is used to execute the in-band executable file, since the current NFS server needs to communicate with the server management system and each server, the performance of the current NFS server may be affected, and thus the performance of the entire server cluster may be affected. Based on this, when the executable file is in the out-of-band mode, the current NFS server executes the executable file itself, and when the executable file is in the in-band mode, the executable file is executed by other servers, which can sufficiently improve cluster management efficiency.

As a preferred embodiment, after generating the first result file corresponding to the executable file, the method further includes:

storing the first result file into a storage space of the target server;

after the N servers in the server cluster generate N second result files according to the executable files, the method further includes:

In order to ensure the data consistency of the server cluster, in the application, considering that two servers can communicate in practical application, when the servers communicate, the servers can communicate by utilizing the executable file, the instruction file or the task and the like, and also can communicate by utilizing a certain generated result file, so that the result file used when the servers communicate and the certain generated result file are required to be the same file. Based on the method, when the current NFS server receives an executable file in an out-of-band mode, the current NFS server generates a first result file of the executable file and returns the first result file to a server management system, and meanwhile, the current NFS server also stores the first result file generated by the current NFS server; when other servers in the server cluster generate second result files corresponding to executable files in an in-band mode, the current NFS server returns the second result files sent by the servers to the server management system, and meanwhile, the current NFS server also stores the second result files. By storing the first result file and the second result file, the subsequent other servers can be called through the current NFS server when the first result file and the second result file are called, and the data consistency of the server cluster can be ensured.

and all the first result files and all the second result files stored by the target server are sent to the NFS server.

In order to ensure the synchronism of data, in the present application, considering that the NFS server is a server specially used for implementing the NFS function, and the target server can be regarded as a client of the NFS server or as a replacement of the NFS server, in the case that the NFS server is normal, the NFS server needs to be a server implemented by taking the NFS server as the primary NFS function, so when the failure of the NFS server is repaired, the NFS server is used as a new server for implementing the NFS function, that is, the original work of the NFS server is restored, and the target server originally implementing the NFS function is returned to the client of the NFS server or the location of the replacement. Because the NFS server needs to be maintained after the failure occurs, the NFS server cannot communicate with other servers in the maintenance process of the NFS server, that is, the NFS server cannot acquire data files such as executable files and result files; meanwhile, the original function of the NFS server is realized by the target server, namely, the target server can receive and store the result files, so that after the NFS server is recovered from faults, the target server sends the result files stored by the target server to the NFS server, the result files which cannot be received by the NFS server because of faults are complemented, and the data synchronism between the NFS server and the target server is ensured.

As a preferred embodiment, sending all the first result files and all the second result files stored by the target server itself to the NFS server includes:

determining a fault starting time when the NFS server fails;

and all the first result files and all the second result files acquired by the target server in the time period from the fault starting time to the fault ending time are sent to the NFS server.

In order to improve efficiency, in the present application, when the result file stored in the target server is sent to the NFS server, considering that the result file stored before the current failure may exist in the result file stored in the target server, for example, the result file stored when the last failure occurs or the result file stored when the user performs file backup, etc., the result file stored before the failure is also typically stored in the NFS server, and it is obvious that if the result file stored before the current failure in the target server is also sent to the NFS server, more sending time is required, and there is a risk of covering the same name file stored in the NFS server by mistake. Based on this, when the target server sends the result file to the NFS server, the target server may record the fault start time and the fault end time of the present fault of the NFS server, so as to determine the fault time period of the NFS server, and the target server only sends all the first result files and all the second result files stored in this time period to the NFS server, which can also complement the result files that the NFS server cannot receive due to the fault, and further shorten the sending time, improve the efficiency, and avoid covering the same name files in the NFS.

As a preferred embodiment, before acquiring the operating parameter information of the NFS server, the method further includes:

if the network is in the same network segment, the step of acquiring the working parameter information of the NFS server is entered.

In order to ensure communication efficiency, in the application, considering that all servers are usually communicated through a network, and the servers can communicate through a wide area network and a local area network when communicating, when the servers communicate through the wide area network, the communication efficiency, the communication quality and the like among the servers can be influenced by factors such as network fluctuation, network speed and the like, and when the network fluctuation is large or the network speed is low, the communication efficiency and the communication quality among the servers can be reduced, so that the management efficiency of a server management system to a server cluster is reduced; when the server communicates through the local area network, the server is less influenced by external factors, and higher communication efficiency and communication quality can be ensured. Based on this, it can be determined whether the target server and the NFS server are in the same network segment, and if so, it is indicated that the NFS server and the target server are in the same local area network, and when the NFS server fails and the target server replaces the NFS server, it is equivalent to using the NFS server in the same local area network. In addition, if the target server and the NFS server are not in the same network segment, it may be determined that the local area network corresponding to the server management system is based on the IP address of the server management system, and then the idle IP address in the local area network corresponding to the server management system is allocated to the NFS server and the server cluster, so that the NFS server, the server cluster, and the server management system are set in the same network segment. Based on this, by judging whether the target server and the NFS server are in the same network segment, communication efficiency can be ensured.

As a preferred embodiment, if the NFS server does not fail, further comprising:

when the NFS server sends or receives the data file, acquiring the data file;

the data file is stored to the storage space of the target server itself.

In order to ensure data synchronism, in the present application, considering that the NFS server is a server dedicated to implementing the NFS function, and the target server may be regarded as a client of the NFS server or as a replacement of the NFS server, in the case where the NFS server is normal, although the NFS server is a server implemented with the NFS server as the primary NFS function, when the NFS server fails, the target server is required to replace the NFS server. The NFS server stores executable files sent by the server management system to the server cluster and result files sent by the server cluster or the NFS server to the server management system, and the NFS server achieves the purpose of data consistency in the server cluster by storing the executable files and the result files. Based on this, the synchronism of the data can be ensured.

Referring to fig. 3, fig. 3 is a schematic structural diagram of a server cluster management system according to the present application, including:

a parameter obtaining unit 21, configured to obtain operating parameter information of the NFS server, where the operating parameter information includes one or two of a network state parameter and a hardware parameter;

a judging unit 22, configured to judge whether the NFS server fails according to the working parameter information of the NFS server, and if the NFS server fails, trigger the executing unit;

and the executing unit 23 is configured to take the target server as a new current NFS server.

For a detailed description of a server cluster management system provided in the present application, please refer to the embodiment of the server cluster management method, and the detailed description of the present application is omitted herein.

Based on the above embodiments:

as a preferred embodiment, further comprising:

an execution mode determining unit, configured to determine, when receiving the executable file after taking the target server as a new current NFS server, an execution mode of the executable file, where the execution mode includes an out-of-band mode and an in-band mode;

the first result file generating unit is used for generating a first result file corresponding to the executable file when the execution mode of the executable file is an out-of-band mode;

and the first sending unit is used for sending the executable file to N servers in the server cluster when the execution mode of the executable file is an in-band mode, so that the N servers in the server cluster generate N second result files according to the executable file, and N is a positive integer.

As a preferred embodiment, further comprising:

the first storage unit is used for storing a first result file corresponding to the executable file into a storage space of the target server after the first result file is generated;

the second storage unit is used for acquiring N second result files after N servers in the server cluster generate N second result files according to the executable files, and storing the N second result files into the storage space of the target server.

As a preferred embodiment, further comprising:

a replacing unit, configured to, after taking the target server as a new current NFS server, take the NFS server as the new current NFS server after detecting that the NFS server resumes normal operation;

and the second sending unit is used for sending all the first result files and all the second result files stored by the target server to the NFS server.

As a preferred embodiment, the second transmitting unit includes:

a first time determining unit, configured to determine a failure start time when the NFS server fails;

a second time determining unit, configured to determine a fault end time when the NFS server is detected to resume normal operation;

and the third sending unit is used for sending all the first result files and all the second result files acquired by the target server in the time period from the fault starting time to the fault ending time to the NFS server.

As a preferred embodiment, further comprising:

the network segment judging unit is used for judging whether the target server and the NFS server are in the same network segment before the working parameter information of the NFS server is acquired; if the network segment is the same, the parameter acquisition unit 21 is triggered.

As a preferred embodiment, further comprising:

a file acquisition unit configured to acquire a data file when the NFS server transmits or receives the data file when the determination unit 22 determines that the NFS server has not failed;

and the third storage unit is used for storing the data file into the storage space of the target server.

Referring to fig. 4, fig. 4 is a schematic structural diagram of a server cluster management device according to the present application, including:

a memory 31 for storing a computer program;

a processor 32 for implementing the steps of the server cluster management method as described above when executing a computer program.

For a detailed description of the server cluster management device provided by the present application, please refer to the embodiment of the server cluster management method, and the detailed description of the present application is omitted herein.

The present application also provides a computer readable storage medium, on which a computer program is stored, which when executed by a processor implements the steps of the server cluster management method as described above.

For a detailed description of a computer readable medium provided in the present application, please refer to the embodiment of the server cluster management method, and the detailed description is omitted herein.

In the present specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, and identical and similar parts between the embodiments are all enough to refer to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.

It should also be noted that in this specification, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A server cluster management method, applied to a target server in a server cluster, where the target server is any one server in the server cluster, the server cluster management method comprising:

if the NFS server fails, the target server is used as a new current NFS server;

after the target server is used as the new current NFS server, the method further comprises:

2. The server cluster management method according to claim 1, further comprising, after generating the first result file corresponding to the executable file:

storing the first result file into a storage space of the target server;

3. The server cluster management method according to claim 2, further comprising, after taking the target server as a new current NFS server:

4. The server cluster management method of claim 3, wherein transmitting all the first result files and all the second result files stored by the target server itself to the NFS server includes:

determining a fault starting time when the NFS server fails;

5. The server cluster management method according to claim 1, further comprising, before acquiring the operating parameter information of the NFS server:

6. The server cluster management method according to any one of claims 1 to 5, wherein if the NFS server is not failed, further comprising:

when the NFS server sends or receives a data file, acquiring the data file;

and storing the data file into a storage space of the target server.

7. A server cluster management system, comprising:

a parameter obtaining unit, configured to obtain working parameter information of an NFS server, where the working parameter information includes one or two of a network state parameter and a hardware parameter;

the execution unit is used for taking the target server as a new current NFS server;

the server cluster management system further includes:

8. A server cluster management apparatus, comprising:

a memory for storing a computer program;

a processor for implementing the steps of the server cluster management method according to any one of claims 1 to 6 when executing the computer program.

9. A computer readable storage medium, characterized in that the computer readable storage medium has stored thereon a computer program which, when executed by a processor, implements the steps of the server cluster management method according to any of claims 1 to 6.