CN107919980B - Evaluation method and device for clustered system - Google Patents

Evaluation method and device for clustered system Download PDF

Info

Publication number
CN107919980B
CN107919980B CN201711037523.3A CN201711037523A CN107919980B CN 107919980 B CN107919980 B CN 107919980B CN 201711037523 A CN201711037523 A CN 201711037523A CN 107919980 B CN107919980 B CN 107919980B
Authority
CN
China
Prior art keywords
calling
disaster recovery
information table
cluster
address
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711037523.3A
Other languages
Chinese (zh)
Other versions
CN107919980A (en
Inventor
符立佳
苗辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guizhou Baishancloud Technology Co Ltd
Original Assignee
Guizhou Baishancloud Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guizhou Baishancloud Technology Co Ltd filed Critical Guizhou Baishancloud Technology Co Ltd
Priority to CN201711037523.3A priority Critical patent/CN107919980B/en
Publication of CN107919980A publication Critical patent/CN107919980A/en
Application granted granted Critical
Publication of CN107919980B publication Critical patent/CN107919980B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0631Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/069Management of faults, events, alarms or notifications using logs of notifications; Post-processing of notifications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/06Generation of reports
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L61/00Network arrangements, protocols or services for addressing or naming
    • H04L61/50Address allocation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/2866Architectures; Arrangements
    • H04L67/30Profiles

Abstract

The invention discloses an evaluating method and a device of a clustering system, wherein the method comprises the following steps: step 1, obtaining a cluster calling program information table of a central server, wherein the cluster calling program information table comprises: calling a target address, a configuration file path and disaster recovery switching time limit; step 2, A reads the configuration file according to the path of the configuration file, and judges whether the calling target address is reasonably configured according to the content of the configuration file; when the judgment result is unreasonable, generating first alarm information and generating an evaluation report containing the first alarm information; and/or B, judging whether the clustered system completes disaster recovery switching within the disaster recovery switching time limit or not through simulation test; when the clustered system does not finish disaster recovery switching within the disaster recovery switching time limit, generating second alarm information and generating an evaluation report containing the second alarm information; and 3, outputting an evaluation report.

Description

Evaluation method and device for clustered system
Technical Field
The invention relates to the technical field of computer networks, in particular to an evaluation method and device for a clustered system.
Background
With the development of the internet, netizens have higher and higher requirements on the quality of network access and tolerate a network access fault or a service fault 0. At present, in order to achieve high availability, a plurality of network service systems are often built by adopting a clustering structure, so that the availability of system services can still be guaranteed when a single server fails and a single node fails. However, in many cases, such a clustered system cannot implement failover due to improper configuration of a cluster calling program or wrong calling manner, and thus, when a single server/node fails, the clustered characteristic of the system cannot be successfully utilized, resulting in service failure.
For example, system A has two servers A and B, which can provide equivalent services, and system B calls the data of system A as the data source of the service. Under normal conditions, when a single server A of the system A has a fault, the server B can normally provide services. However, in reality, only the server a may be configured when the system B configures the calling program, or although the servers a and B are configured, due to the problem of the switching program of the system B, when the server a fails, the server a cannot be switched to the server B to acquire data, so that a service failure occurs, thereby reducing the reliability of the service system.
Therefore, it is necessary to evaluate the clustered system so as to find the above abnormal conditions existing in the clustered system in time.
Disclosure of Invention
In order to solve the technical problem, the invention provides an evaluating method and an evaluating device for a clustered system, which can evaluate the calling strategy and the performance of the clustered system.
The invention provides an evaluating method of a clustering system, which comprises the following steps:
step 1, obtaining a cluster calling program information table of a central server, wherein the cluster calling program information table comprises: calling a target address, a configuration file path and disaster recovery switching time limit;
step 2, A reads the configuration file according to the path of the configuration file, and judges whether the calling target address is reasonably configured according to the content of the configuration file; when the judgment result is unreasonable, generating first alarm information and generating an evaluation report containing the first alarm information; and/or B, judging whether the clustered system completes disaster recovery switching within the disaster recovery switching time limit or not through simulation test; when the clustered system does not finish disaster recovery switching within the disaster recovery switching time limit, generating second alarm information and generating an evaluation report containing the second alarm information;
and 3, outputting an evaluation report.
Further, in the foregoing scheme, the method further includes:
judging whether the cluster calling program information table is complete or not according to the access log of the clustered system;
when the information table of the cluster calling program is complete, executing the step 2;
and when the information table of the cluster calling program is incomplete, generating third alarm information, generating an evaluation report containing the third alarm information, and executing the step 3.
Further, in the above solution, the information table of the cluster calling program further includes: a first calling source address and a first application service name; the judging whether the cluster calling program information table is complete according to the access log of the clustered system comprises:
extracting a second calling source address and a second application service name in an access log of the clustered system within preset time;
when the first calling source address is consistent with the second calling source address and the first application service name is consistent with the second application service name, judging that the cluster calling program information table is complete;
otherwise, judging that the information table of the cluster calling program is incomplete.
Further, in the foregoing scheme, the determining whether the calling target address is configured reasonably according to the content of the configuration file includes:
acquiring the address of the called server and a preset calling configuration strategy according to the configuration file;
when the calling target address is different from the called server address, judging that the calling target address is unreasonable; and/or
And when the calling target address does not accord with a preset calling configuration strategy, judging that the calling target address is unreasonable.
Further, in the foregoing scheme, the determining, by the simulation test, whether the clustered system completes the disaster recovery switching within the disaster recovery switching time limit includes:
respectively appointing a server pointed by any address in the calling target addresses to actively shield the request sent by the calling source address;
observing whether the clustered system switches the request to a server pointed to by other addresses in the calling target address;
when the cluster system switches the request to a server pointed by other addresses in the calling target address, recording the switching time used in the switching process;
when all the servers pointed by all the addresses in the calling target address are designated, each request is switched, and the corresponding switching time is less than or equal to the disaster recovery switching time limit, judging that the clustered system completes disaster recovery switching within the disaster recovery switching time limit;
otherwise, the cluster system does not complete the disaster recovery switching within the disaster recovery switching time limit.
The invention also provides an evaluation device of the clustering system, which comprises: the system comprises an information table acquisition module, a configuration judgment module and/or a disaster tolerance test module and a report output module; wherein the content of the first and second substances,
an information table obtaining module, configured to obtain a cluster calling program information table of the central server, where the cluster calling program information table includes: calling a target address, a configuration file path and disaster recovery switching time limit;
the configuration judging module is used for reading a configuration file according to the path of the configuration file and judging whether the calling target address is reasonably configured according to the content of the configuration file; when the judgment result is unreasonable, generating first alarm information;
the disaster tolerance test module is used for judging whether the clustered system completes disaster tolerance switching within the disaster tolerance switching time limit through simulation test; when the clustered system does not finish disaster recovery switching within the disaster recovery switching time limit, generating second alarm information;
and the report output module is used for outputting an evaluation report, and the evaluation report comprises the first alarm information and/or the second alarm information.
Further, in the foregoing solution, the apparatus further includes an information table checking module, where the information table checking module includes:
a complete judgment unit, configured to judge whether a cluster calling program information table of a central server is complete according to an access log of the clustered system after the cluster calling program information table of the central server is obtained;
the first skipping unit is used for skipping to the configuration judging module and/or the disaster tolerance testing module when the information table of the cluster calling program is complete;
and the second skipping unit is used for generating third alarm information when the information table of the cluster calling program is incomplete, generating an evaluation report containing the third alarm information and skipping to the report output module.
Further, in the above solution, the information table of the cluster calling program further includes: a first calling source address and a first application service name; the judging complete unit includes:
the extraction subunit is used for extracting a second calling source address and a second application service name in an access log of the clustered system within preset time;
a determining subunit, configured to determine that the cluster calling program information table is complete when the first calling source address is consistent with the second calling source address and the first application service name is consistent with the second application service name; otherwise, judging that the information table of the cluster calling program is incomplete.
Further, in the foregoing solution, the configuration determining module includes:
the obtaining unit is used for obtaining the called server address and a preset calling configuration strategy according to the configuration file;
the server judging unit is used for judging that the calling target address is unreasonable when the calling target address is different from the called server address; and/or
And the strategy judgment unit is used for judging that the calling target address is unreasonable when the calling target address does not accord with a preset calling configuration strategy.
Further, in the above scheme, the disaster recovery testing module includes:
the shielding unit is used for respectively appointing a server pointed by any address in the calling target addresses to actively shield the request sent by the calling source address;
the observation unit is used for observing whether the clustered system switches the request to a server pointed by other addresses in the calling target address;
the recording unit is used for recording the switching time used in the switching process when the clustered system switches the request to the server pointed by other addresses in the calling target address;
a disaster recovery determining unit, configured to determine that the clustered system completes disaster recovery switching within the disaster recovery switching time limit when all servers pointed to by all addresses in the call target address are designated, each request is switched, and the corresponding switching time is less than or equal to the disaster recovery switching time limit; otherwise, the cluster system does not complete the disaster recovery switching within the disaster recovery switching time limit.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate an embodiment of the invention and, together with the description, serve to explain the invention and not to limit the invention. In the drawings:
fig. 1 is a schematic flow chart illustrating an implementation of an evaluation method of a clustered system according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of a composition of an evaluation apparatus of a clustered system according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention. It should be noted that the embodiments and features of the embodiments in the present application may be arbitrarily combined with each other without conflict.
The invention provides an evaluating device which is used for implementing the evaluating method of the clustered system.
Fig. 1 is a schematic flow chart of an implementation of an evaluation method for a clustered system according to an embodiment of the present invention, as shown in fig. 1, the method includes:
step 1, obtaining a cluster calling program information table of a central server, wherein the cluster calling program information table comprises: calling a target address, a configuration file path and disaster recovery switching time limit;
specifically, the evaluation device obtains a cluster calling program information table of the central server, and the cluster calling program information table usually includes: the method comprises the following steps that information such as a first calling source address, a first application service name, a clustered system name, a first access resource, a calling target address, a configuration file path, disaster recovery switching time limit and the like are obtained;
here, the cluster caller refers to a generic name of a system, a module, and an interface of the cluster service.
Table 1 is an example of a table of cluster caller information in one embodiment.
TABLE 1
Figure GDA0002309191070000061
The evaluating device obtains a cluster calling program information table, and can obtain a plurality of key information for evaluating whether the clustered system has high reliability, including: a first calling source address, a first application service name, a calling target address, a configuration file path, a disaster recovery switching time limit and the like.
Therefore, whether the content of the information table of the cluster calling program is complete and correct or not directly influences the reliability of the evaluation result of the evaluation device on the clustered system. Therefore, in some embodiments, the above evaluation method further includes:
after a cluster calling program information table of a central server is obtained, whether the cluster calling program information table is complete or not is judged according to an access log of the clustered system;
when the information table of the cluster calling program is complete, executing the following step 2;
and when the information table of the cluster calling program is incomplete, generating third alarm information, executing the following step 3, and generating an evaluation report containing the third alarm information.
Specifically, the determining, according to the access log of the clustered system, whether the information table of the cluster calling program is complete includes:
extracting a second calling source address and a second application service name in an access log of the clustered system within preset time;
when the first calling source address is consistent with the second calling source address and the first application service name is consistent with the second application service name, judging that the cluster calling program information table is complete;
otherwise, judging that the information table of the cluster calling program is incomplete.
Because when the cluster calling program accesses the clustered system, the IP address of the server (i.e. the first calling source address) and the name of the first application service are put into the request information, and the clustered system receives the calling request and records information of calling time, accessing resources, calling source IP address, calling source application service, etc. into the access log, the access log of the clustered system usually contains: calling time, a second access resource, a second calling source address, a second application service name and the like.
For example, the cluster calling program information table of the central server acquired by the evaluation device is shown in table 1, and when accessing the cluster 1, 1.1.1.1, applications 1 and 2.2.2.1, and application 2 are carried to the cluster 1; the cluster 1 prints an access log, including two items, namely 1.1.1.1, application 1 and 2.2.2.1 and application 2; the information is information of two second calling source addresses and second application service names; comparing the first calling source address and the first application service name with the first calling source address and the first application service name acquired from the cluster calling program information table (table 1), and judging that the cluster calling program information table is complete if the comparison is consistent; assuming that there is 3.3.3.3, 3, or only 1.1.1.1, 1, or only 2.2.2.1, 2 access in the log, the cluster caller information table is determined to be incomplete.
Further, in some embodiments, when the evaluating device finds that the information table of the cluster calling program is incomplete, not only the third alarm information is generated, but also the evaluation report including the third alarm information is generated and skipped to the following step 3, and the current evaluation process of the clustered system is exited.
Step 2, A reads the configuration file according to the path of the configuration file, and judges whether the calling target address is reasonably configured according to the content of the configuration file; when the judgment result is unreasonable, generating first alarm information and generating an evaluation report containing the first alarm information;
specifically, the evaluation device reads a configuration file according to the path of the configuration file, and judges whether the calling target address is reasonably configured according to the content of the configuration file; when the judgment result is unreasonable, generating first alarm information and generating an evaluation report containing the first alarm information;
in the foregoing solution, the determining whether the calling target address is configured reasonably according to the content of the configuration file includes:
acquiring the address of the called server and a preset calling configuration strategy according to the configuration file;
when the calling target address is different from the called server address, judging that the calling target address is unreasonable; and/or
And when the calling target address does not accord with a preset calling configuration strategy, judging that the calling target address is unreasonable.
Here, the preset invoking configuration policy includes, but is not limited to, "configure the minimum number of clustered servers IP," "different room servers," "different network area servers," "different ISP servers," and the like. Generally, a preset calling configuration policy is necessarily contained in a configuration file, the configuration policy has a unique identifier as a key, and if the calling configuration policy is particularly specified in the configuration file, the default is that the minimum number of configured clustered servers IP is 0. For example: "server _ ip: 1.1.1.1, 2.2.2.2 ", wherein server _ ip is a unique identifier of a calling configuration policy preset in a configuration file.
For example: the cluster calling program information table of the central server acquired by the evaluation device is shown in table 1, and whether the calling target address is reasonably configured or not is judged according to the first item in table 1. The evaluation device reads the IP data of the cluster server in the Config1, judges whether the IP data are consistent with the calling target addresses ' 4.4.4.1 and 4.4.4.2 ', judges reasonably if the IP data are consistent with the calling target addresses ', and judges unreasonably if the IP data are inconsistent with the calling target addresses; if the address configuration is consistent, the calling target address configuration is determined to be 4.4.4.1, 4.4.4.2;
at this time, the preset evaluation strategy is that the minimum IP is N, and the relationship is different machine rooms. The actual calling target addresses have 2 IPs, so that if N is greater than 2, the number of the actual calling target addresses does not meet the requirement of a preset evaluation strategy, and the judgment is unreasonable; and if N is less than or equal to 2, judging to be reasonable if the actual number of the calling target addresses meets the requirement of a preset evaluation strategy. Then, the evaluating device judges whether the IP addresses 4.4.4.1, 4.4.4.2 are in the same machine room, if so, the judgment is unreasonable, otherwise, the judgment is reasonable.
Therefore, the evaluating device can complete the check of the cluster calling program configuration rationality of the clustered system.
Further, in order to more perfectly evaluate the reliability of the clustered system, the evaluating method may further include:
step 2, judging whether the clustered system completes disaster recovery switching within the disaster recovery switching time limit or not through simulation test; when the clustered system does not finish disaster recovery switching within the disaster recovery switching time limit, generating second alarm information and generating an evaluation report containing the second alarm information;
specifically, the evaluation device can judge whether the clustered system completes disaster recovery switching within the disaster recovery switching time limit through an actual test process; when the clustered system does not finish disaster recovery switching within the disaster recovery switching time limit, generating second alarm information and generating an evaluation report containing the second alarm information;
wherein the determining, through the simulation test, whether the clustered system completes the disaster recovery switching within the disaster recovery switching time limit includes:
respectively appointing a server pointed by any address in the calling target addresses to actively shield the request sent by the calling source address;
observing whether the clustered system switches the request to a server pointed to by other addresses in the calling target address;
when the cluster system switches the request to a server pointed by other addresses in the calling target address, recording the switching time used in the switching process;
when all the servers pointed by all the addresses in the calling target address are designated, each request is switched, and the corresponding switching time is less than or equal to the disaster recovery switching time limit, judging that the clustered system completes disaster recovery switching within the disaster recovery switching time limit;
otherwise, the cluster system does not complete the disaster recovery switching within the disaster recovery switching time limit.
For example, the cluster calling program information table of the central server acquired by the evaluation device is shown in table 1, and the test judgment is performed on the first item in table 1. The evaluation device shields call request access from an IP address of 1.1.1.1 and an application service name of application 1 on a server with an IP address of 4.4.4.1 and records time; monitoring an access log on a server with an IP address of 4.4.4.2, and confirming the time when the access request reaches a backup server 4.4.4.2; calculating the switching time, comparing disaster tolerance switching time limit in the cluster calling program information table, if the time is less than or equal to the disaster tolerance switching time limit, determining that the disaster tolerance switching is completed within a set time limit, if the time exceeds the disaster tolerance switching time limit and the switching is not performed, determining that the switching is failed, and determining that the disaster tolerance switching is abnormal; then, the evaluating device shields the call request access from the IP address of 1.1.1.1 and the application service name of application 1 on the server with the IP address of 4.4.4.2, and records the time; monitoring an access log on a server with an IP address of 4.4.4.1, and confirming the time when the access request reaches a backup server 4.4.4.1; calculating the switching time, comparing disaster tolerance switching time limit in the cluster calling program information table, if the time is less than or equal to the disaster tolerance switching time limit, determining that the disaster tolerance switching is completed within a set time limit, if the time exceeds the disaster tolerance switching time limit and the switching is not performed, determining that the switching is failed, and determining that the disaster tolerance switching is abnormal; and if the two times of switching are finished within the set time limit, judging that the disaster tolerance capability of the clustering system is normal.
Particularly, the step 2B can complete evaluation of the disaster tolerance capability of the clustered system, and does not depend on the step 2A, so in some embodiments, the evaluation device can skip the step 2A and directly execute the step 2B; of course, in some embodiments, the evaluation device may only perform step 2A, skipping step 2B.
And 3, outputting an evaluation report.
Specifically, the evaluation device outputs the generated evaluation report according to the evaluation condition.
For example, the evaluation device outputs the alarm information when detecting the abnormality according to the above scheme, and when the abnormality is not detected, the evaluation report may be the normal evaluation result and the data of the evaluation process of each link.
By using the evaluation method of the clustered system provided by the embodiment, which configuration faults and omissions exist in the clustered system for the cluster calling program can be found in time, and meanwhile, the disaster tolerance capability of the clustered system can be actually detected, and the defect of the disaster tolerance capability can be found in time; therefore, when the fault does not occur, the fault is prevented in advance, the risk of the fault is reduced, and the reliability of the clustered system is improved.
Fig. 2 is a schematic structural diagram of an evaluation device of a clustered system according to an embodiment of the present invention, and as shown in fig. 2, the evaluation device includes: an information table acquisition module 201, a configuration judgment module 202 and/or a disaster tolerance test module 203, and a report output module 204; wherein the content of the first and second substances,
an information table obtaining module 201, configured to obtain a cluster calling program information table of a central server, where the cluster calling program information table includes: a first calling source address, a first application service name, a calling target address, a configuration file path and a disaster recovery switching time limit;
a configuration determining module 202, configured to read a configuration file according to the configuration file path, and determine whether the calling target address is configured reasonably according to the configuration file content; when the judgment result is unreasonable, generating first alarm information and generating an evaluation report containing the first alarm information;
the disaster tolerance testing module 203 is configured to determine, through a simulation test, whether the clustered system completes disaster tolerance switching within the disaster tolerance switching time limit; when the clustered system does not finish disaster recovery switching within the disaster recovery switching time limit, generating second alarm information and generating an evaluation report containing the second alarm information;
a report output module 204, configured to output an evaluation report, where the evaluation report includes the first warning information and/or the second warning information.
Further, the above evaluation apparatus further includes an information table checking module, where the information table checking module includes:
a complete judgment unit, configured to judge whether a cluster calling program information table of a central server is complete according to an access log of the clustered system after the cluster calling program information table of the central server is obtained;
a first jumping unit, configured to jump to the configuration determining module 202 and/or the disaster tolerance testing module 203 when the information table of the cluster calling program is complete;
and the second skipping unit is used for generating third alarm information when the information table of the cluster calling program is incomplete, generating an evaluation report containing the third alarm information and skipping to the report output module 204. .
Furthermore, in the foregoing solution, the determining unit includes:
the extraction subunit is used for extracting a second calling source address and a second application service name in an access log of the clustered system within preset time;
a determining subunit, configured to determine that the cluster calling program information table is complete when the first calling source address is consistent with the second calling source address and the first application service name is consistent with the second application service name; otherwise, judging that the information table of the cluster calling program is incomplete.
In the above solution, the configuration determining module 202 includes:
the obtaining unit is used for obtaining the called server address and a preset calling configuration strategy according to the configuration file;
the server judging unit is used for judging that the calling target address is unreasonable when the calling target address is different from the called server address; and/or
And the strategy judgment unit is used for judging that the calling target address is unreasonable when the calling target address does not accord with a preset calling configuration strategy.
In the above solution, the disaster recovery testing module 203 includes:
the shielding unit is used for respectively appointing a server pointed by any address in the calling target addresses to actively shield the request sent by the calling source address;
the observation unit is used for observing whether the clustered system switches the request to a server pointed by other addresses in the calling target address;
the recording unit is used for recording the switching time used in the switching process when the clustered system switches the request to the server pointed by other addresses in the calling target address;
a disaster recovery determining unit, configured to determine that the clustered system completes disaster recovery switching within the disaster recovery switching time limit when all servers pointed to by all addresses in the call target address are designated, each request is switched, and the corresponding switching time is less than or equal to the disaster recovery switching time limit; otherwise, the cluster system does not complete the disaster recovery switching within the disaster recovery switching time limit.
In practical applications, each module and each unit can be implemented by a Central Processing Unit (CPU), a microprocessor unit (MPU), a Digital Signal Processor (DSP), or a Field Programmable Gate Array (FPGA) in the evaluation device.
The above-described aspects may be implemented individually or in various combinations, and such variations are within the scope of the present invention.
It will be understood by those skilled in the art that all or part of the steps of the above methods may be implemented by instructing the relevant hardware through a program, and the program may be stored in a computer readable storage medium, such as a read-only memory, a magnetic or optical disk, and the like. Alternatively, all or part of the steps of the foregoing embodiments may also be implemented by using one or more integrated circuits, and accordingly, each module/unit in the foregoing embodiments may be implemented in the form of hardware, and may also be implemented in the form of a software functional module. The present invention is not limited to any specific form of combination of hardware and software.
It is to be noted that, in this document, the terms "comprises", "comprising" or any other variation thereof are intended to cover a non-exclusive inclusion, so that an article or apparatus including a series of elements includes not only those elements but also other elements not explicitly listed or inherent to such article or apparatus. Without further limitation, an element defined by the phrase "comprising … …" does not exclude the presence of additional like elements in the article or device comprising the element.
The above embodiments are merely to illustrate the technical solutions of the present invention and not to limit the present invention, and the present invention has been described in detail with reference to the preferred embodiments. It will be understood by those skilled in the art that various modifications and equivalent arrangements may be made without departing from the spirit and scope of the present invention and it should be understood that the present invention is to be covered by the appended claims.

Claims (6)

1. A method for evaluating a clustered system, the method comprising:
step 1, obtaining a cluster calling program information table of a central server, wherein the cluster calling program information table comprises: calling a target address, a configuration file path and disaster recovery switching time limit;
step 2, A reads the configuration file according to the path of the configuration file, and judges whether the calling target address is reasonably configured according to the content of the configuration file; when the judgment result is unreasonable, generating first alarm information and generating an evaluation report containing the first alarm information; and the number of the first and second groups,
b, judging whether the clustered system completes disaster recovery switching within the disaster recovery switching time limit or not through simulation test; when the clustered system does not finish disaster recovery switching within the disaster recovery switching time limit, generating second alarm information and generating an evaluation report containing the second alarm information;
step 3, outputting an evaluation report;
after step 1, before step 2, the method further comprises:
judging whether the cluster calling program information table is complete or not according to the access log of the clustered system;
when the information table of the cluster calling program is complete, executing the step 2;
when the information table of the cluster calling program is incomplete, generating third alarm information, generating an evaluation report containing the third alarm information, and executing the step 3;
in step 2, the determining whether the calling target address is configured reasonably according to the content of the configuration file includes:
acquiring the address of the called server and a preset calling configuration strategy according to the configuration file;
when the calling target address is different from the called server address, judging that the calling target address is unreasonable; and/or
And when the calling target address does not accord with a preset calling configuration strategy, judging that the calling target address is unreasonable.
2. An evaluating method according to claim 1, wherein the cluster caller information table further comprises: a first calling source address and a first application service name; the judging whether the cluster calling program information table is complete according to the access log of the clustered system comprises:
extracting a second calling source address and a second application service name in an access log of the clustered system within preset time;
when the first calling source address is consistent with the second calling source address and the first application service name is consistent with the second application service name, judging that the cluster calling program information table is complete;
otherwise, judging that the information table of the cluster calling program is incomplete.
3. The evaluating method according to claim 1, wherein the determining whether the clustered system completes the disaster recovery switching within the disaster recovery switching time limit through the simulation test comprises:
respectively appointing a server pointed by any address in the calling target addresses to actively shield the request sent by the calling source address;
observing whether the clustered system switches the request to a server pointed to by other addresses in the calling target address;
when the cluster system switches the request to a server pointed by other addresses in the calling target address, recording the switching time used in the switching process;
when all the servers pointed by all the addresses in the calling target address are designated, each request is switched, and the corresponding switching time is less than or equal to the disaster recovery switching time limit, judging that the clustered system completes disaster recovery switching within the disaster recovery switching time limit;
otherwise, the cluster system does not complete the disaster recovery switching within the disaster recovery switching time limit.
4. An evaluation apparatus of a clustered system, the apparatus comprising: the disaster recovery system comprises an information table acquisition module, a configuration judgment module, a disaster tolerance test module and a report output module; wherein the content of the first and second substances,
an information table obtaining module, configured to obtain a cluster calling program information table of the central server, where the cluster calling program information table includes: calling a target address, a configuration file path and disaster recovery switching time limit;
the configuration judging module is used for reading a configuration file according to the path of the configuration file and judging whether the calling target address is reasonably configured according to the content of the configuration file; when the judgment result is unreasonable, generating first alarm information;
wherein the configuration determination module comprises:
the obtaining unit is used for obtaining the called server address and a preset calling configuration strategy according to the configuration file;
the server judging unit is used for judging that the calling target address is unreasonable when the calling target address is different from the called server address; and/or
The strategy judgment unit is used for judging that the calling target address is unreasonable when the calling target address does not accord with a preset calling configuration strategy;
the disaster tolerance test module is used for judging whether the clustered system completes disaster tolerance switching within the disaster tolerance switching time limit through simulation test; when the clustered system does not finish disaster recovery switching within the disaster recovery switching time limit, generating second alarm information;
the report output module is used for outputting an evaluation report, and the evaluation report comprises the first alarm information and the second alarm information;
the apparatus further comprises an information table checking module, the information table checking module comprising:
a complete judgment unit, configured to judge whether a cluster calling program information table of a central server is complete according to an access log of the clustered system after the cluster calling program information table of the central server is obtained;
the first skipping unit is used for skipping to the configuration judging module and the disaster tolerance testing module when the information table of the cluster calling program is complete;
and the second skipping unit is used for generating third alarm information when the information table of the cluster calling program is incomplete, generating an evaluation report containing the third alarm information and skipping to the report output module.
5. The evaluation apparatus according to claim 4, wherein the information table of the group call routine further comprises: a first calling source address and a first application service name; the judging complete unit includes:
the extraction subunit is used for extracting a second calling source address and a second application service name in an access log of the clustered system within preset time;
a determining subunit, configured to determine that the cluster calling program information table is complete when the first calling source address is consistent with the second calling source address and the first application service name is consistent with the second application service name; otherwise, judging that the information table of the cluster calling program is incomplete.
6. The evaluation device according to claim 4, wherein the disaster recovery test module comprises:
the shielding unit is used for respectively appointing a server pointed by any address in the calling target addresses to actively shield the request sent by the calling source address;
the observation unit is used for observing whether the clustered system switches the request to a server pointed by other addresses in the calling target address;
the recording unit is used for recording the switching time used in the switching process when the clustered system switches the request to the server pointed by other addresses in the calling target address;
a disaster recovery determining unit, configured to determine that the clustered system completes disaster recovery switching within the disaster recovery switching time limit when all servers pointed to by all addresses in the call target address are designated, each request is switched, and the corresponding switching time is less than or equal to the disaster recovery switching time limit; otherwise, the cluster system does not complete the disaster recovery switching within the disaster recovery switching time limit.
CN201711037523.3A 2017-10-30 2017-10-30 Evaluation method and device for clustered system Active CN107919980B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711037523.3A CN107919980B (en) 2017-10-30 2017-10-30 Evaluation method and device for clustered system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711037523.3A CN107919980B (en) 2017-10-30 2017-10-30 Evaluation method and device for clustered system

Publications (2)

Publication Number Publication Date
CN107919980A CN107919980A (en) 2018-04-17
CN107919980B true CN107919980B (en) 2020-02-21

Family

ID=61895893

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711037523.3A Active CN107919980B (en) 2017-10-30 2017-10-30 Evaluation method and device for clustered system

Country Status (1)

Country Link
CN (1) CN107919980B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109039817B (en) * 2018-08-03 2020-09-01 京东数字科技控股有限公司 Information processing method, device, equipment and medium for flow monitoring

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104035872A (en) * 2014-06-27 2014-09-10 浪潮(北京)电子信息产业有限公司 Method and device for testing clustering software
CN105429826A (en) * 2015-12-25 2016-03-23 北京奇虎科技有限公司 Fault detection method and device for database cluster
CN106034037A (en) * 2015-03-13 2016-10-19 腾讯科技(深圳)有限公司 Disaster recovery switching method and device based on virtual machine

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8266474B2 (en) * 2009-12-30 2012-09-11 Symantec Corporation Fast cluster failure detection
US9128862B2 (en) * 2012-02-23 2015-09-08 International Business Machines Corporation Efficient checksums for shared nothing clustered filesystems

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104035872A (en) * 2014-06-27 2014-09-10 浪潮(北京)电子信息产业有限公司 Method and device for testing clustering software
CN106034037A (en) * 2015-03-13 2016-10-19 腾讯科技(深圳)有限公司 Disaster recovery switching method and device based on virtual machine
CN105429826A (en) * 2015-12-25 2016-03-23 北京奇虎科技有限公司 Fault detection method and device for database cluster

Also Published As

Publication number Publication date
CN107919980A (en) 2018-04-17

Similar Documents

Publication Publication Date Title
CN111130938B (en) Index acquisition method and device, electronic equipment and computer readable storage medium
CN107516547A (en) The processing method and processing device of internal memory hard error
CN113625945A (en) Distributed storage slow disk processing method, system, terminal and storage medium
CN113656168A (en) Method, system, medium and equipment for automatic disaster recovery and scheduling of traffic
CN112818307A (en) User operation processing method, system, device and computer readable storage medium
CN107919980B (en) Evaluation method and device for clustered system
CN113126925B (en) Member list determining method, device and equipment and readable storage medium
CN108650123B (en) Fault information recording method, device, equipment and storage medium
CN107291575B (en) Processing method and equipment for data center fault
CN111950640A (en) Switch fault processing method and device
CN105743725A (en) Method and device for testing application programs
CN109150587B (en) Maintenance method and device
CN112286786A (en) Database testing method and device and server
US9747181B2 (en) System and method for inspection of system state during testing
CN115276844B (en) Communication module testing method and device, storage medium and electronic equipment
CN113778763B (en) Intelligent switching method and system for three-way interface service faults
CN115373916A (en) Abnormality detection method, abnormality detection device, electronic apparatus, and computer-readable storage medium
CN108845932B (en) Unit testing method and device of network library, storage medium and terminal
CN112799896A (en) Distributed storage hard disk fault processing method and device
CN113656215A (en) Automatic disaster recovery method, system, medium and equipment based on centralized configuration
CN107623602B (en) Automatic checking method and system
CN108156007B (en) Attribution method and device for network service fault
CN112068935A (en) Method, device and equipment for monitoring deployment of kubernets program
CN111880958A (en) Zero terminal self-checking method and device
CN117608891A (en) Online accident investigation method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 550003 Building No. 12 in the Southern Park of Gui'an High-end Equipment Industrial Park, Guizhou Province

Applicant after: Guizhou Baishan cloud Polytron Technologies Inc

Address before: 100015 5 floor, block E, 201 IT tower, electronic city, 10 Jiuxianqiao Road, Chaoyang District, Beijing.

Applicant before: Guizhou white cloud Technology Co., Ltd.

REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 1254304

Country of ref document: HK

GR01 Patent grant
GR01 Patent grant