CN114172829A - Server health monitoring method and system and computing equipment - Google Patents

Server health monitoring method and system and computing equipment Download PDF

Info

Publication number
CN114172829A
CN114172829A CN202210123500.9A CN202210123500A CN114172829A CN 114172829 A CN114172829 A CN 114172829A CN 202210123500 A CN202210123500 A CN 202210123500A CN 114172829 A CN114172829 A CN 114172829A
Authority
CN
China
Prior art keywords
server
health
monitored
check
request
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210123500.9A
Other languages
Chinese (zh)
Other versions
CN114172829B (en
Inventor
廖世伟
汤雄飞
江林伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Uniontech Software Technology Co Ltd
Original Assignee
Uniontech Software Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Uniontech Software Technology Co Ltd filed Critical Uniontech Software Technology Co Ltd
Priority to CN202210123500.9A priority Critical patent/CN114172829B/en
Priority to CN202210731202.8A priority patent/CN115190047B/en
Publication of CN114172829A publication Critical patent/CN114172829A/en
Application granted granted Critical
Publication of CN114172829B publication Critical patent/CN114172829B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0805Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability
    • H04L43/0817Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability by checking functioning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3055Monitoring arrangements for monitoring the status of the computing system or of the computing system component, e.g. monitoring if the computing system is on, off, available, not available
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a server health monitoring method, a system and a computing device, wherein the method is executed in a health monitoring server, the health monitoring server is connected with one or more check servers, each check server is respectively connected with a server to be monitored, and the method comprises the following steps: regularly sending health check requests to one or more check servers, so that the check servers send requests to corresponding servers to be monitored based on the health check requests, obtain request results returned by the servers to be monitored, and store the request results to a memory queue of the check servers; and acquiring the request result in the memory queue of each check server at regular time, and analyzing the request result to determine the health state of the corresponding server to be monitored at the time point of the return of the request result. According to the technical scheme of the invention, the current health state of the server to be monitored can be accurately determined.

Description

Server health monitoring method and system and computing equipment
Technical Field
The invention relates to the technical field of network communication, in particular to a server health monitoring method, a server health monitoring system and computing equipment.
Background
At present, with the rapid development of enterprises, the types of services are more and more abundant, the number of the accompanying projects is more and more, and the single-machine deployment mode is difficult to meet the service requirements. In order to meet the service requirements, a micro-service architecture and a clustering deployment mode are generally adopted, and the number of used servers is multiplied, so that the health state (whether a request can be responded or not and whether the server runs normally) of the server is difficult to monitor.
In the prior art, there are two main methods for monitoring the health status of a server, one is a heartbeat detection mechanism based on a registration center in a micro-service architecture, the server sends a heartbeat message (for example, using a TCP/UDP protocol) to the registration center at a fixed time, and if the registration center replies with a heartbeat within a configured time range, the health of the server is indicated. The other method is to check API call regularly, the monitored server provides an API interface, the server health monitoring system sends HTTP request regularly to the interface, and judges whether the server is normal or not according to the interface return result.
However, the above-mentioned solutions cannot adapt to a complex network environment, for example, some projects of an enterprise are deployed in different rooms of an internal network, networks of the rooms are isolated from each other, some projects are deployed in an external network, and if only a health check server is deployed in one room, it may not be possible to perform health monitoring on servers of all the projects. In addition, the above scheme fixes the parameters configured for the health check server, and the period of the timing check is fixed for each monitored server, so that the customization cannot be realized. Moreover, the above solution determines the health status of the server by comparing the returned result with the expected returned result in the health parameters, however, as the data changes, the returned result may be similar to and unequal to the expected returned result, and at this time, a misjudgment may be made. In addition, because the network may have fluctuation, the above scheme only returns the result once to judge whether the server is healthy or not.
Therefore, a server health monitoring method and system are needed to solve the problems in the above technical solutions.
Disclosure of Invention
To this end, the present invention provides a server health monitoring method, a server health monitoring system and a computing device to solve or at least alleviate the above-existing problems.
According to one aspect of the present invention, there is provided a server health monitoring method, executed in a health monitoring server, the health monitoring server being connected to one or more check servers, each check server being connected to a server to be monitored, the method comprising the steps of: regularly sending health check requests to one or more check servers, so that the check servers send requests to corresponding servers to be monitored based on the health check requests, obtain request results returned by the servers to be monitored, and store the request results to a memory queue of the check servers; and acquiring the request result in the memory queue of each check server at regular time, and analyzing the request result to determine the health state of the corresponding server to be monitored at the time point of the return of the request result.
Optionally, in the server health monitoring method according to the present invention, the step of periodically sending health check requests to one or more check servers, and periodically obtaining the request result in the memory queue of each check server includes: and respectively starting a thread for each server to be monitored to execute the timing task of the server to be monitored so as to regularly send a health check request to the check server connected with the server to be monitored and regularly acquire a request result in a memory queue of the check server.
Optionally, in the server health monitoring method according to the present invention, the step of sending, by the check server, a request to the corresponding server to be monitored based on the health check request and obtaining a request result returned by the server to be monitored includes: the checking server acquires health checking parameters from the health checking request, constructs an HTTP request based on the health checking parameters and sends the HTTP request to a corresponding server to be monitored; and acquiring an HTTP request result returned by the server to be monitored.
Optionally, in the server health monitoring method according to the present invention, the health check parameter includes one or more of a request protocol, a request address, a request parameter, an allowed timeout time, and a maximum retry number of the server to be monitored.
Optionally, in the server health monitoring method according to the present invention, the analyzing the request result includes: and acquiring a state code and a response body from the request result, and determining the health state of the server to be monitored corresponding to the request result at the time point when the request result returns according to the state code and the response body.
Optionally, in the server health monitoring method according to the present invention, the health status includes one or more of health, ongoing maintenance, and abnormality.
Optionally, in the server health monitoring method according to the present invention, after determining the health status of the server to be monitored at the time point when the request result is returned, the method further includes the steps of: and judging whether the health state of the server to be monitored changes, and if so, sending a corresponding server health state change notification to the client.
Optionally, in the server health monitoring method according to the present invention, the health monitoring server is connected to the state machine module, and the step of determining whether the health state of the server to be monitored changes includes: and sending the signal corresponding to the health state of the server to be monitored to a state machine module so that the state machine module judges whether the health state of the server to be monitored changes or not according to the signal corresponding to the health state, and sending a corresponding server health state change notification to the client when the change is determined.
Optionally, in the server health monitoring method according to the present invention, the check server is an Agent proxy server.
According to an aspect of the present invention, there is provided a server health monitoring system, comprising: one or more inspection servers, each of which is connected with one server to be monitored; and the health monitoring server is in communication connection with one or more checking servers and is suitable for executing the method for monitoring the health of the server to be monitored.
Optionally, in the server health monitoring system according to the present invention, further comprising: and the state machine module is connected with the health monitoring server, is suitable for receiving a signal which is sent by the health monitoring server and corresponds to the health state of the server to be monitored, is suitable for judging whether the health state of the server to be monitored changes according to the signal which corresponds to the health state, and sends a corresponding server health state change notice to the client when the change is determined.
According to an aspect of the invention, there is provided a computing device comprising: at least one processor; a memory storing program instructions configured to be executed by the at least one processor, the program instructions comprising instructions for performing the server health monitoring method as described above.
According to one aspect of the present invention, there is provided a readable storage medium storing program instructions that, when read and executed by a computing device, cause the computing device to perform a server health monitoring method as described above.
According to the technical scheme of the invention, the health monitoring server sends health checking requests to one or more checking servers at regular time, so that HTTP requests are sent to corresponding servers to be monitored through the checking servers and request results returned by the servers to be monitored are obtained, the request results returned by the servers to be monitored at various time points are obtained from a memory queue of the checking servers at regular time, and the health states of the servers to be monitored at different time points can be analyzed and determined according to the request results returned by the servers to be monitored at various time points. In this way, the current health status of the server to be monitored can be accurately determined. And whether the health state of the monitoring server changes or not can be judged, and when the health state of the monitoring server changes, the health state change notification of the monitoring server is sent to the user, so that the user can know the current health state of the server to be monitored in real time.
Furthermore, different timing tasks are set for each server to be monitored, a thread is started for each server to be monitored to execute the timing tasks, and the health state of each server to be monitored is checked at regular time, so that the health state monitoring requirements of different servers to be monitored can be met.
The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.
Drawings
To the accomplishment of the foregoing and related ends, certain illustrative aspects are described herein in connection with the following description and the annexed drawings, which are indicative of various ways in which the principles disclosed herein may be practiced, and all aspects and equivalents thereof are intended to be within the scope of the claimed subject matter. The above and other objects, features and advantages of the present disclosure will become more apparent from the following detailed description read in conjunction with the accompanying drawings. Throughout this disclosure, like reference numerals generally refer to like parts or elements.
FIG. 1 shows a schematic diagram of a server health monitoring system 100, in accordance with one embodiment of the present invention;
FIG. 2 shows a schematic diagram of a computing device 200, according to one embodiment of the invention; and
FIG. 3 shows a flow diagram of a server health monitoring method 300 according to one embodiment of the invention.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
FIG. 1 shows a schematic diagram of a server health monitoring system 100, in accordance with one embodiment of the present invention.
As shown in FIG. 1, the server health monitoring system 100 includes a client 110, a health monitoring server 120, a state machine module 125, and one or more check servers 130. Each of the examination servers 130 may be communicatively connected to one server 150 to be monitored, and the health monitoring server 120 may be communicatively connected to the client 110 and one or more examination servers 130, for example, through a wired or wireless network.
The health monitoring server 120 is used for managing and controlling the one or more checking servers 130 to check the health status of the servers 150 to be monitored at regular time, so as to realize the health monitoring of the one or more servers 150 to be monitored. The present invention is not limited to the specific type of the health monitoring server 120, for example, the health monitoring server 120 may be implemented as a computing device such as a desktop computer, a notebook computer, a processor chip, a mobile phone, a tablet computer, etc., but is not limited thereto, and may also be an application program residing on the computing device.
The client 110 is a terminal device used by a user, and may specifically be a personal computer such as a desktop computer and a notebook computer, or may also be a mobile phone, a tablet computer, a multimedia device, an intelligent wearable device, and the like, but is not limited thereto.
In the embodiment of the present invention, the health monitoring server 120 periodically sends health check requests to one or more checking servers 130, so that the checking servers 130 send requests to corresponding servers 150 to be monitored based on the health check requests, and obtain request results returned by the servers 150 to be monitored. After acquiring the request result returned by the server 150 to be monitored each time, the check server 130 stores the request result in the memory queue of the check server 130. Thus, the memory queue of the checking server 130 may contain one or more request results returned by the server 150 to be monitored connected thereto.
And, the health monitoring server 120 regularly obtains the request result in the memory queue of each checking server 130, and determines the health status of the server 150 to be monitored corresponding to the request result at the time point when the request result is returned by analyzing the request result.
In one embodiment, the inspection server 130 may be implemented as an Agent proxy server. The user may register information for one or more check servers 130 at client 110 and register information for one or more servers to be monitored 150 and send to health monitoring server 120. Here, the check server information includes, for example, a name (Agent name) of the check server, an address, an access key, a description, and the like. Here, the access key is, for example, a Token key. The server information to be monitored includes, for example, one or more of an item name, a service name, a request protocol, a health check request address, a request parameter, a request time interval, an allowed timeout time, a maximum number of retries, a check server name, and an alarm mailbox.
The health monitoring server 120, after receiving the information of one or more checking servers registered by the user at the client 110 and the information of one or more servers to be monitored, may bind the server 150 to be monitored with a corresponding one of the checking servers 130 according to the information of the server to be monitored (checking server name).
In one embodiment, as shown in FIG. 1, health monitoring server 120 includes a management module 121, a monitoring module 122 connected to client 110. The monitoring module 122 may establish a communication connection with one or more inspection servers 130, so as to periodically send health inspection requests to one or more inspection servers 130 and periodically obtain the request results in the memory queue of each inspection server 130. The monitoring module 122 analyzes the request result to determine the health status of the server 150 to be monitored corresponding to the request result at the time point when the request result is returned.
The management module 121 may acquire information of one or more check servers registered by the user at the client 110 and manage the check server information. The monitoring module 122 may obtain information of one or more servers to be monitored registered by the user at the client 110, and store the registered server information to be monitored in the structured data storage device, so as to manage the server information to be monitored. Also, the monitoring module 122 may bind the server 150 to be monitored with a corresponding one of the check servers 130 according to the server information to be monitored (check server name).
In one embodiment, after acquiring the information of the one or more monitoring servers registered by the client 110, the health monitoring server 120 may set a corresponding timing task for each server 150 to be monitored, and configure a thread for each server 150 to be monitored to execute the corresponding timing task.
In this way, the health monitoring server 120 starts a thread for each server 150 to be monitored in the memory to execute the timing task of the server 150 to be monitored, so as to send the health check request of the server 150 to be monitored to the check server 130 connected to the server 150 to be monitored at regular time, and obtain the request result returned by the server 150 to be monitored stored in the memory queue of the check server 130 at regular time, and analyze the health status of the corresponding server 150 to be monitored according to each request result.
In one embodiment, the health monitoring server 120, after determining the health status of the server to be monitored 150 (i.e., the health status of the server to be monitored 150 at the time point when the request result is returned) by analyzing the request result of the server to be monitored 150, also determines whether the health status of the server to be monitored 150 has changed, i.e., determines whether the health status of the server to be monitored 150 has changed from the health status at the time point when the last request result was returned. If a change occurs, a corresponding server health status change notification is sent to the client 110. For example, a server health status change notification may be sent to an alarm mailbox corresponding to the server 150 to be monitored, so as to inform the user of the current server health status.
In one embodiment, the monitoring module 122 of the health monitoring server 120 is connected to a State Machine module 125, and the State Machine module 125 may be implemented as, for example, a FSM (Finite State Machine). The monitoring module 122 of the health monitoring server 120 sends a signal (e.g., a health signal, a maintenance signal, an abnormal signal) corresponding to the health status of the server 150 to be monitored to the state machine module 125, and the state machine module 125 can determine whether the health status of the server 150 to be monitored changes according to the signal corresponding to the health status. The state machine module 125 may send a corresponding server health status change notification to the client when it is determined that the health status of the server 150 to be monitored changes.
In one embodiment, the checking server 130 is adapted to send an HTTP request to the server to be monitored 150 connected thereto, and obtain the HTTP request result returned by the server to be monitored 150. After obtaining the HTTP request result from the memory queue of the checking server 130, the health monitoring server 120 may determine the health status of the server 150 to be monitored corresponding to the request result according to the status code and the response body by obtaining the status code and the response body from the HTTP request result. Here, the health state includes one or more of health, maintenance, and abnormality.
In one embodiment, the health monitoring server 120 is configured to perform the server health monitoring method 300 of the present invention, the server health monitoring method 300 of the present invention being described in more detail below.
In one embodiment, the health monitoring server 120, the client 110, the checking server 130, and the server to be monitored 150 of the present invention may each be implemented as a computing device, such that the server health monitoring method 300 of the present invention may be executed in the computing device.
FIG. 2 shows a schematic diagram of a computing device 200, according to one embodiment of the invention.
As shown in FIG. 2, in a basic configuration 202, a computing device 200 typically includes a system memory 206 and one or more processors 204. A memory bus 208 may be used for communication between the processor 204 and the system memory 206.
Depending on the desired configuration, the processor 204 may be any type of processing, including but not limited to: a microprocessor (UP), a microcontroller (UC), a digital information processor (DSP), or any combination thereof. The processor 204 may include one or more levels of cache, such as a level one cache 210 and a level two cache 212, a processor core 214, and registers 216. Example processor cores 214 may include Arithmetic Logic Units (ALUs), Floating Point Units (FPUs), digital signal processing cores (DSP cores), or any combination thereof. The example memory controller 218 may be used with the processor 204, or in some implementations the memory controller 218 may be an internal part of the processor 204.
Depending on the desired configuration, system memory 206 may be any type of memory, including but not limited to: volatile memory (such as RAM), non-volatile memory (such as ROM, flash memory, etc.), or any combination thereof. System memory 206 may include an operating system 220, one or more applications 222, and program data 224. The application 222 is actually a plurality of program instructions that direct the processor 204 to perform corresponding operations. In some embodiments, application 222 may be arranged to cause processor 204 to operate with program data 224 on an operating system.
Computing device 200 also includes storage device 232, storage device 232 including removable storage 236 and non-removable storage 238.
Computing device 200 may also include a storage interface bus 234. The storage interface bus 234 enables communication from the storage devices 232 (e.g., removable storage 236 and non-removable storage 238) to the basic configuration 202 via the bus/interface controller 230. At least a portion of the operating system 220, applications 222, and data 224 may be stored on removable storage 236 and/or non-removable storage 238, and loaded into system memory 206 via storage interface bus 234 and executed by the one or more processors 204 when the computing device 200 is powered on or the applications 222 are to be executed.
Computing device 200 may also include an interface bus 240 that facilitates communication from various interface devices (e.g., output devices 242, peripheral interfaces 244, and communication devices 246) to the basic configuration 202 via the bus/interface controller 230. The exemplary output device 242 includes an image processing unit 248 and an audio processing unit 250. They may be configured to facilitate communication with various external devices, such as a display or speakers, via one or more a/V ports 252. Example peripheral interfaces 244 can include a serial interface controller 254 and a parallel interface controller 256, which can be configured to facilitate communications with external devices such as input devices (e.g., keyboard, mouse, pen, voice input device, touch input device) or other peripherals (e.g., printer, scanner, etc.) via one or more I/O ports 258. An example communication device 246 may include a network controller 260, which may be arranged to facilitate communications with one or more other computing devices 262 over a network communication link via one or more communication ports 264.
A network communication link may be one example of a communication medium. Communication media may typically be embodied by computer readable instructions, data structures, program modules, and may include any information delivery media, such as carrier waves or other transport mechanisms, in a modulated data signal. A "modulated data signal" may be a signal that has one or more of its data set or its changes made in a manner that encodes information in the signal. By way of non-limiting example, communication media may include wired media such as a wired network or private-wired network, and various wireless media such as acoustic, Radio Frequency (RF), microwave, Infrared (IR), or other wireless media. The term computer readable media as used herein may include both storage media and communication media.
In a computing device 200 according to the present invention, an application 222 in the computing device includes a plurality of program instructions that implement the server health monitoring method 300, which can be read and executed by the processor 204 such that the computing device 200 implements monitoring the health status of a plurality of servers by executing the server health monitoring method 300 of the present invention.
FIG. 3 shows a flow diagram of a server health monitoring method 300 according to one embodiment of the invention. The method 300 is suitable for execution in a health monitoring server 120, such as the aforementioned computing device 200. Health monitoring server 120 is communicatively coupled to client 110, state machine module 125, and one or more check servers 130, respectively. Each check server 130 may be communicatively connected to one monitoring server 150.
As shown in FIG. 3, the method 300 includes steps S310 to S320.
In step S310, health check requests are sent to one or more check servers 130 at regular time, so that the check servers 130 send requests to the corresponding servers to be monitored 150 based on the health check requests to obtain request results returned by the servers to be monitored 150. After acquiring the request result returned by the server 150 to be monitored each time, the check server 130 stores the request result in the memory queue of the check server 130. Thus, the memory queue of the checking server 130 may contain one or more request results returned by the server 150 to be monitored connected thereto.
In one embodiment, the inspection server 130 may be implemented as an Agent proxy server. Each of the inspection servers 130 may be communicatively coupled to a server 150 to be monitored.
In one embodiment, a user may register information for one or more inspection servers 130 and one or more servers to be monitored 150 at client 110 prior to performing method 300. Health monitoring server 120 may receive information for one or more check servers (agents) that the user has registered with client 110, as well as information for one or more servers to be monitored.
Specifically, the examination server information includes, for example, a name (Agent name) of the examination server, an address, an access key, a description, and the like. Here, the access key is, for example, a Token key. The server information to be monitored includes, for example, one or more of an item name, a service name, a request protocol, a health check request address, a request parameter, a request time interval, an allowed timeout time, a maximum number of retries, a check server name (Agent name), an alarm mailbox.
In one embodiment, as shown in FIG. 1, health monitoring server 120 comprises a management module 121, a monitoring module 122 connected to client 110, wherein monitoring module 122 is communicatively connected to one or more check servers 130 and periodically sends health check requests to one or more check servers 130.
The management module 121 may acquire information of one or more inspection servers (agents) registered by the user at the client 110 and manage the inspection server information. The monitoring module 122 may obtain information of one or more servers to be monitored registered by the user at the client 110, and store the registered server information to be monitored in the structured data storage device, so as to manage the server information to be monitored. Also, the monitoring module 122 of the health monitoring server 120 may bind the server to be monitored 150 with a corresponding one of the check servers 130 according to the server information to be monitored (check server name).
In step S320, the request result in the memory queue of each inspection server is obtained at regular time, and the health status of the server 150 to be monitored corresponding to the request result at the time point when the request result is returned is determined by analyzing the request result.
According to an embodiment of the present invention, after acquiring information of one or more monitoring servers registered by the client 110, the health monitoring server 120 may set a corresponding timing task for each to-be-monitored server 150, and configure a thread for each to-be-monitored server 150 to execute the corresponding timing task.
In this way, when steps S310 and S320 are executed, the health monitoring server 120 starts a thread for each server 150 to be monitored in the memory to execute the timing task of the server 150 to be monitored, so as to periodically send the health check request of the server 150 to be monitored to the checking server 130 connected to the server 150 to be monitored, and periodically obtain the request result returned by the server 150 to be monitored stored in the memory queue of the checking server 130. Here, in one implementation, after the request result in the memory queue of the checking server 130 is obtained periodically, the obtained request result may be stored in the structured data storage device, and then the health status of the corresponding server 150 to be monitored at the time point of the return of the request result is analyzed according to each request result by traversing each request result in the structured data storage device.
It should be noted that the present invention can satisfy the health status monitoring requirements of different servers to be monitored by setting different timing tasks for each server 150 to be monitored, and starting a thread for each server to be monitored to execute the timing task to periodically check the health status of each server to be monitored.
It should be noted that the health check request to be monitored, which is periodically sent to the check server 130 by the health monitoring server 120, includes the access key and the health check parameter of the check server 130. Here, the access key is, for example, a Token key. In one embodiment, the health check parameters include a request protocol, a request address, request parameters, an allowed timeout time, a maximum number of retries, etc. of the server to be monitored. Here, the permitted timeout time is a response timeout time permitted when the check server 130 transmits a request to the corresponding server to be monitored 150, and the maximum number of retries is a maximum number of times when the check server 130 retransmits a request after failing to transmit a request to the corresponding server to be monitored 150.
In one embodiment, after receiving the health check request periodically sent by the health monitoring server 120, the checking server 130, when sending a request to the corresponding server to be monitored based on the health check request, specifically, by obtaining the health check parameters from the health check request, and constructing an HTTP request based on the health check parameters, and then sending the HTTP request to the corresponding server to be monitored 150. After that, the check server 130 may obtain the HTTP request result returned by the server to be monitored 150.
In an embodiment, when the health status of the corresponding server 150 to be monitored is determined by analyzing the request result, the status code and the response body may be specifically obtained from the HTTP request result, and the health status of the server 150 to be monitored corresponding to the request result at the time point when the request result is returned is determined according to the status code and the response body. Here, the health state includes one or more of health, maintenance, and abnormality. That is, it can be determined whether the server to be monitored 150 is in a healthy, maintenance state, or abnormal state at the time point corresponding to the HTTP request result returned to the check server 130, according to the status code and the responder from the HTTP request result.
It should be noted that the state code and the responder corresponding to different health states are different. For example, "healthy" corresponds to a status code that is greater than or equal to 200 and less than 400. The maintaining state corresponds to a custom state code 503, and the response body includes a "maintaining" field. Based on this, if the status code in the request result is less than 200, greater than or equal to 400, and not equal to 503, it may be determined that the server to be monitored corresponding to the request result is in an abnormal state.
It should be noted that the present invention is not limited to the above-mentioned classification of health status and corresponding status code, and the type of server health status and corresponding status code can be defined and adjusted by those skilled in the art according to the actual situation.
According to an embodiment of the present invention, the health monitoring server 120, after determining the health status of the server 150 to be monitored (i.e., the health status of the monitoring server 150 at the time point when the request result is returned) by analyzing the request result of the server 150 to be monitored, also determines whether the health status of the server 150 to be monitored changes, i.e., determines whether the health status of the server 150 to be monitored changes with respect to the health status at the time point when the last request result was returned. If a change occurs, a corresponding server health status change notification is sent to the client 110. For example, a server health status change notification may be sent to an alarm mailbox corresponding to the server 150 to be monitored, so as to inform the user of the current server health status.
In one embodiment, health monitoring server 120 is coupled to State Machine module 125, and State Machine module 125 may be implemented as a FSM (Finite State Machine). The health monitoring server 120 may determine whether the health status of the server 150 to be monitored changes by:
the signals (e.g., health signals, maintenance signals, abnormal signals) corresponding to the health status of the server 150 to be monitored are sent to the state machine module 125, so that the state machine module 125 determines whether the health status of the server 150 to be monitored changes according to the signals corresponding to the health status. When determining that the health status of the server 150 to be monitored changes, the state machine module 125 sends a corresponding server health status change notification to the client, so that the user can know the current health status of the server 150 to be monitored in real time.
Specifically, by creating a state machine (FSM instance) for each server 150 to be monitored in the state machine module 125, after the health monitoring server 120 sends a signal (e.g., a health signal, a maintenance signal, or an abnormal signal) corresponding to the health status of any server 150 to be monitored to the state machine module 125, the state machine corresponding to the server 150 to be monitored is used to determine whether the health status of the server 150 to be monitored changes.
For example, if the current state of the state machine is "healthy", it indicates that the current health state of the server 150 to be monitored corresponding to the state machine is healthy, and after the state machine receives the exception signal next time, the state machine enters an exception state and determines that the health state of the server 150 to be monitored changes, and then sends a notification of "server exception" to the client 110. If the current state of the state machine is "healthy", after the state machine receives the maintenance signal next time, the state machine enters the maintenance state and determines that the health state of the server 150 to be monitored changes, and then sends a "server maintenance" notification to the client 110. In addition, if the current state of the state machine is "healthy", when the state machine receives a health signal next time, the health signal is a spin signal for the current state machine, the state of the state machine is not changed, and it is determined that the health state of the server 150 to be monitored is not changed.
According to the server health monitoring method 300, health check requests are sent to one or more check servers at regular time, so that HTTP requests are sent to corresponding servers to be monitored through the check servers and request results returned by the servers to be monitored are obtained, the request results returned by the servers to be monitored at various time points are obtained from the memory queue of the check servers at regular time, and the health states of the servers to be monitored at different time points can be analyzed and determined according to the request results returned by the servers to be monitored at various time points. Therefore, the current health state of the server to be monitored can be accurately analyzed. And whether the health state of the monitoring server changes or not can be judged, and when the health state of the monitoring server changes, the health state change notification of the monitoring server is sent to the user, so that the user can know the current health state of the server to be monitored in real time.
Furthermore, different timing tasks are set for each server to be monitored, a thread is started for each server to be monitored to execute the timing tasks, and the health state of each server to be monitored is checked at regular time, so that the health state monitoring requirements of different servers to be monitored can be met.
The various techniques described herein may be implemented in connection with hardware or software or, alternatively, with a combination of both. Thus, the methods and apparatus of the present invention, or certain aspects or portions thereof, may take the form of program code (i.e., instructions) embodied in tangible media, such as removable hard drives, U.S. disks, floppy disks, CD-ROMs, or any other machine-readable storage medium, wherein, when the program is loaded into and executed by a machine, such as a computer, the machine becomes an apparatus for practicing the invention.
In the case of program code execution on programmable computers, the mobile terminal generally includes a processor, a storage medium readable by the processor (including volatile and non-volatile memory and/or storage elements), at least one input device, and at least one output device. Wherein the memory is configured to store program code; the processor is configured to perform the server health monitoring method of the present invention according to instructions in the program code stored in the memory.
By way of example, and not limitation, readable media may comprise readable storage media and communication media. Readable storage media store information such as computer readable instructions, data structures, program modules or other data. Communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. Combinations of any of the above are also included within the scope of readable media.
In the description provided herein, algorithms and displays are not inherently related to any particular computer, virtual system, or other apparatus. Various general purpose systems may also be used with examples of this invention. The required structure for constructing such a system will be apparent from the description above. Moreover, the present invention is not directed to any particular programming language. It is appreciated that a variety of programming languages may be used to implement the teachings of the present invention as described herein, and any descriptions of specific languages are provided above to disclose the best mode of the invention.
In the description provided herein, numerous specific details are set forth. It is understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be interpreted as reflecting an intention that: that the invention as claimed requires more features than are expressly recited in each claim. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.
Those skilled in the art will appreciate that the modules or units or components of the devices in the examples disclosed herein may be arranged in a device as described in this embodiment or alternatively may be located in one or more devices different from the devices in this example. The modules in the foregoing examples may be combined into one module or may be further divided into multiple sub-modules.
Those skilled in the art will appreciate that the modules in the device in an embodiment may be adaptively changed and disposed in one or more devices different from the embodiment. The modules or units or components of the embodiments may be combined into one module or unit or component, and furthermore they may be divided into a plurality of sub-modules or sub-units or sub-components. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or elements of any method or apparatus so disclosed, may be combined in any combination, except combinations where at least some of such features and/or processes or elements are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
Furthermore, those skilled in the art will appreciate that while some embodiments described herein include some features included in other embodiments, rather than other features, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments.
Furthermore, some of the described embodiments are described herein as a method or combination of method elements that can be performed by a processor of a computer system or by other means of performing the described functions. A processor having the necessary instructions for carrying out the method or method elements thus forms a means for carrying out the method or method elements. Further, the elements of the apparatus embodiments described herein are examples of the following apparatus: the apparatus is used to implement the functions performed by the elements for the purpose of carrying out the invention.
As used herein, unless otherwise specified the use of the ordinal adjectives "first", "second", "third", etc., to describe a common object, merely indicate that different instances of like objects are being referred to, and are not intended to imply that the objects so described must be in a given sequence, either temporally, spatially, in ranking, or in any other manner.
While the invention has been described with respect to a limited number of embodiments, those skilled in the art, having benefit of this description, will appreciate that other embodiments can be devised which do not depart from the scope of the invention as described herein. Furthermore, it should be noted that the language used in the specification has been principally selected for readability and instructional purposes, and may not have been selected to delineate or circumscribe the inventive subject matter. Accordingly, many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the appended claims. The present invention has been disclosed in an illustrative rather than a restrictive sense, and the scope of the present invention is defined by the appended claims.

Claims (13)

1. A server health monitoring method, implemented in a health monitoring server, the health monitoring server being connected to one or more examination servers, each examination server being connected to a server to be monitored, the method comprising the steps of:
regularly sending health check requests to one or more check servers, so that the check servers send requests to corresponding servers to be monitored based on the health check requests, obtain request results returned by the servers to be monitored, and store the request results to a memory queue of the check servers; and
and acquiring a request result in a memory queue of each inspection server at fixed time, and analyzing the request result to determine the health state of the corresponding server to be monitored at the time point of the return of the request result.
2. The method as claimed in claim 1, wherein the step of periodically sending health check requests to one or more check servers, and periodically obtaining the request result in the memory queue of each check server comprises:
and respectively starting a thread for each server to be monitored to execute the timing task of the server to be monitored so as to regularly send a health check request to the check server connected with the server to be monitored and regularly acquire a request result in a memory queue of the check server.
3. The method as claimed in claim 1, wherein the step of the checking server sending a request to the corresponding server to be monitored based on the health check request and obtaining the request result returned by the server to be monitored comprises:
the checking server acquires health checking parameters from the health checking request, constructs an HTTP request based on the health checking parameters and sends the HTTP request to a corresponding server to be monitored;
and acquiring an HTTP request result returned by the server to be monitored.
4. The method of claim 3, wherein,
the health check parameters comprise one or more of request protocol, request address, request parameters, allowed timeout time and maximum retry number of the server to be monitored.
5. The method of any of claims 1-4, wherein analyzing the request results comprises:
and acquiring a state code and a response body from the request result, and determining the health state of the server to be monitored corresponding to the request result at the time point when the request result returns according to the state code and the response body.
6. The method of any one of claims 1-4, the health status comprising one or more of health, ongoing maintenance, abnormal.
7. The method according to any of claims 1-4, further comprising, after determining the health status of the server to be monitored at the point in time when the request result is returned, the steps of:
and judging whether the health state of the server to be monitored changes, and if so, sending a corresponding server health state change notification to the client.
8. The method of claim 7, wherein the health monitoring server is coupled to a state machine module, and the step of determining whether the health status of the server to be monitored changes comprises:
and sending the signal corresponding to the health state of the server to be monitored to a state machine module so that the state machine module judges whether the health state of the server to be monitored changes or not according to the signal corresponding to the health state, and sending a corresponding server health state change notification to the client when the change is determined.
9. The method of any one of claims 1-4, the inspection server being an Agent proxy server.
10. A server health monitoring system comprising:
one or more inspection servers, each of which is connected with one server to be monitored;
a health monitoring server, communicatively connected to one or more examination servers, adapted to perform the method of any one of claims 1-9 for health monitoring of a server to be monitored.
11. The system of claim 10, further comprising:
and the state machine module is connected with the health monitoring server, is suitable for receiving a signal which is sent by the health monitoring server and corresponds to the health state of the server to be monitored, is suitable for judging whether the health state of the server to be monitored changes according to the signal which corresponds to the health state, and sends a corresponding server health state change notice to the client when the change is determined.
12. A computing device, comprising:
at least one processor; and
a memory storing program instructions, wherein the program instructions are configured to be executed by the at least one processor, the program instructions comprising instructions for performing the server health monitoring method of any of claims 1-9.
13. A readable storage medium storing program instructions that, when read and executed by a computing device, cause the computing device to perform the server health monitoring method of any of claims 1-9.
CN202210123500.9A 2022-02-10 2022-02-10 Server health monitoring method and system and computing equipment Active CN114172829B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202210123500.9A CN114172829B (en) 2022-02-10 2022-02-10 Server health monitoring method and system and computing equipment
CN202210731202.8A CN115190047B (en) 2022-02-10 2022-02-10 Method, system and computing device for monitoring server health

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210123500.9A CN114172829B (en) 2022-02-10 2022-02-10 Server health monitoring method and system and computing equipment

Related Child Applications (1)

Application Number Title Priority Date Filing Date
CN202210731202.8A Division CN115190047B (en) 2022-02-10 2022-02-10 Method, system and computing device for monitoring server health

Publications (2)

Publication Number Publication Date
CN114172829A true CN114172829A (en) 2022-03-11
CN114172829B CN114172829B (en) 2022-08-12

Family

ID=80489591

Family Applications (2)

Application Number Title Priority Date Filing Date
CN202210123500.9A Active CN114172829B (en) 2022-02-10 2022-02-10 Server health monitoring method and system and computing equipment
CN202210731202.8A Active CN115190047B (en) 2022-02-10 2022-02-10 Method, system and computing device for monitoring server health

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN202210731202.8A Active CN115190047B (en) 2022-02-10 2022-02-10 Method, system and computing device for monitoring server health

Country Status (1)

Country Link
CN (2) CN114172829B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114785861A (en) * 2022-06-22 2022-07-22 飞狐信息技术(天津)有限公司 Service request forwarding system, method, computer equipment and storage medium
CN114826982A (en) * 2022-04-08 2022-07-29 浙江大学 Self-adaptive heartbeat packet adjusting method in micro-service scene
CN114938377A (en) * 2022-04-20 2022-08-23 京东科技信息技术有限公司 Back-end server management method and device, readable medium and electronic equipment
CN115665009A (en) * 2022-12-29 2023-01-31 鹏城实验室 DNS root server state monitoring method and device, electronic equipment and medium
CN114938377B (en) * 2022-04-20 2024-05-17 京东科技信息技术有限公司 Back-end server management method and device, readable medium and electronic equipment

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101383735A (en) * 2008-10-15 2009-03-11 阿里巴巴集团控股有限公司 Server checking method, equipment and system
CN102375772A (en) * 2011-09-27 2012-03-14 云智慧(北京)科技有限公司 Server monitoring method and device
CN107241240A (en) * 2017-06-30 2017-10-10 广州君海网络科技有限公司 Game server state monitoring method, apparatus and system
CN110290019A (en) * 2019-05-27 2019-09-27 网宿科技股份有限公司 Monitoring method and system
CN112882895A (en) * 2021-02-22 2021-06-01 中国工商银行股份有限公司 Health examination method, device, computer system and readable storage medium
CN112882901A (en) * 2021-03-04 2021-06-01 中国航空工业集团公司西安航空计算技术研究所 Intelligent health state monitor of distributed processing system
CN113032223A (en) * 2021-04-20 2021-06-25 上海哔哩哔哩科技有限公司 Server state detection method and device

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017000296A1 (en) * 2015-07-02 2017-01-05 深圳市谷玛鹤健康科技有限公司 Temperature measurement device, central monitoring station, and temperature monitoring system and method therefor
CN107070742B (en) * 2017-03-14 2020-05-08 北京三快在线科技有限公司 Service server health state checking method and system
US11108673B2 (en) * 2017-09-18 2021-08-31 Citrix Systems, Inc. Extensible, decentralized health checking of cloud service components and capabilities
CN108199914A (en) * 2017-12-27 2018-06-22 杭州迪普科技股份有限公司 Server-side condition detection method and device
CN108228452B (en) * 2017-12-28 2021-04-06 微梦创科网络科技(中国)有限公司 Test method and test device based on simple factory mode
JP2021060221A (en) * 2019-10-03 2021-04-15 サイマックス株式会社 Health monitoring system, health monitoring method and health monitoring program

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101383735A (en) * 2008-10-15 2009-03-11 阿里巴巴集团控股有限公司 Server checking method, equipment and system
CN102375772A (en) * 2011-09-27 2012-03-14 云智慧(北京)科技有限公司 Server monitoring method and device
CN107241240A (en) * 2017-06-30 2017-10-10 广州君海网络科技有限公司 Game server state monitoring method, apparatus and system
CN110290019A (en) * 2019-05-27 2019-09-27 网宿科技股份有限公司 Monitoring method and system
CN112882895A (en) * 2021-02-22 2021-06-01 中国工商银行股份有限公司 Health examination method, device, computer system and readable storage medium
CN112882901A (en) * 2021-03-04 2021-06-01 中国航空工业集团公司西安航空计算技术研究所 Intelligent health state monitor of distributed processing system
CN113032223A (en) * 2021-04-20 2021-06-25 上海哔哩哔哩科技有限公司 Server state detection method and device

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114826982A (en) * 2022-04-08 2022-07-29 浙江大学 Self-adaptive heartbeat packet adjusting method in micro-service scene
CN114826982B (en) * 2022-04-08 2023-08-18 浙江大学 Self-adaptive heartbeat packet adjusting method in micro-service scene
CN114938377A (en) * 2022-04-20 2022-08-23 京东科技信息技术有限公司 Back-end server management method and device, readable medium and electronic equipment
CN114938377B (en) * 2022-04-20 2024-05-17 京东科技信息技术有限公司 Back-end server management method and device, readable medium and electronic equipment
CN114785861A (en) * 2022-06-22 2022-07-22 飞狐信息技术(天津)有限公司 Service request forwarding system, method, computer equipment and storage medium
CN114785861B (en) * 2022-06-22 2022-12-13 飞狐信息技术(天津)有限公司 Service request forwarding system, method, computer equipment and storage medium
CN115665009A (en) * 2022-12-29 2023-01-31 鹏城实验室 DNS root server state monitoring method and device, electronic equipment and medium

Also Published As

Publication number Publication date
CN115190047A (en) 2022-10-14
CN115190047B (en) 2023-07-07
CN114172829B (en) 2022-08-12

Similar Documents

Publication Publication Date Title
CN114172829B (en) Server health monitoring method and system and computing equipment
CN107995030B (en) Network detection method, network fault detection method and system
CN107835098B (en) Network fault detection method and system
US10157089B2 (en) Event queue management for embedded systems
JP6526907B2 (en) Performance monitoring of distributed storage systems
US20200314093A1 (en) System and method for selectively hibernating and restarting a node of an application instance
CN112867988A (en) Implementing compliance settings by a mobile device to follow a configuration scenario
CN110784374A (en) Method, device, equipment and system for monitoring operation state of service system
US9223672B1 (en) Method and system for providing error repair status data to an application user
US11494246B1 (en) Systems and methods for processing electronic requests
US20030055951A1 (en) Products, apparatus and methods for handling computer software/hardware messages
US10404676B2 (en) Method and apparatus to coordinate and authenticate requests for data
CN110874298A (en) Request data storage method and terminal equipment
US20160309005A1 (en) Method of automatically setting protocol in programmable logic controller system
CN111414383B (en) Data request method, data processing system and computing device
CN111447273A (en) Cloud processing system and data processing method based on cloud processing system
CN114880194B (en) Service abnormity monitoring method and device, electronic equipment and computer storage medium
CN109034668B (en) ETL task scheduling method, ETL task scheduling device, computer equipment and storage medium
CN114500327B (en) Detection method and detection device for server cluster and computing equipment
CN112465599B (en) Order processing method, order processing system and computing equipment
CN107925607A (en) Using shade, agency continuously monitors data server
CN114185804A (en) Interface testing method and device and terminal equipment
CN110825592A (en) Method and computing device for generating alarm content
CN112148783A (en) Data exchange method, device and equipment
CN109165147A (en) Log print control program, device, system, back-end server and headend equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant