CN115190047A - Server health monitoring method and system and computing equipment - Google Patents
Server health monitoring method and system and computing equipment Download PDFInfo
- Publication number
- CN115190047A CN115190047A CN202210731202.8A CN202210731202A CN115190047A CN 115190047 A CN115190047 A CN 115190047A CN 202210731202 A CN202210731202 A CN 202210731202A CN 115190047 A CN115190047 A CN 115190047A
- Authority
- CN
- China
- Prior art keywords
- server
- health
- monitored
- request
- check
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000036541 health Effects 0.000 title claims abstract description 189
- 238000012544 monitoring process Methods 0.000 title claims abstract description 116
- 238000000034 method Methods 0.000 title claims abstract description 62
- 230000003862 health status Effects 0.000 claims description 50
- 230000008859 change Effects 0.000 claims description 20
- 238000007689 inspection Methods 0.000 claims description 17
- 238000012423 maintenance Methods 0.000 claims description 11
- 230000004044 response Effects 0.000 claims description 11
- 230000002159 abnormal effect Effects 0.000 claims description 6
- 238000004891 communication Methods 0.000 description 20
- 238000010586 diagram Methods 0.000 description 6
- 238000013500 data storage Methods 0.000 description 4
- 238000007726 management method Methods 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 230000005856 abnormality Effects 0.000 description 3
- 230000008901 benefit Effects 0.000 description 3
- 230000002093 peripheral effect Effects 0.000 description 3
- 230000006870 function Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000007723 transport mechanism Effects 0.000 description 2
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/08—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
- H04L43/0805—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability
- H04L43/0817—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability by checking functioning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/3003—Monitoring arrangements specially adapted to the computing system or computing system component being monitored
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/3055—Monitoring arrangements for monitoring the status of the computing system or of the computing system component, e.g. monitoring if the computing system is on, off, available, not available
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Quality & Reliability (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Signal Processing (AREA)
- Computer Networks & Wireless Communication (AREA)
- Environmental & Geological Engineering (AREA)
- Mathematical Physics (AREA)
- Computer And Data Communications (AREA)
- Debugging And Monitoring (AREA)
Abstract
The invention discloses a server health monitoring method, a system and a computing device, wherein the method is executed in a health monitoring server, the health monitoring server is connected with one or more check servers, each check server is respectively connected with a server to be monitored, and the method comprises the following steps: regularly sending health check requests to one or more check servers, so that the check servers send requests to corresponding servers to be monitored based on the health check requests, obtain request results returned by the servers to be monitored, and store the request results to a memory queue of the check servers; and acquiring the request result in the memory queue of each check server at regular time, and analyzing the request result to determine the health state of the corresponding server to be monitored at the time point of the return of the request result. According to the technical scheme of the invention, the current health state of the server to be monitored can be accurately determined.
Description
This application is a divisional application of the invention patent application 2022101235009 filed on 10/2/2022.
Technical Field
The invention relates to the technical field of network communication, in particular to a server health monitoring method, a server health monitoring system and computing equipment.
Background
At present, with the rapid development of enterprises, the types of services are more and more abundant, the number of the accompanying projects is more and more, and the single-machine deployment mode is difficult to meet the service requirements. In order to meet the service requirements, a micro-service architecture and a clustering deployment mode are generally adopted, and the number of used servers is multiplied, so that the health state (whether a request can be responded or not and whether the server runs normally) of the server is difficult to monitor.
In the prior art, there are two main methods for monitoring the health status of a server, one is a heartbeat detection mechanism based on a registration center in a micro-service architecture, the server sends a heartbeat message (for example, using a TCP/UDP protocol) to the registration center at a fixed time, and if the registration center replies with a heartbeat within a configured time range, the health of the server is indicated. The other method is to check API call regularly, the monitored server provides an API interface, the server health monitoring system sends HTTP request regularly to the interface, and judges whether the server is normal or not according to the interface return result.
However, the above-mentioned solutions cannot adapt to a complex network environment, for example, some projects of an enterprise are deployed in different rooms of an internal network, networks of the rooms are isolated from each other, some projects are deployed in an external network, and if only a health check server is deployed in one room, it may not be possible to perform health monitoring on servers of all the projects. In addition, the above scheme fixes the parameters configured for the health check server, and the period of the timing check is fixed for each monitored server, so that customization cannot be realized. Moreover, the above solution determines the health status of the server by comparing the returned result with the expected returned result in the health parameters, however, as the data is changed, the returned result may be similar to and unequal to the expected returned result, and at this time, a false judgment may be made. In addition, because the network may have a fluctuation situation, the above scheme only returns the result once to judge whether the server is healthy or not.
Therefore, a server health monitoring method and system are needed to solve the problems in the above technical solutions.
Disclosure of Invention
To this end, the present invention provides a server health monitoring method, a server health monitoring system and a computing device to solve or at least alleviate the above-existing problems.
According to one aspect of the present invention, there is provided a server health monitoring method, performed in a health monitoring server, the health monitoring server being connected to one or more check servers, each check server being connected to a server to be monitored, respectively, the method comprising the steps of: regularly sending health check requests to one or more check servers so that the check servers can send requests to corresponding servers to be monitored based on the health check requests, obtain request results returned by the servers to be monitored and store the request results to a memory queue of the check servers; and acquiring the request result in the memory queue of each check server at regular time, and analyzing the request result to determine the health state of the corresponding server to be monitored at the time point when the request result is returned.
Optionally, in the server health monitoring method according to the present invention, the step of periodically sending health check requests to one or more check servers, and periodically obtaining a request result in a memory queue of each check server includes: and respectively starting a thread for each server to be monitored to execute the timing task of the server to be monitored so as to regularly send a health check request to the check server connected with the server to be monitored and regularly acquire a request result in a memory queue of the check server.
Optionally, in the server health monitoring method according to the present invention, the step of sending, by the inspection server, a request to a corresponding server to be monitored based on the health inspection request and obtaining a request result returned by the server to be monitored includes: the checking server acquires health checking parameters from the health checking request, constructs an HTTP request based on the health checking parameters and sends the HTTP request to a corresponding server to be monitored; and acquiring an HTTP request result returned by the server to be monitored.
Optionally, in the server health monitoring method according to the present invention, the health check parameter includes one or more of a request protocol, a request address, a request parameter, an allowed timeout time, and a maximum retry number of the server to be monitored.
Optionally, in the server health monitoring method according to the present invention, the analyzing the request result includes: and acquiring a state code and a response body from the request result, and determining the health state of the server to be monitored corresponding to the request result at the time point when the request result returns according to the state code and the response body.
Optionally, in the server health monitoring method according to the present invention, the health status includes one or more of health, ongoing maintenance, and abnormality.
Optionally, in the server health monitoring method according to the present invention, after determining the health status of the server to be monitored at the time point when the request result is returned, the method further includes the steps of: and judging whether the health state of the server to be monitored changes or not, and if so, sending a corresponding server health state change notification to the client.
Optionally, in the server health monitoring method according to the present invention, the health monitoring server is connected to the state machine module, and the step of determining whether the health state of the server to be monitored changes includes: and sending the signal corresponding to the health state of the server to be monitored to a state machine module so that the state machine module judges whether the health state of the server to be monitored changes or not according to the signal corresponding to the health state, and sending a corresponding server health state change notification to the client when the change is determined.
Optionally, in the server health monitoring method according to the present invention, the check server is an Agent proxy server.
According to an aspect of the present invention, there is provided a server health monitoring system, comprising: one or more inspection servers, each of which is connected with one server to be monitored; the health monitoring server is in communication connection with one or more checking servers and is suitable for executing the method for performing health monitoring on the server to be monitored.
Optionally, in the server health monitoring system according to the present invention, further comprising: and the state machine module is connected with the health monitoring server, is suitable for receiving a signal which is sent by the health monitoring server and corresponds to the health state of the server to be monitored, is suitable for judging whether the health state of the server to be monitored changes according to the signal which corresponds to the health state, and sends a corresponding server health state change notice to the client when the change is determined.
According to an aspect of the invention, there is provided a computing device comprising: at least one processor; a memory storing program instructions configured to be executed by the at least one processor, the program instructions comprising instructions for performing the server health monitoring method as described above.
According to one aspect of the present invention, there is provided a readable storage medium storing program instructions that, when read and executed by a computing device, cause the computing device to perform a server health monitoring method as described above.
According to the technical scheme of the invention, the health monitoring server sends health checking requests to one or more checking servers at regular time, so that HTTP requests are sent to corresponding servers to be monitored through the checking servers and request results returned by the servers to be monitored are obtained, the request results returned by the servers to be monitored at various time points are obtained from a memory queue of the checking servers at regular time, and the health states of the servers to be monitored at different time points can be analyzed and determined according to the request results returned by the servers to be monitored at various time points. In this way, the current health status of the server to be monitored can be accurately determined. And whether the health state of the monitoring server changes or not can be judged, and when the health state of the monitoring server changes, the health state change notification of the monitoring server is sent to the user, so that the user can know the current health state of the server to be monitored in real time.
Furthermore, different timing tasks are set for each server to be monitored, a thread is started for each server to be monitored to execute the timing tasks, and the health state of each server to be monitored is checked at regular time, so that the health state monitoring requirements of different servers to be monitored can be met.
The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.
Drawings
To the accomplishment of the foregoing and related ends, certain illustrative aspects are described herein in connection with the following description and the annexed drawings, which are indicative of various ways in which the principles disclosed herein may be practiced, and all aspects and equivalents thereof are intended to be within the scope of the claimed subject matter. The above and other objects, features and advantages of the present disclosure will become more apparent from the following detailed description read in conjunction with the accompanying drawings. Throughout this disclosure, like reference numerals generally refer to like parts or elements.
FIG. 1 shows a schematic diagram of a server health monitoring system 100, in accordance with one embodiment of the present invention;
FIG. 2 shows a schematic diagram of a computing device 200, according to one embodiment of the invention; and
FIG. 3 shows a flow diagram of a server health monitoring method 300 according to one embodiment of the invention.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
FIG. 1 shows a schematic diagram of a server health monitoring system 100, in accordance with one embodiment of the present invention.
As shown in FIG. 1, the server health monitoring system 100 includes a client 110, a health monitoring server 120, a state machine module 125, and one or more check servers 130. Each of the examination servers 130 may be communicatively connected to one server 150 to be monitored, and the health monitoring server 120 may be communicatively connected to the client 110 and one or more examination servers 130, for example, through a wired or wireless network.
The health monitoring server 120 is used for managing and controlling the one or more checking servers 130 to check the health status of the servers 150 to be monitored at regular time, so as to realize the health monitoring of the one or more servers 150 to be monitored. The present invention is not limited to the specific type of the health monitoring server 120, for example, the health monitoring server 120 may be implemented as a computing device such as a desktop computer, a notebook computer, a processor chip, a mobile phone, a tablet computer, etc., but is not limited thereto, and may also be an application program residing on the computing device.
The client 110 is a terminal device used by a user, and may specifically be a personal computer such as a desktop computer and a notebook computer, or may also be a mobile phone, a tablet computer, a multimedia device, an intelligent wearable device, and the like, but is not limited thereto.
In the embodiment of the present invention, the health monitoring server 120 periodically sends health check requests to one or more checking servers 130, so that the checking servers 130 send requests to corresponding servers 150 to be monitored based on the health check requests, and obtain request results returned by the servers 150 to be monitored. After acquiring the request result returned by the server to be monitored 150 each time, the check server 130 stores the request result in the memory queue of the check server 130. In this way, the memory queue of the checking server 130 may include the request results returned by one or more servers 150 to be monitored connected thereto.
And, the health monitoring server 120 regularly obtains the request result in the memory queue of each checking server 130, and determines the health status of the server 150 to be monitored corresponding to the request result at the time point when the request result is returned by analyzing the request result.
In one embodiment, the inspection server 130 may be implemented as an Agent proxy server. The user may register information for one or more check servers 130 at client 110 and register information for one or more servers to be monitored 150 and send to health monitoring server 120. Here, the examination server information includes, for example, a name (Agent name) of the examination server, an address, an access key, a description, and the like. Here, the access key is, for example, a Token key. The server information to be monitored includes, for example, one or more of an item name, a service name, a request protocol, a health check request address, a request parameter, a request time interval, an allowed timeout time, a maximum number of retries, a check server name, and an alarm mailbox.
The health monitoring server 120, after receiving information of one or more checking servers registered by the user at the client 110 and information of one or more servers to be monitored, may bind the server 150 to be monitored with a corresponding one of the checking servers 130 according to the information of the server to be monitored (checking server name).
In one embodiment, as shown in FIG. 1, health monitoring server 120 includes a management module 121, a monitoring module 122 connected to client 110. The monitoring module 122 may establish a communication connection with one or more inspection servers 130, so as to periodically send health inspection requests to one or more inspection servers 130 and periodically obtain the request results in the memory queue of each inspection server 130. The monitoring module 122 analyzes the request result to determine the health status of the server 150 to be monitored corresponding to the request result at the time point when the request result is returned.
The management module 121 may acquire information of one or more check servers registered by the user at the client 110 and manage the check server information. The monitoring module 122 may obtain information of one or more servers to be monitored registered by the user at the client 110, and store the registered server information to be monitored in the structured data storage device, so as to manage the server information to be monitored. Also, the monitoring module 122 may bind the server 150 to be monitored with a corresponding one of the check servers 130 according to the server information to be monitored (check server name).
In one embodiment, after acquiring the information of the one or more monitoring servers registered by the client 110, the health monitoring server 120 may set a corresponding timing task for each server 150 to be monitored, and configure a thread for each server 150 to be monitored to execute the corresponding timing task.
In this way, the health monitoring server 120 starts a thread for each server 150 to be monitored in the memory to execute the timing task of the server 150 to be monitored, so as to send the health check request of the server 150 to be monitored to the check server 130 connected to the server 150 to be monitored at regular time, and obtain the request result returned by the server 150 to be monitored stored in the memory queue of the check server 130 at regular time, and analyze the health status of the corresponding server 150 to be monitored according to each request result.
In one embodiment, the health monitoring server 120, after determining the health status of the server to be monitored 150 (i.e., the health status of the server to be monitored 150 at the time point when the request result is returned) by analyzing the request result of the server to be monitored 150, also determines whether the health status of the server to be monitored 150 has changed, i.e., determines whether the health status of the server to be monitored 150 has changed from the health status at the time point when the last request result was returned. If a change occurs, a corresponding server health status change notification is sent to the client 110. For example, a server health status change notification may be sent to an alarm mailbox corresponding to the server 150 to be monitored, so as to inform the user of the current server health status.
In one embodiment, the monitoring module 122 of the health monitoring server 120 is connected to a State Machine module 125, and the State Machine module 125 may be implemented as, for example, a FSM (Finite State Machine). The monitoring module 122 of the health monitoring server 120 sends a signal (e.g., a health signal, a maintenance signal, an abnormal signal) corresponding to the health status of the server 150 to be monitored to the state machine module 125, and the state machine module 125 can determine whether the health status of the server 150 to be monitored changes according to the signal corresponding to the health status. The state machine module 125 may send a corresponding server health status change notification to the client when it is determined that the health status of the server 150 to be monitored changes.
In one embodiment, the checking server 130 is adapted to send an HTTP request to the server to be monitored 150 connected thereto, and obtain the HTTP request result returned by the server to be monitored 150. After obtaining the HTTP request result from the memory queue of the check server 130, the health monitoring server 120 may determine the health status of the server 150 to be monitored corresponding to the request result according to the status code and the response body by obtaining the status code and the response body from the HTTP request result. Here, the health state includes one or more of health, maintenance, and abnormality.
In one embodiment, the health monitoring server 120 is configured to perform the server health monitoring method 300 of the present invention, the server health monitoring method 300 of the present invention being described in more detail below.
In one embodiment, the health monitoring server 120, the client 110, the checking server 130, and the server to be monitored 150 of the present invention may each be implemented as a computing device, such that the server health monitoring method 300 of the present invention may be executed in the computing device.
FIG. 2 shows a schematic diagram of a computing device 200, according to one embodiment of the invention.
As shown in FIG. 2, in a basic configuration 202, a computing device 200 typically includes a system memory 206 and one or more processors 204. A memory bus 208 may be used for communication between the processor 204 and the system memory 206.
Depending on the desired configuration, the processor 204 may be any type of processing, including but not limited to: a microprocessor (UP), a microcontroller (UC), a digital information processor (DSP), or any combination thereof. The processor 204 may include one or more levels of cache, such as a level one cache 210 and a level two cache 212, a processor core 214, and registers 216. Example processor cores 214 may include Arithmetic Logic Units (ALUs), floating Point Units (FPUs), digital signal processing cores (DSP cores), or any combination thereof. The example memory controller 218 may be used with the processor 204, or in some implementations the memory controller 218 may be an internal part of the processor 204.
Depending on the desired configuration, system memory 206 may be any type of memory, including but not limited to: volatile memory (such as RAM), non-volatile memory (such as ROM, flash memory, etc.), or any combination thereof. System memory 206 may include an operating system 220, one or more applications 222, and program data 224. The application 222 is actually a plurality of program instructions that direct the processor 204 to perform corresponding operations. In some embodiments, application 222 may be arranged to cause processor 204 to operate with program data 224 on an operating system.
Computing device 200 also includes storage device 232, storage device 232 including removable storage 236 and non-removable storage 238.
Computing device 200 may also include a storage interface bus 234. A storage interface bus 234 enables communication from storage devices 232 (e.g., removable storage 236 and non-removable storage 238) to basic configuration 202 via bus/interface controller 230. At least a portion of the operating system 220, applications 222, and data 224 may be stored on removable storage 236 and/or non-removable storage 238, and loaded into system memory 206 via storage interface bus 234 and executed by the one or more processors 204 when the computing device 200 is powered on or the applications 222 are to be executed.
Computing device 200 may also include an interface bus 240 that facilitates communication from various interface devices (e.g., output devices 242, peripheral interfaces 244, and communication devices 246) to the basic configuration 202 via the bus/interface controller 230. The exemplary output device 242 includes an image processing unit 248 and an audio processing unit 250. They may be configured to facilitate communication with various external devices, such as a display or speakers, via one or more a/V ports 252. Example peripheral interfaces 244 can include a serial interface controller 254 and a parallel interface controller 256, which can be configured to facilitate communications with external devices such as input devices (e.g., keyboard, mouse, pen, voice input device, touch input device) or other peripherals (e.g., printer, scanner, etc.) via one or more I/O ports 258. An example communication device 246 may include a network controller 260, which may be arranged to facilitate communications with one or more other computing devices 262 over a network communication link via one or more communication ports 264.
A network communication link may be one example of a communication medium. Communication media may typically be embodied by computer readable instructions, data structures, program modules, in a modulated data signal, such as a carrier wave or other transport mechanism, and may include any information delivery media. A "modulated data signal" may be a signal that has one or more of its data set or its changes in such a manner as to encode information in the signal. By way of non-limiting example, communication media may include wired media such as a wired network or private-wired network, and various wireless media such as acoustic, radio Frequency (RF), microwave, infrared (IR), or other wireless media. The term computer readable media as used herein may include both storage media and communication media.
In a computing device 200 according to the present invention, an application 222 in the computing device includes a plurality of program instructions that implement the server health monitoring method 300, which can be read and executed by the processor 204 such that the computing device 200 implements monitoring the health status of a plurality of servers by executing the server health monitoring method 300 of the present invention.
FIG. 3 shows a flow diagram of a server health monitoring method 300 according to one embodiment of the invention. The method 300 is suitable for execution in a health monitoring server 120, such as the aforementioned computing device 200. Health monitoring server 120 is communicatively coupled to client 110, state machine module 125, and one or more check servers 130, respectively. Each check server 130 may be communicatively connected to one monitoring server 150.
As shown in FIG. 3, the method 300 includes steps S310-S320.
In step S310, health check requests are sent to one or more check servers 130 at regular time, so that the check servers 130 send requests to the corresponding servers to be monitored 150 based on the health check requests to obtain request results returned by the servers to be monitored 150. After acquiring the request result returned by the server to be monitored 150 each time, the check server 130 stores the request result in the memory queue of the check server 130. In this way, the memory queue of the checking server 130 may include the request results returned by one or more servers 150 to be monitored connected thereto.
In one embodiment, the inspection server 130 may be implemented as an Agent proxy server. Each of the inspection servers 130 may be communicatively coupled to a server 150 to be monitored.
In one embodiment, a user may register information for one or more inspection servers 130 and one or more servers to be monitored 150 at client 110 prior to performing method 300. Health monitoring server 120 may receive information for one or more check servers (agents) that the user has registered with client 110, as well as information for one or more servers to be monitored.
Specifically, the examination server information includes, for example, a name (Agent name) of the examination server, an address, an access key, a description, and the like. Here, the access key is, for example, a Token key. The server information to be monitored includes, for example, one or more of an item name, a service name, a request protocol, a health check request address, a request parameter, a request time interval, an allowed timeout time, a maximum number of retries, a check server name (Agent name), an alarm mailbox.
In one embodiment, as shown in fig. 1, health monitoring server 120 comprises a management module 121 and a monitoring module 122 connected to client 110, wherein monitoring module 122 is communicatively connected to one or more check servers 130 and is configured to periodically send health check requests to one or more check servers 130.
The management module 121 may acquire information of one or more inspection servers (agents) registered by the user at the client 110 and manage the inspection server information. The monitoring module 122 may obtain information of one or more servers to be monitored registered by the user at the client 110, and store the registered server information to be monitored in the structured data storage device, so as to manage the server information to be monitored. Also, the monitoring module 122 of the health monitoring server 120 may bind the server to be monitored 150 with a corresponding one of the check servers 130 according to the server information to be monitored (check server name).
In step S320, the request result in the memory queue of each inspection server is obtained at regular time, and the health status of the server 150 to be monitored corresponding to the request result at the time point when the request result is returned is determined by analyzing the request result.
According to an embodiment of the present invention, after acquiring information of one or more monitoring servers registered by the client 110, the health monitoring server 120 may set a corresponding timing task for each to-be-monitored server 150, and configure a thread for each to-be-monitored server 150 to execute the corresponding timing task.
In this way, when steps S310 and S320 are executed, the health monitoring server 120 starts a thread for each server 150 to be monitored in the memory to execute the timing task of the server 150 to be monitored, so as to periodically send the health check request of the server 150 to be monitored to the checking server 130 connected to the server 150 to be monitored, and periodically obtain the request result returned by the server 150 to be monitored stored in the memory queue of the checking server 130. Here, in one implementation, after the request result in the memory queue of the checking server 130 is obtained periodically, the obtained request result may be stored in the structured data storage device, and then the health status of the corresponding server 150 to be monitored at the time point of the return of the request result is analyzed according to each request result by traversing each request result in the structured data storage device.
It should be noted that the present invention can satisfy the health status monitoring requirements of different servers to be monitored by setting different timing tasks for each server 150 to be monitored, and starting a thread for each server to be monitored to execute the timing task to periodically check the health status of each server to be monitored.
It should be noted that the health check request to be monitored, which is periodically sent to the check server 130 by the health monitoring server 120, includes the access key and the health check parameter of the check server 130. Here, the access key is, for example, a Token key. In one embodiment, the health check parameters include a request protocol, a request address, request parameters, an allowed timeout time, a maximum number of retries, etc. of the server to be monitored. Here, the permitted timeout time is a response timeout time permitted when the check server 130 transmits a request to the corresponding server to be monitored 150, and the maximum number of retries is a maximum number of times when the check server 130 retransmits a request after failing to transmit a request to the corresponding server to be monitored 150.
In one embodiment, after receiving the health check request periodically sent by the health monitoring server 120, the checking server 130, when sending a request to the corresponding server to be monitored based on the health check request, specifically, by obtaining the health check parameters from the health check request, and constructing an HTTP request based on the health check parameters, and then sending the HTTP request to the corresponding server to be monitored 150. After that, the check server 130 may obtain the HTTP request result returned by the server to be monitored 150.
In an embodiment, when the health status of the corresponding server 150 to be monitored is determined by analyzing the request result, the status code and the response body may be specifically obtained from the HTTP request result, and the health status of the server 150 to be monitored corresponding to the request result at the time point when the request result is returned is determined according to the status code and the response body. Here, the health state includes one or more of health, maintenance, and abnormality. That is, it can be determined from the status code and the response body in the HTTP request result whether the server 150 to be monitored is in a healthy, maintenance state, or abnormal state at the time point corresponding to the HTTP request result returned to the check server 130.
It should be noted that the state code and the responder corresponding to different health states are different. For example, "healthy" corresponds to a status code that is greater than or equal to 200 and less than 400. The maintaining state corresponds to a custom state code 503, and the response body includes a "maintaining" field. Based on this, if the status code in the request result is less than 200, greater than or equal to 400, and not equal to 503, it may be determined that the server to be monitored corresponding to the request result is in an abnormal state.
It should be noted that the present invention is not limited to the above-mentioned classification of health status and the corresponding status code, and the type of health status of the server and the corresponding status code can be defined and adjusted by those skilled in the art according to the actual situation.
According to an embodiment of the present invention, the health monitoring server 120, after determining the health status of the server 150 to be monitored (i.e., the health status of the monitoring server 150 at the time point when the request result is returned) by analyzing the request result of the server 150 to be monitored, also determines whether the health status of the server 150 to be monitored changes, i.e., determines whether the health status of the server 150 to be monitored changes from the health status at the time point when the last request result is returned. If a change occurs, a corresponding server health status change notification is sent to the client 110. For example, a server health status change notification may be sent to an alarm mailbox corresponding to the server 150 to be monitored, so as to inform the user of the current server health status.
In one embodiment, health monitoring server 120 is coupled to State Machine module 125, and State Machine module 125 may be implemented as a FSM (Finite State Machine). The health monitoring server 120 may determine whether the health status of the server 150 to be monitored changes by:
the signals (e.g., health signals, maintenance signals, abnormal signals) corresponding to the health status of the server 150 to be monitored are sent to the state machine module 125, so that the state machine module 125 determines whether the health status of the server 150 to be monitored changes according to the signals corresponding to the health status. When determining that the health status of the server 150 to be monitored changes, the state machine module 125 sends a corresponding server health status change notification to the client, so that the user can know the current health status of the server 150 to be monitored in real time.
Specifically, by respectively creating a state machine (FSM instance) for each server 150 to be monitored in the state machine module 125, after the health monitoring server 120 sends a signal (e.g., a health signal, a maintenance signal, an abnormal signal) corresponding to the health state of any server 150 to be monitored to the state machine module 125, the state machine corresponding to the server 150 to be monitored is used to determine whether the health state of the server 150 to be monitored changes.
For example, if the current state of the state machine is "healthy", it indicates that the current health state of the server 150 to be monitored corresponding to the state machine is healthy, and after the state machine receives the exception signal next time, the state machine enters an exception state and determines that the health state of the server 150 to be monitored changes, and then sends a notification of "server exception" to the client 110. If the current state of the state machine is "healthy", after the state machine receives the maintenance signal next time, the state machine enters the maintenance state and determines that the health state of the server 150 to be monitored changes, and then sends a "server maintenance" notification to the client 110. In addition, if the current state of the state machine is "healthy", when the state machine receives a health signal next time, the health signal is a spin signal for the current state machine, the state of the state machine is not changed, and it is determined that the health state of the server 150 to be monitored is not changed.
According to the server health monitoring method 300, health check requests are sent to one or more check servers at regular time, so that HTTP requests are sent to corresponding servers to be monitored through the check servers and request results returned by the servers to be monitored are obtained, the request results returned by the servers to be monitored at various time points are obtained from the memory queue of the check servers at regular time, and the health states of the servers to be monitored at different time points can be analyzed and determined according to the request results returned by the servers to be monitored at various time points. Therefore, the current health state of the server to be monitored can be accurately analyzed. And whether the health state of the monitoring server changes or not can be judged, and when the health state of the monitoring server changes, the health state change notification of the monitoring server is sent to the user, so that the user can know the current health state of the server to be monitored in real time.
Furthermore, different timing tasks are set for each server to be monitored, a thread is started for each server to be monitored to execute the timing tasks, and the health state of each server to be monitored is checked at regular time, so that the health state monitoring requirements of different servers to be monitored can be met.
The various techniques described herein may be implemented in connection with hardware or software or, alternatively, with a combination of both. Thus, the methods and apparatus of the present invention, or certain aspects or portions thereof, may take the form of program code (i.e., instructions) embodied in tangible media, such as removable hard drives, U.S. disks, floppy disks, CD-ROMs, or any other machine-readable storage medium, wherein, when the program is loaded into and executed by a machine, such as a computer, the machine becomes an apparatus for practicing the invention.
In the case of program code execution on programmable computers, the mobile terminal will generally include a processor, a storage medium readable by the processor (including volatile and non-volatile memory and/or storage elements), at least one input device, and at least one output device. Wherein the memory is configured to store program code; the processor is configured to perform the server health monitoring method of the present invention according to instructions in the program code stored in the memory.
By way of example, and not limitation, readable media may comprise readable storage media and communication media. Readable storage media store information such as computer readable instructions, data structures, program modules or other data. Communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. Combinations of any of the above are also included within the scope of readable media.
In the description provided herein, algorithms and displays are not inherently related to any particular computer, virtual system, or other apparatus. Various general purpose systems may also be used with examples of this invention. The required structure for constructing such a system is apparent from the description above. Moreover, the present invention is not directed to any particular programming language. It is appreciated that a variety of programming languages may be used to implement the teachings of the present invention as described herein, and any descriptions of specific languages are provided above to disclose the best mode of the invention.
In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be construed to reflect the intent: rather, the invention as claimed requires more features than are expressly recited in each claim. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.
Those skilled in the art will appreciate that the modules or units or components of the devices in the examples disclosed herein may be arranged in a device as described in this embodiment or alternatively may be located in one or more devices different from the devices in this example. The modules in the foregoing examples may be combined into one module or may be further divided into multiple sub-modules.
Those skilled in the art will appreciate that the modules in the device in an embodiment may be adaptively changed and disposed in one or more devices different from the embodiment. The modules or units or components in the embodiments may be combined into one module or unit or component, and furthermore, may be divided into a plurality of sub-modules or sub-units or sub-components. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or elements of any method or apparatus so disclosed, may be combined in any combination, except combinations where at least some of such features and/or processes or elements are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
Furthermore, those skilled in the art will appreciate that while some embodiments described herein include some features included in other embodiments, rather than other features, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments.
Additionally, some of the embodiments are described herein as a method or combination of method elements that can be implemented by a processor of a computer system or by other means of performing the described functions. A processor with the necessary instructions for carrying out the method or the method elements thus forms a device for carrying out the method or the method elements. Further, the elements of the apparatus embodiments described herein are examples of the following apparatus: the means for performing the functions performed by the elements for the purpose of carrying out the invention.
As used herein, unless otherwise specified the use of the ordinal adjectives "first", "second", "third", etc., to describe a common object, merely indicate that different instances of like objects are being referred to, and are not intended to imply that the objects so described must be in a given sequence, either temporally, spatially, in ranking, or in any other manner.
While the invention has been described with respect to a limited number of embodiments, those skilled in the art, having benefit of this description, will appreciate that other embodiments can be devised which do not depart from the scope of the invention as disclosed herein. Furthermore, it should be noted that the language used in the specification has been principally selected for readability and instructional purposes, and may not have been selected to delineate or circumscribe the inventive subject matter. Accordingly, many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the appended claims. The present invention has been disclosed with respect to the scope of the invention, which is to be considered as illustrative and not restrictive, and the scope of the invention is defined by the appended claims.
Claims (11)
1. A server health monitoring method, implemented in a health monitoring server, said health monitoring server being connected to a plurality of examination servers, each examination server being connected to a server to be monitored, respectively, said method comprising the steps of:
respectively setting a timing task for each server to be monitored, and respectively starting a thread for each server to be monitored to execute the corresponding timing task;
regularly sending health check requests to a plurality of check servers based on timing tasks, so that the check servers send requests to corresponding servers to be monitored based on the health check requests, obtain request results returned by the servers to be monitored, and store the request results to a memory queue of the check servers; and
the method comprises the steps of obtaining a request result in a memory queue of each check server at regular time based on a timing task, obtaining a state code and a response body from the request result, and determining the request result according to the state code and the response body so as to determine the health state of a corresponding server to be monitored at the time point when the request result is returned.
2. The method as claimed in claim 1, wherein the step of the checking server sending a request to the corresponding server to be monitored based on the health check request and obtaining the request result returned by the server to be monitored comprises:
the checking server acquires health checking parameters from the health checking request, constructs an HTTP request based on the health checking parameters and sends the HTTP request to a corresponding server to be monitored;
and acquiring an HTTP request result returned by the server to be monitored.
3. The method of claim 2, wherein,
the health check parameters comprise one or more of request protocol, request address, request parameters, allowed timeout time and maximum retry number of the server to be monitored.
4. The method of any one of claims 1-3, the health status comprising one or more of healthy, in-maintenance, abnormal.
5. The method according to any of claims 1-4, further comprising, after determining the health status of the server to be monitored at the point in time when the request result is returned, the steps of:
and judging whether the health state of the server to be monitored changes or not, and if so, sending a corresponding server health state change notification to the client.
6. The method of claim 5, wherein the health monitoring server is coupled to a state machine module, and wherein determining whether the health status of the server to be monitored has changed comprises:
and sending the signal corresponding to the health state of the server to be monitored to a state machine module so that the state machine module judges whether the health state of the server to be monitored changes or not according to the signal corresponding to the health state, and sending a corresponding server health state change notification to the client when the change is determined.
7. The method of any one of claims 1-6, the inspection server being an Agent proxy server.
8. A server health monitoring system, comprising:
the system comprises a plurality of inspection servers, a monitoring server and a monitoring server, wherein each inspection server is respectively connected with one server to be monitored;
a health monitoring server, communicatively connected to the plurality of examination servers, adapted to perform the method according to any of claims 1-7 for health monitoring of the server to be monitored.
9. The system of claim 8, further comprising:
and the state machine module is connected with the health monitoring server, is suitable for receiving a signal which is sent by the health monitoring server and corresponds to the health state of the server to be monitored, is suitable for judging whether the health state of the server to be monitored changes according to the signal which corresponds to the health state, and sends a corresponding server health state change notice to the client when the change is determined.
10. A computing device, comprising:
at least one processor; and
a memory storing program instructions, wherein the program instructions are configured to be executed by the at least one processor, the program instructions comprising instructions for performing the server health monitoring method of any of claims 1-7.
11. A readable storage medium storing program instructions that, when read and executed by a computing device, cause the computing device to perform the server health monitoring method of any of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210731202.8A CN115190047B (en) | 2022-02-10 | 2022-02-10 | Method, system and computing device for monitoring server health |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210123500.9A CN114172829B (en) | 2022-02-10 | 2022-02-10 | Server health monitoring method and system and computing equipment |
CN202210731202.8A CN115190047B (en) | 2022-02-10 | 2022-02-10 | Method, system and computing device for monitoring server health |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210123500.9A Division CN114172829B (en) | 2022-02-10 | 2022-02-10 | Server health monitoring method and system and computing equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115190047A true CN115190047A (en) | 2022-10-14 |
CN115190047B CN115190047B (en) | 2023-07-07 |
Family
ID=80489591
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210123500.9A Active CN114172829B (en) | 2022-02-10 | 2022-02-10 | Server health monitoring method and system and computing equipment |
CN202210731202.8A Active CN115190047B (en) | 2022-02-10 | 2022-02-10 | Method, system and computing device for monitoring server health |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210123500.9A Active CN114172829B (en) | 2022-02-10 | 2022-02-10 | Server health monitoring method and system and computing equipment |
Country Status (1)
Country | Link |
---|---|
CN (2) | CN114172829B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114826982B (en) * | 2022-04-08 | 2023-08-18 | 浙江大学 | Self-adaptive heartbeat packet adjusting method in micro-service scene |
CN114938377B (en) * | 2022-04-20 | 2024-05-17 | 京东科技信息技术有限公司 | Back-end server management method and device, readable medium and electronic equipment |
CN114785861B (en) * | 2022-06-22 | 2022-12-13 | 飞狐信息技术(天津)有限公司 | Service request forwarding system, method, computer equipment and storage medium |
CN115665009B (en) * | 2022-12-29 | 2023-05-09 | 鹏城实验室 | DNS root server state monitoring method and device, electronic equipment and medium |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2017000296A1 (en) * | 2015-07-02 | 2017-01-05 | 深圳市谷玛鹤健康科技有限公司 | Temperature measurement device, central monitoring station, and temperature monitoring system and method therefor |
CN107070742A (en) * | 2017-03-14 | 2017-08-18 | 北京三快在线科技有限公司 | Service server health status inspection method and system |
CN108199914A (en) * | 2017-12-27 | 2018-06-22 | 杭州迪普科技股份有限公司 | Server-side condition detection method and device |
CN108228452A (en) * | 2017-12-28 | 2018-06-29 | 微梦创科网络科技(中国)有限公司 | A kind of test method and test device based on simple factory mode |
US20190089618A1 (en) * | 2017-09-18 | 2019-03-21 | Citrix Systems, Inc. | Extensible, Decentralized Health Checking of Cloud Service Components and Capabilities |
CN110290019A (en) * | 2019-05-27 | 2019-09-27 | 网宿科技股份有限公司 | Monitoring method and system |
JP2021060221A (en) * | 2019-10-03 | 2021-04-15 | サイマックス株式会社 | Health monitoring system, health monitoring method and health monitoring program |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101383735A (en) * | 2008-10-15 | 2009-03-11 | 阿里巴巴集团控股有限公司 | Server checking method, equipment and system |
CN102375772B (en) * | 2011-09-27 | 2015-05-06 | 云智慧(北京)科技有限公司 | Server monitoring method and device |
CN107241240B (en) * | 2017-06-30 | 2020-04-03 | 广州君海网络科技有限公司 | Game server state monitoring method, device and system |
CN112882895B (en) * | 2021-02-22 | 2024-06-21 | 中国工商银行股份有限公司 | Health check method, device, computer system and readable storage medium |
CN112882901B (en) * | 2021-03-04 | 2024-06-18 | 中国航空工业集团公司西安航空计算技术研究所 | Intelligent health state monitor of distributed processing system |
CN113032223B (en) * | 2021-04-20 | 2023-04-11 | 上海哔哩哔哩科技有限公司 | Server state detection method and device |
-
2022
- 2022-02-10 CN CN202210123500.9A patent/CN114172829B/en active Active
- 2022-02-10 CN CN202210731202.8A patent/CN115190047B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2017000296A1 (en) * | 2015-07-02 | 2017-01-05 | 深圳市谷玛鹤健康科技有限公司 | Temperature measurement device, central monitoring station, and temperature monitoring system and method therefor |
CN107070742A (en) * | 2017-03-14 | 2017-08-18 | 北京三快在线科技有限公司 | Service server health status inspection method and system |
US20190089618A1 (en) * | 2017-09-18 | 2019-03-21 | Citrix Systems, Inc. | Extensible, Decentralized Health Checking of Cloud Service Components and Capabilities |
CN108199914A (en) * | 2017-12-27 | 2018-06-22 | 杭州迪普科技股份有限公司 | Server-side condition detection method and device |
CN108228452A (en) * | 2017-12-28 | 2018-06-29 | 微梦创科网络科技(中国)有限公司 | A kind of test method and test device based on simple factory mode |
CN110290019A (en) * | 2019-05-27 | 2019-09-27 | 网宿科技股份有限公司 | Monitoring method and system |
JP2021060221A (en) * | 2019-10-03 | 2021-04-15 | サイマックス株式会社 | Health monitoring system, health monitoring method and health monitoring program |
Non-Patent Citations (1)
Title |
---|
林国;刘学锋;毛建华;张亚平;: "Sensor Web支持下的远程健康监护系统客户端设计与实现", 电子测量技术, no. 13 * |
Also Published As
Publication number | Publication date |
---|---|
CN115190047B (en) | 2023-07-07 |
CN114172829B (en) | 2022-08-12 |
CN114172829A (en) | 2022-03-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114172829B (en) | Server health monitoring method and system and computing equipment | |
CN107995030B (en) | Network detection method, network fault detection method and system | |
CN107835098B (en) | Network fault detection method and system | |
CN109040209A (en) | Intercept method, apparatus, computer equipment and the storage medium of repetitive requests | |
US20200314093A1 (en) | System and method for selectively hibernating and restarting a node of an application instance | |
CN113259428A (en) | Data access request processing method and device, computer equipment and medium | |
US9448827B1 (en) | Stub domain for request servicing | |
CN109039803A (en) | A kind of method, system and the computer equipment of processing readjustment notification message | |
CN103618590A (en) | Overtime control method and device of business processing process | |
CN115604144B (en) | Test method and device, electronic equipment and storage medium | |
US20030055951A1 (en) | Products, apparatus and methods for handling computer software/hardware messages | |
CN110795343A (en) | Test system, test method and computing device | |
CN107783844A (en) | A kind of computer program operation exception detection method, device and medium | |
CN114090623A (en) | Method and device for creating cache resources, electronic equipment and storage medium | |
CN110912990B (en) | Method and related equipment for updating consensus period | |
CN104468594A (en) | Data request method, device and system | |
CN112465599B (en) | Order processing method, order processing system and computing equipment | |
US20160309005A1 (en) | Method of automatically setting protocol in programmable logic controller system | |
CN109165147A (en) | Log print control program, device, system, back-end server and headend equipment | |
CN111209333B (en) | Data updating method, device, terminal and storage medium | |
CN111447273A (en) | Cloud processing system and data processing method based on cloud processing system | |
CN110825592A (en) | Method and computing device for generating alarm content | |
CN112148783A (en) | Data exchange method, device and equipment | |
CN114880194B (en) | Service abnormity monitoring method and device, electronic equipment and computer storage medium | |
CN114500327B (en) | Detection method and detection device for server cluster and computing equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |