CN112131073A - Server monitoring method and system - Google Patents

Server monitoring method and system Download PDF

Info

Publication number
CN112131073A
CN112131073A CN202010860648.1A CN202010860648A CN112131073A CN 112131073 A CN112131073 A CN 112131073A CN 202010860648 A CN202010860648 A CN 202010860648A CN 112131073 A CN112131073 A CN 112131073A
Authority
CN
China
Prior art keywords
alarm
monitoring data
monitoring
module
server
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010860648.1A
Other languages
Chinese (zh)
Inventor
马涛
邱春武
李国平
李其轩
陈艺超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sina Technology China Co Ltd
Original Assignee
Sina Technology China Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sina Technology China Co Ltd filed Critical Sina Technology China Co Ltd
Priority to CN202010860648.1A priority Critical patent/CN112131073A/en
Publication of CN112131073A publication Critical patent/CN112131073A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3006Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/32Monitoring with visual or acoustical indication of the functioning of the machine
    • G06F11/324Display of status information
    • G06F11/327Alarm or error message display
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2237Vectors, bitmaps or matrices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2255Hash tables

Abstract

The embodiment of the invention provides a monitoring method and a system of a server, wherein the method comprises the following steps: the monitoring data acquisition module acquires monitoring data of a server to be monitored and reports the monitoring data to the proxy server; the proxy server distributes the monitoring data to a graph storage module and an alarm module; the graph storage module stores the monitoring data into a database; the alarm module acquires an alarm rule from the database, determines whether to alarm or not according to the alarm rule and the monitoring data, generates an alarm event if the alarm is determined, and writes the alarm event into a storage queue module; and the sending module acquires the alarm event from the storage queue module and sends the alarm event. The method can ensure high availability and faster query and operation efficiency, and reduce the complexity of operation.

Description

Server monitoring method and system
Technical Field
The invention relates to Content Delivery Network (CDN) monitoring in cloud computing, in particular to a monitoring method and a monitoring system for a server.
Background
In the prior art zabbix, the mysql library has too much data after being used for a period of time, so that the query is very slow and the operation is complicated. In the prior art, monitoring is carried out by reporting collected data to a center mysql through an agent for storage, and a front-end page obtains the collected data through a query center mysql and draws the collected data.
In the process of implementing the invention, the inventor finds that at least the following problems exist in the prior art: when the collection indexes are gradually increased, mysql query efficiency becomes very low, so that manual operation is abnormally slow, system level alarm, alarm mail and custom project alarm need to be set by self, and the process is complicated.
Disclosure of Invention
The embodiment of the invention provides a monitoring method and a monitoring system for a server, which are used for improving the database query efficiency or reducing the complexity of operation.
In a first aspect, an embodiment of the present invention provides a server monitoring method, which includes:
the monitoring data acquisition module acquires monitoring data of a server to be monitored and reports the monitoring data to the proxy server;
the proxy server distributes the monitoring data to a graph storage module and an alarm module;
the graph storage module stores the monitoring data into a database;
the alarm module acquires an alarm rule from the database, determines whether to alarm or not according to the alarm rule and the monitoring data, generates an alarm event if the alarm is determined, and writes the alarm event into a storage queue module;
and the sending module acquires the alarm event from the storage queue module and sends the alarm event.
In a second aspect, an embodiment of the present invention provides a monitoring system for a server, including:
the monitoring data acquisition module is used for acquiring monitoring data of a server to be monitored and reporting the monitoring data to the proxy server;
the proxy server is used for distributing the monitoring data to a graph storage module and an alarm module;
the graph storage module is used for storing the monitoring data into a database;
the alarm module is used for acquiring an alarm rule from the database, determining whether to alarm or not according to the alarm rule and the monitoring data, generating an alarm event if the alarm is determined, and writing the alarm event into a storage queue module;
and the sending module is used for acquiring the alarm event from the storage queue module and sending the alarm event.
In a third aspect, an embodiment of the present invention provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the computer program implements the monitoring method for the server.
The technical scheme has the following beneficial effects: the embodiment of the invention collects the monitoring data of the server to be monitored through the monitoring data collection module and reports the monitoring data to the proxy server; the proxy server distributes the monitoring data to the graph storage module and the alarm module; the graph storage module stores the monitoring data into a database; the alarm module acquires an alarm rule from the database, determines whether to alarm or not according to the alarm rule and the monitoring data, generates an alarm event if the alarm is determined, and writes the alarm event into the storage queue module; the sending module obtains the alarm event from the storage queue module and sends the alarm event, so that the logic is simpler, the mode of storing the alarm event into the database is changed into the mode of rrd time sequence database, each link can adopt a distributed mode, high availability and faster query and operation efficiency are ensured, and the complex operation degree is reduced.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1A is a functional block diagram of a monitoring system of a server of an embodiment of the present invention;
FIG. 1B is another functional block diagram of a monitoring system of a server in accordance with an embodiment of the present invention;
fig. 2 is an architecture diagram of a monitoring system and a monitoring method thereof according to an embodiment of the present invention as an example;
fig. 3 is a flowchart of a monitoring method of a server according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment of the invention adopts brand-new technical consideration and brand-new scheme design, so that the logic is simpler, the mode of storing the database is changed into the mode of rrd time sequence database, and each link can adopt a distributed mode, thereby ensuring high availability and faster query and operation efficiency. Wherein, rrd (round Robin database) is a database that recycles storage space, and is suitable for storing time series data related to time series.
Fig. 1A is a functional block diagram of a monitoring system of a server according to an embodiment of the present invention. As shown in fig. 1, the monitoring system of the server includes:
the monitoring data acquisition module 10 is used for acquiring monitoring data of a server to be monitored and reporting the monitoring data to the proxy server 20; the monitoring data includes, but is not limited to, cpu utilization of the server, system load, hard disk io, and the like, for example.
The proxy server 20 is used for distributing the monitoring data to the graph storage module 30 and the alarm module 40;
the graph storage module 30 is used for storing the monitoring data into a database;
an alarm module 40 for acquiring an alarm rule from the database 50, determining whether to alarm according to the alarm rule and the monitoring data, generating an alarm event if the alarm is determined, and writing the alarm event into the storage queue module 60;
and a sending module 70, configured to obtain the alarm event from the storage queue module 60 and send the alarm event.
In some embodiments, the monitoring data acquisition module 10 may be specifically configured to:
the monitoring data acquisition module 10 actively reads basic information of a server to be monitored as monitoring data, and sends the basic information to the proxy server 20 through a hypertext transfer protocol http request; alternatively, the first and second electrodes may be,
the monitoring data acquisition module 10 acquires an acquisition rule through a timing request application program interface API80, acquires an acquisition script to be executed from the acquisition rule, executes the acquisition script regularly according to the acquisition rule, then collects script printing information, and reports the collected script printing information as monitoring data to the proxy server 20; alternatively, the first and second electrodes may be,
the monitoring data acquisition module 10 receives monitoring data reported by other programs at regular time through the monitoring port, and then sends a request carrying the monitoring data to the proxy server 20.
In some embodiments, the proxy server 20 may be specifically configured to: and forwarding the monitoring data adopting the key-value structure to the plurality of graphic storage modules 30 by using the unique mark-monitoring item name-monitoring item mark devices-measures-tags of the monitoring equipment as keys and the acquired value of the monitoring item as a value through a rule of hash. The device name is used to uniquely identify a device to be monitored. The collection Value of the monitoring index refers to currently collected data, for example, the current cpu idle rate is 90%, and the reported Value is 90%. The hash rule is stored in a k/v structure, the value can be quickly found through the key, and the query speed is high.
In some embodiments, the fields of the monitoring data include: monitoring item name, monitoring item mark, monitoring item collection period, monitoring equipment unique mark, monitoring item collection value and collection time. Among them, a Timestamp field, a monitoring entry collection period step field, will be used in alarm mapping.
The fields of the alarm rule include: monitoring item name, monitoring item mark, threshold value times and contact information of the notified alarm person;
the alarm event comprises all fields of monitoring data and all fields of alarm rules;
the alert module 40 may also be used to: writing the alarm event into the database 50 for display by the front-end module; and/or, when the alarm module 40 determines that the monitoring item corresponding to the alarm event is recovered according to the reported monitoring data, writing the recovered alarm event, and then updating the alarm event state of the database to be recovered.
Fig. 1B is another functional block diagram of a monitoring system of a server according to an embodiment of the present invention. As shown in fig. 1B, in some embodiments, the monitoring system may further include: the front-end module 90 is used for calling the application program interface API80, acquiring monitoring data from the graph storage module 30, generating a visual graph according to the monitoring data, and calling the application program interface API80 to configure acquisition rules and alarm rules in the database 50 by the front-end module 90; the Server to be monitored includes a content delivery network CDN Server, lvs (Linux Virtual Server) Server, and the like. The database 50 includes, but is not limited to, mysql, which may be implemented in other linux servers, for example, as long as scripts corresponding to monitoring the collection are provided.
The technical solution of the embodiments of the present invention is described in more detail below by way of examples:
fig. 2 is an architecture diagram of a monitoring system and a monitoring method thereof according to an embodiment of the present invention as an example. As shown in fig. 2, the technical solution of the embodiment of the present invention includes: data acquisition, data reporting, data warehousing, alarm processing and front-end display 5, which are described in detail below.
(1) Data acquisition
The client agent is responsible for collecting data and reporting the collected data to the proxy server proxy, and acquiring collection strategies through an interface of the request proxy, wherein the collection strategies or collection rules are manually configured and stored in the database, and each collection strategy defines what information the client needs to collect and what script or command should be executed, such as cpu information, system load, memory usage rate and the like, and can be collected and reported in the following modes.
Acquisition mode 1: the client agent actively reads cdn device information, such as basic information of cpu, network card, load, etc., and sends the information to proxy through http request.
The collection mode 2: the client agent obtains the collection script to be executed by obtaining the collection strategy stored by the center (namely, the proxy server reads the database to obtain the collection strategy configured by the task) through an api (Application Programming Interface) of the proxy server at regular time, executes the collection script at regular time according to the obtained collection strategy, then collects script printing information, and reports the collected script printing information to the proxy. The print information refers to outputting the content acquired by the script to the screen. agent is connected with proxy, which is beneficial to ensuring the network smoothness. In some embodiments, the api is part of a proxy program. Optionally, the stored collection policy may be a script and a command that specifically define how to collect, and what to collect.
And (3) an acquisition mode: other programs report data to the agent monitoring port at regular time, and the agent forwards the reported data to proxy through a post request of http. The other programs may generally refer to all other programs on the linux system, and the other programs themselves may obtain their own information and directly report data through the port monitored by the post request agent of http. The http post request carries a number of fields as in the following example.
As an example, the data collected is as follows:
metric: cpu, representing a monitoring item name;
tags: idle, representing a monitoring item tag; the mark can be arbitrarily defined, and has the effect of subdividing one monitoring item, for example, the cpu has a plurality of values such as idle and used;
step: 60, representing a monitoring item acquisition cycle;
device: 192.168.1.1, representing a monitoring device unique signature;
value: 99, representing the monitoring item collection value;
time: 1593748649, unix (a kind of operating system) time stamp indicating the acquisition time. The time is used in the following alarm event or in the drawing, the alarm needs to know the time point of the alarm event, and the drawing is plotted according to the time on the x-axis and the specific value on the y-axis.
(2) Data reporting
The proxy is responsible for distributing data to the graph, the alarm component, the alarm, of the graph data storage component. Specifically, the proxy copies the data into 2 shares, one copy is forwarded to the graph data storage component graph for data storage, and the other copy is forwarded to the alarm component alarm for alarm.
proxy makes proxy for strategy interface of api to make agent obtain collection strategy.
(3) Data warehousing
When the proxy forwards the data to the back-end graph, the data is forwarded to a plurality of graphs according to the device-metric-tags as keys and then through the rule of hash, so that the problem that the hard disk is slow to read and write when the data volume is large can be solved, namely, the graph is a cluster, and the data is written in through the hash and multi-copy mode when the proxy forwards the data.
For example: there are two graph clusters a and B: the A cluster comprises two devices A1 and A2; the B cluster comprises two devices B1 and B2. The cluster B exists as a copy of the cluster A, when a proxy forwards a request to a graph, two pieces of data are sent to A1 and B1 at the same time through a hash rule, the hash rule carries out hash removal according to IP addresses configured by the cluster in the cluster, if one of the cluster A is down or the IP addresses are out of order, the hash rule is disturbed, part of query requests cannot obtain the data, and the cluster B is needed at this time, so that the data can be ensured not to be lost.
When a large number of requests are made for a graph cluster, the hash rule can distribute the requests to a plurality of devices in one cluster according to the rule of key + ip, so that the io pressure of the devices is evenly distributed, and meanwhile, the corresponding values are directly calculated through the key and the ip during query, and the corresponding values are obtained from the corresponding device library.
(4) Alarm system
When alarm is started, mysql (a relational database management system) gets the user-specified alarm policy, such as metric: cpu and tags: if the value of idle exceeds the threshold m, generating an alarm event, after receiving a request forwarded by a proxy, judging whether to alarm or not according to an alarm strategy, if so, generating the alarm event to be written into a redis (key-value storage system) queue, and then consuming (reading) each alarm event by consumption programs such as an electronic mail sender mail-sender, a short message service sender sms-sender, a Wechat sender and the like, and sending the alarm event to a user.
As an example, the alarm policy or rule primary fields may include: collecting time, monitoring item name Metric, monitoring item marks Tags, threshold values, threshold times and alarming persons. The alarm defines the contact address of the mail, short message, or micro message, micro blog and other self-media account numbers of the specific technical personnel to be notified. The alarm event generated according to the alarm strategy has a corresponding alarm state, namely alarm or recovery.
In some embodiments, an alarm event may contain all fields of the collected data and contain all fields of the alarm policy, while writing mysql for front-end presentation.
In some embodiments, when the monitoring item is restored, i.e., when the collected data is below the threshold, the restored alarm event is also written, and then the alarm event status of mysql is updated to restore.
(5) Front end page configuration display
The front end is responsible for calling the api to display data of each monitoring item, configuring an acquisition strategy, configuring an alarm strategy and the like.
The api needs to communicate with a proxy, a graph and a mysql, the proxy mainly provides proxy service for the api and acquires an acquisition strategy, the graph provides original data for the api and is used for front-end mapping, for example, a curve fluctuation graph is manufactured for technicians to judge index fluctuation and judge abnormity, and the mysql is mainly used for storing the acquisition strategy, an alarm strategy and information of an alarm person.
The technical scheme of the embodiment of the invention has the following advantages that:
the technical scheme of the embodiment of the invention has higher reading and writing speed than the scheme in the prior art, the scheme of the invention adopts a distributed storage mode, and the data is searched out according to the key (device-metric-tags) of the hash, which is much faster than the speed of putting the mysql database into a hard disk.
The technical scheme of the embodiment of the invention is distributed time sequence library processing, and the prior technical scheme is that a large amount of read-write operations are in mysql by reading mysql storage, so that the query efficiency is low.
In the technical scheme of the embodiment of the invention, the whole system has no core single point, is easy to operate and maintain, is easy to deploy and can be horizontally expanded.
Fig. 3 is a flowchart of a monitoring method of a server according to an embodiment of the present invention. As shown in fig. 3, the monitoring method of the server includes the following steps:
s110: the monitoring data acquisition module acquires monitoring data of a server to be monitored and reports the monitoring data to the proxy server;
s120: the proxy server distributes the monitoring data to the graph storage module and the alarm module;
s130: the graph storage module stores the monitoring data into a database;
s140: the alarm module acquires an alarm rule from the database, determines whether to alarm or not according to the alarm rule and the monitoring data, generates an alarm event if the alarm is determined, and writes the alarm event into the storage queue module;
s150: and the sending module acquires the alarm event from the storage queue module and sends the alarm event.
In some embodiments, the acquiring of the monitoring data of the server to be monitored by the monitoring data acquiring module in S110, and reporting the monitoring data to the proxy server may specifically include:
the monitoring data acquisition module actively reads basic information of a server to be monitored as monitoring data, and sends the basic information to the proxy server through a hypertext transfer protocol (http) request; alternatively, the first and second electrodes may be,
the monitoring data acquisition module acquires an acquisition rule through a timing request Application Program Interface (API), acquires an acquisition script to be executed from the acquisition rule, executes the acquisition script regularly according to the acquisition rule, then collects script printing information, and reports the collected script printing information to the proxy server as monitoring data; alternatively, the first and second electrodes may be,
the monitoring data acquisition module receives monitoring data reported by other programs at regular time through a monitoring port of the monitoring data acquisition module, and then sends a request carrying the monitoring data to the proxy server.
In some embodiments, the distributing, by the proxy server in S120, the monitoring data to the graph storage module may specifically include:
the proxy server takes the unique mark-monitoring item name-monitoring item mark device-measure-tags of the monitoring equipment as a key, takes the collected value of the monitoring item as a value, and forwards the monitoring data adopting the key-value structure to the plurality of graphic storage modules through the rule of the Hash hash. The device-metric-tags is used as a key, and the name combination (device name-monitoring index tag name) is used as a key.
In some embodiments, the fields of the monitoring data include: monitoring item name, monitoring item mark, monitoring item collection period, monitoring equipment unique mark, monitoring item collection value and collection time. Among them, a Timestamp field, a monitoring entry collection period step field, will be used in alarm mapping.
The fields of the alarm rule include: collecting time, monitoring item names, monitoring item marks, threshold values, threshold times and contact ways of notified alarm persons;
the alarm event comprises all fields of monitoring data and all fields of alarm rules;
the monitoring method may further include: the alarm module writes the alarm event into a database for the front-end module to display; and/or writing the recovered alarm event when the alarm module determines that the monitoring item corresponding to the alarm event is recovered according to the reported monitoring data, and then updating the alarm event state of the database to be recovered.
In some embodiments, the monitoring method may further include: the front-end module calls an application program interface API, acquires monitoring data from the graph storage module, generates a visual graph according to the monitoring data, and calls the application program interface API to configure acquisition rules and alarm rules in a database; the server to be monitored comprises a content delivery network CDN server or a Linux virtual server.
The technical scheme of the embodiment of the invention has the following advantages that:
the technical scheme of the embodiment of the invention has higher reading and writing speed than the scheme in the prior art, the scheme of the invention adopts a distributed storage mode, and the index of the data is calculated according to the key of the hash and is much faster than the speed of putting the mysql database into a hard disk.
The technical scheme of the embodiment of the invention is distributed time sequence library processing, and the prior technical scheme is that a large amount of read-write operations are in mysql by reading mysql storage, so that the query efficiency is low.
The embodiment of the invention also provides a computer readable storage medium, wherein a computer program is stored in the computer readable storage medium, and when being executed by a processor, the computer program realizes the steps of the monitoring method of the server.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, as for the device, the electronic device and the readable storage medium embodiments, since they are substantially similar to the method embodiments, the description is simple, and the relevant points can be referred to the partial description of the method embodiments.
Those of skill in the art will further appreciate that the various illustrative logical blocks, units, and steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both. To clearly illustrate the interchangeability of hardware and software, various illustrative components, elements, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as hardware or software depends upon the particular application and design requirements of the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present embodiments.
The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims (11)

1. A method for monitoring a server, comprising:
the monitoring data acquisition module acquires monitoring data of a server to be monitored and reports the monitoring data to the proxy server;
the proxy server distributes the monitoring data to a graph storage module and an alarm module;
the graph storage module stores the monitoring data into a database;
the alarm module acquires an alarm rule from the database, determines whether to alarm or not according to the alarm rule and the monitoring data, generates an alarm event if the alarm is determined, and writes the alarm event into a storage queue module;
and the sending module acquires the alarm event from the storage queue module and sends the alarm event.
2. The method according to claim 1, wherein the monitoring data collecting module collects monitoring data of a server to be monitored and reports the monitoring data to the proxy server, and specifically comprises:
the monitoring data acquisition module actively reads basic information of a server to be monitored as monitoring data, and sends the basic information to the proxy server through a hypertext transfer protocol (http) request; alternatively, the first and second electrodes may be,
the monitoring data acquisition module acquires an acquisition rule through a timing request Application Program Interface (API), acquires an acquisition script to be executed from the acquisition rule, executes the acquisition script regularly according to the acquisition rule, then collects script printing information, and reports the collected script printing information to the proxy server as monitoring data; alternatively, the first and second electrodes may be,
and the monitoring data acquisition module receives monitoring data reported by other programs at regular time through a monitoring port of the monitoring data acquisition module, and then sends a request carrying the monitoring data to the proxy server.
3. The method according to claim 1 or 2,
the distributing, by the proxy server, the monitoring data to a graph storage module specifically includes:
the proxy server takes the unique monitoring equipment mark-monitoring item name-monitoring item mark device-measure-tags as a key, takes the collected monitoring item value as a value, and forwards the monitoring data adopting the key-value structure to a plurality of graphic storage modules through a Hash rule.
4. The method according to claim 1 or 2, wherein the fields of the monitoring data comprise: monitoring item name, monitoring item mark, monitoring item acquisition period, monitoring equipment unique mark, monitoring item acquisition value and acquisition time;
the fields of the alarm rule include: collecting time, monitoring item names, monitoring item marks, threshold values, threshold times and contact ways of notified alarm persons;
the alarm event comprises all fields of the monitoring data and all fields of the alarm rule;
the method further comprises the following steps: the alarm module writes the alarm event into a database for a front-end module to display; and/or writing the recovered alarm event when the alarm module determines that the monitoring item corresponding to the alarm event is recovered according to the reported monitoring data, and then updating the state of the alarm event of the database to be recovered.
5. The method of claim 1 or 2, further comprising: the front-end module calls an application program interface API, acquires the monitoring data from the graph storage module and generates a visual graph according to the monitoring data, and the front-end module configures the acquisition rule and the alarm rule in the database by calling the application program interface API; the server to be monitored comprises a content delivery network CDN server or a Linux virtual server.
6. A monitoring system for a server, comprising:
the monitoring data acquisition module is used for acquiring monitoring data of a server to be monitored and reporting the monitoring data to the proxy server;
the proxy server is used for distributing the monitoring data to a graph storage module and an alarm module;
the graph storage module is used for storing the monitoring data into a database;
the alarm module is used for acquiring an alarm rule from the database, determining whether to alarm or not according to the alarm rule and the monitoring data, generating an alarm event if the alarm is determined, and writing the alarm event into a storage queue module;
and the sending module is used for acquiring the alarm event from the storage queue module and sending the alarm event.
7. The system of claim 6, wherein the monitoring data acquisition module is specifically configured to:
the monitoring data acquisition module actively reads basic information of a server to be monitored as monitoring data, and sends the basic information to the proxy server through a hypertext transfer protocol (http) request; alternatively, the first and second electrodes may be,
the monitoring data acquisition module acquires an acquisition rule through a timing request Application Program Interface (API), acquires an acquisition script to be executed from the acquisition rule, executes the acquisition script regularly according to the acquisition rule, then collects script printing information, and reports the collected script printing information to the proxy server as monitoring data; alternatively, the first and second electrodes may be,
and the monitoring data acquisition module receives monitoring data reported by other programs at regular time through a monitoring port of the monitoring data acquisition module, and then sends a request carrying the monitoring data to the proxy server.
8. The system according to claim 6 or 7, wherein the proxy server is specifically configured to: the proxy server takes the unique monitoring equipment mark-monitoring item name-monitoring item mark device-measure-tags as a key, takes the collected monitoring item value as a value, and forwards the monitoring data adopting the key-value structure to a plurality of graphic storage modules through a rule of hash.
9. The system according to claim 6 or 7, wherein the fields of the monitoring data comprise: monitoring item name, monitoring item mark, monitoring item acquisition period, monitoring equipment unique mark, monitoring item acquisition value and acquisition time;
the fields of the alarm rule include: collecting time, monitoring item names, monitoring item marks, threshold values, threshold times and contact ways of notified alarm persons;
the alarm event comprises all fields of the monitoring data and all fields of the alarm rule;
the alarm module is further configured to: writing the alarm event into a database for a front-end module to display; and/or writing the recovered alarm event when the alarm module determines that the monitoring item corresponding to the alarm event is recovered according to the reported monitoring data, and then updating the alarm event state of the database to be recovered.
10. The system of claim 6 or 7, further comprising: the front-end module is used for calling an Application Program Interface (API), acquiring the monitoring data from the graph storage module and generating a visual graph according to the monitoring data, and calling the API to configure the acquisition rule and the alarm rule in the database; the server to be monitored comprises a content delivery network CDN server or a Linux virtual server.
11. A computer-readable storage medium, on which a computer program is stored, which program, when being executed by a processor, is adapted to carry out a method of monitoring a server according to any one of claims 1-5.
CN202010860648.1A 2020-08-25 2020-08-25 Server monitoring method and system Pending CN112131073A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010860648.1A CN112131073A (en) 2020-08-25 2020-08-25 Server monitoring method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010860648.1A CN112131073A (en) 2020-08-25 2020-08-25 Server monitoring method and system

Publications (1)

Publication Number Publication Date
CN112131073A true CN112131073A (en) 2020-12-25

Family

ID=73848520

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010860648.1A Pending CN112131073A (en) 2020-08-25 2020-08-25 Server monitoring method and system

Country Status (1)

Country Link
CN (1) CN112131073A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112671587A (en) * 2020-12-28 2021-04-16 紫光云技术有限公司 Alarm method for failure of equipment issuing configuration
CN113448801A (en) * 2021-06-15 2021-09-28 新浪网技术(中国)有限公司 Database monitoring method and system
CN113688149A (en) * 2021-07-20 2021-11-23 青岛海尔科技有限公司 Monitoring method and device
CN113778001A (en) * 2021-09-28 2021-12-10 上海市大数据股份有限公司 Real-time data monitoring system suitable for application system
CN113783890A (en) * 2021-09-24 2021-12-10 国网山西省电力公司电力科学研究院 Intelligent Internet of things system Internet of things terminal safety monitoring system based on edge calculation
CN113806166A (en) * 2021-08-25 2021-12-17 合众人寿保险股份有限公司 Object monitoring method and device, storage medium and electronic equipment
CN115499431A (en) * 2022-07-29 2022-12-20 天翼云科技有限公司 Public cloud multi-resource pool operation and maintenance monitoring system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102142720A (en) * 2011-04-29 2011-08-03 珠海市鸿瑞软件技术有限公司 Network communication recorder and network communication record analysis system
CN106445789A (en) * 2016-10-11 2017-02-22 北京集奥聚合科技有限公司 Monitoring visualizing method and system
CN110048888A (en) * 2019-04-16 2019-07-23 深圳市致宸信息科技有限公司 A kind of method based on zabbix monitoring alarm, server, equipment and storage medium
US20200053127A1 (en) * 2018-08-10 2020-02-13 Servicenow, Inc. Creating security incident records using a remote network management platform
CN111352800A (en) * 2020-02-25 2020-06-30 京东数字科技控股有限公司 Big data cluster monitoring method and related equipment
CN111447109A (en) * 2020-03-23 2020-07-24 京东方科技集团股份有限公司 Monitoring management apparatus and method, computer readable storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102142720A (en) * 2011-04-29 2011-08-03 珠海市鸿瑞软件技术有限公司 Network communication recorder and network communication record analysis system
CN106445789A (en) * 2016-10-11 2017-02-22 北京集奥聚合科技有限公司 Monitoring visualizing method and system
US20200053127A1 (en) * 2018-08-10 2020-02-13 Servicenow, Inc. Creating security incident records using a remote network management platform
CN110048888A (en) * 2019-04-16 2019-07-23 深圳市致宸信息科技有限公司 A kind of method based on zabbix monitoring alarm, server, equipment and storage medium
CN111352800A (en) * 2020-02-25 2020-06-30 京东数字科技控股有限公司 Big data cluster monitoring method and related equipment
CN111447109A (en) * 2020-03-23 2020-07-24 京东方科技集团股份有限公司 Monitoring management apparatus and method, computer readable storage medium

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112671587A (en) * 2020-12-28 2021-04-16 紫光云技术有限公司 Alarm method for failure of equipment issuing configuration
CN113448801A (en) * 2021-06-15 2021-09-28 新浪网技术(中国)有限公司 Database monitoring method and system
CN113688149A (en) * 2021-07-20 2021-11-23 青岛海尔科技有限公司 Monitoring method and device
CN113806166A (en) * 2021-08-25 2021-12-17 合众人寿保险股份有限公司 Object monitoring method and device, storage medium and electronic equipment
CN113783890A (en) * 2021-09-24 2021-12-10 国网山西省电力公司电力科学研究院 Intelligent Internet of things system Internet of things terminal safety monitoring system based on edge calculation
CN113778001A (en) * 2021-09-28 2021-12-10 上海市大数据股份有限公司 Real-time data monitoring system suitable for application system
CN115499431A (en) * 2022-07-29 2022-12-20 天翼云科技有限公司 Public cloud multi-resource pool operation and maintenance monitoring system

Similar Documents

Publication Publication Date Title
CN112131073A (en) Server monitoring method and system
US10348809B2 (en) Naming of distributed business transactions
CN108763038B (en) Alarm data management method and device, computer equipment and storage medium
US10860406B2 (en) Information processing device and monitoring method
CN110232010A (en) A kind of alarm method, alarm server and monitoring server
CN111669295B (en) Service management method and device
CN111538563A (en) Event analysis method and device for Kubernetes
CN112039726A (en) Data monitoring method and system for content delivery network CDN device
CN111770002A (en) Test data forwarding control method and device, readable storage medium and electronic equipment
US7895247B2 (en) Tracking space usage in a database
CN111010318A (en) Method and system for discovering loss of connection of terminal equipment of Internet of things and equipment shadow server
CN109766198B (en) Stream processing method, device, equipment and computer readable storage medium
CN111597056A (en) Distributed scheduling method, system, storage medium and device
CN111026606A (en) Alarm method and device based on hystrix fuse monitoring and computer equipment
CN114637656B (en) Redis-based monitoring method and device, storage medium and equipment
CN115658745A (en) Data processing method, data processing device, computer equipment and computer readable storage medium
CN107368355B (en) Dynamic scheduling method and device of virtual machine
CN112202895B (en) Method and system for collecting monitoring index data, electronic equipment and storage medium
CN112685157B (en) Task processing method, device, computer equipment and storage medium
CN113595776A (en) Monitoring data processing method and system
CN114356970A (en) Storage system resource caching method and device
CN113138896A (en) Application running condition monitoring method, device and equipment
CN112433891A (en) Data processing method and device and server
CN111290909A (en) System and method for monitoring and alarming ceph cluster
CN112363905B (en) Application log collection system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20230406

Address after: Room 501-502, 5/F, Sina Headquarters Scientific Research Building, Block N-1 and N-2, Zhongguancun Software Park, Dongbei Wangxi Road, Haidian District, Beijing, 100193

Applicant after: Sina Technology (China) Co.,Ltd.

Address before: 100193 7th floor, scientific research building, Sina headquarters, plot n-1, n-2, Zhongguancun Software Park, Dongbei Wangxi Road, Haidian District, Beijing, 100193

Applicant before: Sina.com Technology (China) Co.,Ltd.