CN114020558A - Alarm callback method, platform, system, device, equipment and storage medium - Google Patents

Alarm callback method, platform, system, device, equipment and storage medium Download PDF

Info

Publication number
CN114020558A
CN114020558A CN202111177540.3A CN202111177540A CN114020558A CN 114020558 A CN114020558 A CN 114020558A CN 202111177540 A CN202111177540 A CN 202111177540A CN 114020558 A CN114020558 A CN 114020558A
Authority
CN
China
Prior art keywords
alarm
callback
event
index
alert
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111177540.3A
Other languages
Chinese (zh)
Inventor
黄咏康
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Weimin Insurance Agency Co Ltd
Original Assignee
Weimin Insurance Agency Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Weimin Insurance Agency Co Ltd filed Critical Weimin Insurance Agency Co Ltd
Priority to CN202111177540.3A priority Critical patent/CN114020558A/en
Publication of CN114020558A publication Critical patent/CN114020558A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3051Monitoring arrangements for monitoring the configuration of the computing system or of the computing system component, e.g. monitoring the presence of processing resources, peripherals, I/O links, software programs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3089Monitoring arrangements determined by the means or processing involved in sensing the monitored data, e.g. interfaces, connectors, sensors, probes, agents
    • G06F11/3093Configuration details thereof, e.g. installation, enabling, spatial arrangement of the probes

Abstract

The application relates to an alarm callback method, a platform, a system, a device, equipment and a storage medium. The method comprises the steps of obtaining an alarm event from an alarm event monitoring system; acquiring an alarm callback configuration parameter corresponding to an alarm event; and generating alarm callback content corresponding to the alarm event based on the alarm callback configuration parameters, and calling back the alarm callback content to a target address in the alarm callback configuration parameters. According to the embodiment of the application, the callback process of the alarm event is stripped from the alarm event monitoring system, the independent alarm callback platform is set to callback the alarm event, and the parameters related to the callback process are configured in the alarm callback platform, so that the problem that in the related technology, when the alarm event monitoring system callbacks the alarm event, the callback configuration parameters supported by the alarm event monitoring system are single, so that the limitation is large in the practical application of an enterprise is solved.

Description

Alarm callback method, platform, system, device, equipment and storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to an alarm callback method, platform, system, device, apparatus, and storage medium.
Background
With the continuous development of electronic information technology, cloud computing, big data and other technologies, various industries begin to adopt service systems deployed in a clustering manner to realize electronic automation. The service system deployed in a clustering manner may include a large number of resources such as physical hosts and virtual machines, and in order to ensure normal operation of the service system, the service system is provided with a corresponding monitoring platform to monitor an operation state of each resource. Prometieus (prometheus) is one such monitoring platform. Prometheus is mainly used to implement monitoring of infrastructure, such as servers, databases, and Virtual Private Servers (VPSs).
Currently, a commonly adopted method for monitoring prometheus is to install a corresponding acquisition component for different third-party applications, acquire data through the acquisition component and then provide the data to a server of the prometheus, and call the acquired data back to a third-party service according to a json format by the server of the prometheus in an alert mode.
However, when collected data is recalled by using alert hook, callback configuration parameters set by prometheus are relatively single, which causes a problem that when data is recalled by using alert hook of prometheus, in actual application of an enterprise, limitation is relatively large.
Disclosure of Invention
The application provides an alarm callback method, a platform, a system, a device, equipment and a storage medium, which are used for solving the problem that the limitation is large when data is called back in an enterprise practical application by adopting a premeheus alert method.
An alarm callback method comprises the following steps:
acquiring an alarm event from an alarm event monitoring system;
acquiring an alarm callback configuration parameter corresponding to the alarm event, wherein the alarm callback configuration parameter comprises a parameter related to an alarm callback process for calling back the alarm event;
generating alarm callback content corresponding to the alarm event based on the alarm callback configuration parameters;
and calling back the alarm callback content to a target address in the alarm callback configuration parameters.
An alert callback platform comprising:
an alarm processing component for receiving an alarm event from an alarm event monitoring system;
the alarm consumption component is used for acquiring an alarm event from the alarm event monitoring system; acquiring an alarm callback configuration parameter corresponding to the alarm event, wherein the alarm callback configuration parameter comprises a parameter related to an alarm callback process for calling back the alarm event; generating alarm callback content corresponding to the alarm event based on the alarm callback configuration parameters;
and the alarm callback component is used for calling back the alarm callback content to a target address in the alarm callback configuration parameters.
Optionally, the method further comprises:
and the message queue is used for storing the alarm event according to the alarm level of the alarm event.
Optionally, the message queue includes a partition for storing the alert event and alert events having the same alert fingerprint as the alert event.
Optionally, the alert consumption component includes at least one consumption component, and each consumption component corresponds to one message queue.
An alert callback device comprising:
the first acquisition unit is used for acquiring an alarm event from an alarm event monitoring system;
a second obtaining unit, configured to obtain an alarm callback configuration parameter corresponding to the alarm event, where the alarm callback configuration parameter includes a parameter related to an alarm callback process for calling back the alarm event;
the generating unit is used for generating the alarm callback content corresponding to the alarm event based on the alarm callback configuration parameters;
and the callback unit is used for calling back the alarm callback content to the target address in the alarm callback configuration parameters.
An alert callback system comprising:
the system comprises an alarm event monitoring system and an alarm callback platform or an alarm callback device which is communicated with the alarm event monitoring system;
the alarm event monitoring system is used for triggering an alarm event;
the alarm callback platform or the alarm callback device is used for acquiring an alarm event from an alarm event monitoring system; acquiring an alarm callback configuration parameter corresponding to the alarm event, wherein the alarm callback configuration parameter comprises a parameter related to an alarm callback process for calling back the alarm event; generating alarm callback content corresponding to the alarm event based on the alarm callback configuration parameters; and calling back the alarm callback content to a target address in the alarm callback configuration parameters.
An electronic device, comprising: the system comprises a processor, a memory and a communication bus, wherein the processor and the memory are communicated with each other through the communication bus;
the memory for storing a computer program;
the processor is used for executing the program stored in the memory and realizing the alarm callback method.
A computer-readable storage medium, storing a computer program which, when executed by a processor, implements the alert callback method described above.
Compared with the prior art, the technical scheme provided by the embodiment of the application has the following advantages: in the technical scheme provided by the embodiment of the application, the alarm event from an alarm event monitoring system is acquired; acquiring an alarm callback configuration parameter corresponding to an alarm event, wherein the alarm callback configuration parameter comprises a parameter related to an alarm callback process for calling back the alarm event; and generating alarm callback content corresponding to the alarm event based on the alarm callback configuration parameters, and calling back the alarm callback content to a target address in the alarm callback configuration parameters. According to the embodiment of the application, the callback process of the alarm event is stripped from the alarm event monitoring system, the independent alarm callback platform is set to callback the alarm event, and the parameters related to the callback process are configured in the alarm callback platform, so that the problem that in the related technology, when the alarm event monitoring system callbacks the alarm event, the callback configuration parameters supported by the alarm event monitoring system are single, so that the limitation is large in the practical application of an enterprise is solved.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and together with the description, serve to explain the principles of the application.
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.
Fig. 1 is a schematic structural diagram of an alarm callback system in an embodiment of the present application;
FIG. 2 is a flowchart illustrating an alarm callback method according to an embodiment of the present application;
FIG. 3 is a schematic diagram of another structure of an alarm callback system in the embodiment of the present application;
FIG. 4 is a schematic structural diagram of an alarm callback device in the embodiment of the present application;
fig. 5 is a schematic structural diagram of an electronic device in an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It should be noted that the terms "first," "second," and the like in the description and claims of this application and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
It should be noted that for convenience of illustration in the following description, the alarm event monitoring system is exemplified by prometheus, but it should be understood that the application does not limit the alarm event monitoring system to prometheus, and the alarm event monitoring system in application may also be other underlying monitoring systems, such as open-falcon, zabbix, Nightingale; of course, the alarm monitoring system can also be adapted middleware developed separately as required.
Before further detailed description of the embodiments of the present application, terms and expressions referred to in the embodiments of the present application will be described, and the terms and expressions referred to in the embodiments of the present application will be applied to the following explanations.
And (3) alarm strategy: the method comprises the steps that alarm strategy configuration information is developed by an alarm strategy management platform based on promethus, and each alarm strategy has a unique identifier (uuid); and according to the alarm strategy uuid, the corresponding alarm strategy configuration information can be associated and inquired. The uuid of the alarm policy is written into the alarm policy (alerting rules) configuration of promethus in the form of a label (labels), and when an alarm event is triggered, promethus attaches the alarm policy uuid generating the alarm event to the labels of the alarm event.
finger print: the alarm fingerprint is an identification code which is generated by md5 and is used for uniquely identifying the alarm event according to the alarm content of the alarm event and labels contained in the alarm event when promethus triggers the alarm event. It should be appreciated that the alarm fingerprints for different alarm events may be the same, and that multiple alarm events having the same finger print may be considered multiple triggers of the same alarm event.
For a prometheus type alarm event, the alarm fingerprinting algorithm is: according to the names in the labels of the alarm events, the dictionary table of the alarm labels is subjected to sequencing and cyclic reading of the sequenced label dictionary table, and the names in the labels and the corresponding label values are added into a hash to calculate alarm fingerprints; if the log alarm or the custom interface report alarm, the alarm fingerprint algorithm is as follows: md5 (alarm level + alarm object + alarm content + alarm label (hash value calculated after sorting according to label), if it is the self-defined interface to report alarm, if the user reports the alarm fingerprint, the fingerprint reported by the user is used as the standard, if the user does not set the alarm fingerprint, the alarm fingerprint calculation method of log alarm is defaulted.
And (4) calling back an alarm: and after the alarm event occurs, configuring according to a certain condition, calling the corresponding service address in a specific mode, and triggering a specific function of the third-party service.
Alertmanager component: alertmanager, as a separate component in Prometheus, is responsible for receiving and handling alarm events from Prometheus Server. Alert manager can further process these alarm events, such as eliminating duplicate alarm information when a large number of duplicate alarm events are received, while grouping and routing alarm events to the correct notifier, while also supporting integration with Webhook to support more customized scenarios. Meanwhile, the AlertManager also provides a silencing and alarm suppression mechanism to optimize the alarm notification behavior.
Alertmanager cluster: the Alertmanger nodes are stateless nodes in Prometous, an Alertmanger cluster can be formed through communication configuration, communication is carried out among the Alertmanger nodes in the cluster through a Gossip protocol, and the fact that the same alarm is triggered only once is guaranteed.
The alarm callback configuration parameters involved in this embodiment are as follows:
and (4) alarm callback condition: this is a precondition for the callback of an alarm, and the user can set the dimensions of an alarm object, an alarm policy, keywords of alarm content, and the like. The alarm strategy can be directly associated with the alarm strategy uuid; the alarm object is matched through a character string; and the alarm content is subjected to condition setting in a fuzzy matching mode or a regular matching mode and the like. After the alarm callback condition is created, an alarm callback id is generated.
Alarm aggregation latency: within the alarm aggregation waiting time, the alarm events of the same finger print cannot be called back immediately, the call-back is triggered once after the alarm waiting time is over, and the total triggering times of the alarm events of the same finger print within the alarm aggregation waiting time are recorded in the alarm call-back content. It should be understood that when the alarm aggregation waiting time is 0S, it indicates that the alarm event needs to be immediately called back.
Target address of alarm callback: and the target address is used for being called by the background service when the alarm callback is triggered. The alarm callback function requires that the network between the alarm callback platform and the target address can be connected directly or through a proxy. WebFront provides an alarm callback test function, and a user can conveniently test whether the background service of an alarm callback platform is normal to a target address in advance. The user may set multiple callback addresses.
And the alarm callback agent can communicate by setting a forward agent node if the background services of the alarm callback platform cannot be directly connected.
And (3) an alarm encryption mode: and when the alarm is returned, the content of the alarm return supports encryption. The encryption method includes but is not limited to AES symmetric encryption, and the packet mode is CBC. For efficiency reasons, intranet users may also support no encryption.
AES key: when AES encryption is selected, a user needs to fill in a 32-to-64-bit encryption key, and after the key is set, the background key cache is refreshed immediately.
And (4) alarming and frequency-adjusting: and the method supports the user to set the alarm callback times of the same alarm callback configuration parameters in a fixed time interval.
Customizing callback content: the alarm callback platform supports the expression of the alarm callback content in a json mode by default. If the user has the requirement of customizing the alarm callback content, the alarm callback platform also supports the user to set the alarm callback content in a customized manner, and supports the mode of using go template, and uses the alarm callback configuration parameters predefined by the alarm callback platform to inject values into the customized alarm callback content variable area.
And (3) an alarm callback protocol: the alarm callback platform supports http, https or grpc alarm callback. In the case of https, the certificate address needs to be configured. In the case of a grpc, the service address of the grpc and the interface name need to be configured.
Callback failure retry number: the parameter is used to set the number of retries executed after an alarm callback fails. It should be understood that when the number of callback failure retries is 0, this indicates that no retry is required.
Alarm callback timeout time: the parameter is used for setting the total overtime time of each alarm callback, and when the target address response is overtime, the alarm callback platform closes the current connection.
The following describes an application scenario related to the present application.
Referring to fig. 1, fig. 1 is an alarm callback system shown in the embodiment of the present application, where the alarm callback system includes:
promethus101 and an alarm callback platform 102, the alarm callback platform 102 including an AlarmHook component, a kafka component, an AlarmCluster component, a WebFront component, a callback component, a Config component, and a database (mysql);
the promethus101 may be a single-node prometheus monitoring system. The single-node prometheus monitoring system supports flexible configuration of alarm conditions, alarm aggregation, and alarm events. The single-node proxy monitoring system serves as a producer of an alarm event of the alarm callback platform 102, and when a configured alarm condition is triggered, the proxy server pushes the alarm event to the alert manager cluster. And sending the alarm event to an AlarmHook component through an http interface by utilizing an Alertmanager alert function.
The AlarmHook component is used for receiving an alarm event from the prometheus101, performing data cleaning on the alarm event, acquiring an alarm policy identifier (uuid) of the alarm event, and converting the alarm event into the alarm event which contains a monitoring classification, a monitoring scene, a monitoring group, a monitoring index, a monitoring object and other alarm related contents and meets the enterprise standard by combining the alarm event uuid.
In order to improve the cleaning efficiency of the alarm data of the Alarmhook component, the Alarmhook component can regularly acquire alarm strategy information from a database (mysql), store the alarm strategy information into a local cache or a redis cache, and expose a cache refreshing interface to the outside for real-time cache refreshing.
After the AlarmHook finishes the alarm information cleaning, the alarm event is written into different theme (Topic) queues of kafka according to different alarm levels. When writing to the topoc queue, the alarm events belonging to the same alarm level are written to different partitions of the queue by using finger print as an index (key).
Since the alarm fingerprints are used to identify the same alarm events, the same alarm events are ordered in kafka, thus ensuring that subsequent consumer components can consume the alarm events in an ordered fashion.
The AlarmHook component sends the cleaned alarm event to kafka before it is consumed by the alarmhuster component. The AlarmCluster component consists of a set of Alarm components, each of which can configure topic for consumption from Kafka, as shown in fig. 1, and Alarm-P0 indicates that the Alarm component configures prometheus. P0 queue in individual consumption Kafka. And the Alarm components consuming the same queue can be bound to the same kafka consumption group, and the Alarm components in the same consumption group can be connected to different partitions to consume the Alarm event through the characteristics of the kafka partitions. Because the Alarm events in the same partition are the same Alarm events, each Alarm component in the AlarmCluster component can consume the same Alarm events in order, and the order of the Alarm states is guaranteed.
To improve consumption efficiency, the Alarm components bound to different queues may consume Alarm events in the bound queues in parallel.
The WebFront component serves as a foreground module of the alarm callback platform 102, and is configured to receive an alarm callback configuration parameter of a user, and store the alarm callback configuration parameter to mysql.
The Config component is a cache component of the alarm callback configuration parameters, and the component can acquire the latest alarm callback configuration parameters from the mysql at regular time and cache the latest alarm callback configuration parameters. The Config component provides an http or grpc interface to provide alarm callback configuration parameters for the AlarmCluster component.
The alarmGluster component can cache the Alarm callback configuration parameters from the config component at regular time, and when the Alarm component in the alarmGluster component consumes the Alarm event, the Alarm callback configuration parameters corresponding to the Alarm callback conditions are matched from the Alarm callback configuration parameters cached by the alarmGluster component according to the Alarm callback conditions, such as an Alarm policy uuid, an Alarm object or Alarm contents. If the matching is successful, whether the alarm event needs to be called back is calculated according to the alarm call-back configuration parameters, and if the alarm event needs to be called back, a call request is sent back to the callback cluster.
After receiving the callback request, the callback cluster will call back the content of the alarm callback to the target address in the way of http, https or grpc, etc., and the specific way is based on the setting in the configuration parameters of the alarm callback.
The alert callback method provided by the embodiment of the present application is described below from the perspective of the alarmcluxer component of the alert callback platform, and as shown in fig. 2, the method may include the following steps:
step 201, acquiring an alarm event from an alarm event monitoring system.
In this embodiment, since the component directly interfaced with the alarm event monitoring system is an AlarmHook component in the alarm callback platform, the format of the alarm event is a json format conforming to a standard protocol of the AlarmHook.
In application, fields in json format of an alarm event may include alarm content, monitoring classification, alarm scene, alarm hierarchy, alarm object, alarm index, alarm fingerprint, alarm level, alarm principal, alarm receiver, alarm grouping key, alarm trigger time, alarm request id, callback native message of underlying monitoring system, extension information, alarm tag string array, unit of alarm index, alarm current value, alarm policy uuid, alarm state, alarm release time, and alarm description tag.
The alarm hierarchy comprises an L1 basic resource, an L2 application layer and an L3 service layer.
And (3) warning scene: the user can define the alarm scene by the dimension of resources, products, business activities and the like. Each alarm strategy is bound into an alarm scenario.
And (3) warning object: the object of the current alarm event, the alarm object should support the correlation query of specific information in cmdb (configuration management database) or other business database. The alert object, such as a CVM scene, is a host ip. The alarm objects of each alarm scene are different and are specified when a user configures an alarm strategy.
Alarm indexes are as follows: the user can configure some monitoring indexes in the alarm scene for briefly identifying the type of the alarm event. Such as CPU usage, disk usage, memory usage, etc. configured in a CVM scenario. The alarm indicator may perform statistical analysis on the alarm event.
Alarm level: the service processing method comprises three levels of P0, P1 and P2, wherein the P0 level indicates that the service core function is unavailable, and the P1 level indicates that the service non-core function is in problem or the service processing efficiency and timeliness are reduced; the P2 level indicates internal early warning problems, and has no influence on service functions, such as disk space, CPU usage and the like.
And the alarm responsible person needs to follow up the alarm event after the alarm event is triggered, and marks the fault type and the main responsible person of the fault reason.
And the alarm grouping key is related to alarm aggregation setting and is set by a user in a self-defined way, and the alarm grouping key can be set according to a certain label dimension. For example, the same active tag is set as the same alarm group, and the alarm group key can be used to implement aggregation of alarm times within a specified time, thereby avoiding alarm storm.
And (3) expanding information: and the extended information set by the alarm event source is used for developing the personalized function of the alarm event.
Alarm tag string array: and the user-defined tag can be bound to the alarm event. Such as binding "critical link" labels for certain alarm policies.
Unit of alarm index: such as the unit of the disk usage index is%.
And (4) warning current value: current monitored value when alarm event is triggered.
And (4) warning state: alarm triggering and alarm releasing.
Alarm description label: some alarm tag attributes are attached by the prometheus system by default. Only prometheus event type is present. For example, each prometheus alarm event adds an alarm data curve viewing link in this field, and supports fast jump to an alarm large disk associated with the alarm event.
Step 202, obtaining an alarm callback configuration parameter corresponding to the alarm event, wherein the alarm callback configuration parameter comprises a parameter related to an alarm callback process of calling back the alarm event.
And 203, generating the alarm callback content corresponding to the alarm event based on the alarm callback configuration parameters.
And step 204, callback the content of the alarm callback to the target address in the alarm callback configuration parameters.
In this embodiment, the alarm callback configuration parameter may be indicated by the WebFront component of the alarm callback platform, or determined based on the alarm callback configuration parameter of the alarm event of the historical callback, or determined based on the alarm callback configuration parameter of the default setting of the alarm callback platform, and the like, which is not specifically limited in this embodiment.
In this embodiment, an alarm callback configuration list of alarm callback configuration parameters is cached in the alarm callback platform, the alarm callback configuration parameters have identifiers of alarm policies, and an alarm event and the alarm callback configuration parameters are associated by the identifiers of the alarm policies, so that when the alarm callback configuration parameters corresponding to the alarm event are obtained, a target alarm policy identifier carried in a tag of the alarm event is obtained, and the target alarm callback configuration parameters of which the identifiers of the alarm policies are the target alarm policy identifier are searched from the alarm callback configuration list; and taking the target alarm callback configuration parameter as an alarm callback configuration parameter corresponding to the alarm event.
In the application, considering that the identifier of the alarm policy is usually a basic element of an alarm event, the identifier of the alarm policy is used as an alarm callback condition in the alarm callback configuration parameter in order to distinguish the identifier from other parameters in the alarm callback configuration parameter.
Correspondingly, when target alarm callback configuration parameters are searched from the alarm callback configuration list, target alarm callback conditions with the identifiers of the alarm strategies as the identifiers of the target alarm strategies are searched from each alarm callback condition; and taking the alarm callback configuration parameter corresponding to the target alarm callback condition as a target alarm callback configuration parameter.
In application, besides the identifier of the alarm policy can be used as the basic element of the alarm event, the alarm object and the alarm content can also be set as the basic element of the alarm event, so the alarm callback condition can also include the alarm object and the alarm content.
When the target alarm callback condition is determined to include the alarm object and/or the alarm content, the alarm object should be matched with the alarm object in the alarm event and/or the alarm content should be matched with the alarm content in the alarm event before the alarm callback configuration parameter corresponding to the target alarm callback condition is used as the target alarm callback configuration parameter, and the alarm object in the target alarm callback condition is determined to be consistent with the alarm object in the alarm event, and/or the alarm content in the target alarm condition is determined to be matched with the alarm content in the alarm event.
In application, the alarm object may be generally identified by a name and the like of a monitored object monitored by the alarm event monitoring system, so that the alarm object in the target alarm callback condition may be matched with the alarm object in the alarm event in a character string matching manner. The alarm content may typically include a plurality of alarm fields, each indicating an aspect of the monitored object being monitored, and the set of alarm fields for an alarm event is typically a subset of the set of alarm fields in the target alarm callback condition. The alarm content in the target alarm callback condition can be matched with the alarm content in the alarm event by adopting a fuzzy matching or regular matching mode.
In this embodiment, in order to improve matching efficiency, when the target alarm callback condition includes both an alarm object and an alarm content, it is first detected whether the alarm object in the target alarm callback condition is consistent with the alarm object in the alarm event, if so, it is then checked whether the alarm content in the target alarm callback condition is matched with the alarm content in the alarm event, and if so, it is indicated that the current alarm event needs to be recalled.
It should be understood that when the alarm object in the target alarm callback condition is not consistent with the alarm object in the alarm event, the alarm callback is not effective, and the alarm event does not need to be called back.
In this embodiment, it is considered that there are many alarm events in some usage scenarios, and not every alarm event needs to be recalled, and generally, the alarm events in a certain period of time need to be collected, the total number of the alarm events is counted, and the callback is performed according to a fixed time interval after the relevant information is summarized. In these scenarios, when the alarm callback configuration parameter indicates a parameter related to the alarm callback process, the alarm callback condition may be indicated, and the alarm aggregation waiting time may also be indicated. Therefore, when the alarm aggregation waiting time is included in the target alarm callback condition, the alarm aggregation waiting time needs to be determined to be over before the alarm content is recalled.
In concrete implementation, acquiring the time of the alarm event triggered by an alarm event monitoring system for the first time; acquiring a timestamp generated based on the time of the first trigger and the alarm aggregation waiting time; and when the time stamp is earlier than the current time, determining that the alarm aggregation waiting time is ended.
It should be understood that the timestamp herein actually indicates the time at which the alarm aggregation latency is ending, and thus indicates the end of the alarm aggregation latency when the timestamp is earlier than the current time.
The present embodiment incorporates a cache determination as to when an alarm event is first triggered. During specific implementation, acquiring an alarm fingerprint carried in a label of an alarm event; generating a first index of an alarm event based on the alarm fingerprint, the identifier of the alarm callback configuration parameter and a preset character string; when the first index is stored in the cache, acquiring the time from the first index to the cache, and taking the time from the first index to the cache as the time for triggering the alarm event for the first time; and when the first index is not stored in the cache, storing the first index into the cache, and taking the time from the first index to the cache as the time for triggering the alarm event for the first time.
Since the alarm event processed by the AlarmCluster component actually comes from the Topic queue of kafka, the preset character string used for generating the first index may include an identifier of the queue storing the alarm event, so as to simplify the procedure and improve the processing efficiency. Further, to indicate that the first index is used for a callback to an alarm event, the preset string may also include an identification of an alarm callback component, such as alert _ callback.
Illustratively, the expression employed in generating the first index may be the identification of "alert _ callback _" + queue name Topic + $ finger print + $ alert callback configuration parameters.
In application, since the Topic queue of kafka stores different alarm events according to alarm levels, when determining the identifier of the queue storing the alarm event, the alarm level determination is based on the alarm event.
In this embodiment, while generating the first index of the alarm event, for the purpose of the alarm callback of the subsequent process, the first index value may also be synchronously generated, and when the first index is stored in the cache, the index value corresponding to the first index in the cache is updated with the synchronously generated first index value; when the first index is not stored in the cache, the first index and the first index value are stored in the cache in a key-value form.
It should be understood that when it is determined that the first index is not stored in the cache, a first index value is generated based on the alarm event, the alarm callback configuration parameter and the number of times of triggering of the alarm event, wherein when the first index is not stored in the cache, it may be determined that the number of times of triggering of the alarm event is 1; when the first index is stored in the cache, the first index value in the cache needs to be updated at this time because the number of times of triggering changes. In concrete implementation, a key value corresponding to the first index is obtained from the cache, the triggering times +1 are analyzed after the content of the key value is analyzed, then an updated first index value is generated based on the updated triggering times, the alarm event and the alarm callback configuration parameter, and the updated first index value is written back to the cache.
It should be understood that when the alarm callback configuration parameter indicates the alarm aggregation waiting time and the alarm aggregation waiting time is over, the first index value corresponding to the first index in the cache is actually the alarm callback content. And the triggering times of the alarm events at the end of the alarm aggregation waiting time are the times of generating the first index in the alarm aggregation waiting time.
In order to implement concurrent writing to the cache, in this embodiment, an atomic writing manner is adopted, and the first index value are written into the cache according to the corresponding relationship between the first index and the first index value, or the updated first index value.
In the embodiment, the first index is used as the distributed lock, so that the safety when the trigger times are increased by 1 is guaranteed, and the concurrent dirty writing is avoided.
In application, in order to avoid that a cache cannot be cleared due to the fact that a program is abnormal and a callback event cannot be caused, after the first index is generated, expiration time of the first index in the cache can be set, so that the first index in the cache is cleared when storage time of the first index in the cache exceeds the expiration time. Wherein, the expiration time of the first index is set based on the alarm aggregation waiting time and a preset time, and the preset time may be manually preset, for example, the preset time is set to 5 min.
It should be appreciated that after the alert callback content is recalled to the target address, the first index needs to be deleted from the cache in order not to affect subsequent callbacks to events having the same alert fingerprint as the alert event.
The detection process of the alarm aggregation waiting time is described by combining each component in the alarm callback platform as follows:
when the Alarm aggregation waiting time is set in the Alarm callback configuration parameter, the Alarm component corresponding to the Alarm event creates a first index (hereinafter, referred to as an Alarm aggregation key) according to the identifier of the Alarm callback configuration parameter in the format of "alert _ callback _" + queue name Topic + $ finger print + $.
And when the alarm aggregation key does not exist in the redis, packing key values as alarm events, alarm callback configuration parameters and trigger times, and writing the key values into the redis in an atomic writing mode. And then, storing the time of the redis and the alarm aggregation waiting time according to the alarm aggregation key, generating a time stamp of the alarm aggregation waiting time, and writing the time stamp into the sort set of the redis.
When the alarm aggregation key exists in the redis, the triggering times +1 are analyzed, the redis written back, and in the process from key value acquisition to write back, the alarm aggregation key is used as a distributed lock, so that the atomic security when the triggering times are increased by 1 is guaranteed, and the concurrent dirty writing is avoided. The alarm aggregation key sets the expiration time, wherein the expiration time is the alarm aggregation waiting time plus the preset time, and the problem that the cache cannot be cleared due to the fact that the event cannot be consumed due to program abnormity is avoided. Illustratively, the preset event may be 5 min.
And the Alarm component is used as a consumption component, circularly reads the content of the Alarm aggregation sortset, acquires all timestamps of which the timestamps are earlier than the current time, then determines the Alarm aggregation key corresponding to the timestamps, and further acquires the key value content of the Alarm aggregation key. After the alarm aggregation key is extracted, the alarm aggregation key is directly deleted.
In this embodiment, the alarm callback configuration parameters may further include an alarm callback frequency, where the alarm callback frequency is used to indicate the number of alarm callback times of an alarm event within a preset time interval, and therefore, before callback of the alarm content, the alarm callback frequency of the alarm time needs to be detected. Wherein the preset time interval may be set manually based on experience.
In a specific embodiment, an alarm fingerprint carried in a tag of an alarm event is acquired; generating a second index of the alarm event based on the alarm fingerprint and the identifier of the alarm callback configuration parameter; and determining that the storage time of the second index in the cache does not exceed the expiration time of the second index in the cache, wherein the expiration time of the second index in the cache is set based on the alarm call-back frequency and the time when the alarm event is triggered for the first time.
It should be noted that, the time for the first triggering of the alarm event is the same as the time for the first triggering of the alarm event when the alarm aggregation waiting time is described above, and therefore, the manner for acquiring the time for the first triggering of the alarm event may be referred to the foregoing description, and the description is not expanded here for the time.
In application, since the second index is usually deleted when the storage time of the second index in the cache exceeds the expiration time of the second index in the cache, whether the storage time of the second index in the cache exceeds the expiration time of the second index in the cache can be determined by whether the second index is stored in the cache. Specifically, when the second index is stored in the cache, it indicates that the warning frequency limit is not removed and a callback is not currently required, and when the second index is not stored in the cache, it indicates that the warning frequency limit is removed, and then the second index is written into the cache and the expiration time of the second index is set.
In this embodiment, the alarm callback configuration parameters may further include an alarm encryption mode, a callback failure retry parameter, and/or an alarm callback proxy parameter.
And the alarm encryption mode is used for encrypting the alarm callback content when the alarm callback content is recalled. The alarm encryption mode comprises parameters such as an encryption mode and an encryption key adopted by the encryption mode. Illustratively, the encryption scheme includes, but is not limited to, a symmetric encryption scheme. It should be understood that when the https mode is used to encrypt the alarm call-back content, or the callback target address and the alarm callback platform both belong to the intranet, no encryption is required. To further ensure the security of the alarm callback content in the application, an md5 signature is computed for the alarm callback content in addition to encrypting the alarm callback content. And after the callback target address obtains the callback request, decrypting the callback target address by using a secret key to obtain the original data before encryption, then calculating md5 signature on the original data, and if the signatures of the two parties are consistent, indicating that the current alarm callback content is not tampered, thereby ensuring the data security.
Wherein the callback failure retry number comprises a parameter indicating whether to retry or not and the retry number. In application, after the alarm callback platform recalls the alarm callback content from the target address, if a response returned by the target address is not received within a specified time or a response which indicates that the alarm callback content fails to be received and is returned by the target address is received, whether the alarm event needs to be recalled again or not can be determined according to parameters in the callback failure retry times. Specifically, when the number of callback-failure retries indicates that a retry is required and the number of retries is not 0, a callback is made again to the target address.
And the alarm callback proxy parameter is used for indicating that the alarm callback content is recalled to the target address through the proxy node. In application, when the alarm callback platform and the target address are not in the same local area network, the alarm callback content is recalled to the target address based on the proxy node indicated in the alarm callback proxy parameter.
In this embodiment, in order to facilitate a subsequent callback query process, after the callback of the alarm callback content is completed, a callback record is generated, and the callback record is stored in the callback log record database. The callback records comprise data such as an alarm callback id, a target address, alarm callback success times, alarm failure times, a target address return result, an http state code returned by the target address, an alarm callback mode and the like.
In this embodiment, the process of the alarm callback platform calling back the alarm event to the target address by the alarm event callback mode may also be combined with different service systems to implement different scenarios, such as:
and the system is butted with an internal script platform or a host agent, so that a certain script is automatically triggered when an alarm is triggered, and an automatic operation and maintenance function is realized. For example, a server disk alarm can trigger a host disk cleaning script, so that the purpose of automatically cleaning the disk during the alarm is achieved.
Of course, the method may also be combined with a service system, for example, when a certain service index reaches an alarm threshold, the service system is automatically called back, so as to dynamically start a certain activity, or adjust parameters related to the service activity.
And the method can also be used for realizing dynamic expansion and capacity after alarm and the like with an operation and maintenance management platform, namely a k8s management platform.
In the technical scheme provided by the embodiment of the application, the alarm event from an alarm event monitoring system is acquired; acquiring an alarm callback configuration parameter corresponding to an alarm event, wherein the alarm callback configuration parameter comprises a parameter related to an alarm callback process for calling back the alarm event; and generating alarm callback content corresponding to the alarm event based on the alarm callback configuration parameters, and calling back the alarm callback content to a target address in the alarm callback configuration parameters. According to the embodiment of the application, the callback process of the alarm event is stripped from the alarm event monitoring system, the independent alarm callback platform is set to callback the alarm event, and the parameters related to the callback process are configured in the alarm callback platform, so that the problem that in the related technology, when the alarm event monitoring system callbacks the alarm event, the callback configuration parameters supported by the alarm event monitoring system are single, so that the limitation is large in the practical application of an enterprise is solved.
Based on the same concept, an embodiment of the present application provides an alert callback device, and specific implementation of the device may refer to the description in the method embodiment section, and repeated details are not described again, as shown in fig. 3, the device system includes:
a first obtaining unit 301, configured to obtain an alarm event from an alarm event monitoring system;
a second obtaining unit 302, configured to obtain an alarm callback configuration parameter corresponding to an alarm event, where the alarm callback configuration parameter includes a parameter related to an alarm callback process of a callback alarm event;
a generating unit 303, configured to generate an alarm callback content corresponding to the alarm event based on the alarm callback configuration parameter;
and a callback unit 304, configured to callback the content of the alarm callback to the target address in the alarm callback configuration parameter.
Optionally, the second obtaining unit 302 is configured to:
acquiring a target alarm strategy identifier carried in a label of an alarm event, wherein the alarm strategy corresponding to the target alarm strategy identifier is used for generating the alarm event;
searching a target alarm callback configuration parameter with the identifier of the alarm strategy as the identifier of the target alarm strategy from an alarm callback configuration list, wherein each alarm callback configuration parameter in the alarm callback configuration list comprises the identifier of the alarm strategy;
and taking the target alarm callback configuration parameter as an alarm callback configuration parameter corresponding to the alarm event.
Optionally, each alarm callback configuration parameter in the alarm callback configuration list includes an alarm callback condition, and each alarm callback condition includes an identifier of an alarm policy;
the second obtaining unit 302 is configured to:
searching a target alarm callback condition with the identifier of the alarm strategy as the identifier of the target alarm strategy from each alarm callback condition;
and taking the alarm callback configuration parameter corresponding to the target alarm callback condition as a target alarm callback configuration parameter.
Optionally, the target alarm callback condition further includes an alarm object and/or alarm content;
the alert callback device is further configured to:
before the alarm callback configuration parameter corresponding to the target alarm callback condition is used as the target alarm callback configuration parameter, determining that the alarm object is consistent with the alarm object of the alarm event, and/or determining that the alarm content is matched with the alarm content of the alarm event.
Optionally, the alarm callback configuration parameter includes alarm aggregation waiting time, where the alarm aggregation waiting time is used to indicate a waiting time required for callback alarm events;
the alert callback device is further configured to:
and determining that the alarm aggregation waiting time is over before the alarm callback content is recalled to the target address in the alarm callback configuration parameters.
Optionally, the alert callback device is configured to:
acquiring the first triggering time of an alarm event by an alarm event monitoring system;
acquiring a timestamp generated based on the time of the first trigger and the alarm aggregation waiting time;
and when the time stamp is earlier than the current time, determining that the alarm aggregation waiting time is ended.
Optionally, the alert callback device is configured to:
acquiring an alarm fingerprint carried in a label of an alarm event;
generating a first index of an alarm event based on the alarm fingerprint, the identifier of the alarm callback configuration parameter and a preset character string;
when the first index is stored in the cache, acquiring the time from the first index to the cache, and taking the time from the first index to the cache as the time for triggering the alarm event for the first time;
and when the first index is not stored in the cache, storing the first index into the cache, and taking the time from the first index to the cache as the time for triggering the alarm event for the first time.
Optionally, the preset string includes an identification of a queue storing the alarm event.
Optionally, the alert callback device is further configured to:
acquiring an alarm level carried in a label of an alarm event before generating a first index of the alarm event based on the alarm fingerprint, the identifier of the alarm callback configuration parameter and a preset character string;
an identification of a queue storing alarm events is determined based on the alarm level.
Optionally, the alert callback device is further configured to:
when the first index is not stored in the cache, setting the expiration time of the first index in the cache based on alarm aggregation waiting time and preset time after the first index is stored in the cache;
and when the storage time of the first index in the cache exceeds the expiration time, clearing the first index in the cache.
Optionally, the alert callback device is further configured to:
and after the content of the alarm callback is called back to the target address in the alarm callback configuration parameters, if the first index is still stored in the cache, deleting the first index from the cache.
Optionally, the generating unit 303 is configured to:
when the alarm aggregation waiting time is over, acquiring a first index value corresponding to a first index in a cache, wherein the first index value is generated based on an alarm event, an alarm callback configuration parameter and the triggering times of the alarm event, and the triggering times of the alarm event are the times of generating the first index in the alarm aggregation waiting time;
and taking the first index value as the content of the alarm callback.
Optionally, the alert callback device is further configured to:
when the first index and the first index value are stored in the cache, the first index and the first index value are written into the cache in an atomic writing mode according to the corresponding relation between the first index and the first index value.
Optionally, the alarm callback configuration parameters further include an alarm callback frequency, where the alarm callback frequency is used to indicate the number of alarm callbacks of the alarm event in a preset time interval;
optionally, the alert callback device is further configured to:
after the alarm aggregation waiting time is determined to be over, and before the alarm callback content is recalled to a target address in the alarm callback configuration parameters, acquiring an alarm fingerprint carried in a label of an alarm event;
generating a second index of the alarm event based on the alarm fingerprint and the identifier of the alarm callback configuration parameter;
and determining that the storage time of the second index in the cache does not exceed the expiration time of the second index in the cache, wherein the expiration time of the second index in the cache is set based on the alarm call-back frequency and the time when the alarm event is triggered for the first time.
Optionally, the alert callback configuration parameters include at least one of:
the alarm encryption mode is used for encrypting the alarm callback content when the alarm callback content is recalled;
callback failure retry parameters, wherein the callback failure retry parameters comprise parameters indicating whether to retry or not and retry times;
and the alarm callback proxy parameter is used for indicating that the alarm callback content is recalled to the target address through the proxy node.
In the technical scheme provided by the embodiment of the application, the alarm event from the alarm event monitoring system is acquired through the first acquisition unit, the alarm callback configuration parameter corresponding to the alarm event is acquired through the second acquisition unit, and the alarm callback content is recalled to the target address in the alarm callback configuration parameter through the generation unit. According to the embodiment of the application, the callback process of the alarm event is stripped from the alarm event monitoring system, the independent alarm callback device is set to callback the alarm event, and the parameters related to the callback process are configured in the alarm callback device, so that the problem that the limitation is large in practical application of an enterprise due to the fact that the callback configuration parameters supported by the alarm event monitoring system are single when the alarm event monitoring system recalls the alarm event in the related technology is solved.
Based on the same concept, the embodiment of the present application provides an alert callback system, and the specific implementation of the system may refer to the description in the method embodiment section, and repeated details are not described again, as shown in fig. 4, the apparatus system includes:
an alarm event monitoring system 401, and an alarm callback platform 402 or an alarm callback device 402 in communication with the alarm event monitoring system;
an alarm event monitoring system 401 for triggering an alarm event;
the alarm callback platform 402 or the alarm callback device 402 is used for acquiring the alarm event from the alarm event monitoring system 401; acquiring an alarm callback configuration parameter corresponding to an alarm event, wherein the alarm callback configuration parameter comprises a parameter related to an alarm callback process for calling back the alarm event; generating alarm callback content corresponding to the alarm event based on the alarm callback configuration parameters; and (4) callback the alarm callback content to the target address in the alarm callback configuration parameter.
In the technical scheme provided by the embodiment of the application, the alarm event monitoring system triggers the alarm event, the alarm callback platform callbacks the alarm event, and the callback process related parameters are configured in the alarm callback platform, so that the problem that in the related technology, when the alarm event monitoring system callbacks the alarm event, the callback configuration parameters supported by the alarm event monitoring system are single, so that the limitation is large in the practical application of an enterprise is solved.
Based on the same concept, an embodiment of the present application further provides an electronic device, as shown in fig. 5, the electronic device mainly includes: a processor 501, a memory 502 and a communication bus 503, wherein the processor 501 and the memory 502 communicate with each other through the communication bus 503. The memory 502 stores a program executable by the processor 501, and the processor 501 executes the program stored in the memory 502, so as to implement the following steps:
acquiring an alarm event from an alarm event monitoring system;
acquiring an alarm callback configuration parameter corresponding to an alarm event, wherein the alarm callback configuration parameter comprises a parameter related to an alarm callback process for calling back the alarm event;
generating alarm callback content corresponding to the alarm event based on the alarm callback configuration parameters;
and (4) callback the alarm callback content to the target address in the alarm callback configuration parameter.
The communication bus 503 mentioned in the electronic device may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus 503 may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown in FIG. 5, but this is not intended to represent only one bus or type of bus.
The Memory 502 may include a Random Access Memory (RAM) or a non-volatile Memory (non-volatile Memory), such as at least one disk Memory. Alternatively, the memory may be at least one memory device located remotely from the aforementioned processor 501.
The Processor 501 may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), etc., and may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic devices, discrete gates or transistor logic devices, and discrete hardware components.
In yet another embodiment of the present application, a computer-readable storage medium is further provided, in which a computer program is stored, and when the computer program runs on a computer, the computer program causes the computer to execute the alert callback method described in the above embodiment.
According to an aspect of the application, a computer program product or computer program is provided, comprising computer instructions, the computer instructions being stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and executes the computer instructions, so that the computer device executes the alarm callback method provided in the above-mentioned various alternative implementations.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the application to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored on a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wire (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wirelessly (e.g., infrared, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that includes one or more of the available media. The available media may be magnetic media (e.g., floppy disks, hard disks, tapes, etc.), optical media (e.g., DVDs), or semiconductor media (e.g., solid state drives), among others.
It is noted that, in this document, relational terms such as "first" and "second," and the like, may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The above description is merely exemplary of the present application and is presented to enable those skilled in the art to understand and practice the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (23)

1. An alarm callback method is characterized by comprising the following steps:
acquiring an alarm event from an alarm event monitoring system;
acquiring an alarm callback configuration parameter corresponding to the alarm event, wherein the alarm callback configuration parameter comprises a parameter related to an alarm callback process for calling back the alarm event;
generating alarm callback content corresponding to the alarm event based on the alarm callback configuration parameters;
and calling back the alarm callback content to a target address in the alarm callback configuration parameters.
2. The method of claim 1, wherein obtaining the alert callback configuration parameters corresponding to the alert event comprises:
acquiring a target alarm strategy identifier carried in a label of the alarm event, wherein an alarm strategy corresponding to the target alarm strategy identifier is used for generating the alarm event;
searching a target alarm callback configuration parameter with an alarm strategy identifier as the target alarm callback configuration parameter of the target alarm strategy identifier from the alarm callback configuration list, wherein each alarm callback configuration parameter in the alarm callback configuration list comprises the alarm strategy identifier;
and taking the target alarm callback configuration parameter as an alarm callback configuration parameter corresponding to the alarm event.
3. The method of claim 2, wherein each alarm callback configuration parameter in said alarm callback configuration list comprises an alarm callback condition, each said alarm callback condition comprising an identification of an alarm policy;
searching for a target alarm callback configuration parameter, identified by the alarm policy as the target alarm policy identifier, from the alarm callback configuration list, including:
searching for a target alarm callback condition with the identifier of the alarm strategy as the identifier of the target alarm strategy from each alarm callback condition;
and taking the alarm callback configuration parameter corresponding to the target alarm callback condition as the target alarm callback configuration parameter.
4. The method of claim 3, wherein the target alert callback condition further comprises an alert object and/or alert content;
before the alarm callback configuration parameter corresponding to the target alarm callback condition is used as the target alarm callback configuration parameter, the method further includes:
determining that the alarm object is consistent with the alarm object of the alarm event, and/or determining that the alarm content is matched with the alarm content of the alarm event.
5. The method of claim 1, wherein the alarm callback configuration parameters include an alarm aggregate latency time indicating a time to wait for the alarm event to be recalled;
before the content of the alarm callback is recalled to the target address in the alarm callback configuration parameter, the method further includes:
and determining that the alarm aggregation waiting time is over.
6. The method of claim 5, wherein determining that the alarm aggregation latency is over comprises:
acquiring the time of the first triggering of the alarm event by the alarm event monitoring system;
acquiring a timestamp generated based on the time of the first trigger and the alarm aggregation waiting time;
and when the time stamp is earlier than the current time, determining that the alarm aggregation waiting time is ended.
7. The method of claim 6, wherein obtaining the time at which the alarm event first triggered comprises:
acquiring an alarm fingerprint carried in a label of the alarm event;
generating a first index of the alarm event based on the alarm fingerprint, the identifier of the alarm callback configuration parameter and a preset character string;
when the first index is stored in the cache, acquiring the time from the first index to the cache for the first time, and taking the time from the first index to the cache as the time for triggering the alarm event for the first time;
and when the first index is not stored in the cache, storing the first index into the cache, and taking the time of storing the first index into the cache as the time of triggering the alarm event for the first time.
8. The method of claim 7, wherein the predetermined string comprises an identification of a queue storing the alarm event.
9. The method of claim 8, wherein prior to generating the first index of alarm events based on the alarm fingerprint, the identification of the alarm callback configuration parameter, and a preset string, further comprising:
acquiring the alarm level carried in the label of the alarm event;
determining an identification of a queue storing the alarm event based on the alarm level.
10. The method of claim 7, wherein when the first index is not stored in the cache, after storing the first index into the cache, further comprising:
setting the expiration time of the first index in the cache based on the alarm aggregation waiting time and preset time;
when the storage time of the first index in the cache exceeds the expiration time, clearing the first index in the cache.
11. The method of claim 10, wherein after callback the alert callback content to a target address in the alert callback configuration parameters, further comprising:
and if the first index is still stored in the cache, deleting the first index from the cache.
12. The method of claim 7, wherein generating the alert callback content corresponding to the alert event based on the alert callback configuration parameters comprises:
acquiring a first index value corresponding to the first index in the cache when the alarm aggregation waiting time is over, wherein the first index value is generated based on the alarm event, the alarm callback configuration parameter and the triggering times of the alarm event, and the triggering times of the alarm event are the times of generating the first index in the alarm aggregation waiting time;
and taking the first index value as the alarm callback content.
13. The method according to claim 12, wherein when storing the first index and the first index value in the cache, the first index and the first index value are written to the cache in an atomic write manner and according to a correspondence relationship between the first index and the first index value.
14. The method of claim 7, wherein the alert callback configuration parameters further comprise an alert callback frequency indicating a number of alert callbacks for the alert event within a preset time interval;
after determining that the alarm aggregation waiting time is over and before callback the alarm callback content to the target address in the alarm callback configuration parameter, the method further includes:
acquiring an alarm fingerprint carried in a label of the alarm event;
generating a second index of the alarm event based on the alarm fingerprint and the identifier of the alarm callback configuration parameter;
determining that the storage time of the second index in the cache does not exceed the expiration time of the second index in the cache, wherein the expiration time of the second index in the cache is set based on the frequency of the alarm call-back and the time when the alarm event is triggered for the first time.
15. The method of claim 1, wherein the alert callback configuration parameters comprise at least one of:
the alarm encryption mode is used for encrypting the alarm callback content when the alarm callback content is recalled;
a callback failure retry parameter, the callback failure retry parameter comprising a parameter indicating whether to retry and a retry number;
and the alarm callback proxy parameter is used for indicating that the alarm callback content is recalled to the target address through a proxy node.
16. An alert callback platform, comprising:
an alarm processing component for receiving an alarm event from an alarm event monitoring system;
the alarm consumption component is used for acquiring alarm callback configuration parameters corresponding to the alarm event, and the alarm callback configuration parameters comprise parameters related to an alarm callback process for calling back the alarm event; generating alarm callback content corresponding to the alarm event based on the alarm callback configuration parameters;
and the alarm callback component is used for calling back the alarm callback content to a target address in the alarm callback configuration parameters.
17. The alert callback platform of claim 16, further comprising:
and the message queue is used for storing the alarm event according to the alarm level of the alarm event.
18. The alert callback platform of claim 17, wherein the message queue comprises a partition for storing the alert event and alert events having the same alert fingerprint as the alert event.
19. The alert callback platform of claim 17, wherein the alert consumption components comprise at least one consumption component, each of the consumption components corresponding to one of the message queues.
20. An alert callback device, comprising:
the first acquisition unit is used for acquiring an alarm event from an alarm event monitoring system;
a second obtaining unit, configured to obtain an alarm callback configuration parameter corresponding to the alarm event, where the alarm callback configuration parameter includes a parameter related to an alarm callback process for calling back the alarm event;
the generating unit is used for generating the alarm callback content corresponding to the alarm event based on the alarm callback configuration parameters;
and the callback unit is used for calling back the alarm callback content to the target address in the alarm callback configuration parameters.
21. An alert callback system, comprising:
the system comprises an alarm event monitoring system and an alarm callback platform or an alarm callback device which is communicated with the alarm event monitoring system;
the alarm event monitoring system is used for triggering an alarm event;
the alarm callback platform or the alarm callback device is used for acquiring an alarm event from an alarm event monitoring system; acquiring an alarm callback configuration parameter corresponding to the alarm event, wherein the alarm callback configuration parameter comprises a parameter related to an alarm callback process for calling back the alarm event; generating alarm callback content corresponding to the alarm event based on the alarm callback configuration parameters; and calling back the alarm callback content to a target address in the alarm callback configuration parameters.
22. An electronic device, comprising: the system comprises a processor, a memory and a communication bus, wherein the processor and the memory are communicated with each other through the communication bus;
the memory for storing a computer program;
the processor, configured to execute the program stored in the memory, and implement the alert callback method according to any one of claims 1 to 15.
23. A computer-readable storage medium storing a computer program, wherein the computer program, when executed by a processor, implements the alert callback method of any of claims 1-15.
CN202111177540.3A 2021-10-09 2021-10-09 Alarm callback method, platform, system, device, equipment and storage medium Pending CN114020558A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111177540.3A CN114020558A (en) 2021-10-09 2021-10-09 Alarm callback method, platform, system, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111177540.3A CN114020558A (en) 2021-10-09 2021-10-09 Alarm callback method, platform, system, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114020558A true CN114020558A (en) 2022-02-08

Family

ID=80055841

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111177540.3A Pending CN114020558A (en) 2021-10-09 2021-10-09 Alarm callback method, platform, system, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114020558A (en)

Similar Documents

Publication Publication Date Title
CN110661659B (en) Alarm method, device and system and electronic equipment
US8966039B1 (en) End-to-end communication service monitoring and reporting
US11177999B2 (en) Correlating computing network events
JP6747287B2 (en) Information processing apparatus and monitoring method
US20200073981A1 (en) Optimizing data entries in a log
JP6160064B2 (en) Application determination program, failure detection apparatus, and application determination method
US10630566B1 (en) Tightly-coupled external cluster monitoring
US9652307B1 (en) Event system for a distributed fabric
US20050038888A1 (en) Method of and apparatus for monitoring event logs
CN110932933A (en) Network condition monitoring method, computing device and computer storage medium
CN111698126B (en) Information monitoring method, system and computer readable storage medium
US20230359514A1 (en) Operation-based event suppression
CN110309028B (en) Monitoring information acquisition method, service monitoring method, device and system
JP6501924B2 (en) Method and server for canceling alert
CN114020558A (en) Alarm callback method, platform, system, device, equipment and storage medium
US7472183B1 (en) Approaches for capturing illegal and undesired behavior in network components and component interactions
CN113421109A (en) Service checking method, device, electronic equipment and storage medium
CN113485891A (en) Service log monitoring method and device, storage medium and electronic equipment
US10296967B1 (en) System, method, and computer program for aggregating fallouts in an ordering system
US7047289B1 (en) MIB detecting data modification in MIB tables in an SNMP command responder
Shivakumar et al. Web Performance Monitoring and Infrastructure Planning
CN113778800B (en) Error information processing method, device, system, equipment and storage medium
US20220129342A1 (en) Conserving computer resources through query termination
CN117370063A (en) Cloud server memory fault feature extraction method, system and related device
JP2016100816A (en) Virtual network management device and method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication