CN118069450A - Alarm processing method, device and storage medium - Google Patents

Alarm processing method, device and storage medium Download PDF

Info

Publication number
CN118069450A
CN118069450A CN202211415239.6A CN202211415239A CN118069450A CN 118069450 A CN118069450 A CN 118069450A CN 202211415239 A CN202211415239 A CN 202211415239A CN 118069450 A CN118069450 A CN 118069450A
Authority
CN
China
Prior art keywords
service
micro
monomer
alarm
fault
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211415239.6A
Other languages
Chinese (zh)
Inventor
李亚茹
庄孺义
邱璐
李明亮
闫颖莹
邓欣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China United Network Communications Group Co Ltd
Original Assignee
China United Network Communications Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China United Network Communications Group Co Ltd filed Critical China United Network Communications Group Co Ltd
Priority to CN202211415239.6A priority Critical patent/CN118069450A/en
Publication of CN118069450A publication Critical patent/CN118069450A/en
Pending legal-status Critical Current

Links

Landscapes

  • Debugging And Monitoring (AREA)

Abstract

The application provides an alarm processing method, an alarm processing device and a storage medium, which are suitable for the technical field of computers, wherein the method comprises the following steps: acquiring execution information of a plurality of service processing dimensions of each micro service monomer in a micro service system within a first preset time period; the execution information of the service processing dimension comprises a plurality of execution parameters of service processing; if the execution parameters of the micro service monomers meet the alarm conditions of at least one service processing dimension, determining that the micro service monomers have faults; sending the alarm message of the determined fault micro-service monomer to a maintenance platform corresponding to the micro-service system; the alert message contains a class of abnormal execution parameters that satisfy the alert condition. The method solves the problem that a large number of alarm messages are generated in a short time in the existing fault monitoring method.

Description

Alarm processing method, device and storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to an alarm processing method, an alarm processing device, and a storage medium.
Background
The micro-service system has the advantages of easy management, deployment, expansion and the like. As shown in fig. 1, the business system of the operator is a micro-service system composed of a plurality of micro-service monomers 12. Each micro service unit 12 has a respective independent service function. The micro service monomers 12 are called mutually based on the service logic to finish the processing of the service corresponding to the service logic.
In order to ensure the normal development of the service, the monitoring device 11 performs fault monitoring and warning on the micro service system. Specifically, the monitoring device 11 monitors the execution parameters of the processing services of each micro service unit 12 in the micro service system. If the monitoring device 11 determines that the execution parameter of the micro service unit 12 exceeds the corresponding parameter threshold value within the preset time period, it determines that the micro service unit 12 fails, and generates an alarm message of the failed micro service unit 12. The monitoring device 11 sends the alarm message to the maintenance platform 13 corresponding to the service system, so that maintenance personnel can conduct fault detection according to the alarm message of the maintenance platform 13.
The existing fault monitoring method is adopted to monitor faults of the micro-service system, a large number of alarm messages are often generated in a short time, so that the time for troubleshooting of maintenance personnel is huge, and the timeliness of fault treatment of the micro-service system is further affected.
Disclosure of Invention
The application provides an alarm processing method, an alarm processing device and a storage medium, which are used for solving the problem that a large number of alarm messages are generated in a short time in the existing fault monitoring method.
In a first aspect, the present application provides an alarm processing method, including:
Acquiring execution information of a plurality of service processing dimensions of each micro service monomer in a micro service system within a first preset time period; the execution information of the service processing dimension comprises a plurality of execution parameters of service processing;
if the execution parameters of the micro service monomers meet the alarm conditions of at least one service processing dimension, determining that the micro service monomers have faults;
Sending the alarm message of the determined fault micro-service monomer to a maintenance platform corresponding to the micro-service system; the alert message contains a class of abnormal execution parameters that satisfy the alert condition.
Optionally, the service processing dimension is any one of a service calling dimension, a function executing dimension, an interface calling dimension and an instance configuration dimension of the micro service unit;
And if the execution parameters of the micro service monomer meet the alarm condition of at least one service processing dimension, determining that the micro service monomer fails, including:
And if the execution parameters of the micro service monomer meet the alarm conditions of at least one service processing dimension of the service calling dimension, the function execution dimension, the interface calling dimension and the instance configuration dimension, determining that the micro service monomer fails.
Optionally, the sending the alarm message of the determined fault micro service monomer to the maintenance platform corresponding to the micro service system includes:
acquiring a system call chain representing a service call relationship between micro service monomers in the micro service system;
Determining adjacent and related fault micro-service monomers from a plurality of fault micro-service monomers of the micro-service system as first fault micro-service monomers based on the system call chain, and determining fault micro-service monomers which do not belong to the first fault micro-service monomers as second fault micro-service monomers;
Determining an initial fault micro-service monomer from a plurality of first fault micro-service monomers based on the system call chain;
and sending the alarm messages of the initial fault micro-service monomer and the second fault micro-service monomer to a maintenance platform corresponding to the micro-service system.
Optionally, the alarm message further includes a second class exception execution parameter; the second-class abnormal execution parameters are execution parameters which do not meet the alarm conditions of the service processing dimension to which the second-class abnormal execution parameters belong, but exceed the corresponding preset parameter threshold;
The sending the alarm message of the determined fault micro-service monomer to the maintenance platform corresponding to the micro-service system comprises the following steps:
And sending the alarm message of the fault micro-service monomer containing the first class abnormal execution parameters and the second class abnormal execution parameters to a maintenance platform corresponding to the micro-service system.
Optionally, after the determining that the micro service monomer fails, the method further includes:
And in a second preset time period, if the frequency of faults of the appointed micro service monomer exceeds a frequency threshold, determining to send an alarm message of the appointed micro service monomer to a maintenance platform corresponding to the micro service system.
Optionally, the sending the alarm message of the determined fault micro service monomer to the maintenance platform corresponding to the micro service system includes:
determining the service quantity associated with the fault micro-service monomer;
based on the corresponding relation between the service quantity and the alarm level, the alarm level marking is carried out on the alarm message of the fault micro-service monomer;
and sending the alarm information marked by the alarm level to a maintenance platform corresponding to the micro-service system.
Optionally, after the determining that the micro service monomer fails, the method further includes:
the system alarm message of the micro service system is sent to a maintenance platform corresponding to the micro service system; the system alarm message comprises a class of abnormal execution parameters meeting alarm conditions of each fault micro-service monomer in the micro-service system.
Optionally, the sending the alarm message of the determined fault micro service monomer to the maintenance platform corresponding to the micro service system includes:
if the time of the fault micro-service monomer fault does not belong to a third preset time period, carrying out waiting-to-send marking on the fault message of the fault micro-service monomer;
After the timer is monitored to exceed the starting time point of a third preset time period of the next period, sending an alarm message of a to-be-sent mark to a maintenance platform corresponding to the micro-service system; and the overtime time point of the timer is the starting time point of the third preset time period.
In a second aspect, the present application provides an alarm processing apparatus, the apparatus comprising:
A transceiver module and a processing module;
the receiving and transmitting module is used for acquiring execution information of a plurality of service processing dimensions of each micro service monomer in the micro service system in a first preset time period; the execution information of the service processing dimension comprises a plurality of execution parameters of service processing;
The processing module is used for determining that the micro-service monomer fails if the execution parameters of the micro-service monomer meet the alarm conditions of at least one service processing dimension;
The receiving and transmitting module is further used for sending the alarm message of the determined fault micro-service monomer to a maintenance platform corresponding to the micro-service system; the alert message contains a class of abnormal execution parameters that satisfy the alert condition.
In a third aspect, the present application provides an alarm processing apparatus, the apparatus comprising:
A processor and a memory;
The memory stores executable instructions executable by the processor;
Wherein the processor executes the executable instructions stored by the memory, causing the processor to perform the method as described above.
In a fourth aspect, the present application provides a storage medium having stored therein computer-executable instructions for performing the method as described above when executed by a processor.
According to the alarm processing method, the alarm processing device and the storage medium, the execution parameters of the micro service monomers meet the alarm condition of at least one service processing dimension, so that the fault micro service monomers in the micro service system in the first preset time period are determined, the alarm information of the determined fault micro service monomers is sent to the maintenance platform to perform fault processing, the number of the alarm information is greatly reduced, the service processing dimension of the fault micro service monomers is defined, the fault checking by maintenance personnel is facilitated, and the fault processing timeliness of the micro service system is improved. The application solves the problem that the existing fault monitoring method generates a large number of alarm messages in a short time.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application.
FIG. 1 is a diagram of a prior art fault monitoring system architecture;
FIG. 2 is a diagram of an alarm processing system according to an embodiment of the present application;
FIG. 3 is a flowchart of an alarm processing provided in an embodiment of the present application;
FIG. 4 is a diagram of an alarm processing device according to an embodiment of the present application;
Fig. 5 is a structural diagram of an alarm processing device according to an embodiment of the present application.
Specific embodiments of the present application have been shown by way of the above drawings and will be described in more detail below. The drawings and the written description are not intended to limit the scope of the inventive concepts in any way, but rather to illustrate the inventive concepts to those skilled in the art by reference to the specific embodiments.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
As shown in fig. 1, the service system of the operator (e.g., cbss2.0 system) is a micro service system composed of a plurality of micro service monomers 12. The micro service monomers 12 are, for example, a micro service monomer a, a micro service monomer B, a micro service monomer C, a micro service monomer D, a micro service monomer E, a micro service monomer F shown in fig. 1. The monitoring device 11 monitors a plurality of execution parameters in the process of processing the service by each micro service unit 12. If the monitoring device 11 determines that the execution parameter of the micro service unit 12 exceeds the corresponding parameter threshold value within the preset time period, it determines that the micro service unit 12 fails, and generates an alarm message of the failed micro service unit 12. An execution parameter exceeding the parameter threshold corresponds to an alarm message. The execution parameters include: the number of times the micro service monomer 12 calls the associated micro service monomer 12, the success rate of the micro service monomer 12 calling the associated micro service monomer 12, the number of times the micro service monomer 12 fails to perform a service function, the execution duration of the micro service monomer 12 for performing stack memory reclamation, and the like. The monitoring device 11 sends the alarm message to the maintenance platform 13 corresponding to the micro service system, so that maintenance personnel can conduct fault detection according to the alarm message of the maintenance platform 13.
Since the service processing of a service involves a plurality of micro-service monomers 12 that are invoked with each other, if one micro-service monomer 12 fails, it will cause other micro-service monomers 12 associated with the service to fail. In addition, each micro service unit 12 corresponds to a plurality of execution parameters, and there is a correlation between the execution parameters, and one execution parameter causes the micro service unit 12 to fail, and may cause other execution parameters of the micro service unit 12 to exceed corresponding parameter thresholds. Therefore, the existing fault monitoring method is adopted to monitor the micro-service system, a large number of alarm messages are often generated in a short time, the fault troubleshooting difficulty of maintenance personnel is increased, the time for troubleshooting of the maintenance personnel is huge, and the timeliness of fault processing of the micro-service system is further affected.
Because a plurality of execution parameters of the micro-service monomer are related to each other in different service processing dimensions, if the execution parameters are classified according to the service processing dimensions and the alarm determination is carried out according to the service processing dimensions, the number of alarm messages can be reduced, and the fault detection is facilitated. In this regard, the application provides an alarm processing method, by acquiring execution information of a plurality of service processing dimensions of each micro service unit in a micro service system within a first preset time period; the execution information of the service processing dimension comprises a plurality of execution parameters of the service processing; if the execution parameters of the micro service monomer meet the alarm conditions of at least one service processing dimension, determining that the micro service monomer fails; sending the alarm message of the determined fault micro-service monomer to a maintenance platform corresponding to the micro-service system; the alert message contains a class of abnormal execution parameters that satisfy the alert condition. The application classifies a plurality of execution parameters corresponding to the micro service monomers according to different service processing dimensions, carries out alarm determination according to the service processing dimensions, reduces the number of alarm messages, solves the problem that a large number of alarm messages are generated in a short time by the existing fault monitoring method, determines the fault micro service monomers from the service processing dimensions, is beneficial to maintenance personnel to carry out fault investigation on the fault micro service monomers, and improves the fault processing timeliness of the micro service system.
The alarm processing method provided by the application is described below with reference to some embodiments.
FIG. 2 is a diagram of an alarm processing system according to an embodiment of the present application. As shown in fig. 2, the system includes: an alarm processing device 21, a micro service system composed of a plurality of micro service units 12, a maintenance platform 13, and a client 14. The micro service unit 12 may be a server or other device having service function processing capability. The micro service monomer 12 is shown as a micro service monomer A, a micro service monomer B, a micro service monomer C, a micro service monomer D, a micro service monomer E and a micro service monomer F in FIG. 2. The client 14 receives a service request entered by a user at the client 14. The client 14 sends the service request to the micro-service system through a gateway interface on the client 14. The micro service monomers 12 in the micro service system call the micro service monomers 12 associated with the service logic based on the service logic corresponding to the received service request to complete the processing of the service corresponding to the service request. To ensure the normal operation of the micro service system, the alert processing apparatus 21 performs alert processing on the micro service system as follows: the alert processing apparatus 21 acquires execution information of a plurality of service processing dimensions of each micro service unit 12 in the micro service system within a first preset period of time. The execution information of the business process dimension includes a plurality of execution parameters of the business process. If the execution parameters of the micro service unit 12 satisfy the alarm condition of at least one service processing dimension, the alarm processing device 21 determines that the micro service unit 12 is malfunctioning. The alarm processing device 21 transmits the alarm message of the determined faulty micro-service unit 12 to the maintenance platform 13 corresponding to the micro-service system. The alert message contains a class of abnormal execution parameters that satisfy the alert condition. The first preset time period may be 1 minute. The business processing dimension may be any one of a service call dimension, a function execution dimension, an interface call dimension, and an instance configuration dimension of the micro-service monomer 12. The plurality of execution parameters of the service invocation dimension of the micro service monomer 12 include: the micro service monomer 12 invokes the call volume, failure rate, timeout volume, timeout rate, etc. of the associated micro service monomer 12 within a first preset time period. The plurality of execution parameters of the function execution dimension of the micro service monomer 12 include: the micro service unit 12 performs the service function for the first preset period of time, and performs the success rate. The plurality of execution parameters of the interface call dimension of the micro service monomer 12 include: the call volume, failure rate, timeout volume, timeout rate, etc. that each interface of the micro service unit 12 is called. An interface such as an API interface on the micro-service monomer 12. Typically, the micro-service unit 12 is provided with a plurality of interfaces for different service treatments. The plurality of execution parameters of the instance configuration dimension of the micro-service monomer 12 include: the execution time of the stack memory reclamation of the micro service unit 12, the stack memory utilization rate of the micro service unit 12, and the like.
Illustratively, the alert processing apparatus 21 may acquire the execution parameters of the multiple service processing dimensions of each micro service monomer 12 in the micro service system within the first preset period of time by:
assume that microservice monomer a invokes microservice monomer B. The call request is sent by micro-service monomer a to micro-service monomer B via one interface of micro-service monomer B (e.g., interface B1). The micro service monomer A receives the call response message returned by the micro service monomer B through the interface B1. The invoke-response message includes an identification of invoking an acknowledgement or a negative acknowledgement.
The alarm processing device 21 may acquire service call state information of the micro service monomer a for calling the associated micro service monomer B through a probe of the alarm processing device 21. The service call state information comprises a call request, a call response message and call time consumption. The call time is the time spent by the micro service monomer A from sending the call request to receiving the call response message, or the time spent by the micro service monomer B from receiving the call request to returning the call response message. The invocation request includes the identity of micro-service monomer a and the identity of micro-service monomer B. The call response message also includes an identification of the micro service monomer a and an identification of the interface B1. A piece of service call state information characterizes one call. Alternatively, the alert processing apparatus 21 may also acquire service call status information of each micro service element 12 of the micro service system within the first preset period from the micro service elements 12 for data aggregation storage of the micro service system (e.g., the micro service element F shown in fig. 2). The service invocation status information may be stored in a columnar store database module (Clickhouse) of the micro-service monomer F. The micro service unit F is used for collecting and storing data generated by each micro service unit 12 for performing service functions in the micro service system. The micro service monomers 12 for performing the service function are shown as a micro service monomer a, a micro service monomer B, a micro service monomer C, a micro service monomer D, and a micro service monomer E in fig. 2.
The alarm processing device 21 performs statistical calculation on the service call state information of the micro service system acquired in the first preset time period to obtain the following execution parameters for calling the associated micro service monomer 12 by the designated micro service monomer 12 (such as the micro service monomer a) in the first preset time period: the calling times Mz, the calling times Mf containing the calling denial identification, the overtime calling times Mc of which the calling time exceeds the corresponding time-consuming threshold value, and the overtime time length Ti of each overtime calling. The alarm processing device 21 determines the failure rate Kf of the designated micro-service monomer 12 to invoke the associated micro-service monomer 12 within the first preset time period using the formula kf=mf/Mz. The alarm processing device 21 determines the timeout rate Kc for the specific micro-service monomer 12 to invoke the associated micro-service monomer 12 within the first preset time period using the formula kc=mc/Mz. The alarm processing device 21 sums the timeout periods Ti of each timeout call of the specified micro-service monomer 12 in the first preset period to obtain the timeout period Tc of the call of the specified micro-service monomer 12 to the associated micro-service monomer 12 in the first preset period. Thereby, the alert processing apparatus 21 obtains the following execution parameters of the service call dimension of the specified micro service monomer 12 in the micro service system within the first preset period of time: service call amount Mz, service failure amount Mf, service failure rate Kf, service timeout amount Tc, service timeout rate Kc.
Similarly, the alarm processing device 21 performs statistical calculation on the service call state information of the micro service system acquired in the first preset time period based on the interface identifier of the designated interface (such as the interface B1), to obtain the following execution parameters that the designated interface is called in the first preset time period: number of calls Mzj, number of calls Mfj containing call deny identification, number of timeout calls Mcj when the call time exceeds a corresponding time-consuming threshold, timeout duration Tij for each timeout call. The alert processing device 21 determines the failure rate Kfj at which the specified interface is invoked for the first preset time period using the equation Kfj = Mfj/Mzj. The alert processing device 21 determines the timeout rate Kcj for the specified interface to be invoked for the first preset time period using the equation Kcj = Mcj/Mzj. The alarm processing device 21 sums the timeout periods Tij associated with the specified interface in the first preset time period and each timeout call to obtain the timeout amount Tcj of the specified interface in the first preset time period. Thereby, the alert processing apparatus 21 obtains the following execution parameters of the interface call dimensions of the designated interface of the micro service monomer 12 in the micro service system within the first preset period of time: interface call volume Mzj, interface failure volume Mfj, interface failure rate Kfj, interface timeout volume Tcj, interface timeout rate Kcj. Alternatively, the alert processing apparatus 21 may also acquire the correspondence of the interface with the micro service monomer 12 from a remote dictionary service module (redis) of the micro service monomer F. Based on the obtained correspondence between the interfaces and the micro service monomers 12, the alarm processing device 21 performs statistical calculation on the service call state information of the micro service system obtained in the first preset time period, so as to obtain the execution parameters of each interface called in the first preset time period.
After receiving the call request or the service request, the micro service unit 12 performs corresponding service function execution, and returns execution result information to the device that sends the call request or the service request after the execution. The execution result information includes an execution acknowledgement or negative acknowledgement flag. Similarly, the alarm processing device 21 may obtain, through the probe of the alarm processing device 21, the execution result information of each micro service unit 12 in the micro service system within the first preset period, and perform statistical calculation on the execution result information, to obtain the following execution parameters specifying the function execution dimensions of the micro service unit 12: the execution times and the execution success rate of the service functions.
The alarm processing device 21 may directly obtain, from the specified micro service monomer 12, the execution duration of the stack memory recovery performed by the specified micro service monomer 12 in the first preset time period and the stack memory usage rate of the micro service monomer 12 through the probe of the alarm processing device 21, so as to obtain the execution parameters of the instance configuration dimensions of each micro service monomer 12 in the micro service system in the first preset time period.
Illustratively, if the execution parameters of the micro-service monomer 12 satisfy the alarm condition of at least one of the service processing dimensions as shown in table 1, the alarm processing device 21 determines that the micro-service monomer 12 is malfunctioning.
TABLE 1 alarm condition for each business process dimension
Alternatively, the alarm condition of each service processing dimension may be configured according to the requirements of different micro service systems, and the alarm condition of the corresponding service processing dimension of each micro service system may be stored in a micro service unit 12 (e.g., a micro service unit F shown in fig. 2) for data aggregation storage of the micro service systems. For example, the alarm conditions for the business process dimension may be stored in the relational database management system module (mysql) of the micro-service monomer F.
If the alarm processing device 21 determines that the micro service unit 12 fails, the alarm processing device 21 sends an alarm message of the determined failed micro service unit 12 to the maintenance platform 13 corresponding to the micro service system, so that maintenance personnel can maintain the failed micro service unit 12 in the micro service system. For example, maintenance personnel may remotely troubleshoot and maintain the faulty micro-service unit 12 via the maintenance platform 13.
The alarm processing method provided by the application has the advantages that the execution parameters of the micro service monomers meet the alarm condition of at least one service processing dimension to determine the fault micro service monomers in the micro service system in the first preset time period, the fault micro service monomers are determined by a plurality of execution parameters of the service processing dimension, the alarm information of the determined fault micro service monomers is sent to the maintenance platform for fault processing, the number of the alarm information is greatly reduced, the problem that a large number of alarm information is generated in a short time by the traditional fault monitoring method is solved, in addition, the fault micro service monomers are determined from the service processing dimension, the service processing dimension of the fault micro service monomers is determined, the fault investigation by maintenance personnel is facilitated, and the fault processing timeliness of the micro service system is improved.
The following describes the alarm processing method provided by the present application in detail with reference to fig. 3. FIG. 3 is a flowchart of an alarm processing according to an embodiment of the present application. The embodiment shown in fig. 3 is an alarm processing device 21 in the embodiment shown in fig. 2. As shown in fig. 3, the method includes:
S101, acquiring execution information of a plurality of service processing dimensions of each micro service monomer in a micro service system within a first preset time period; the execution information of the business process dimension includes a plurality of execution parameters of the business process.
Specifically, the alert processing apparatus 21 acquires execution information of a plurality of service processing dimensions of each micro service monomer in the micro service system within a first preset period of time. The execution information of the business process dimension includes a plurality of execution parameters of the business process. The specific implementation manner of the alarm processing device 21 obtaining the execution information of the multiple service processing dimensions of each micro service unit 12 in the micro service system in the first preset time period may be shown in the embodiment shown in fig. 2, and the specific implementation manner of the alarm processing device 21 obtaining the execution parameters of the multiple service processing dimensions of each micro service unit 12 in the micro service system in the first preset time period is not described herein again.
S102, if the execution parameters of the micro service monomer meet the alarm condition of at least one service processing dimension, determining that the micro service monomer fails.
Specifically, if the execution parameters of the micro service monomer satisfy the alarm condition of at least one service processing dimension, the alarm processing device 21 determines that the micro service monomer 12 fails.
Alternatively, the business processing dimension is any one of a service call dimension, a function execution dimension, an interface call dimension, and an instance configuration dimension of the micro-service monomer 12. If the execution parameters of the micro service unit 12 satisfy the alarm condition of at least one service processing dimension of the service calling dimension, the function execution dimension, the interface calling dimension, and the instance configuration dimension, the alarm processing device 21 determines that the micro service unit 12 fails. The respective alert conditions of the service invocation dimension, the function execution dimension, the interface invocation dimension, and the instance configuration dimension may be alert conditions as shown in table 1.
Optionally, after the alarm processing device 21 determines that the micro service unit 12 fails, the method provided by the present application further includes: in the second preset period, if the number of times that the specified micro service unit 12 fails exceeds the number threshold, the alarm processing device 21 determines to send an alarm message of the specified micro service unit 12 to the maintenance platform 13 corresponding to the micro service system.
Typically the second preset time period is greater than the first preset time period. The second preset time period may be n times the first preset time period. n is a natural number of 2 or more. The number of times threshold may be 2 times. For example, the first preset time period is 1 minute and the second preset time period is 5 minutes. Typically, the micro-service monomer 12 has an automatic maintenance function. The automatic maintenance function of the micro service unit 12 can repair abnormal jitter of the micro service unit 12 during the service processing. Abnormal jitter, e.g. a large number of service requests suddenly gush into a micro-service monomer 12 at a certain moment, causes the micro-service monomer 12 to execute parameters meeting the alarm conditions of the service processing dimension within a certain first preset period of time. The alarm processing device 21 determines that the number of times of the failure of the designated micro service unit 12 exceeds the threshold number of times in the second preset time period according to steps S101-S102, the alarm processing device 21 determines that the failure of the designated micro service unit 12 is not abnormal jitter, the failure of the designated micro service unit 12 cannot be repaired by the automatic maintenance function of the designated micro service unit 12, and the alarm processing device 21 sends an alarm message of the designated micro service unit 12 to the maintenance platform 13 corresponding to the micro service system, so that maintenance personnel can perform failure investigation and maintenance on the micro service system in time.
Further, the alarm processing device 21 transmits the alarm message in the previous second preset time period in the preset transmission time period. The starting time point of the preset transmission time period is the ending time point of the previous second preset time period. The preset transmission period may be 0.5 minutes or 1 minute.
Optionally, after the alarm processing device 21 determines that the micro service unit 12 fails, the alarm processing method provided by the present application further includes: the alarm processing device 21 transmits a system alarm message of the micro service system to the maintenance platform 13 corresponding to the micro service system. The system alarm message includes a class of abnormal execution parameters of each failed micro-service monomer 12 in the micro-service system that satisfy the alarm condition.
For example, after determining each faulty micro-service unit 12 in the micro-service system, the alarm processing device 21 incorporates a class of abnormal execution parameters of each faulty micro-service unit 12 into one system alarm message of the micro-service system. The alarm processing device 21 sends a system alarm message to the maintenance platform 13. When the alarm processing device 21 carries out alarm processing on a plurality of micro service systems at the same time, the system alarm information of the micro service systems is sent to the corresponding maintenance platform 13 of the micro service systems, which is beneficial for a manager to dispatch maintenance staff to maintain the corresponding micro service systems according to the system alarm information in time, reduces the time consumption of classifying and distributing the micro service single 12 alarm information of different micro service systems received by the maintenance platform 12 according to the micro service systems by the manager or the maintenance platform 12, and ensures the maintenance timeliness of the micro service systems.
S103, sending the alarm message of the determined fault micro-service monomer to a maintenance platform corresponding to the micro-service system; the alert message contains a class of abnormal execution parameters that satisfy the alert condition.
Specifically, the alarm processing device 21 transmits the alarm message of the determined faulty micro service unit 12 to the maintenance platform 13 corresponding to the micro service system. The alert message contains a class of abnormal execution parameters that satisfy the alert condition.
Alternatively, the alarm processing device 21 may send the alarm message of the determined faulty micro service monomer 12 to the maintenance platform 13 corresponding to the micro service system in the manner shown in steps S201 to S204:
S201, the alarm processing device 21 acquires a system call chain which represents the service call relation between micro service monomers in the micro service system.
The system call chain is illustratively a service call relationship between micro service monomers, which is preset and stored by the operation device based on the service logic of each service by the service developer. The operation device may be any one of the client 14, the alarm processing device 21, the maintenance platform 13, and the micro service unit 12, or may be another device communicating with the micro service system. The system call chain may be stored in the micro service unit 12 (e.g., the micro service unit F shown in fig. 2) for data aggregation storage of the micro service system, or may be stored in a database device corresponding to the micro service system. For example, the system call chain may be stored in the graphics database module (neo 4 j) of the micro service monomer F. The alert processing apparatus 21 may acquire a system call chain of the micro service system from the micro service monomer F.
S202, the alarm processing device 21 determines, from among the plurality of fault micro service monomers 12 in the micro service system, the adjacent and associated fault micro service monomer 12 as a first fault micro service monomer 12, and determines the fault micro service monomer 12 not belonging to the first fault micro service monomer 12 as a second fault micro service monomer 12 based on the system call chain.
S203, the alarm processing device 21 determines the initial faulty micro-service monomer 12 from the plurality of first faulty micro-service monomers 12 based on the system call chain.
S204, the alarm processing device 21 sends the alarm message of each of the initial fault micro-service monomer 12 and the second fault micro-service monomer 12 to the corresponding maintenance platform 13 of the micro-service system.
Typically, the failure of adjacent and associated first plurality of failed micro-service monomers 12 is typically caused by the failure of the initial failed micro-service monomer 12 on the system call chain, rather than spontaneously. After the failure of the initial failure micro service unit 12 is resolved, the failure of other first failure micro service units 12 caused by the association of the initial failure micro service unit 12 can be automatically eliminated. For example, based on the system call chain, the alert processing device 21 may determine, from the adjacent and associated plurality of first faulty micro-service monomers 12, that the first faulty micro-service monomer 12 located at a low level of the call chain is the initial faulty micro-service monomer 12.
For example, as shown in fig. 2, it is assumed that the alert processing apparatus 21 determines that the micro service monomer C, the micro service monomer D, and the micro service monomer E are all the first faulty micro service monomer 12. Based on the system call chain, the alert device 21 determines the micro-service monomer C located at the lower level (e.g., two levels) of the call chain as the initial failed micro-service monomer 12.
Optionally, the system call chain is composed of a service call chain corresponding to each of the plurality of service logics. The alarm processing device 21 may also determine, based on the service call chain, that the first faulty micro-service monomer 12 located at the low level of the call chain is the initial faulty micro-service monomer 12 from among the plurality of first faulty micro-service monomers 12 located on the same service call chain.
Based on the system call chain or the service call chain, the alarm processing device 21 determines the initial fault micro-service monomer 12 and sends an alarm message of the initial fault micro-service monomer 12 to the maintenance platform 13, so that a maintainer performs fault troubleshooting maintenance on the initial fault micro-service monomer 12, the alarm message is reduced, and the maintenance timeliness of the micro-service system is ensured.
The absence of a second faulty micro service cell 12 adjacent to and associated with the second faulty micro service cell 12 indicates that the fault of the second faulty micro service cell 12 is not caused by other faulty micro service cells 12 associated with its traffic, but rather is caused spontaneously. Therefore, the alarm processing device 21 sends the alarm message of the second fault micro service unit 12 to the maintenance platform 13, so that the maintenance personnel can maintain the second fault micro service unit 12 in time, and the maintenance timeliness of the micro service system is ensured.
Further, the alarm message also includes a class II exception execution parameter. The second-class abnormal execution parameters are execution parameters which do not meet the alarm conditions of the service processing dimension to which the second-class abnormal execution parameters belong, but exceed the corresponding preset parameter threshold. The alarm processing device 21 may transmit the alarm message of the determined failed micro service monomer 12 to the maintenance platform 13 corresponding to the micro service system as follows: the alarm processing device 21 sends an alarm message of the fault micro-service monomer 12 containing one type of abnormal execution parameters and two types of abnormal execution parameters to the maintenance platform 13 corresponding to the micro-service system, so as to help maintenance personnel to conduct fault investigation according to the abnormal execution parameters.
For example, in the execution parameters of a micro-service monomer 12, the interface call volume exceeds the interface call volume threshold, the interface timeout rate exceeds the interface timeout rate threshold, and the service call volume exceeds the service call volume threshold. The alert processing apparatus 21 determines that the execution parameters of the interface call dimension of the micro service monomer 12 satisfy the alert condition of the interface call dimension as shown in table 1, but the execution parameters of the service call dimension thereof do not satisfy the alert condition of the service call dimension as shown in table 1. The alarm processing device 21 determines the interface call volume and the interface timeout rate of the micro service unit 12 as a class-one abnormal execution parameter, and determines the service call volume of the micro service unit 12 as a class-two abnormal execution parameter. The alarm processing device 21 sends an alarm message containing the interface call volume, the interface timeout rate and the service call volume of the micro service unit 12 to the maintenance platform 13 corresponding to the micro service system, so as to help maintenance personnel to perform fault investigation and maintenance on the micro service unit 12 according to the abnormal interface call volume, the interface timeout rate and the service call volume in the alarm message.
Optionally, the alarm processing device 21 may also send the alarm message of the determined faulty micro service monomer 12 to the maintenance platform 13 corresponding to the micro service system in the manner shown in steps S301-S303:
S301, the alarm processing device 21 determines the number of services associated with the faulty micro-service unit 12.
S302, the alarm processing device 21 marks the alarm level of the alarm message of the fault micro-service unit 12 based on the corresponding relation between the service quantity and the alarm level.
S303, the alarm processing device 21 sends the alarm message marked by the alarm level to the maintenance platform 13 corresponding to the micro service system.
In general, the greater the number of services associated with the faulty micro-service unit 12, the greater the impact range of the faulty micro-service unit 12 on the services, and the higher the alarm level of the faulty micro-service unit 12, and therefore, the maintenance personnel is required to perform fault troubleshooting and maintenance on the faulty micro-service unit 12 preferentially. The alarm processing device 21 marks the alarm level of the alarm message of the fault micro-service unit 12 according to the corresponding relation between the service quantity and the alarm level associated with the fault micro-service unit 12, so that maintenance personnel can conveniently conduct fault investigation and maintenance according to the order of the alarm level, and adverse effects of the fault micro-service unit 12 on service processing can be reduced.
Optionally, the alarm processing device 21 may also send the alarm message of the determined faulty micro service monomer 12 to the maintenance platform 13 corresponding to the micro service system in the manner shown in steps S401-S402:
S401, if the time when the fault micro service unit 12 fails does not belong to the third preset time period, the alarm processing device 21 marks the fault message of the fault micro service unit 12 to be sent.
And S402, after the alarm processing equipment 21 monitors that the timer exceeds the starting time point of the third preset time period of the next period, sending the alarm message of the to-be-sent mark to the maintenance platform 13 corresponding to the micro-service system. The timeout time point of the timer is the starting time point of the third preset time period.
For example, the third preset time period may be 06:00-23:00 per day. Typically, the time outside the third preset period is the time at which the micro service unit 12 is automatically maintained. The alarm processing device 21 sends an alarm message to the maintenance platform 13 in a third preset time period, so that not only can the conflict between manual maintenance and automatic maintenance be avoided, but also the maintenance efficiency of the micro-service system can be improved on the basis of saving human resources.
According to the alarm processing method provided by the application, the execution parameters of the micro service monomers meet the alarm condition of at least one service processing dimension to determine the fault micro service monomers in the micro service system within the first preset time period, the initial fault micro service monomers and the second fault micro service monomers which spontaneously fail are determined from a plurality of fault micro service monomers of the micro service system based on a system call chain, and corresponding alarm messages are sent to a maintenance platform for maintenance, so that the number of the alarm messages is greatly reduced, the problem that a large number of alarm messages are generated in a short time by the existing fault monitoring method is solved, and the fault processing timeliness of the micro service system is ensured. In addition, under the condition of carrying out alarm processing on a plurality of micro service systems, the abnormal execution parameters of the micro service system subordinate to the micro service single are combined into the system alarm messages of the corresponding micro service system for transmission, so that the number of the alarm messages is greatly reduced, the problem that a large number of alarm messages are generated in a short time by the existing fault monitoring method is solved, and the fault processing timeliness of each micro service system is ensured.
The embodiment of the application also provides an alarm processing device, and fig. 4 is a structural diagram of the alarm processing device provided by the embodiment of the application. As shown in fig. 4, the apparatus includes: a transceiver module 41 and a processing module 42.
The transceiver module 41 is configured to obtain execution information of a plurality of service processing dimensions of each micro service unit in the micro service system within a first preset period of time. The execution information of the business process dimension includes a plurality of execution parameters of the business process.
The processing module 42 is configured to determine that the micro service monomer fails if the execution parameter of the micro service monomer satisfies the alarm condition of at least one service processing dimension.
The transceiver module 41 is further configured to send an alarm message of the determined faulty micro-service monomer to a maintenance platform corresponding to the micro-service system; the alert message contains a class of abnormal execution parameters that satisfy the alert condition.
The specific implementation principle and technical effect of the alarm processing device provided in this embodiment are similar to those of the embodiment shown in fig. 3, and the description of this embodiment is omitted here.
The embodiment of the application also provides an alarm processing device. Fig. 5 is a structural diagram of an alarm processing device according to an embodiment of the present application. As shown in fig. 5, the alarm processing device includes a processor 51 and a memory 52, where the memory 52 stores instructions executable by the processor 51, so that the processor 51 can be used to execute the technical scheme of the above method embodiment, and the implementation principle and technical effect are similar, and the embodiment is not repeated here. It should be understood that the Processor 51 may be a central processing unit (english: central Processing Unit, abbreviated as CPU), other general purpose processors, a digital signal Processor (english: DIGITAL SIGNAL Processor, abbreviated as DSP), an Application-specific integrated Circuit (english: application SPECIFIC INTEGRATED Circuit, abbreviated as ASIC), or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the present application may be embodied directly in a hardware processor for execution, or in a combination of hardware and software modules in a processor for execution. The memory 52 may include a high-speed random access memory (english: random Access Memory, abbreviated as RAM), and may further include a Non-volatile memory (NVM), such as at least one magnetic disk memory, and may also be a U-disk, a removable hard disk, a read-only memory, a magnetic disk, or an optical disk.
The embodiment of the application also provides a storage medium, wherein computer execution instructions are stored in the storage medium, and when the computer execution instructions are executed by a processor, the alarm processing method is realized. The storage medium may be implemented by any type of volatile or non-volatile Memory device or combination thereof, such as Static Random-Access Memory (SRAM), electrically erasable programmable Read-Only Memory (EEPROM), erasable programmable Read-Only Memory (EPROM), programmable Read-Only Memory (PROM), read-Only Memory (ROM), magnetic Memory, flash Memory, magnetic disk, or optical disk. A storage media may be any available media that can be accessed by a general purpose or special purpose computer.
An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an Application Specific Integrated Circuit (ASIC), referred to as an Application SPECIFIC INTEGRATED Circuits (English). It is also possible that the processor and the storage medium reside as discrete components in an electronic device or a master device.
The embodiment of the application also provides a program product, such as a computer program, which realizes the alarm processing method covered by the application when being executed by a processor.
Those of ordinary skill in the art will appreciate that: all or part of the steps for implementing the method embodiments described above may be performed by hardware associated with program instructions. The foregoing program may be stored in a computer readable storage medium. The program, when executed, performs steps including the method embodiments described above; and the aforementioned storage medium includes: various media that can store program code, such as ROM, RAM, magnetic or optical disks.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced with equivalents; such modifications and substitutions do not depart from the essence of the corresponding technical solutions from the scope of the technical solutions of the embodiments of the present invention.

Claims (11)

1. An alarm processing method, comprising:
Acquiring execution information of a plurality of service processing dimensions of each micro service monomer in a micro service system within a first preset time period; the execution information of the service processing dimension comprises a plurality of execution parameters of service processing;
if the execution parameters of the micro service monomers meet the alarm conditions of at least one service processing dimension, determining that the micro service monomers have faults;
Sending the alarm message of the determined fault micro-service monomer to a maintenance platform corresponding to the micro-service system; the alert message contains a class of abnormal execution parameters that satisfy the alert condition.
2. The method of claim 1, wherein the business processing dimension is any one of a service call dimension, a function execution dimension, an interface call dimension, and an instance configuration dimension of a micro-service monomer;
And if the execution parameters of the micro service monomer meet the alarm condition of at least one service processing dimension, determining that the micro service monomer fails, including:
And if the execution parameters of the micro service monomer meet the alarm conditions of at least one service processing dimension of the service calling dimension, the function execution dimension, the interface calling dimension and the instance configuration dimension, determining that the micro service monomer fails.
3. The method according to claim 1, wherein the sending the alarm message of the determined faulty micro-service monomer to the maintenance platform corresponding to the micro-service system comprises:
acquiring a system call chain representing a service call relationship between micro service monomers in the micro service system;
Determining adjacent and related fault micro-service monomers from a plurality of fault micro-service monomers of the micro-service system as first fault micro-service monomers based on the system call chain, and determining fault micro-service monomers which do not belong to the first fault micro-service monomers as second fault micro-service monomers;
Determining an initial fault micro-service monomer from a plurality of first fault micro-service monomers based on the system call chain;
and sending the alarm messages of the initial fault micro-service monomer and the second fault micro-service monomer to a maintenance platform corresponding to the micro-service system.
4. The method of claim 1, wherein the alert message further includes a class ii exception execution parameter; the second-class abnormal execution parameters are execution parameters which do not meet the alarm conditions of the service processing dimension to which the second-class abnormal execution parameters belong, but exceed the corresponding preset parameter threshold;
The sending the alarm message of the determined fault micro-service monomer to the maintenance platform corresponding to the micro-service system comprises the following steps:
And sending the alarm message of the fault micro-service monomer containing the first class abnormal execution parameters and the second class abnormal execution parameters to a maintenance platform corresponding to the micro-service system.
5. The method of claim 1, wherein after said determining that said micro-service monomer has failed, said method further comprises:
And in a second preset time period, if the frequency of faults of the appointed micro service monomer exceeds a frequency threshold, determining to send an alarm message of the appointed micro service monomer to a maintenance platform corresponding to the micro service system.
6. The method according to any one of claims 1-5, wherein the sending the alarm message of the determined faulty micro-service monomer to the maintenance platform corresponding to the micro-service system comprises:
determining the service quantity associated with the fault micro-service monomer;
based on the corresponding relation between the service quantity and the alarm level, the alarm level marking is carried out on the alarm message of the fault micro-service monomer;
and sending the alarm information marked by the alarm level to a maintenance platform corresponding to the micro-service system.
7. The method of any of claims 1-5, wherein after the determining that the micro-service monomer has failed, the method further comprises:
the system alarm message of the micro service system is sent to a maintenance platform corresponding to the micro service system; the system alarm message comprises a class of abnormal execution parameters meeting alarm conditions of each fault micro-service monomer in the micro-service system.
8. The method according to any one of claims 1-5, wherein the sending the alarm message of the determined faulty micro-service monomer to the maintenance platform corresponding to the micro-service system comprises:
if the time of the fault micro-service monomer fault does not belong to a third preset time period, carrying out waiting-to-send marking on the fault message of the fault micro-service monomer;
After the timer is monitored to exceed the starting time point of a third preset time period of the next period, sending an alarm message of a to-be-sent mark to a maintenance platform corresponding to the micro-service system; and the overtime time point of the timer is the starting time point of the third preset time period.
9. An alarm processing apparatus, the apparatus comprising:
A transceiver module and a processing module;
the receiving and transmitting module is used for acquiring execution information of a plurality of service processing dimensions of each micro service monomer in the micro service system in a first preset time period; the execution information of the service processing dimension comprises a plurality of execution parameters of service processing;
The processing module is used for determining that the micro-service monomer fails if the execution parameters of the micro-service monomer meet the alarm conditions of at least one service processing dimension;
The receiving and transmitting module is further used for sending the alarm message of the determined fault micro-service monomer to a maintenance platform corresponding to the micro-service system; the alert message contains a class of abnormal execution parameters that satisfy the alert condition.
10. An alarm processing apparatus, the apparatus comprising:
A processor and a memory;
The memory stores executable instructions executable by the processor;
Wherein execution of the executable instructions stored by the memory by the processor causes the processor to perform the method of any one of claims 1-8.
11. A storage medium having stored therein computer-executable instructions which, when executed by a processor, are adapted to carry out the method of any one of claims 1-8.
CN202211415239.6A 2022-11-11 2022-11-11 Alarm processing method, device and storage medium Pending CN118069450A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211415239.6A CN118069450A (en) 2022-11-11 2022-11-11 Alarm processing method, device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211415239.6A CN118069450A (en) 2022-11-11 2022-11-11 Alarm processing method, device and storage medium

Publications (1)

Publication Number Publication Date
CN118069450A true CN118069450A (en) 2024-05-24

Family

ID=91102652

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211415239.6A Pending CN118069450A (en) 2022-11-11 2022-11-11 Alarm processing method, device and storage medium

Country Status (1)

Country Link
CN (1) CN118069450A (en)

Similar Documents

Publication Publication Date Title
CN107515796B (en) Equipment abnormity monitoring processing method and device
TWI746512B (en) Physical machine fault classification processing method and device, and virtual machine recovery method and system
WO2021008031A1 (en) Processing method for implementing monitoring intellectualization on the basis of micro-services, and electronic device
WO2020228276A1 (en) Network alert method and device
WO2008083890A1 (en) Method, system and program product for alerting an information technology support organization of a security event
CN107800783B (en) Method and device for remotely monitoring server
CN109245966A (en) The monitoring method and device of the service state of cloud platform
CN113452607A (en) Distributed link acquisition method and device, computing equipment and storage medium
KR20180037342A (en) Application software error monitoring, statistics management service and solution method.
CN111464589A (en) Intelligent contract processing method, computer equipment and storage medium
CN114356499A (en) Kubernetes cluster alarm root cause analysis method and device
CN111628903B (en) Monitoring method and monitoring system for transaction system running state
KR20110037969A (en) Targeted user notification of messages in a monitoring system
CN116795643A (en) Alarm management method
CN109558272A (en) The fault recovery method and device of server
CN118069450A (en) Alarm processing method, device and storage medium
CN111190754A (en) Block chain event notification method and block chain system
CN115190052A (en) Long connection management method, system and control unit
CN111967968B (en) Block chain-based vulnerability processing method and device
CN112445597B (en) Timing task scheduling method and device
CN112631866A (en) Server hardware state monitoring method and device, electronic equipment and medium
CN106850283B (en) Event-driven cloud AC alarm processing system and method
CN110188019A (en) A kind of monitoring resource condition method, apparatus, equipment and readable storage medium storing program for executing
CN115022243B (en) Data flow control method, device and system, electronic equipment and storage medium
CN111130919B (en) Interface monitoring method, device and system and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination