CN113821413A - Alarm analysis method and device - Google Patents

Alarm analysis method and device Download PDF

Info

Publication number
CN113821413A
CN113821413A CN202111134371.5A CN202111134371A CN113821413A CN 113821413 A CN113821413 A CN 113821413A CN 202111134371 A CN202111134371 A CN 202111134371A CN 113821413 A CN113821413 A CN 113821413A
Authority
CN
China
Prior art keywords
alarm
transaction
determining
information
root cause
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111134371.5A
Other languages
Chinese (zh)
Inventor
李国莹
王艳华
周明宏
常冬冬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Construction Bank Corp
Original Assignee
China Construction Bank Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Construction Bank Corp filed Critical China Construction Bank Corp
Priority to CN202111134371.5A priority Critical patent/CN113821413A/en
Publication of CN113821413A publication Critical patent/CN113821413A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3006Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/32Monitoring with visual or acoustical indication of the functioning of the machine
    • G06F11/324Display of status information
    • G06F11/327Alarm or error message display

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Quality & Reliability (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The invention provides an alarm analysis method and device, wherein the method comprises the following steps: determining a calling chain corresponding to a transaction alarm under the condition that the system is monitored to generate the transaction alarm; analyzing the transaction alarm based on the abnormal index information of the call chain, and determining whether the alarm type of the transaction alarm is a call chain abnormal root cause alarm or not; if yes, sending alarm information corresponding to the transaction alarm to operation and maintenance personnel; if not, determining whether the alarm frequency of the system for continuously generating the transaction alarm is greater than or equal to an alarm threshold value, and if so, sending alarm information corresponding to the transaction alarm to the operation and maintenance personnel. By analyzing the transaction alarm, when the alarm type of the transaction alarm is a preset alarm type or the alarm frequency of the continuous transaction alarm of the system is greater than or equal to an alarm threshold value, alarm information is sent to operation and maintenance personnel, the alarm frequency of the system is reduced, the attention and the workload of the operation and maintenance personnel to the system are reduced, and the working efficiency of the operation and maintenance personnel is improved.

Description

Alarm analysis method and device
Technical Field
The present invention relates to the field of alarm technologies, and in particular, to an alarm analysis method and apparatus.
Background
With the wide application of the distributed architecture, the monitoring technology of the call chain during online transaction is widely applied, the call chain means that in the distributed system, one online transaction request of a user is processed by different service nodes in sequence, and the processed online transaction request is returned to the user, and the call chain is formed by the processes called in sequence.
The condition of transaction can in time be known in monitoring the call chain, and the transaction just can report an emergency and ask for help or increased vigilance the number of times that the system was reported an emergency and asked for help or increased vigilance the mode of just reporting an emergency and asking for help or increased vigilance at once appearing in the unusual condition, makes the fortune dimension personnel need put into a large amount of attention to the system, has increased fortune dimension personnel's work load, has reduced fortune dimension personnel's work efficiency.
Disclosure of Invention
In view of this, the present invention provides an alarm analysis method and apparatus, which can effectively reduce the number of times of alarm performed by a system, avoid false alarm performed by the system, reduce the attention and working connections of operation and maintenance personnel, and reduce the working efficiency of the operation and maintenance personnel.
In order to achieve the above purpose, the embodiments of the present invention provide the following technical solutions:
a first aspect of the present application discloses an alarm analysis method, including:
determining a calling chain corresponding to a transaction alarm under the condition that the system is monitored to generate the transaction alarm;
analyzing the transaction alarm based on the abnormal index information of the call chain, and determining whether the alarm type of the transaction alarm is a call chain abnormal root cause alarm or not;
when the alarm type of the transaction alarm is determined to be a calling chain abnormal root cause alarm, sending alarm prompt information corresponding to the transaction alarm to operation and maintenance personnel;
when the alarm type of the transaction alarm is determined not to be a call chain abnormal root cause alarm, determining whether the alarm frequency of the transaction alarm continuously generated by the system is greater than or equal to a preset alarm threshold value, and when the alarm frequency of the transaction alarm continuously generated by the system is determined to be greater than or equal to the alarm threshold value, sending alarm prompt information corresponding to the transaction alarm to the operation and maintenance personnel.
Optionally, the determining a call chain corresponding to the transaction alarm includes:
analyzing the transaction alarm to obtain a transaction identifier in the transaction alarm;
determining a service corresponding to the transaction identification in the system;
and determining the calling chain of the service as the calling chain corresponding to the transaction alarm.
Optionally, the analyzing the transaction alarm based on the abnormal index information of the call chain to determine whether the alarm type of the transaction alarm is a call chain abnormal root cause alarm includes:
determining each monitoring index of the call chain based on the abnormal index information;
acquiring alarm information in the transaction alarm;
comparing the alarm information with each monitoring index to determine whether the monitoring index corresponding to the alarm information exists in each monitoring index;
if the monitoring indexes corresponding to the alarm information exist in the monitoring indexes, determining the alarm type of the transaction alarm as a calling chain abnormal root cause alarm;
and if the monitoring indexes corresponding to the alarm information do not exist in the monitoring indexes, determining that the alarm type of the transaction alarm is not a call chain abnormal root cause alarm.
Optionally, the method for determining whether the number of times of the continuous transaction alarm of the system is greater than or equal to a preset alarm threshold includes:
determining an alarm record in the system, and determining the alarm frequency of the transaction alarm continuously generated by the system based on the alarm record;
and comparing the alarm times with the alarm threshold value to judge whether the alarm times are greater than or equal to the alarm threshold value.
The above method, optionally, further includes:
and when the alarm times are not more than or equal to the alarm threshold value, prohibiting the operation and maintenance personnel from sending alarm information corresponding to the transaction alarm.
The second aspect of the present application discloses an alarm analysis device, including:
the system comprises a determining unit, a processing unit and a processing unit, wherein the determining unit is used for determining a calling chain corresponding to a transaction alarm under the condition that the system is monitored to generate the transaction alarm;
the analysis unit is used for analyzing the transaction alarm based on the abnormal index information of the calling chain and determining whether the alarm type of the transaction alarm is a calling chain abnormal root cause alarm or not;
the first sending unit is used for sending alarm prompt information corresponding to the transaction alarm to operation and maintenance personnel when the alarm type of the transaction alarm is determined to be a calling chain abnormal root cause alarm;
and the second sending unit is used for determining whether the alarm frequency of the transaction alarm continuously generated by the system is greater than or equal to a preset alarm threshold value or not when the alarm type of the transaction alarm is determined not to be a call chain abnormal root cause alarm, and sending alarm prompt information corresponding to the transaction alarm to the operation and maintenance personnel when the alarm frequency of the transaction alarm continuously generated by the system is determined to be greater than or equal to the alarm threshold value.
The above apparatus, optionally, the determining unit includes:
the analysis subunit is used for analyzing the transaction alarm to acquire a transaction identifier in the transaction alarm;
a first determining subunit, configured to determine, in the system, a service corresponding to the transaction identifier;
and the second determining subunit is used for determining the calling chain of the service as the calling chain corresponding to the transaction alarm.
The above apparatus, optionally, the analysis unit includes:
a third determining subunit, configured to determine, based on the abnormal index information, each monitoring index of the call chain;
the acquisition subunit is used for acquiring the alarm information in the transaction alarm;
the comparison subunit is configured to compare the alarm information with each monitoring index, and determine whether a monitoring index corresponding to the alarm information exists in each monitoring index;
a fourth determining subunit, configured to determine that the alarm type of the transaction alarm is a call chain abnormal root cause alarm if a monitoring index corresponding to the alarm information exists in each monitoring index;
and the fifth determining subunit is configured to determine that the alarm type of the transaction alarm is not a call chain abnormal root cause alarm if the monitoring index corresponding to the alarm information does not exist in the monitoring indexes.
The above apparatus, optionally, the second sending unit includes:
the sixth determining subunit is used for determining an alarm record in the system and determining the alarm frequency of the transaction alarm continuously generated by the system based on the alarm record;
and the judging subunit is used for comparing the alarm frequency with the alarm threshold value so as to judge whether the alarm frequency is greater than or equal to the alarm threshold value.
The above apparatus, optionally, further comprises:
and the forbidding unit is used for forbidding the operation and maintenance personnel to send the alarm information corresponding to the transaction alarm when the alarm frequency is not more than or equal to the alarm threshold value.
Compared with the prior art, the invention has the following advantages:
the invention provides an alarm analysis method and device, wherein the method comprises the following steps: determining a calling chain corresponding to a transaction alarm under the condition that the system is monitored to generate the transaction alarm; analyzing the transaction alarm based on the abnormal index information of the call chain, and determining whether the alarm type of the transaction alarm is a call chain abnormal root cause alarm or not; when the alarm type of the transaction alarm is determined to be a calling chain abnormal root cause alarm, alarm information corresponding to the transaction alarm is sent to operation and maintenance personnel; when the alarm type of the transaction alarm is determined not to be the calling chain abnormal root cause alarm, determining whether the alarm frequency of the system for continuously generating the transaction alarm is greater than or equal to a preset alarm threshold value, and when the alarm frequency of the system for continuously generating the transaction alarm is determined to be greater than or equal to the alarm threshold value, sending alarm information corresponding to the transaction alarm to operation and maintenance personnel. The invention analyzes the transaction alarm, and sends alarm information to operation and maintenance personnel when the alarm type of the transaction alarm is a preset alarm type; when the alarm type of the transaction alarm is not the corresponding alarm type and the alarm frequency of the system for continuously generating the transaction alarm is greater than or equal to the preset alarm threshold value, the alarm information is sent to the operation and maintenance personnel, so that the frequency of the system for alarming the operation and maintenance personnel can be effectively reduced, the attention of the operation and maintenance personnel to the system can be reduced, the workload of the operation and maintenance personnel is reduced, and the working efficiency of the operation and maintenance personnel is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 is a flowchart of a method of an alarm analysis method according to an embodiment of the present invention;
FIG. 2 is a flowchart of another method of an alarm analysis method according to an embodiment of the present invention;
FIG. 3 is a flowchart of another method of an alarm analysis method according to an embodiment of the present invention;
FIG. 4 is a flowchart of another method of an alarm analysis method according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of an alarm analysis apparatus according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of an electronic medium according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In this application, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The invention is operational with numerous general purpose or special purpose computing device environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet-type devices, multi-processor apparatus, distributed computing environments that include any of the above devices or equipment, and the like. The execution subject of the present invention may be a processor or a server in a distributed system, specifically, a distributed system in the present invention is specifically a bank system.
Referring to fig. 1, a method flowchart of an alarm analysis method according to an embodiment of the present invention is specifically described as follows:
s101, determining a call chain corresponding to a transaction alarm under the condition that the system is monitored to generate the transaction alarm.
The distributed system is provided with a monitoring module for monitoring the system for transaction alarm, and the module monitors whether the system gives the transaction alarm in real time.
The system provides a plurality of services, and different services correspond to different call chains, so the system comprises a plurality of call chains. When a transaction alarm is detected to occur in the system, a call chain corresponding to the transaction alarm needs to be determined.
Referring to fig. 2, a flowchart of a method for determining an invocation corresponding to a transaction alert according to another embodiment of the present invention is specifically described as follows:
s201, analyzing the transaction alarm to obtain a transaction identifier in the transaction alarm.
The transaction identifier has uniqueness, and the transaction alarm comprises alarm time, transaction duration, transaction request time, transaction response time and the like besides the transaction identifier. The transaction identifier is used for determining the specific service to which the transaction alarm belongs.
S202, determining the service corresponding to the transaction identification in the system.
And determining the service identifier of each service in the system, comparing each service identifier with the transaction identifier, and determining the service to which the service identifier consistent with the transaction identifier belongs as the service corresponding to the transaction identifier, in other words, the service to which the service identifier consistent with the transaction identifier belongs is the service corresponding to the transaction alarm.
S203, determining the calling chain of the service as the calling chain corresponding to the transaction alarm.
And each service has a corresponding call chain, and the call chain of the service is determined as the call chain corresponding to the transaction alarm.
In the method provided by the embodiment of the invention, the business corresponding to the transaction alarm is determined by using the transaction identifier, and the call chain of the business is used as the call chain corresponding to the transaction alarm; by using the transaction identifier, the calling chain with the alarm can be accurately determined in the massive calling chains of the system, a convenient and quick mode is provided for the system to determine the calling chain with the alarm, and the working efficiency of the system is improved.
S102, analyzing the transaction alarm based on the abnormal index information of the call chain, and determining whether the alarm type of the transaction alarm is a call chain abnormal root cause alarm or not; if the alarm type of the transaction alarm is determined to be the calling chain abnormal root cause alarm, S103 is executed; and if the alarm type of the transaction alarm is determined not to be the call chain abnormal root cause alarm, executing S104.
It should be noted that the alarm types of the transaction alarm may be divided into two types, one is a call chain abnormal root cause alarm, and the other is a non-call chain abnormal root cause alarm, where an alarm caused by a call chain abnormal root cause may be referred to as a call chain abnormal root cause alarm, and further, a root cause causing an abnormality in a call chain is referred to as a call chain abnormal root cause.
The contents of the abnormal index information of different call chains are different, and the abnormal index information contains the information of the abnormal root of the call chain.
Referring to fig. 3, a flowchart of a method for analyzing a transaction alarm based on abnormal index information and determining whether an alarm type of the transaction alarm is a call chain abnormal root cause alarm according to another embodiment of the present invention is specifically described as follows:
s301, determining each monitoring index of the call chain based on the abnormal index information.
And analyzing the abnormal index information to obtain each monitoring index of the call chain, wherein each monitoring index is an index of the abnormal root cause of the call chain, and whether the alarm type of the transaction alarm is the abnormal root cause alarm of the call chain can be determined through the monitoring index.
The monitoring indexes include, but are not limited to, system service indexes, business logic indexes, and the like, wherein the system service indexes include indexes of transaction errors caused by system service self-abnormity, and the business logic indexes include indexes of transaction errors caused by business logic errors. Preferably, the system success rate, the service success rate and the like can be determined according to the system service index, wherein the system power is reduced when a transaction error occurs due to the abnormality of the system service; when the transaction is unsuccessful due to the service logic, the service success is reduced, and further, the transaction unsuccessful due to the service logic is specifically the transaction failure due to insufficient balance when the payment service is performed.
The monitoring indexes are composed of information of call chain abnormal root causes, the monitoring indexes correspond to the call chain abnormal root causes one by one, and each monitoring index contains the information of the call chain abnormal root cause corresponding to the monitoring index.
S302, acquiring alarm information in the transaction alarm.
And analyzing the transaction alarm so as to obtain alarm information in the transaction alarm, wherein the alarm information comprises the reason of the transaction alarm, specifically, specific information such as transaction response time overtime, transaction processing time overtime, transaction logic error and the like. The alarm information in the transaction alarm generates specific abnormal information according to the abnormal condition of the transaction.
S303, comparing the alarm information with each monitoring index, and determining whether the monitoring index corresponding to the alarm information exists in each monitoring index; if the monitoring indexes corresponding to the alarm information exist in the monitoring indexes, executing S304; if there is no monitoring index corresponding to the alarm information in each monitoring index, S305 is executed.
And further, when the monitoring index corresponding to the alarm information exists in each monitoring index, representing the information causing the abnormal root cause of the calling chain corresponding to the monitoring index to be contained in the alarm information, thereby determining that the transaction alarm is the abnormal alarm caused by the root cause of the calling chain.
S304, determining the alarm type of the transaction alarm as a calling chain abnormal root cause alarm.
S305, determining that the alarm type of the transaction alarm is not a call chain abnormal root cause alarm.
In the method provided by the embodiment of the invention, the alarm type of the transaction alarm can be accurately determined based on each monitoring index of the call chain, and corresponding operation is executed based on the alarm type.
And S103, sending alarm prompt information corresponding to the transaction alarm to operation and maintenance personnel.
When the alarm type of the transaction alarm is determined to be a calling chain abnormal root cause alarm, alarm prompt information corresponding to the transaction alarm can be directly sent to the operation and maintenance personnel so as to alarm the operation and maintenance personnel.
Furthermore, when the alarm type of the transaction alarm is a call chain abnormal root cause alarm, it indicates that the basic service of the call chain is abnormal, and at this time, the operation and maintenance personnel needs to perform corresponding maintenance, so once the call chain abnormal root cause alarm occurs, the operation and maintenance personnel must be immediately warned.
S104, determining whether the alarm frequency of the system for continuously generating transaction alarms is greater than or equal to a preset alarm threshold value; when the alarm frequency of the system for continuously generating the transaction alarm is determined to be greater than or equal to the alarm threshold value, S103 is executed; when it is determined that the number of times of alarm for the system to continuously generate transaction alarms is not greater than or equal to the alarm threshold, S105 is performed.
When the alarm type of the transaction alarm is determined not to be the calling chain abnormal root cause alarm, it indicates that no abnormality occurs in the root service of the calling chain, and what occurs is other environmental factors, such as an alarm caused by network interruption.
Furthermore, when the alarm frequency of the system continuously generating transaction alarms is larger than or equal to the alarm threshold value, the alarm is suppressed for the purpose of alarming the operation and maintenance personnel, and the false alarms caused by the instant abnormal fluctuation of the system can be effectively reduced by suppressing the alarms, so that the alarm frequency is reduced.
Referring to fig. 4, a flowchart of a method for determining whether the number of times of continuously generating transaction alarms by the system is greater than or equal to a preset alarm threshold according to another embodiment of the present invention is specifically described as follows:
s401, determining an alarm record in the system, and determining the alarm frequency of the system for continuously generating transaction alarms based on the alarm record.
The alarm record in the system records the global alarm record of the system, the alarm record comprises a plurality of logs, the system generates corresponding logs every time the alarm occurs, and the logs in the alarm record are arranged according to the time sequence of the alarm. When the alarm frequency of transaction alarms continuously occurring by the system is determined through the alarm records, determining a log queue corresponding to the current transaction alarm in the alarm records based on the current transaction alarm, wherein the log queue comprises at least one log, the log queue comprises the log of the current transaction alarm, all logs in the log queue are continuously arranged in the alarm records, and the identifier of the alarm to which each log in the log queue belongs is the same as the identifier of the current transaction alarm; the number of the logs in the log queue is the alarm frequency of the system for continuously generating transaction alarm; in an exemplary manner, the first and second electrodes are,
s402, comparing the alarm frequency with an alarm threshold value to judge whether the alarm frequency is greater than or equal to the alarm threshold value.
By comparing the alarm frequency with the alarm threshold, it can be determined whether the alarm frequency is greater than or equal to the alarm threshold, specifically, the alarm threshold can be set according to the actual requirement, specifically, if the alarm threshold is set to N, N can be set to any positive integer.
The invention can determine the alarm times of the continuous alarm of the system through the alarm records of the system, and further can determine whether alarm prompt information needs to be sent to operation and maintenance personnel based on the alarm times.
And S105, prohibiting the operation and maintenance personnel from sending alarm prompt information corresponding to the transaction alarm.
In the method provided by the embodiment of the invention, under the condition that the system is monitored to generate a transaction alarm, a call chain corresponding to the transaction alarm is determined; analyzing the transaction alarm based on the abnormal index information of the call chain, and determining whether the alarm type of the transaction alarm is a call chain abnormal root cause alarm or not; when the alarm type of the transaction alarm is determined to be a calling chain abnormal root cause alarm, alarm information corresponding to the transaction alarm is sent to operation and maintenance personnel; when the alarm type of the transaction alarm is determined not to be the calling chain abnormal root cause alarm, determining whether the alarm frequency of the system for continuously generating the transaction alarm is greater than or equal to a preset alarm threshold value, and when the alarm frequency of the system for continuously generating the transaction alarm is determined to be greater than or equal to the alarm threshold value, sending alarm information corresponding to the transaction alarm to operation and maintenance personnel. The invention analyzes the transaction alarm, and sends alarm information to operation and maintenance personnel when the alarm type of the transaction alarm is a preset alarm type; when the alarm type of the transaction alarm is not the corresponding alarm type and the alarm frequency of the system for continuously generating the transaction alarm is greater than or equal to the preset alarm threshold value, the alarm information is sent to the operation and maintenance personnel, so that the frequency of the system for alarming the operation and maintenance personnel can be effectively reduced, the attention of the operation and maintenance personnel to the system can be reduced, the workload of the operation and maintenance personnel is reduced, and the working efficiency of the operation and maintenance personnel is improved.
The method is based on the mode of calling the chain root cause, under the condition of non-calling chain abnormal root cause alarm, the method can alarm operation and maintenance personnel when the continuous times of alarm reach the preset threshold value, thereby effectively reducing the alarm quantity, and avoiding the alarm of the front end and the rear end of the calling chain, thereby solving the problem of repeated alarm caused by the alarm of the front end and the rear end in the traditional alarm mode, effectively reducing the manual attention of the operation and maintenance personnel, reducing the workload of the operation and maintenance personnel and improving the working efficiency of the operation and maintenance personnel.
Corresponding to the method shown in fig. 1, an alarm analysis device provided in an embodiment of the present invention may be applied to a system with a distributed architecture, so as to support the application of the method shown in fig. 1 in practice, and referring to fig. 5, a schematic structural diagram of the alarm analysis device provided in the embodiment of the present invention is specifically described as follows:
the determining unit 501 is configured to determine a call chain corresponding to a transaction alarm when the system is monitored to generate the transaction alarm;
an analyzing unit 502, configured to analyze the transaction alarm based on the abnormal index information of the call chain, and determine whether an alarm type of the transaction alarm is a call chain abnormal root cause alarm;
a first sending unit 503, configured to send alarm prompt information corresponding to the transaction alarm to an operation and maintenance worker when it is determined that the alarm type of the transaction alarm is a call chain abnormal root cause alarm;
a second sending unit 504, configured to determine whether the number of times that the system continuously generates the transaction alarm is greater than or equal to a preset alarm threshold when it is determined that the alarm type of the transaction alarm is not a call chain abnormal root cause alarm, and send alarm prompt information corresponding to the transaction alarm to the operation and maintenance staff when it is determined that the number of times that the system continuously generates the transaction alarm is greater than or equal to the alarm threshold.
In the device provided by the embodiment of the invention, under the condition that the system is monitored to generate a transaction alarm, a call chain corresponding to the transaction alarm is determined; analyzing the transaction alarm based on the abnormal index information of the call chain, and determining whether the alarm type of the transaction alarm is a call chain abnormal root cause alarm or not; when the alarm type of the transaction alarm is determined to be a calling chain abnormal root cause alarm, alarm information corresponding to the transaction alarm is sent to operation and maintenance personnel; when the alarm type of the transaction alarm is determined not to be the calling chain abnormal root cause alarm, determining whether the alarm frequency of the system for continuously generating the transaction alarm is greater than or equal to a preset alarm threshold value, and when the alarm frequency of the system for continuously generating the transaction alarm is determined to be greater than or equal to the alarm threshold value, sending alarm information corresponding to the transaction alarm to operation and maintenance personnel. The invention analyzes the transaction alarm, and sends alarm information to operation and maintenance personnel when the alarm type of the transaction alarm is a preset alarm type; when the alarm type of the transaction alarm is not the corresponding alarm type and the alarm frequency of the system for continuously generating the transaction alarm is greater than or equal to the preset alarm threshold value, the alarm information is sent to the operation and maintenance personnel, so that the frequency of the system for alarming the operation and maintenance personnel can be effectively reduced, the attention of the operation and maintenance personnel to the system can be reduced, the workload of the operation and maintenance personnel is reduced, and the working efficiency of the operation and maintenance personnel is improved.
In the apparatus provided in the embodiment of the present invention, the determining unit 501 may be configured to:
the analysis subunit is used for analyzing the transaction alarm to acquire a transaction identifier in the transaction alarm;
a first determining subunit, configured to determine, in the system, a service corresponding to the transaction identifier;
and the second determining subunit is used for determining the calling chain of the service as the calling chain corresponding to the transaction alarm.
In the apparatus provided in the embodiment of the present invention, the analysis unit 502 may be configured to:
a third determining subunit, configured to determine, based on the abnormal index information, each monitoring index of the call chain;
the acquisition subunit is used for acquiring the alarm information in the transaction alarm;
the comparison subunit is configured to compare the alarm information with each monitoring index, and determine whether a monitoring index corresponding to the alarm information exists in each monitoring index;
a fourth determining subunit, configured to determine that the alarm type of the transaction alarm is a call chain abnormal root cause alarm if a monitoring index corresponding to the alarm information exists in each monitoring index;
and the fifth determining subunit is configured to determine that the alarm type of the transaction alarm is not a call chain abnormal root cause alarm if the monitoring index corresponding to the alarm information does not exist in the monitoring indexes.
In the apparatus provided in the embodiment of the present invention, the second sending unit 504 may be configured to:
the sixth determining subunit is used for determining an alarm record in the system and determining the alarm frequency of the transaction alarm continuously generated by the system based on the alarm record;
and the judging subunit is used for comparing the alarm frequency with the alarm threshold value so as to judge whether the alarm frequency is greater than or equal to the alarm threshold value.
In the apparatus provided in the embodiment of the present invention, the apparatus may be further configured to:
and the forbidding unit is used for forbidding the operation and maintenance personnel to send the alarm information corresponding to the transaction alarm when the alarm frequency is not more than or equal to the alarm threshold value.
The embodiment of the present invention further provides a storage medium, where the storage medium includes a stored instruction, where when the instruction runs, the apparatus where the storage medium is located is controlled to perform the following operations:
determining a calling chain corresponding to a transaction alarm under the condition that the system is monitored to generate the transaction alarm;
analyzing the transaction alarm based on the abnormal index information of the call chain, and determining whether the alarm type of the transaction alarm is a call chain abnormal root cause alarm or not;
when the alarm type of the transaction alarm is determined to be a calling chain abnormal root cause alarm, sending alarm prompt information corresponding to the transaction alarm to operation and maintenance personnel;
when the alarm type of the transaction alarm is determined not to be a call chain abnormal root cause alarm, determining whether the alarm frequency of the transaction alarm continuously generated by the system is greater than or equal to a preset alarm threshold value, and when the alarm frequency of the transaction alarm continuously generated by the system is determined to be greater than or equal to the alarm threshold value, sending alarm prompt information corresponding to the transaction alarm to the operation and maintenance personnel.
An electronic device is provided in an embodiment of the present invention, and the structural diagram of the electronic device is shown in fig. 6, which specifically includes a memory 601 and one or more instructions 602, where the one or more instructions 602 are stored in the memory 601 and configured to be executed by one or more processors 603 to perform the following operations on the one or more instructions 602:
determining a calling chain corresponding to a transaction alarm under the condition that the system is monitored to generate the transaction alarm;
analyzing the transaction alarm based on the abnormal index information of the call chain, and determining whether the alarm type of the transaction alarm is a call chain abnormal root cause alarm or not;
when the alarm type of the transaction alarm is determined to be a calling chain abnormal root cause alarm, sending alarm prompt information corresponding to the transaction alarm to operation and maintenance personnel;
when the alarm type of the transaction alarm is determined not to be a call chain abnormal root cause alarm, determining whether the alarm frequency of the transaction alarm continuously generated by the system is greater than or equal to a preset alarm threshold value, and when the alarm frequency of the transaction alarm continuously generated by the system is determined to be greater than or equal to the alarm threshold value, sending alarm prompt information corresponding to the transaction alarm to the operation and maintenance personnel.
The specific implementation procedures and derivatives thereof of the above embodiments are within the scope of the present invention.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, the system or system embodiments are substantially similar to the method embodiments and therefore are described in a relatively simple manner, and reference may be made to some of the descriptions of the method embodiments for related points. The above-described system and system embodiments are only illustrative, wherein the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. An alarm analysis method, comprising:
determining a calling chain corresponding to a transaction alarm under the condition that the system is monitored to generate the transaction alarm;
analyzing the transaction alarm based on the abnormal index information of the call chain, and determining whether the alarm type of the transaction alarm is a call chain abnormal root cause alarm or not;
when the alarm type of the transaction alarm is determined to be a calling chain abnormal root cause alarm, sending alarm prompt information corresponding to the transaction alarm to operation and maintenance personnel;
when the alarm type of the transaction alarm is determined not to be a call chain abnormal root cause alarm, determining whether the alarm frequency of the transaction alarm continuously generated by the system is greater than or equal to a preset alarm threshold value, and when the alarm frequency of the transaction alarm continuously generated by the system is determined to be greater than or equal to the alarm threshold value, sending alarm prompt information corresponding to the transaction alarm to the operation and maintenance personnel.
2. The method of claim 1, wherein determining the call chain corresponding to the transaction alert comprises:
analyzing the transaction alarm to obtain a transaction identifier in the transaction alarm;
determining a service corresponding to the transaction identification in the system;
and determining the calling chain of the service as the calling chain corresponding to the transaction alarm.
3. The method according to claim 1, wherein the analyzing the transaction alarm based on the abnormal index information of the call chain to determine whether the alarm type of the transaction alarm is a call chain abnormal root cause alarm comprises:
determining each monitoring index of the call chain based on the abnormal index information;
acquiring alarm information in the transaction alarm;
comparing the alarm information with each monitoring index to determine whether the monitoring index corresponding to the alarm information exists in each monitoring index;
if the monitoring indexes corresponding to the alarm information exist in the monitoring indexes, determining the alarm type of the transaction alarm as a calling chain abnormal root cause alarm;
and if the monitoring indexes corresponding to the alarm information do not exist in the monitoring indexes, determining that the alarm type of the transaction alarm is not a call chain abnormal root cause alarm.
4. The method of claim 1, wherein the determining whether the number of times the system continuously generates the transaction alert is greater than or equal to a preset alert threshold comprises:
determining an alarm record in the system, and determining the alarm frequency of the transaction alarm continuously generated by the system based on the alarm record;
and comparing the alarm times with the alarm threshold value to judge whether the alarm times are greater than or equal to the alarm threshold value.
5. The method of claim 4, further comprising:
and when the alarm times are not more than or equal to the alarm threshold value, prohibiting the operation and maintenance personnel from sending alarm information corresponding to the transaction alarm.
6. An alarm analysis apparatus, comprising:
the system comprises a determining unit, a processing unit and a processing unit, wherein the determining unit is used for determining a calling chain corresponding to a transaction alarm under the condition that the system is monitored to generate the transaction alarm;
the analysis unit is used for analyzing the transaction alarm based on the abnormal index information of the calling chain and determining whether the alarm type of the transaction alarm is a calling chain abnormal root cause alarm or not;
the first sending unit is used for sending alarm prompt information corresponding to the transaction alarm to operation and maintenance personnel when the alarm type of the transaction alarm is determined to be a calling chain abnormal root cause alarm;
and the second sending unit is used for determining whether the alarm frequency of the transaction alarm continuously generated by the system is greater than or equal to a preset alarm threshold value or not when the alarm type of the transaction alarm is determined not to be a call chain abnormal root cause alarm, and sending alarm prompt information corresponding to the transaction alarm to the operation and maintenance personnel when the alarm frequency of the transaction alarm continuously generated by the system is determined to be greater than or equal to the alarm threshold value.
7. The apparatus of claim 6, wherein the determining unit comprises:
the analysis subunit is used for analyzing the transaction alarm to acquire a transaction identifier in the transaction alarm;
a first determining subunit, configured to determine, in the system, a service corresponding to the transaction identifier;
and the second determining subunit is used for determining the calling chain of the service as the calling chain corresponding to the transaction alarm.
8. The apparatus of claim 6, wherein the analysis unit comprises:
a third determining subunit, configured to determine, based on the abnormal index information, each monitoring index of the call chain;
the acquisition subunit is used for acquiring the alarm information in the transaction alarm;
the comparison subunit is configured to compare the alarm information with each monitoring index, and determine whether a monitoring index corresponding to the alarm information exists in each monitoring index;
a fourth determining subunit, configured to determine that the alarm type of the transaction alarm is a call chain abnormal root cause alarm if a monitoring index corresponding to the alarm information exists in each monitoring index;
and the fifth determining subunit is configured to determine that the alarm type of the transaction alarm is not a call chain abnormal root cause alarm if the monitoring index corresponding to the alarm information does not exist in the monitoring indexes.
9. The apparatus of claim 6, wherein the second sending unit comprises:
the sixth determining subunit is used for determining an alarm record in the system and determining the alarm frequency of the transaction alarm continuously generated by the system based on the alarm record;
and the judging subunit is used for comparing the alarm frequency with the alarm threshold value so as to judge whether the alarm frequency is greater than or equal to the alarm threshold value.
10. The apparatus of claim 9, further comprising:
and the forbidding unit is used for forbidding the operation and maintenance personnel to send the alarm information corresponding to the transaction alarm when the alarm frequency is not more than or equal to the alarm threshold value.
CN202111134371.5A 2021-09-27 2021-09-27 Alarm analysis method and device Pending CN113821413A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111134371.5A CN113821413A (en) 2021-09-27 2021-09-27 Alarm analysis method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111134371.5A CN113821413A (en) 2021-09-27 2021-09-27 Alarm analysis method and device

Publications (1)

Publication Number Publication Date
CN113821413A true CN113821413A (en) 2021-12-21

Family

ID=78915613

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111134371.5A Pending CN113821413A (en) 2021-09-27 2021-09-27 Alarm analysis method and device

Country Status (1)

Country Link
CN (1) CN113821413A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115514619A (en) * 2022-09-20 2022-12-23 建信金融科技有限责任公司 Alarm convergence method and system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110096408A (en) * 2019-03-11 2019-08-06 中国平安人寿保险股份有限公司 Alarm-monitor method, apparatus, electronic equipment and computer readable storage medium
CN111459695A (en) * 2020-03-12 2020-07-28 平安科技(深圳)有限公司 Root cause positioning method and device, computer equipment and storage medium
CN111555921A (en) * 2020-04-29 2020-08-18 平安科技(深圳)有限公司 Method and device for positioning alarm root cause, computer equipment and storage medium
CN111858123A (en) * 2020-07-29 2020-10-30 中国工商银行股份有限公司 Fault root cause analysis method and device based on directed graph network
CN112308455A (en) * 2020-11-20 2021-02-02 深圳前海微众银行股份有限公司 Root cause positioning method, device, equipment and computer storage medium
WO2021139252A1 (en) * 2020-07-31 2021-07-15 平安科技(深圳)有限公司 Operation and maintenance fault root cause identification method and apparatus, computer device, and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110096408A (en) * 2019-03-11 2019-08-06 中国平安人寿保险股份有限公司 Alarm-monitor method, apparatus, electronic equipment and computer readable storage medium
CN111459695A (en) * 2020-03-12 2020-07-28 平安科技(深圳)有限公司 Root cause positioning method and device, computer equipment and storage medium
CN111555921A (en) * 2020-04-29 2020-08-18 平安科技(深圳)有限公司 Method and device for positioning alarm root cause, computer equipment and storage medium
CN111858123A (en) * 2020-07-29 2020-10-30 中国工商银行股份有限公司 Fault root cause analysis method and device based on directed graph network
WO2021139252A1 (en) * 2020-07-31 2021-07-15 平安科技(深圳)有限公司 Operation and maintenance fault root cause identification method and apparatus, computer device, and storage medium
CN112308455A (en) * 2020-11-20 2021-02-02 深圳前海微众银行股份有限公司 Root cause positioning method, device, equipment and computer storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115514619A (en) * 2022-09-20 2022-12-23 建信金融科技有限责任公司 Alarm convergence method and system
CN115514619B (en) * 2022-09-20 2023-06-16 建信金融科技有限责任公司 Alarm convergence method and system

Similar Documents

Publication Publication Date Title
CN112162878B (en) Database fault discovery method and device, electronic equipment and storage medium
CN109688188A (en) Monitoring alarm method, apparatus, equipment and computer readable storage medium
CN104778111A (en) Alarm method and alarm device
JP2007502467A5 (en)
CN105549508B (en) A kind of alarm method and device merged based on information
CN110348718B (en) Service index monitoring method and device and electronic equipment
CN110708316A (en) Method and system architecture for enterprise network security operation management
CN108880845A (en) A kind of method and relevant apparatus of information alert
CN113704018A (en) Application operation and maintenance data processing method and device, computer equipment and storage medium
CN113821413A (en) Alarm analysis method and device
CN114844768A (en) Information analysis method and device and electronic equipment
CN114816917A (en) Monitoring data processing method, device, equipment and storage medium
CN110609761B (en) Method and device for determining fault source, storage medium and electronic equipment
CN116455725A (en) Network fault alarm method, system, terminal and storage medium
CN115102838B (en) Emergency processing method and device for server downtime risk and electronic equipment
CN113742169B (en) Service monitoring alarm method, device, equipment and storage medium
CN115277479A (en) Method and system for realizing system operation condition monitoring based on monitoring assistant
KR101288535B1 (en) Method for monitoring communication system and apparatus therefor
CN113391611B (en) Early warning method, device and system for power environment monitoring system
CN112508207A (en) Fault detection method, device, equipment and storage medium
CN111404740A (en) Fault analysis method and device, electronic equipment and computer readable storage medium
CN111581062A (en) Service fault processing method and server
CN104852810A (en) Method and equipment for determining abnormity of business platform
CN112433915B (en) Data monitoring method and related device based on distributed performance monitoring tool
CN113965486B (en) Line detection method and device for vertically positioning faults

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination