CN115981952A - Service monitoring alarm method and device - Google Patents
Service monitoring alarm method and device Download PDFInfo
- Publication number
- CN115981952A CN115981952A CN202211590601.3A CN202211590601A CN115981952A CN 115981952 A CN115981952 A CN 115981952A CN 202211590601 A CN202211590601 A CN 202211590601A CN 115981952 A CN115981952 A CN 115981952A
- Authority
- CN
- China
- Prior art keywords
- abnormal
- abnormality
- severity
- alarm
- information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Landscapes
- Debugging And Monitoring (AREA)
Abstract
The invention provides a service monitoring and alarming method and a device, wherein the method comprises the following steps: monitoring a target business service in real time; when the target business service is detected to be abnormal, checking whether a pre-established basic information configuration library is configured with abnormal information corresponding to the abnormality; if yes, splicing the abnormal information into a character string in a text data exchange format, wherein the character string comprises an abnormal name, abnormal detailed information, an abnormal score, occurrence times, an alarm threshold value, an alarm mode and an alarm information receiver; calculating abnormal scores according to the abnormal scores and the occurrence times of the abnormalities; determining the severity of the abnormality according to the abnormality score, and obtaining abnormal monitoring alarm information; and sending the abnormal monitoring alarm information to an alarm information receiver according to the abnormal severity and the alarm mode. By the method, the accuracy of monitoring alarm information is guaranteed, the problem reasons are intelligently analyzed, and the problem processing cost is reduced.
Description
Technical Field
The invention relates to the technical field of software, in particular to a service monitoring and alarming method and device.
Background
With the rapid development of company business, monitoring and alarming on business reliability become increasingly important. Due to the characteristics of the company, the business involved upstream and downstream is very much, and in this case, the abnormal information of the service is monitored only and is not enough to meet the characteristic requirements of the company.
In the related art, monitoring is performed for interface availability of an interface return value and interface response time, and an anomaly in a service cannot be fed back. Moreover, current monitoring schemes are mainly static data presentation and lack analysis of anomalies. In addition, the information display of monitoring and alarming mainly takes the abnormal information of the log as the standard, the readability is poor, the cost of artificial positioning cannot be greatly reduced, and the communication cost can be increased.
Therefore, the existing monitoring mode cannot accurately and timely monitor alarm information, can not intelligently analyze the reasons of common problems, and can not shorten the link for troubleshooting problems and consume time.
Disclosure of Invention
In view of this, embodiments of the present invention provide a method and an apparatus for service monitoring and alarming, so as to achieve the purposes of ensuring accuracy of monitoring and alarming information, intelligently analyzing causes of problems, and reducing cost of problem processing.
In order to achieve the above purpose, the embodiments of the present invention provide the following technical solutions:
the first aspect of the embodiment of the invention discloses a service monitoring and alarming method, which comprises the following steps:
monitoring a target business service in real time;
when the target business service is detected to be abnormal, checking whether a pre-established basic information configuration library configures abnormal information corresponding to the abnormality;
if so, splicing the abnormal information into a character string in a text data exchange format, wherein the character string comprises an abnormal name, abnormal detailed information, an abnormal score, occurrence times, an alarm threshold value, an alarm mode and an alarm information receiver of the abnormality;
calculating the abnormal score of the abnormality according to the abnormal score and the occurrence frequency of the abnormality;
determining the severity of the abnormality according to the abnormality score, and obtaining monitoring alarm information of the abnormality;
and sending the abnormal monitoring alarm information to the alarm information receiver according to the severity of the abnormality and the alarm mode, wherein the monitoring alarm information comprises the name of the abnormality, the detailed information of the abnormality, the reason of the error, the severity, the alarm mode and the alarm information receiver.
Optionally, when it is detected that the target service is abnormal, checking whether a pre-established basic information configuration library configures abnormal information corresponding to the abnormality includes:
detecting a log of the target business service in real time;
if any one of the logs hits an abnormal keyword, determining that the target business service is abnormal, and checking whether a pre-established basic information configuration library configures abnormal information corresponding to the abnormality.
Optionally, the method further includes:
if not, determining the abnormal score of the abnormality as 0.
Optionally, the determining the severity of the abnormality according to the abnormality score includes:
if the abnormality score is at a first abnormality level, determining the severity of the abnormality as a low-risk level;
if the anomaly score is at a second anomaly level, determining the severity of the anomaly as a general severity;
determining the severity of the abnormality as moderate severity if the abnormality score is at a third abnormality level;
and if the abnormality score is at a fourth abnormality level, determining the severity of the abnormality as high severity.
Optionally, the sending the abnormal monitoring alarm information to the alarm information receiver according to the severity of the abnormality and the alarm mode includes:
calling an interface corresponding to the alarm mode according to the alarm mode;
and sending the abnormal monitoring alarm information to the alarm information receiver based on the interface and the severity of the abnormality.
The second aspect of the embodiment of the invention discloses a service monitoring and alarming device, which comprises:
the monitoring module is used for monitoring the target business service in real time;
the anomaly detection module is used for checking whether a pre-established basic information configuration library is configured with anomaly information corresponding to the anomaly or not when the target business service is detected to be abnormal; if so, splicing the abnormal information into a character string in a text data exchange format, wherein the character string comprises an abnormal name, abnormal detailed information, an abnormal score, occurrence times, an alarm threshold value, an alarm mode and an alarm information receiver of the abnormality;
the anomaly analysis module is used for calculating the anomaly score of the anomaly according to the anomaly score and the occurrence frequency of the anomaly; determining the severity of the abnormality according to the abnormality score, and obtaining monitoring alarm information of the abnormality;
and the sending alarm module is used for sending the abnormal monitoring alarm information to the alarm information receiver according to the abnormal severity and the alarm mode, wherein the monitoring alarm information comprises the abnormal name, the abnormal detailed information, the error reason, the severity, the alarm mode and the alarm information receiver.
Optionally, the anomaly detection module, configured to check whether a pre-established basic information configuration library configures anomaly information corresponding to an anomaly when the target service is detected to be abnormal, includes:
the detection unit is used for detecting the logs of the target business service in real time;
and the checking unit is used for determining that the target business service is abnormal if any one of the logs hits an abnormal keyword, and checking whether a pre-established basic information configuration library configures abnormal information corresponding to the abnormality.
Optionally, the anomaly detection module is further configured to:
if not, determining the abnormal score of the abnormality as 0.
Optionally, the abnormality analysis module for determining the severity of the abnormality according to the abnormality score includes:
a first determining unit, configured to determine a severity of the anomaly as a low risk degree if the anomaly score is at a first anomaly level;
a second determining unit, configured to determine the severity of the anomaly as a general severity if the anomaly score is at a second anomaly level;
a third determining unit, configured to determine the severity of the anomaly as a medium severity if the anomaly score is at a third anomaly level;
and the fourth determining unit is used for determining the severity of the abnormity as the high severity if the abnormity score is in a fourth abnormity grade.
Optionally, the sending alarm module includes:
the calling unit is used for calling an interface corresponding to the alarm mode according to the alarm mode;
and the sending alarm unit is used for sending the abnormal monitoring alarm information to the alarm information receiver based on the interface and the severity of the abnormality.
Based on the above method and apparatus for service monitoring and alarming provided by the embodiments of the present invention, the method includes: monitoring a target business service in real time; when the target business service is detected to be abnormal, checking whether a pre-established basic information configuration library configures abnormal information corresponding to the abnormality; if so, splicing the abnormal information into a character string in a text data exchange format, wherein the character string comprises an abnormal name, abnormal detailed information, an abnormal score, occurrence times, an alarm threshold value, an alarm mode and an alarm information receiver of the abnormality; calculating the abnormal score of the abnormality according to the abnormal score and the occurrence frequency of the abnormality; determining the severity of the abnormality according to the abnormality score, and obtaining monitoring alarm information of the abnormality; and sending the abnormal monitoring alarm information to the alarm information receiver according to the severity of the abnormality and the alarm mode, wherein the monitoring alarm information comprises the name of the abnormality, the detailed information of the abnormality, the reason for the error, the severity, the alarm mode and the alarm information receiver. In the scheme, if the target business service is detected to be abnormal, abnormal information corresponding to the abnormality is configured in the basic information configuration library, the abnormal information is assembled into a character string, the severity of the abnormality is determined according to the calculated abnormal score, and abnormal monitoring alarm information is sent according to the severity of the abnormality and the alarm mode, so that the accuracy of the monitoring alarm information is guaranteed, the problem causes are intelligently analyzed, and the problem processing cost is reduced.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the prior art descriptions will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 is a schematic flow chart of a service monitoring and alarming method according to an embodiment of the present invention;
fig. 2 is a schematic flowchart of a process for detecting whether a target service is abnormal according to an embodiment of the present invention;
FIG. 3 is a schematic flow chart illustrating a method for determining the severity of an anomaly according to an embodiment of the present invention;
fig. 4 is a schematic flow diagram for sending abnormal monitoring alarm information to an alarm information receiver according to an embodiment of the present invention;
fig. 5 is a schematic block diagram of a service monitoring alarm according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of a service monitoring and alarming device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.
In this application, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrases "comprising a," "...," or "comprising" does not exclude the presence of additional like elements in a process, method, article, or apparatus that comprises the element.
The terms "first," "second," "third," "fourth," and the like in the description and claims of this application and in the above-described drawings, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that the embodiments described herein may be implemented in other sequences than those illustrated or described herein.
As known from the background technology, the existing monitoring mode can not accurately and timely monitor alarm information, can not intelligently analyze the reasons of common problems, and can not shorten the link for troubleshooting problems and shorten time.
Therefore, the embodiment of the invention provides a service monitoring and alarming method and a device, in the scheme, if the target service is detected to be abnormal, abnormal information corresponding to the abnormality is configured in a basic information configuration library, the abnormal information is assembled into a character string, the severity of the abnormality is determined according to the calculated abnormal score, and abnormal monitoring and alarming information is sent according to the severity of the abnormality and the alarming mode, so that the accuracy of the monitoring and alarming information is ensured, the problem reason is intelligently analyzed, and the problem processing cost is reduced.
As shown in fig. 1, a schematic flow chart of a service monitoring and alarming method provided in an embodiment of the present invention is shown, where the method mainly includes the following steps:
step S101: and monitoring the target business service in real time.
In the embodiment of the invention, the target business service is file rendering.
In practical application, in order to realize monitoring of target business service, a monitoring alarm function is added in file rendering, the function aims at abnormal file rendering, an alarm is given in time to a request of 100% calling failure through an alarm strategy, problem reasons are intelligently analyzed according to error reporting information, corresponding analysis information with strong readability is given, and a user and a service provider are assisted to quickly check and locate problems.
Step S102: and judging whether the target business service is detected to be abnormal or not, if so, executing the step S103, otherwise, ending the operation.
In the process of implementing the step S102, the target service is detected in real time, and in the process of detecting the target service in real time, it is determined whether the target service is detected to be abnormal, if so, it is determined that corresponding abnormal processing needs to be performed on the abnormal service, step S103 is executed, otherwise, it is determined that the target service is in a normal state, the service can be provided normally, and if not, the operation is directly ended.
In the embodiment of the invention, the abnormity is captured when the abnormity of the target business service is detected.
Step S103: checking whether the basic information configuration library established in advance configures abnormal information corresponding to the abnormality, if so, executing step S104, and if not, executing step S108.
In the process of implementing step S103 specifically, it is checked whether the pre-established basic information configuration library configures the abnormal information corresponding to the abnormality, if so, step S104 is executed, otherwise, step S108 is executed if it indicates that the abnormality and the abnormal information corresponding to the abnormality are not found in the basic information configuration library.
Optionally, a process of determining whether the target service is detected to be abnormal in step S102 and checking whether the basic information configuration library established in advance is configured with abnormal information corresponding to the abnormality in step S103 is executed, as shown in fig. 2, a flow diagram for detecting whether the target service is abnormal in an embodiment of the present invention mainly includes the following steps:
step S201: and detecting logs of the target business service in real time.
In step S201, the log may be expressed as log.
Step S202: and judging whether any one item in the log hits the abnormal keyword, if so, executing the step S203, and if not, ending the operation.
In step S202, the exception keyword is exception.
In the process of implementing step S202 specifically, it is determined whether any one of the logs hits an abnormal keyword, that is, it is determined whether any one of the logs hits an exception, if yes, step S203 is executed, and if no, it is indicated that the log display data is normal, that is, the target service is in a normal state, and the service can be provided normally, and if no, the operation is directly ended.
Step S203: and determining that the target business service is abnormal, and checking whether a pre-established basic information configuration library is configured with abnormal information corresponding to the abnormality.
In the process of specifically implementing step S203, when it is determined that any one of the logs hits an abnormal keyword, it is determined that the target service is abnormal, and it is checked whether a pre-established basic information configuration library configures abnormal information corresponding to the abnormality, if yes, step S104 is executed, and if no, step S108 is executed.
Step S104: and splicing the abnormal information into a character string in a text data exchange format.
In step S104, the character string includes an abnormality name of the abnormality, abnormality detailed information, an abnormality score, the number of occurrences, an alarm threshold, an alarm mode, and an alarm information receiver.
The alarm modes include but are not limited to short messages, mails and enterprise WeChat.
In the embodiment of the invention, the character string in the text data exchange format is a json string.
It should be noted that json (JavaScript object notification) is a lightweight data exchange format. It stores and represents data in a text format that is completely independent of the programming language, based on a subset of ECMAScript (js specification set by the European computer manufacturers Association).
In the process of implementing step S104 specifically, when it is determined that the abnormal information corresponding to the configuration abnormality of the basic information configuration library established in advance is checked, the abnormal information is converted into an expression that is easily understood by the service side, that is, the abnormal information is assembled into a character string in a text data exchange format, that is, the abnormal information is assembled into a json string.
Step S105: and calculating the abnormal score of the abnormality according to the abnormal score and the occurrence frequency of the abnormality.
In the process of implementing step S105 specifically, an abnormal score of the abnormality is calculated according to the abnormal score, the number of occurrences of the abnormality, and a preset scoring criterion.
In practical application, a scoring standard is formulated according to business characteristics, that is, the scoring standard is formulated according to characteristics of a target business service, and the specific steps are as follows:
1 time of abnormal occurrence, and adding 1 point to the abnormal score; if the abnormality occurs 3 times or more in one day, the 4 th time starting abnormality score is added with 3 points each time. Adding 10 points to the 10 th abnormal score when the cumulative abnormal occurs 10 times in a week; if the abnormality occurs for 2 times or more within one hour, adding 5 points to the 3 rd initial abnormality score; if the abnormality occurs 3 times or more in one minute, the third abnormality score is added by 100. When the score reaches a threshold value of 100, the abnormity is alarmed once every 10 minutes until no new abnormity is added in 30 minutes continuously, and then the abnormity is automatically cancelled. The above fractions are cleared monthly.
It should be noted that, the above criteria, the score condition, the emptying time, and the threshold information may be implemented in a configurable manner.
Step S106: and determining the severity of the abnormality according to the abnormality score, and obtaining abnormal monitoring alarm information.
In the process of specifically implementing step S106, the severity of the abnormality is determined according to the abnormality level at which the abnormality score is located, so that the abnormal monitoring alarm information is obtained according to the severity of the abnormality.
Optionally, step S106 is executed to determine the severity of the anomaly according to the anomaly score, as shown in fig. 3, which is a schematic flow chart for determining the severity of the anomaly provided in the embodiment of the present invention, and the method mainly includes the following steps:
step S301: and judging whether the abnormal score is in a first abnormal level, if so, executing step S302, and if not, executing step S303.
In the process of implementing step S301 specifically, it is determined whether the abnormality score is at the first abnormality level, if so, step S302 is executed, and if not, it is described that the abnormality score may be at the second abnormality level, may be at the third abnormality level, and may be at the fourth abnormality level, step S303 is executed.
In the embodiment of the present invention, the first abnormality level is a score greater than or equal to 0 and less than 30; the second abnormality level is a score of 30 or more and less than 70; the third abnormal grade is that the fraction is more than or equal to 70 and less than 100; the fourth abnormality level is a score of 100 or more.
Step S302: the severity of the abnormality is determined as a low risk level.
In the process of specifically implementing step S302, in the case where it is determined that the abnormality score is at the first abnormality level, that is, in the case where it is determined that the abnormality score is 0 or more and less than 30, the severity of the abnormality is determined as the low-risk degree.
Step S303: and judging whether the abnormality score is at the second abnormality level, if so, executing step S304, and if not, executing step S305.
In the process of implementing step S303 specifically, it is determined whether the abnormality score is at the second abnormality level, if so, step S304 is executed, and if not, it is described that the abnormality score may be at the third abnormality level or at the fourth abnormality level, step S305 is executed.
Step S304: the severity of the anomaly is determined as the general severity.
In the process of implementing step S304 in detail, in the case where it is determined that the abnormality score is at the second abnormality level, that is, in the case where it is determined that the abnormality score is equal to or greater than 30 and less than 70, the severity of the abnormality is determined as the general severity.
Step S305: and judging whether the abnormal score is in a third abnormal grade, if so, executing step S306, and if not, executing step S307.
In the process of implementing step S305, it is determined whether the abnormal score is at the third abnormal level, if so, step S306 is executed, otherwise, step S307 is executed if it is determined that the abnormal score may be at the fourth abnormal level.
Step S306: the severity of the abnormality was determined to be of moderate severity.
In the process of specifically implementing step S306, in the case where it is determined that the abnormality score is at the third abnormality level, that is, in the case where it is determined that the abnormality score is 70 or more and less than 100, the degree of severity of the abnormality is determined to be moderate degree of severity.
Step S307: and judging whether the abnormal score is in a fourth abnormal level, if so, executing the step S308, and if not, ending the operation.
In the process of implementing step S307, it is determined whether the abnormality score is at the fourth abnormality level, if so, step S308 is executed, and if not, it is determined that the abnormality score is less than 0, the operation is ended.
Step S308: the severity of the anomaly is determined to be a high severity.
In the process of embodying step S308, in the case where it is determined that the abnormality score is at the fourth abnormality level, that is, in the case where it is determined that the abnormality score is 100 or more, the degree of severity of the abnormality is determined to be a high degree of severity.
Step S107: and sending the abnormal monitoring alarm information to an alarm information receiver according to the severity and the alarm mode of the abnormality.
In step S107, the monitoring alert information includes an abnormality name, abnormality detailed information, a cause of error, a severity, an alert mode, and an alert information receiver.
In the process of specifically implementing step S107, if the alarm mode is a mail, the obtained abnormal monitoring alarm information is sent to the alarm information receiver by the mail according to the severity of the abnormality.
And if the alarm mode is short message, sending the obtained abnormal monitoring alarm information to an alarm information receiver through the short message according to the abnormal severity.
And if the alarm mode is enterprise WeChat, sending the obtained abnormal monitoring alarm information to an alarm information receiver through the enterprise WeChat according to the abnormal severity degree.
Optionally, step S107 is executed to send the abnormal monitoring alarm information to the alarm information receiver according to the severity of the abnormality and the alarm manner, as shown in fig. 4, which is a schematic flow diagram for sending the abnormal monitoring alarm information to the alarm information receiver according to the embodiment of the present invention, and the method mainly includes the following steps:
step S401: and calling an interface corresponding to the alarm mode according to the alarm mode.
Step S402: and sending the abnormal monitoring alarm information to an alarm information receiver based on the interface and the abnormal severity.
In the process of specifically implementing step S401 and step S402, if the alarm mode is a mail, a mail sending interface is called, and the obtained abnormal monitoring alarm information is sent to the alarm information receiver based on the mail sending interface and the severity of the abnormality.
And if the alarm mode is a short message, calling a short message sending interface, and sending the obtained abnormal monitoring alarm information to an alarm information receiver based on the short message sending interface and the severity of the abnormality.
And if the alarm mode is enterprise WeChat, calling an enterprise WeChat sending interface, and sending the obtained abnormal monitoring alarm information to an alarm information receiver based on the enterprise WeChat sending interface and the abnormal severity.
Step S108: the abnormality score of the abnormality is determined to be 0.
In the process of implementing step S108 specifically, in the case where it is determined that abnormality information corresponding to a basic information configuration library configuration abnormality established in advance is not checked, the abnormality score of the abnormality is determined to be 0.
For better understanding of the above description, fig. 5 is a schematic block diagram of a service monitoring alarm according to an embodiment of the present invention.
In fig. 5, the anomaly detection module is provided with a basic information configuration library for defining a scoring mechanism, a threshold value and an alarm strategy.
And the abnormity analysis module is used for calculating an abnormity score and selecting an alarm strategy, and is provided with a grading system, a scoring system, a timing alarm queue, a configuration module and an abnormity library.
Wherein, the grading system is used for defining an exception grade; the scoring system is used for calculating an anomaly score of the anomaly; the timing alarm queue is used for polling the alarm queue at regular time and giving an alarm once every preset time; the configuration module is used for configuring abnormal basic information; the exception library is used for recording exception information of the exception.
And the sending alarm module is used for distributing and processing the alarm.
In a specific implementation, after detecting a keyword exception in a log, an exception detection module checks whether a basic information configuration library configures the exception according to the captured exception, and if so, assembles exception information into a json string and sends the json string to an exception analysis module; if not, the anomaly score for the anomaly is determined to be 0.
And after receiving the json string of the abnormality detection module, the abnormality analysis module calculates an abnormality score according to the abnormality score and the occurrence frequency in the json string, calculates the severity so as to obtain abnormal monitoring alarm information according to the severity of the abnormality, and then sends the abnormal monitoring alarm information to the alarm module according to the severity and the corresponding alarm mode.
And the sending alarm module receives the abnormal monitoring alarm information, calls different alarm mode interfaces through the abnormal monitoring alarm information, and sends the abnormal monitoring alarm information out.
For better understanding of the above description, the following description will be given by way of example.
Firstly, an exception scene is refined, and each exception information is defined. An exception scene such as a file rendering exception.
Specifically, according to the actual exception hit in the log, and for the exception, unique information for the exception scene is given in the monitoring alarm information, and the error code and the error information returned by the interface are unchanged, but when the retry mechanism still fails, corresponding information is returned in the monitoring, which specifically includes:
event name-file rendering Exception
Occurrence period 2022-11-1919
The application name is ccbscf-file-server
Environmental information tX
XXXX tenant identification
traceId:f7xxxxxx108
Business flow number 2211 xxx-xxx-7188220e8
Event description rendering time 2022-01-1919
And packaging the abnormal information, and defining an error information analysis result with high readability by combining the characteristics of the abnormal information.
Specifically, according to the captured abnormal information, the abnormal information is converted into an expression which is easy to understand by a business party in combination with the problem situation which often occurs on the line.
For detailed reasons that the local file does not exist, the file number is e037a722-xxxxx8xxx-8ba9adeca.
Next, a grade, weight, and score are defined for each anomaly, with an initial score of 0.
Then, according to the service characteristics, a scoring standard is established.
Specifically, 1 time of abnormal occurrence, and 1 point of score is added; if the abnormality occurs 3 times or more in one day, the 4 th starting point is added with 3 points each time. Adding 10 points to the 10 th point if the abnormal condition of one week occurs 10 times; if the abnormality occurs for 2 times or more within one hour, adding 5 points to the 3 rd starting point; if the abnormality occurs 3 times or more within one minute, the third score is added to 100.
When the score reaches a threshold value of 100, the abnormity is alarmed once every 10 minutes until no new abnormity is added in 30 minutes continuously, and then the abnormity is automatically cancelled. The above fractions are cleared monthly.
The above standards, the score condition, the emptying time and the threshold information can be configured and realized, so that the configuration file can be modified conveniently according to the change of the service, and because the information of the configuration file is read firstly when the abnormal alarm is triggered every time, the frequency of the alarm information can be adjusted at any time according to the data of the configuration file.
The business characteristics include, but are not limited to, whether the business has direct influence on the user, and the degree of influence on the company after the process has a problem.
And then, after the file rendering is abnormal, monitoring information shows the abnormity of the alarm, abnormity analysis, scores and the times of the occurrence of the current day. Facilitating business decisions regarding the severity and frequency of anomalies.
Wherein, the frequency of the abnormity is calculated according to the occurrence frequency of the same abnormity.
The severity of the anomaly is determined according to the score, the anomaly can be autonomously configured by a service party in a configuration file, the parameter is configured to be 100 points by default, and if the score > =100 points, the anomaly is judged to be severe.
Based on the business monitoring and alarming method provided by the embodiment of the invention, the target business service is monitored in real time; when the target business service is detected to be abnormal, checking whether a pre-established basic information configuration library is configured with abnormal information corresponding to the abnormality; if yes, splicing the abnormal information into a character string in a text data exchange format, wherein the character string comprises an abnormal name, abnormal detailed information, an abnormal score, occurrence times, an alarm threshold value, an alarm mode and an alarm information receiver; calculating abnormal scores according to the abnormal scores and the occurrence times of the abnormalities; determining the severity of the abnormity according to the abnormity score, and obtaining abnormal monitoring alarm information; and sending the abnormal monitoring alarm information to an alarm information receiver according to the severity and the alarm mode of the abnormality, wherein the monitoring alarm information comprises an abnormal name, abnormal detailed information, error reasons, the severity, the alarm mode and the alarm information receiver. In the scheme, if the target business service is detected to be abnormal, abnormal information corresponding to the abnormality is configured in the basic information configuration library, the abnormal information is assembled into a character string, the severity of the abnormality is determined according to the calculated abnormal score, and abnormal monitoring alarm information is sent according to the severity of the abnormality and the alarm mode, so that the accuracy of the monitoring alarm information is guaranteed, the problem causes are intelligently analyzed, and the problem processing cost is reduced.
Corresponding to the service monitoring and alarming method shown in fig. 1 in the embodiment of the present invention, an embodiment of the present invention further provides a service monitoring and alarming device, as shown in fig. 6, where the device includes: a monitoring module 601, an anomaly detection module 602, an anomaly analysis module 603 and a sending alarm module 604.
And the monitoring module 601 is configured to monitor the target service in real time.
The anomaly detection module 602 is configured to, when it is detected that the target service is abnormal, check whether a pre-established basic information configuration library is configured with abnormal information corresponding to the abnormality; if yes, assembling the abnormal information into a character string in a text data exchange format.
The character string comprises abnormal names, abnormal detailed information, abnormal scores, occurrence times, alarm threshold values, alarm modes and alarm information receivers of the abnormal.
An anomaly analysis module 603, configured to calculate an anomaly score of an anomaly according to the anomaly score and the occurrence frequency of the anomaly; and determining the severity of the abnormality according to the abnormality score, and obtaining abnormal monitoring alarm information.
And a sending alarm module 604, configured to send the abnormal monitoring alarm information to an alarm information receiver according to the severity of the abnormality and the alarm manner.
The monitoring alarm information comprises an abnormal name, abnormal detailed information, error reasons, severity, an alarm mode and an alarm information receiver.
Optionally, based on the above-mentioned anomaly detection module 602 shown in fig. 6, the anomaly detection module 602, configured to check whether the pre-established basic information configuration library configures anomaly information corresponding to an anomaly when detecting that the target service is abnormal, includes:
and the detection unit is used for detecting the logs of the target business service in real time.
And the checking unit is used for determining that the target business service is abnormal if any one of the abnormal keywords hit in the log, and checking whether a pre-established basic information configuration library is configured with abnormal information corresponding to the abnormality.
Optionally, based on the abnormality detection module 602 shown in fig. 6, the abnormality detection module 602 is further configured to:
if not, the abnormal score of the abnormality is determined to be 0.
Optionally, based on the abnormality analysis module 603 shown in fig. 6, the abnormality analysis module 603 for determining the severity of the abnormality according to the abnormality score includes:
and the first determining unit is used for determining the severity of the abnormality as the low-risk degree if the abnormality score is in the first abnormality level.
And the second determining unit is used for determining the severity of the abnormity as the general severity if the abnormity score is in a second abnormity grade.
And the third determining unit is used for determining the severity of the abnormity as the moderate severity if the abnormity score is in a third abnormity grade.
And the fourth determining unit is used for determining the severity of the abnormity as the high severity if the abnormity score is in a fourth abnormity grade.
Optionally, based on the sending alarm module 604 shown in fig. 6, the sending alarm module 604 includes:
and the calling unit is used for calling the interface corresponding to the alarm mode according to the alarm mode.
And the sending alarm unit is used for sending the abnormal monitoring alarm information to an alarm information receiver based on the interface and the abnormal severity.
It should be noted that, the specific principle and the execution process of each module in the service monitoring and alarming device disclosed in the embodiment of the present invention are the same as the service monitoring and alarming method implemented by the present invention, and reference may be made to corresponding parts in the service monitoring and alarming method disclosed in the embodiment of the present invention, which are not described herein again.
The business monitoring alarm device provided by the embodiment of the invention monitors the target business service in real time; when the target business service is detected to be abnormal, checking whether a pre-established basic information configuration library is configured with abnormal information corresponding to the abnormality; if yes, splicing the abnormal information into a character string in a text data exchange format, wherein the character string comprises an abnormal name, abnormal detailed information, an abnormal score, occurrence times, an alarm threshold value, an alarm mode and an alarm information receiver; calculating abnormal scores according to the abnormal scores and the occurrence times of the abnormalities; determining the severity of the abnormality according to the abnormality score, and obtaining abnormal monitoring alarm information; and sending the abnormal monitoring alarm information to an alarm information receiver according to the abnormal severity and the alarm mode, wherein the monitoring alarm information comprises an abnormal name, abnormal detailed information, a fault reason, the severity, the alarm mode and the alarm information receiver. In the scheme, if the target business service is detected to be abnormal, abnormal information corresponding to the abnormality is configured in the basic information configuration library, the abnormal information is assembled into a character string, the severity of the abnormality is determined according to the calculated abnormal score, and abnormal monitoring alarm information is sent according to the severity of the abnormality and the alarm mode, so that the accuracy of the monitoring alarm information is guaranteed, the problem causes are intelligently analyzed, and the problem processing cost is reduced.
All the embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from other embodiments. In particular, the system or system embodiments, which are substantially similar to the method embodiments, are described in a relatively simple manner, and reference may be made to some descriptions of the method embodiments for relevant points. The above-described system and system embodiments are only illustrative, wherein the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement without inventive effort.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present disclosure.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
Claims (10)
1. A service monitoring alarm method is characterized by comprising the following steps:
monitoring a target business service in real time;
when the target business service is detected to be abnormal, checking whether a pre-established basic information configuration library is configured with abnormal information corresponding to the abnormality;
if so, splicing the abnormal information into a character string in a text data exchange format, wherein the character string comprises an abnormal name, abnormal detailed information, an abnormal score, occurrence times, an alarm threshold value, an alarm mode and an alarm information receiver of the abnormality;
calculating the abnormal score of the abnormality according to the abnormal score and the occurrence frequency of the abnormality;
determining the severity of the abnormality according to the abnormality score, and obtaining monitoring alarm information of the abnormality;
and sending the abnormal monitoring alarm information to the alarm information receiver according to the severity of the abnormality and the alarm mode, wherein the monitoring alarm information comprises the name of the abnormality, the detailed information of the abnormality, the reason for the error, the severity, the alarm mode and the alarm information receiver.
2. The method according to claim 1, wherein when it is detected that the target business service is abnormal, checking whether a pre-established basic information configuration library configures abnormal information corresponding to the abnormality comprises:
detecting a log of the target business service in real time;
if any one of the logs hits an abnormal keyword, determining that the target business service is abnormal, and checking whether a pre-established basic information configuration library configures abnormal information corresponding to the abnormality.
3. The method of claim 1, further comprising:
if not, determining the abnormal score of the abnormality as 0.
4. The method of claim 1, wherein said determining a severity of said abnormality based on said abnormality score comprises:
if the abnormality score is at a first abnormality level, determining the severity of the abnormality as a low-risk level;
if the anomaly score is at a second anomaly level, determining the severity of the anomaly as a general severity;
determining the severity of the abnormality as moderate severity if the abnormality score is at a third abnormality level;
and if the abnormality score is in a fourth abnormality level, determining the severity of the abnormality as high severity.
5. The method according to claim 1, wherein the sending the monitoring alarm information of the abnormality to the alarm information receiver according to the severity of the abnormality and the alarm manner comprises:
calling an interface corresponding to the alarm mode according to the alarm mode;
and sending the abnormal monitoring alarm information to the alarm information receiver based on the interface and the severity of the abnormality.
6. A traffic monitoring alarm device, the device comprising:
the monitoring module is used for monitoring the target business service in real time;
the anomaly detection module is used for checking whether a pre-established basic information configuration library is configured with anomaly information corresponding to the anomaly or not when the target business service is detected to be abnormal; if so, splicing the abnormal information into a character string in a text data exchange format, wherein the character string comprises an abnormal name, abnormal detailed information, an abnormal score, occurrence times, an alarm threshold value, an alarm mode and an alarm information receiver of the abnormality;
the anomaly analysis module is used for calculating the anomaly score of the anomaly according to the anomaly score and the occurrence frequency of the anomaly; determining the severity of the abnormality according to the abnormality score, and obtaining monitoring alarm information of the abnormality;
and the sending alarm module is used for sending the abnormal monitoring alarm information to the alarm information receiver according to the abnormal severity and the alarm mode, wherein the monitoring alarm information comprises the abnormal name, the abnormal detailed information, the error reason, the severity, the alarm mode and the alarm information receiver.
7. The apparatus according to claim 6, wherein the anomaly detection module, when detecting that the target business service is abnormal, for checking whether a pre-established basic information configuration library configures abnormal information corresponding to the abnormality, comprises:
the detection unit is used for detecting the logs of the target business service in real time;
and the checking unit is used for determining that the target business service is abnormal if any one of the abnormal keywords hit in the log, and checking whether a pre-established basic information configuration library configures abnormal information corresponding to the abnormality.
8. The apparatus of claim 6, wherein the anomaly detection module is further configured to:
if not, determining the abnormal score of the abnormality as 0.
9. The apparatus of claim 6, wherein the anomaly analysis module for determining the severity of the anomaly based on the anomaly score comprises:
a first determining unit, configured to determine a severity of the abnormality as a low risk degree if the abnormality score is at a first abnormality level;
a second determining unit, configured to determine the severity of the anomaly as a general severity if the anomaly score is at a second anomaly level;
a third determining unit, configured to determine the severity of the anomaly as a medium severity if the anomaly score is at a third anomaly level;
and the fourth determining unit is used for determining the severity of the abnormity as the high severity if the abnormity score is in a fourth abnormity grade.
10. The apparatus of claim 6, wherein the transmit alarm module comprises:
the calling unit is used for calling an interface corresponding to the alarm mode according to the alarm mode;
and the sending alarm unit is used for sending the abnormal monitoring alarm information to the alarm information receiver based on the interface and the severity of the abnormality.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211590601.3A CN115981952A (en) | 2022-12-12 | 2022-12-12 | Service monitoring alarm method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211590601.3A CN115981952A (en) | 2022-12-12 | 2022-12-12 | Service monitoring alarm method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115981952A true CN115981952A (en) | 2023-04-18 |
Family
ID=85958746
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211590601.3A Pending CN115981952A (en) | 2022-12-12 | 2022-12-12 | Service monitoring alarm method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115981952A (en) |
-
2022
- 2022-12-12 CN CN202211590601.3A patent/CN115981952A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7051244B2 (en) | Method and apparatus for managing incident reports | |
US7712083B2 (en) | Method and apparatus for monitoring and updating system software | |
KR100718023B1 (en) | System for intrusion detection and vulnerability analysis in a telecommunications signaling network | |
US20220321440A1 (en) | Interface Service Function Monitoring Method and System Based on Data Acquisition | |
US9049105B1 (en) | Systems and methods for tracking and managing event records associated with network incidents | |
EP1638253A1 (en) | Method of monitoring wireless network performance | |
CN103490917B (en) | The detection method of troubleshooting situation and device | |
US8040231B2 (en) | Method for processing alarm data to generate security reports | |
CN112152823B (en) | Website operation error monitoring method and device and computer storage medium | |
CN112395156A (en) | Fault warning method and device, storage medium and electronic equipment | |
US20030093516A1 (en) | Enterprise management event message format | |
US7757122B2 (en) | Remote maintenance system, mail connect confirmation method, mail connect confirmation program and mail transmission environment diagnosis program | |
CN113495820A (en) | Method and device for collecting and processing abnormal information and abnormal monitoring system | |
CN107968727A (en) | A kind of detection method, device and the medium of CIFS services | |
CN113381884B (en) | Full link monitoring method and device for monitoring alarm system | |
CN111143162A (en) | Method for detecting whether application system based on multilayer architecture normally operates | |
CN110955581A (en) | Online software abnormity warning method and device, electronic equipment and storage medium | |
US6665822B1 (en) | Field availability monitoring | |
CN115981952A (en) | Service monitoring alarm method and device | |
CN110633165B (en) | Fault processing method, device, system server and computer readable storage medium | |
US8897713B2 (en) | System, method, and computer program product for wireless network monitoring | |
JP2006331026A (en) | Message analysis system and message analysis program | |
CN113556671B (en) | Fault positioning method, device and storage medium | |
CN115083030A (en) | Service inspection method and device and electronic equipment | |
CN112835780B (en) | Service detection method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |