CN116501551B - Data alarm generation and recovery processing method - Google Patents

Data alarm generation and recovery processing method Download PDF

Info

Publication number
CN116501551B
CN116501551B CN202310735465.0A CN202310735465A CN116501551B CN 116501551 B CN116501551 B CN 116501551B CN 202310735465 A CN202310735465 A CN 202310735465A CN 116501551 B CN116501551 B CN 116501551B
Authority
CN
China
Prior art keywords
data
alarm
observation
observation step
alarming
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310735465.0A
Other languages
Chinese (zh)
Other versions
CN116501551A (en
Inventor
赵建云
李善宝
王文民
洛佳明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Yuanqiao Information Technology Co ltd
Original Assignee
Shandong Yuanqiao Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Yuanqiao Information Technology Co ltd filed Critical Shandong Yuanqiao Information Technology Co ltd
Priority to CN202310735465.0A priority Critical patent/CN116501551B/en
Publication of CN116501551A publication Critical patent/CN116501551A/en
Application granted granted Critical
Publication of CN116501551B publication Critical patent/CN116501551B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3024Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a central processing unit [CPU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/32Monitoring with visual or acoustical indication of the functioning of the machine
    • G06F11/324Display of status information
    • G06F11/327Alarm or error message display

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Maintenance And Management Of Digital Transmission (AREA)
  • Monitoring And Testing Of Transmission In General (AREA)

Abstract

The invention relates to a data alarm generation and recovery processing method, which relates to the technical field of information monitoring and comprises the steps of acquiring monitoring data, judging data abnormality, activating judgment, judging data accumulation, first data observation, alarm judgment, alarm, second data observation, alarm release judgment, alarm release and the like. The invention can improve the alarm accuracy and has the alarm capability of coping with jitter data; the method can adjust the retention time and the first threshold according to different situations, so that the method can be compatible with different equipment, different index data monitoring frequencies and different data quality, and the compatibility is improved; in addition, in the calculation process, the method has fewer called parameters, so that the calculation difficulty is reduced, and the timeliness of alarm response is improved.

Description

Data alarm generation and recovery processing method
Technical Field
The invention relates to the technical field of information monitoring, in particular to a data alarm generation and recovery processing method.
Background
With the continuous advancement of informatization construction, people have greatly changed their work and life style. During this process, many units accumulate a large number of information assets, including but not limited to, servers, switches, routers, firewalls, storage devices, end hosts, databases, middleware, business systems, and the like. The assets are required to be monitored continuously, the health state of the assets is acquired, and the reliability, safety and stability of the service are guaranteed.
In the early stage, because of lacking necessary automatic monitoring means, most of the operation parameters of all devices and systems are checked by using a manual mode, so that time and labor are wasted, the operation states of all hardware and service systems are difficult to master comprehensively and timely, and the information processing operation is passive. The current operation and maintenance monitoring means for information assets gradually transition to active monitoring, automatic operation and maintenance or even intelligent operation and maintenance based on public protocols or private interfaces. In the upgrading iterative process, the perfect data acquisition monitoring means and the timely alarming are important.
With the current data acquisition monitoring means, the generation and recovery of alarms is approximately implemented by the following logic: 1. setting a threshold value during alarming; 2. acquiring monitored data; 3. judging whether the monitored data exceeds a set threshold value. And if the monitored data exceeds the set threshold value, an alarm is sent out, and if the detected data does not exceed the set threshold value, the alarm is released.
However, since the frequency of data generation is inconsistent, jitter of data exists objectively. For example, the utilization rate of the CPU can be quickly increased when an APP is started, and the CPU occupancy rate can be greatly reduced compared with that of the APP when the APP is started when the APP is stably operated. If only the alarm threshold is set, the situation that an alarm is sent out when an APP is started and the alarm is released when the APP runs stably is very easy to occur.
Because the alarm strategy can be mixed with a large amount of false alarm information, the labor burden of monitoring personnel can be greatly increased.
Disclosure of Invention
In order to improve the accuracy of the alarm and reduce the labor burden of monitoring personnel, the invention provides a data alarm generation and recovery processing method.
The invention provides a data alarm generation and recovery processing method, which adopts the following technical scheme:
a data alarm generation and recovery processing method comprises the following steps:
acquiring monitoring data: continuously acquiring monitored data information by a monitoring means;
judging data abnormality: judging whether the acquired instantaneous data is abnormal or not; if the data is abnormal, executing an activation judging step; if the data is normal, executing a data accumulation judging step;
and (3) activation judgment: if the first data observation step and the second data observation step are not in an activated state, activating the first data observation step; if the first data observation step is in an activated state, sending information of data abnormality to the first data observation step; if the second data observation step is in an activated state, sending information of data abnormality to the second data observation step;
and (3) data accumulation judgment: if the first data observation step and the second data observation step are not in an activated state, not reacting; if the first data observation step is in an activated state, sending information of normal data to the first data observation step; if the second data observation step is in an activated state, sending information of normal data to the second data observation step;
first data retention: setting a first stay time length, wherein in the first stay time length, the first data stay is in an activated state; calculating the number of data anomalies in the time when the first data observation step is activated;
and (3) alarm judgment: if the number of data anomalies in the first data observation step is greater than or equal to a first threshold value, executing an alarm step and clearing the data information in the first data observation step; otherwise, executing a data abnormality judging step, and emptying the data information in the first data observation step;
alarming: sending out an alarm to prompt that the monitoring data is abnormal; and executing a second data persistence step;
second data retention: setting a second stay time length, wherein in the second stay time length, the second data stay is in an activated state; calculating the normal times of continuous data in the time when the second data observation step is activated;
alarm release judgment: if the number of times of continuous normal data in the second data observation step is smaller than a second threshold value and the latest data in the second data observation step is abnormal data, continuing to execute the alarming step; if the number of times of continuous and normal data in the second data observation step is equal to a second threshold value, executing an alarm releasing step; otherwise, the reaction is not carried out;
alarm release: stopping alarming to prompt the inspector to restore the data to normal; and simultaneously executing the data abnormality judging step and deleting the cache in the second data observing step.
By adopting the technical scheme, when the monitoring data is abnormal, the alarm is not sent immediately, but the alarm is started when the number of times of data abnormality in the first observation time is larger than the first threshold value, so that false alarm caused by single data abnormality is eliminated, and the alarm accuracy is improved; when the data is always in a jitter state, the data of part of time nodes is in an abnormal state, and the data is always in jitter, so that the method also has the alarm capability for dealing with the jitter data. The method can adjust the retention time and the threshold according to different situations, so that the method can be compatible with different equipment, different index data monitoring frequencies and different data quality, and the compatibility is improved; in addition, in the calculation process, the method has fewer called parameters, so that the calculation difficulty is reduced, and the timeliness of alarm response is improved.
Optionally, the data abnormality judging step includes a plurality of abnormality judging conditions, and if any one or all of the plurality of abnormality judging conditions is satisfied, the data abnormality is determined;
or, in the data anomaly judging step, weight proportions are set for the plurality of data, and if the sum of the weights of all the anomaly data is greater than or equal to a third threshold value, the data anomaly is determined.
By adopting the technical scheme, under certain conditions, the system instability can be represented by the abnormality of certain data, and the method can judge various data simultaneously, and if one data abnormality exists, the data abnormality can be directly identified; in some cases, the instability of the system can be represented by the simultaneous abnormality of various data, the method can also judge the various data simultaneously, and if the various data are abnormal simultaneously, the data can be judged as abnormal; in some cases, some data anomalies in the data can represent system instability, the method can also judge the data simultaneously, and if the sum of the weights of the abnormal data is too large, the data anomalies can be identified. Therefore, the requirements on data abnormality under different conditions can be met, and the compatibility of the method is improved.
Optionally, the alarming step includes an alarming counting step, an alarming counting judging step, a primary alarming step and a continuous alarming step,
alarm counting: each time an alarm step is executed, the alarm times are accumulated once, and an alarm counting judgment step is executed after counting is completed;
alarm counting judgment: if the number of alarms is equal to 1, executing a primary alarm step; otherwise, executing a continuous alarming step;
primary alarm: firstly giving an alarm to prompt that the monitoring data is abnormal for the first time;
and (3) continuously alarming: continuously sending out an alarm to prompt the monitoring data to continuously appear abnormity;
in the alarm releasing step, the alarm count is cleared.
By adopting the technical scheme, the monitoring personnel can judge the continuous condition of the data information abnormality according to the alarm type, thereby being convenient for the monitoring personnel to judge the severity of the data abnormality; if the alarm is only the primary alarm, the data is proved to be abnormal and not serious; if the alarm is a continuous alarm, the data is proved to be abnormal and the reason of the abnormality needs to be analyzed in time. The alarm condition is further screened, and the labor intensity of monitoring staff is reduced.
Optionally, the alarming step further includes a history recording step;
history record: recording time information and alarm type information of alarms, and further forming an alarm form;
and executing the history recording step when the primary alarming step and the continuous alarming step are executed.
By adopting the technical scheme, after the alarm is released, the monitoring personnel can still know the alarm condition by calling the alarm form so as to judge whether to process the history alarm.
Optionally, in the data anomaly determination step, whether the data can be acquired is further determined, and if the data cannot be acquired, the data anomaly is determined.
By adopting the technical scheme, under certain conditions, the data cannot be acquired and belongs to an abnormal phenomenon, so that when the data cannot be acquired, the judgment result is biased to alarm, and the alarm accuracy is improved.
Optionally, in the step of determining abnormal data, whether the data can be obtained is further determined, and if the data cannot be obtained, the data is determined to be normal.
By adopting the technical scheme, under certain conditions, data cannot be acquired or no data transmission belongs to a normal phenomenon, so that when the data cannot be acquired, the judgment result is biased to not alarm, and the probability of false alarm is reduced.
Optionally, in the step of determining data abnormality, determining whether data can be acquired or not is further performed, and if data cannot be acquired, performing data acquisition determination;
and (3) data acquisition judgment: if the first data observation step and the second data observation step are not in an activated state, not reacting; if the first data observation step is in an activated state, sending information of data loss to the first data observation step; if the second data observation step is in an activated state, sending information of data loss to the second data observation step;
and (3) alarm judgment: if the number of data anomalies in the first data observation step is greater than or equal to a first threshold value, executing an alarm step and clearing the data information in the first data observation step; if the number of data loss times in the first data observation step is greater than or equal to a fourth threshold value, executing a data error reporting step, and clearing the data information in the first data observation step; otherwise, executing a data abnormality judging step, and emptying the data information in the first data observation step;
data error reporting: sending out a data error report alarm to prompt that monitoring data cannot be acquired and checking whether the data transmission has a problem or not in time;
alarm release judgment: if the number of data loss in the second data observation step is greater than or equal to a fourth threshold value, executing an exit alarming step, and carrying out forward extension on the second data observation time period; if the number of times of continuous normal data in the second data observation step is smaller than a second threshold value and the latest data in the second data observation step is abnormal data, continuing to execute the alarming step; otherwise, executing an alarm releasing step;
exit alarm: an alarm is not issued for a while.
By adopting the technical scheme, under certain conditions, data cannot be acquired or no data transmission belongs to frequent occurrence phenomenon, but whether the data is abnormal cannot be judged, so when the data cannot be acquired for a long time and is in an alarm state, the alarm is firstly exited, the judgment of subsequent data is waited, and if the subsequent data is abnormal data, the continuous alarm step is continuously executed, so that the accuracy of alarm categories is improved.
In summary, the present invention includes at least one of the following beneficial technical effects:
1. through the arrangement of the first data observation step and the second data observation step, false alarm caused by single data abnormality is eliminated, and the alarm accuracy is improved; when the data are consistent in a jitter state, the data of part of time nodes are in an abnormal state, and the data are always jitter, so that the method also has the alarm capability for dealing with the jitter data; the method can adjust the retention time and the first threshold according to different situations, so that the method can be compatible with different equipment, different index data monitoring frequencies and different data quality, and the compatibility is improved; in addition, in the calculation process, the method has fewer called parameters, so that the calculation difficulty is reduced, and the timeliness of alarm response is improved.
2. Through the setting of the alarming step, a monitoring person can judge the continuous condition of the data information abnormality according to the alarming type, thereby being convenient for the monitoring person to judge the severity of the data abnormality; the alarm condition is further screened, and the labor intensity of monitoring staff is reduced.
3. By judging whether the data can be obtained or not and setting the steps of data error reporting and exiting alarming, the accuracy of alarming can be improved, and the probability of error reporting is reduced.
Drawings
FIG. 1 is a system diagram of example 1;
FIG. 2 is a system diagram of example 2;
fig. 3 is a system diagram of example 3.
Detailed Description
The invention is described in further detail below in connection with fig. 1-3.
Example 1:
the embodiment discloses a data alarm generation and recovery processing method, referring to fig. 1, the data alarm generation and recovery processing method includes the following steps:
s1: acquiring monitoring data: the monitored data information is continuously acquired through a monitoring means, and the monitoring means can be a common protocol or a private interface or other modes capable of acquiring the monitored data.
S2: judging data abnormality: judging whether the data can be acquired or not and judging whether the acquired instantaneous data is abnormal or not; if the data is not acquired or the acquired data is abnormal, executing an activation judging step S31; if the acquired data is normal, a data accumulation judging step S32 is performed.
When data can be acquired, the acquired data may be single data or one data packet, and one data packet includes a plurality of data. When the acquired data is single data, directly judging whether the single data accords with a preset value, if the single data accords with the preset value, proving that the data is normal, and otherwise proving that the data is abnormal. When the acquired data is a data packet, a setting person can set a judgment rule according to actual requirements.
For example, if any one of the plurality of abnormality judgment conditions is satisfied, it is determined that the data is abnormal (the data packet may include data A1, data A2, and data A3, and if any one of the data A1, data A2, and data A3 does not conform to a preset value, it is proved that the data is abnormal, otherwise it is proved that the data is normal); for example, if all the plurality of abnormality judgment conditions are met, the data abnormality is identified (the data packet may include data A1, data A2 and data A3, and if all the data in the data A1, the data A2 and the data A3 do not meet the preset value, the data abnormality is only proved, otherwise, the data is proved to be normal); for example, in the data anomaly judging step, a weight ratio is set for the plurality of data, if the sum of the weights of all the anomaly data is greater than or equal to a third threshold value, the data anomaly is considered (the data packet may include data A1, data A2 and data A3, the weights of the data A1, the data A2 and the data A3 are all 1, the set value of the third threshold value is 2, and when any two data of the data A1, the data A2 and the data A3 do not meet the threshold value, the sum of the weights of all the anomaly data is greater than or equal to the third threshold value, and at this time, the data anomaly is proved, otherwise, the data is proved to be normal).
S31: and (3) activation judgment: if the first data observation step S4 and the second data observation step S7 are not in an activated state, activating the first data observation step S4; if the first data observation step S4 is in an activated state, sending information of data abnormality to the first data observation step S4; if the second data observation step S7 is in an active state, information of data abnormality is sent to the second data observation step S7.
S32: and (3) data accumulation judgment: if the first data observation step S4 and the second data observation step S7 are not in an activated state, no reaction is carried out; if the first data observation step S4 is in an activated state, sending information of normal data to the first data observation step S4; if the second data observing step S7 is in an active state, sending information of normal data to the second data observing step S7.
S4: first data retention: setting a first stay time length, wherein in the first stay time length, a first data stay step S4 is in an activated state; the number of data anomalies is calculated during the time the first data retention step S4 is activated.
S5: and (3) alarm judgment: if the number of data anomalies is greater than or equal to a first threshold in the first data observation step S4, executing an alarming step S6, and clearing the data information in the first data observation step S4; otherwise, the data abnormality judging step S2 is executed, and the data information in the first data observing step S4 is emptied.
For example: the first duration is 5s, the data acquired at 1s is B1, the data acquired at 2s is B2 and … …, and the data acquired at 5s is B5; the first threshold is 3. When the data B1 is abnormal data, the first data observation step S4 is activated; then, the first data retention step S4 retains the data B1, the data B2, the data B3, the data B4 and the data B5.
If 3 or more of the data B1, the data B2, the data B3, the data B4 and the data B5 are abnormal data, the alarming step S6 is executed, and the caches of the data B1, the data B2, the data B3, the data B4 and the data B5 in the first data observation step S4 are deleted.
If at most two of the data B1, the data B2, the data B3, the data B4 and the data B5 are abnormal data, the data abnormality determining step S2 is continuously executed, further, the data abnormality determining is performed on the new data B6, and the cache of the data B1, the data B2, the data B3, the data B4 and the data B5 in the first data observation step S4 is deleted.
S6: alarming: the alarming step S6 comprises an alarming counting step S61, an alarming counting judging step S62, a primary alarming step S63, a continuous alarming step S64 and a history recording step S65;
s61: alarm counting: each time the alarming step S6 is executed, the alarming times are accumulated once, and after counting is completed, the alarming counting judging step S62 is executed;
s62: alarm counting judgment: if the number of alarms is equal to 1, executing a primary alarm step S63; otherwise, executing a continuous alarming step S65;
s63: primary alarm: firstly giving an alarm to prompt that the monitoring data is abnormal for the first time; executing a second data observation step S7 and a history recording step S65;
s64: and (3) continuously alarming: continuously sending out an alarm to prompt the monitoring data to continuously appear abnormity; and executing a second data persistence step S7 and a history recording step S65;
s65: history record: recording data once every time the history recording step is executed, and further forming an alarm form, wherein the alarm form comprises alarm type information and alarm time information.
S7: second data retention: setting a second observation time length, wherein in the second observation time length, a second data observation step S7 is in an activated state; the number of times that the continuous data is normal is calculated during the time that the second data observing step S7 is activated.
S8: alarm release judgment: if in the second data observation step S7, the number of times of continuous and normal data is smaller than the second threshold value, and the latest data in the second data observation step S7 is abnormal data, continuing to execute the alarming step S6; if in the second data observation step S7, the number of times of continuous and normal data is equal to the second threshold, executing an alarm releasing step S9; otherwise, it does not react.
S9: alarm release: stopping alarming to prompt the inspector to restore the data to normal; simultaneously executing a data abnormality judging step S2, and deleting the cache in a second data observation step S7;
for example: the second retention time is 3s, the data acquired at 11s is C1, the data acquired at 12s is C2 and … …, and the data acquired at 15s is C5; the second threshold is 3. When the first alarm or the continuous alarm is sent out at the 10 th step, a second data observation step S7 is activated; then, the second data retention step S7 retains the data B1, the data B2 and the data B3 from the 11 th S to the 13 th S.
If the 3 data of the data B1, the data B2, and the data B3 in the 11 th to 13 th S are normal data, the alarm canceling step S9 is executed. If the data B1 and the data B2 in the 11 th S to the 12 th S are normal data and the data B3 in the 13 th S is abnormal data, the alarm step S6 is continuously executed, and the second data observation step S7 is continuously executed after the alarm step S6 is executed, so that the second data observation step S7 is refreshed (namely, in the 13 th S, the second data observation step S7 is used for re-observing the data in the 14 th S to the 18 th S, and the data in the 11 th S to the 13 th S are deleted).
The implementation principle of the data alarm generation and recovery processing method in this embodiment is as follows:
when the first time of monitoring that the data is abnormal, the alarm is not sent immediately, but the alarm is started when the number of times of data abnormality in the first observation time is larger than a first threshold value, so that false alarm caused by single data abnormality is eliminated, and the alarm accuracy is improved. After the alarm, carrying out second observation on the subsequent data, and releasing the alarm when the data in the second observation reaches the alarm releasing standard; therefore, if the data is in a jitter state, the alarm is not stopped immediately, and the method further has the alarm capability of coping with the jitter data.
In addition, the alarming step of the method comprises two alarming methods of primary alarming and continuous alarming; the primary alarm and the continuous alarm can help the monitoring personnel to judge the continuous condition of the data information abnormality, thereby being convenient for the monitoring personnel to judge the severity of the data abnormality; if the alarm is only the primary alarm, the data is proved to be abnormal and not serious; if the alarm is a continuous alarm, the data is proved to be abnormal and the reason of the abnormality needs to be analyzed in time. The alarm condition is further screened, and the labor intensity of monitoring staff is reduced.
Meanwhile, in the history recording step of the method, the alarms are formed into an alarm form so as to be convenient for monitoring personnel to inquire, and then whether to process the history alarms is selected.
The method can adjust the retention time and the threshold according to different situations, so that the method can be compatible with different equipment, different index data monitoring frequencies and different data quality, and the compatibility is improved; in addition, in the calculation process, the method has fewer called parameters, so that the calculation difficulty is reduced, and the timeliness of alarm response is improved.
Example 2:
the present embodiment discloses a data alarm generating and recovering processing method, referring to fig. 2, the steps of the present embodiment are substantially the same as those of embodiment 1, and the difference is that:
s2: judging data abnormality: judging whether the data can be acquired or not and judging whether the acquired instantaneous data is abnormal or not; if the acquired data is abnormal, executing an activation judging step S31; if the data cannot be acquired or the acquired data is normal, a data accumulation judging step S32 is executed.
The implementation principle of the data alarm generation and recovery processing method of this embodiment is substantially the same as that of embodiment 1, and the difference is that:
in some cases, the data cannot be acquired or no data transmission belongs to a normal phenomenon, so in the data anomaly judging step S2, if the data cannot be acquired, the judging result is biased to not alarm, so as to reduce the probability of false alarm.
Example 3:
the embodiment discloses a data alarm generation and recovery processing method, referring to fig. 3, the data alarm generation and recovery processing method disclosed in the embodiment includes the following steps:
s1: the same as S1 in example 1;
s2: judging data abnormality: judging whether the data can be acquired, judging whether the acquired instantaneous data is abnormal, and judging whether the data can be acquired; if the data cannot be acquired, executing a data acquisition judging step S33; if the instantaneous data can be acquired, judging whether the acquired instantaneous data is abnormal or not; if the acquired data is abnormal, executing an activation judging step S31; if the acquired data is normal, a data accumulation judging step S32 is performed.
S31: the same as S31 in example 1;
s32: the same as S32 in example 1;
s33: and (3) data acquisition judgment: if the first data observation step S4 and the second data observation step S7 are not in an activated state, no reaction is carried out; if the first data observation step S4 is in an activated state, sending information of data loss to the first data observation step S4; if the second data persistence step S7 is in an active state, information about data loss is sent to the second data persistence step S7.
S4: first data retention: setting a first stay time length, wherein in the first stay time length, a first data stay step S4 is in an activated state; the number of data anomalies and the number of data losses are calculated during the time the first data retention step S4 is activated.
S5: and (3) alarm judgment: if the number of data anomalies is greater than or equal to a first threshold in the first data observation step S4, executing an alarming step S6, and clearing the data information in the first data observation step S4; if the number of data loss is greater than or equal to the fourth threshold in the first data observation step S4, performing a data error reporting step S10, and emptying the data information in the first data observation step S4; otherwise, the data abnormality judging step S2 is executed, and the data information in the first data observing step S4 is emptied.
For example: the first duration is 5s, the data acquired at 1s is D1, the data acquired at 2s is D2 and … …, and the data acquired at 5s is D5; the first threshold is 3 and the fourth threshold is 3. When the data D1 is abnormal data, the first data persistence is activated; the first data retention step then retains data D1, data D2, data D3, data D4, and data D5.
If 3 or more of the data B2, the data B3, the data B4 and the data B5 are lost, the data error reporting step S10 is performed, and the caches of the data B1, the data B2, the data B3, the data B4 and the data B5 in the first data observation step S4 are deleted.
S6: the same as S6 in example 1;
s7: second data retention: setting a second observation time length, wherein in the second observation time length, a second data observation step S7 is in an activated state; the number of times the continuous data is normal and the number of times the data is lost are calculated during the time the second data retention step S7 is activated.
S8: alarm release judgment: if the number of data loss is greater than or equal to the fourth threshold in the second data observation step S7, executing the exit alarm step S11, and carrying out a forward delay on the time period of the second data observation step S7; if in the second data observation step S7, the number of times of continuous and normal data is smaller than the second threshold value, and the latest data in the second data observation step S7 is abnormal data, continuing to execute the alarming step S6; if in the second data observation step S7, the number of times of continuous and normal data is equal to the second threshold, executing an alarm releasing step S9; otherwise, the reaction is not carried out;
s9: the same as S9 in example 1;
s10: data error reporting: sending out a data error report alarm to prompt that monitoring data cannot be acquired and checking whether the data transmission has a problem or not in time;
s11: exit alarm: an alarm is not issued for a while.
For example: the second retention time is 5s, the data acquired at 11s is E1, the data acquired at 12s is E2 and … …, and the data acquired at 15s is E5; the second threshold is 3 and the fourth threshold is 3. When the first alarm or the continuous alarm is sent out at the 10 th step, a second data observation step S7 is activated; then, the second data retention step S7 retains the data E1, the data E2, the data E3, the data E4 and the data E5 from the 11 th S to the 15 th S.
If 3 or more of the data E1, E2, E3, E4 and E5 are lost, the exit alarm step S11 is executed, and the time period for the second data to stay is extended. By forward, it is meant that the second data retention step S7 is extended backward by 1S while the exit alert step S11 is performed, and the data acquired by the first 1S in the second data retention step S7 is deleted. For example, data E1, data E3, and data E5 are lost, at 15S, the exit alert step S11 is performed, and data E2, data E3, data E4, data E5, and data E6 in 12S to 16S are left behind, and data E1 at 11S is deleted.
The implementation principle of the data alarm generation and recovery processing method in this embodiment is as follows:
in some cases, data cannot be acquired or no data transmission belongs to frequent occurrence, and cannot represent abnormal data or normal data; therefore, when a large amount of data is lost in the first data observation step S4, a data error alarm is sent to prompt that the monitoring data cannot be acquired, so that whether the data is abnormal or not cannot be judged, and a monitoring person should check whether the data transmission has a problem or not in time.
And when a large amount of data loss occurs in the second data observation step S7, whether the data is abnormal or not cannot be judged, so that the alarm is firstly exited, and then whether the alarm is to be exited or the alarm is continuously exited is judged according to the subsequent data state, so that the accuracy of the alarm category is improved.
The above embodiments are not intended to limit the scope of the present invention, so: all equivalent changes in structure, shape and principle of the invention should be covered in the scope of protection of the invention.

Claims (7)

1. A data alarm generation and recovery processing method is characterized in that: the method comprises the following steps:
acquiring monitoring data: continuously acquiring monitored data information by a monitoring means;
judging data abnormality: judging whether the acquired instantaneous data is abnormal or not; if the data is abnormal, executing an activation judging step; if the data is normal, executing a data accumulation judging step;
and (3) activation judgment: if the first data observation step and the second data observation step are not in an activated state, activating the first data observation step; if the first data observation step is in an activated state, sending information of data abnormality to the first data observation step; if the second data observation step is in an activated state, sending information of data abnormality to the second data observation step;
and (3) data accumulation judgment: if the first data observation step and the second data observation step are not in an activated state, not reacting; if the first data observation step is in an activated state, sending information of normal data to the first data observation step; if the second data observation step is in an activated state, sending information of normal data to the second data observation step;
first data retention: setting a first stay time length, wherein in the first stay time length, the first data stay is in an activated state; calculating the number of data anomalies in the time when the first data observation step is activated;
and (3) alarm judgment: if the number of data anomalies in the first data observation step is greater than or equal to a first threshold value, executing an alarm step and clearing the data information in the first data observation step; otherwise, executing a data abnormality judging step, and emptying the data information in the first data observation step;
alarming: sending out an alarm to prompt that the monitoring data is abnormal; and executing a second data persistence step;
second data retention: setting a second stay time length, wherein in the second stay time length, the second data stay is in an activated state; calculating the normal times of continuous data in the time when the second data observation step is activated;
alarm release judgment: if the number of times of continuous normal data in the second data observation step is smaller than a second threshold value and the latest data in the second data observation step is abnormal data, continuing to execute the alarming step; if the number of times of continuous and normal data in the second data observation step is equal to a second threshold value, executing an alarm releasing step; otherwise, the reaction is not carried out;
alarm release: stopping alarming to prompt the inspector to restore the data to normal; and simultaneously executing the data abnormality judging step and deleting the cache in the second data observing step.
2. The data alarm generation and recovery processing method according to claim 1, wherein: the data abnormality judging step comprises a plurality of abnormality judging conditions, and if any one or all of the plurality of abnormality judging conditions are met, the data abnormality is determined;
or, in the data anomaly judging step, weight proportions are set for the plurality of data, and if the sum of the weights of all the anomaly data is greater than or equal to a third threshold value, the data anomaly is determined.
3. A data alert generation and recovery processing method according to claim 1 or 2, characterized in that: the alarming step comprises an alarming counting step, an alarming counting judging step, a primary alarming step and a continuous alarming step,
alarm counting: each time an alarm step is executed, the alarm times are accumulated once, and an alarm counting judgment step is executed after counting is completed;
alarm counting judgment: if the number of alarms is equal to 1, executing a primary alarm step; otherwise, executing a continuous alarming step;
primary alarm: firstly giving an alarm to prompt that the monitoring data is abnormal for the first time;
and (3) continuously alarming: continuously sending out an alarm to prompt the monitoring data to continuously appear abnormity;
in the alarm releasing step, the alarm count is cleared.
4. A data alert generation and recovery processing method according to claim 3, wherein: the alarming step further comprises a history recording step;
history record: recording time information and alarm type information of alarms, and further forming an alarm form;
and executing the history recording step when the primary alarming step and the continuous alarming step are executed.
5. A data alert generation and recovery processing method according to claim 3, wherein: in the data abnormality judging step, whether the data can be acquired is also judged, and if the data cannot be acquired, the data abnormality is judged.
6. A data alert generation and recovery processing method according to claim 3, wherein: in the data abnormality judging step, whether the data can be acquired is also judged, and if the data can not be acquired, the data is judged to be normal.
7. A data alert generation and recovery processing method according to claim 3, wherein: in the data abnormality judging step, judging whether the data can be acquired or not, and if the data cannot be acquired, executing data acquisition judgment;
and (3) data acquisition judgment: if the first data observation step and the second data observation step are not in an activated state, not reacting; if the first data observation step is in an activated state, sending information of data loss to the first data observation step; if the second data observation step is in an activated state, sending information of data loss to the second data observation step;
and (3) alarm judgment: if the number of data anomalies in the first data observation step is greater than or equal to a first threshold value, executing an alarm step and clearing the data information in the first data observation step; if the number of data loss times in the first data observation step is greater than or equal to a fourth threshold value, executing a data error reporting step, and clearing the data information in the first data observation step; otherwise, executing a data abnormality judging step, and emptying the data information in the first data observation step;
data error reporting: sending out a data error report alarm to prompt that monitoring data cannot be acquired and checking whether the data transmission has a problem or not in time;
alarm release judgment: if the number of data loss in the second data observation step is greater than or equal to a fourth threshold value, executing an exit alarming step, and carrying out forward extension on the second data observation time period; if the number of times of continuous normal data in the second data observation step is smaller than a second threshold value and the latest data in the second data observation step is abnormal data, continuing to execute the alarming step; otherwise, executing an alarm releasing step;
exit alarm: an alarm is not issued for a while.
CN202310735465.0A 2023-06-21 2023-06-21 Data alarm generation and recovery processing method Active CN116501551B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310735465.0A CN116501551B (en) 2023-06-21 2023-06-21 Data alarm generation and recovery processing method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310735465.0A CN116501551B (en) 2023-06-21 2023-06-21 Data alarm generation and recovery processing method

Publications (2)

Publication Number Publication Date
CN116501551A CN116501551A (en) 2023-07-28
CN116501551B true CN116501551B (en) 2023-09-15

Family

ID=87320468

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310735465.0A Active CN116501551B (en) 2023-06-21 2023-06-21 Data alarm generation and recovery processing method

Country Status (1)

Country Link
CN (1) CN116501551B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102566475A (en) * 2010-12-17 2012-07-11 北京北方微电子基地设备工艺研究中心有限责任公司 Method and device for processing monitoring alarm and plasma processing device
CN105957314A (en) * 2016-04-29 2016-09-21 北京奇虎科技有限公司 Monitoring alarming method and system
CN106407077A (en) * 2016-09-21 2017-02-15 广州华多网络科技有限公司 A real-time alarm method and system
WO2017133589A1 (en) * 2016-02-05 2017-08-10 中兴通讯股份有限公司 Safety monitoring method and device for terminal user
CN109560963A (en) * 2018-11-23 2019-04-02 北京车和家信息技术有限公司 Monitoring alarm method, system and computer readable storage medium
CN113572654A (en) * 2020-04-29 2021-10-29 华为技术有限公司 Network performance monitoring method, network device and storage medium
CN115080356A (en) * 2022-07-21 2022-09-20 支付宝(杭州)信息技术有限公司 Abnormity warning method and device
CN116206427A (en) * 2023-05-06 2023-06-02 安徽智寰科技有限公司 Hierarchical alarm method based on universal index self-adaptive threshold

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102566475A (en) * 2010-12-17 2012-07-11 北京北方微电子基地设备工艺研究中心有限责任公司 Method and device for processing monitoring alarm and plasma processing device
WO2017133589A1 (en) * 2016-02-05 2017-08-10 中兴通讯股份有限公司 Safety monitoring method and device for terminal user
CN105957314A (en) * 2016-04-29 2016-09-21 北京奇虎科技有限公司 Monitoring alarming method and system
CN106407077A (en) * 2016-09-21 2017-02-15 广州华多网络科技有限公司 A real-time alarm method and system
CN109560963A (en) * 2018-11-23 2019-04-02 北京车和家信息技术有限公司 Monitoring alarm method, system and computer readable storage medium
CN113572654A (en) * 2020-04-29 2021-10-29 华为技术有限公司 Network performance monitoring method, network device and storage medium
CN115080356A (en) * 2022-07-21 2022-09-20 支付宝(杭州)信息技术有限公司 Abnormity warning method and device
CN116206427A (en) * 2023-05-06 2023-06-02 安徽智寰科技有限公司 Hierarchical alarm method based on universal index self-adaptive threshold

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
The Design and Application of Data Acquisition and Monitoring System for Laboratory;Duan, Rongxia.et;《PROCEEDINGS OF THE 7TH INTERNATIONAL CONFERENCE ON MANAGEMENT, EDUCATION, INFORMATION AND CONTROL (MEICI 2017)》;全文 *
基于SNMP的气象设备监控系统;索开华;《信息技术与信息化》;全文 *
基于计数算法的网管告警关联问题研究;邵臻磊;;电脑知识与技术(04);全文 *

Also Published As

Publication number Publication date
CN116501551A (en) 2023-07-28

Similar Documents

Publication Publication Date Title
CN110430071A (en) Service node fault self-recovery method, apparatus, computer equipment and storage medium
CN110596486A (en) Intelligent early warning operation and maintenance method and system for charging pile
US9319284B2 (en) Operation delay monitoring method, operation management apparatus, and operation management program
EP1382155A4 (en) Method and system for reducing false alarms in network fault management systems
CN104464158B (en) Fire alarm linkage control method and system
WO2001077828A2 (en) Incremental alarm correlation method and apparatus
US11770199B2 (en) Traffic data self-recovery processing method, readable storage medium, server and apparatus
CN112346924A (en) Server monitoring method and system
CN106330588A (en) BFD detection method and device
CN114124492A (en) Network traffic anomaly detection and analysis method and device
US7564796B2 (en) Method and system for managing a network slowdown
CN116501551B (en) Data alarm generation and recovery processing method
CN110730087A (en) Method and device for processing alarm storm
CN109309577A (en) Alert processing method, apparatus and system for SDN network
CN117520096B (en) Intelligent server safety monitoring system
CN104468224A (en) Double-filtration fault warning method for data center monitoring system
CN112732820A (en) Database session management system and method thereof
CN107612755A (en) The management method and its device of a kind of cloud resource
EP3991030A1 (en) Determining problem dependencies in application dependency discovery, reporting, and management tool
CN111585833A (en) Method and device for detecting public network quality of CDN node and computer equipment
CN108616423B (en) Offline device monitoring method and device
JP4485344B2 (en) Server apparatus, failure path diagnosis method, and failure path diagnosis program
CN113612647A (en) Alarm processing method and device
KR102109536B1 (en) Method for diagnosing and handling obstacle of server based on obstacle type
CN113065001A (en) Fault loss stopping method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant