CN107204879B - A kind of distributed system adaptive failure detection method based on index rolling average - Google Patents
A kind of distributed system adaptive failure detection method based on index rolling average Download PDFInfo
- Publication number
- CN107204879B CN107204879B CN201710413817.5A CN201710413817A CN107204879B CN 107204879 B CN107204879 B CN 107204879B CN 201710413817 A CN201710413817 A CN 201710413817A CN 107204879 B CN107204879 B CN 107204879B
- Authority
- CN
- China
- Prior art keywords
- heartbeat
- time
- delay
- value
- delayed
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/10—Active monitoring, e.g. heartbeat, ping or trace-route
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/06—Management of faults, events, alarms or notifications
- H04L41/0677—Localisation of faults
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/08—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
- H04L43/0852—Delays
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Environmental & Geological Engineering (AREA)
- Health & Medical Sciences (AREA)
- Cardiology (AREA)
- General Health & Medical Sciences (AREA)
- Measurement And Recording Of Electrical Phenomena And Electrical Characteristics Of The Living Body (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The distributed system adaptive failure detection method based on index rolling average that the invention discloses a kind of includes four steps: time series data is collected, heartbeat prediction, exports diagnostic value and fault distinguishing.The fault detection method can be used for the fault detection in distributed system, find system failure hidden danger in time, reduce system failure risk.The present invention utilizes history heart sequence, the diagnostic value of one dynamic accumulative at any time of output, the threshold value set when according to system initialization, judge system interior joint whether failure.When calculating heartbeat predicted value, the influence weight of each history heartbeat message is calculated based on index rolling average, make influence weight at any time be incremented by and exponential decrease, while by variance ratio reduce be mutated history heartbeat influence weight.
Description
Technical field
The invention belongs to distributed system technical fields, and in particular to a kind of distributed system based on index rolling average
Adaptive failure detection method.
Background technique
With the development of distributed computing technology, distributed system is just being applied to the every aspect of people's daily life, electronics quotient
The industries such as business, cloud storage, network communication and bank and security all by the building of its core business in distributed system with to visitor
Family provides the service of fast and stable and safety.Fault detection is the basic component part of distributed system, is that guarantee system is reliable
One of with the necessary means of stable operation;With being continuously increased for system scale and complexity, the difficulty of fault detection is also more next
It is higher.
Adaptive failure detector can be according to system or network state, and dynamic adjusts detection parameters, when such as heartbeat timeout
Between etc., the tracer fixed relative to tradition has better detection effect.Currently, adaptive failure detection technique is ground
Studying carefully have been relatively mature, and many is gone out based on the adaptive tracer of heartbeat, and can substantially be summarized as two classes: one kind is logical
History heart sequence is crossed, calculate the predicted value of next heartbeat using algorithms of different and detection time-out time is set according to predicted value,
The testing result of this tracer is with two-value or failure or normally;In addition one kind be by the monitoring of failure with
The adaptive failure detector of power of interpretation separation, the detector export one and change over time also with Heart-Beat Technology
Accumulation decision value, user by setting prediction to determine whether failure, this detector can in integrated system not
Different detection effect is generated with application, there is higher flexibility.
Summary of the invention
The present invention provides a kind of distributed system adaptive failure detection method based on index rolling average, Neng Gou
Reduce failure detection time simultaneously with guaranteeing detection accuracy, improves fault detection efficiency, and there is stronger applicability.
A kind of distributed system adaptive failure detection method based on index rolling average, includes the following steps:
(1) the tested node every the set time into system sends heartbeat message and receives the response message of its return,
To maintain to update the heartbeat time-delayed sequence that a designated length is n, n is the natural number greater than 1;
(2) according to the heartbeat time-delayed sequence, next heartbeat delay is calculated in the last heartbeat response arrival time
Predicted value EIA0;
(3) the predicted value EIA being delayed according to next heartbeat0Calculate the diagnostic value an of cumulative growth at any time
And fault distinguishing is carried out to tested node according to the diagnostic value.
Heartbeat time-delayed sequence in the step (1) is by n heartbeat delay IA1~IAnChronologically from closely to remote arrangement group
At the arrival time that any heartbeat delay in sequence is equal to its corresponding heartbeat response subtracts its preceding heartbeat response
Arrival time;If heartbeat time-delayed sequence has been expired, it is being stored in newest heartbeat delay while is removing farthest heartbeat delay.
The predicted value EIA of next heartbeat delay is calculated in the step (2)0, detailed process is as follows:
2.1 for any heartbeat delay IA in heartbeat time-delayed sequencei, it is calculated under using the index method of moving average
The influence weight φ of one heartbeat delayi;
2.2 use Variance ratio method to influence weight φiIt is adjusted optimization, obtains heartbeat delay IAiNext heartbeat is prolonged
When final influence weight θi;
2.3 make the error mean of the delay of heartbeat in heartbeat time-delayed sequence and its predicted value as more than the safety predicted next time
α is measured, and influences weight θ according to finaliCalculate the predicted value EIA of next heartbeat delay0。
The calculation expression of the index method of moving average is as follows in the step 2.1:
Wherein:Expression rounds up, and i is natural number and 1≤i≤n.
Using Variance ratio method to influence weight φ in the step 2.2iIt is adjusted optimization, specific calculating process is as follows:
Wherein: μ and δ is respectively the mean value and standard deviation of heartbeat time-delayed sequence, vi=IAi- μ, Ψ (vi) it is that heartbeat is delayed
IAiCorresponding variance ratio.
The safe clearance α predicted next time is calculated in the step 2.3 according to the following formula:
Wherein: EIAiFor heartbeat delay IAiPredicted value.
The predicted value EIA of next heartbeat delay is calculated in the step 23 according to the following formula0:
Diagnostic value is calculated according to the following formula in the step (3)
Wherein: TlastFor the arrival time of the last heartbeat response, t is the time.
Compared with existing Faults in Distributed Systems detection technique, Faults in Distributed Systems detection method of the present invention is based on index
The method of moving average predicts next diagnostic value heartbeat delay and accumulated at any time using this predicted value as input, output one, is protecting
Failure detection time is reduced to card detection accuracy simultaneously, improves fault detection efficiency;Especially lost in internet message
In the environment that rate allows, fault detection method of the present invention has stronger applicability, can find system failure hidden danger in time, drops
Low system failure risk.
Detailed description of the invention
Fig. 1 is the flow diagram of Faults in Distributed Systems detection method of the present invention.
Specific embodiment
In order to more specifically describe the present invention, with reference to the accompanying drawing and specific embodiment is to technical solution of the present invention
It is described in detail.
Distributed system is abstracted into the cluster comprising two nodes { p, q } by present embodiment, and wherein p is as detection section
Point, q is as detected node;As shown in Figure 1, the distributed system adaptive failure detection method includes the following steps:
(1) time series data is collected.
Detection node p sends heartbeat message to detected node q every the η time, and q returns to sound when receiving heartbeat message
Answer message.Node p is in the internal heartbeat time series for maintaining a designated length n, when p receives the heartbeat response message of q, more
Data in new local time series.
Delay { the IA of the heartbeat of nearest n is stored in time series1, IA2..., IAi..., IAn, wherein IA1It indicates most
Nearly heartbeat delay, IAnIndicate farthest heartbeat delay in sequence.Assuming that i-th of heartbeat response arrival time is Ti, then:
IAi=Ti-Ti-1
When the local heartbeat time series of node p has been expired, oldest data are removed before being stored in newest heartbeat delay, are guaranteed
The timeliness of data.
(2) heartbeat is predicted.
Heartbeat prediction specific steps can further be divided into: index rolling average weight computing, Variance ratio method weighed value adjusting,
Safe clearance α is calculated.
2-1 index rolling average weight computing: the influence power that history heartbeat delay is delayed to next heartbeat refers at any time
Number successively decreases, and the current closer point influence power of distance is bigger, on the contrary, smaller apart from remoter point influence power.For heartbeat message sequence
Arrange { IA1, IA2..., IAi..., IAn, influencing weight is { φ1(β), φ2(β) ..., φi(β) ..., φn(β) }, wherein
β is the constant for reconciling weight, between 0~1.
Influence weight φi(β) is defined as:
φi(β)=β (1- β)i-1, 1≤i≤n
It can be seen that 0 < φi(β) < 1, and the influence weight that heartbeat is delayed in time series exponentially successively decreases at any time.
2-2 Variance ratio method weighed value adjusting: the weight φ that the index method of moving average is calculatedi(β) is advanced optimized, it is assumed that
It is poor that variable μ and δ respectively indicate heartbeat is delayed in time series mean value and SS, it may be assumed that
Then variance ratio ψ (vi) is defined as:
Wherein, νi=IAiBigger ψ (the ν of fluctuation of-μ, i.e. history heartbeat delayi) smaller, and ψ (νi)≤1。
The influence weight θ of final each history heartbeat delayiAre as follows:
θi=φi(β)*ψ(νi), 1≤i≤n
2-3 safe clearance α is calculated: using the error mean of heartbeat Delay Forecast in time series as predicting next time
Safe clearance, calculation formula are as follows:
Finally, calculating the predicted value of next heartbeat delay, calculation formula is as follows:
(3) diagnostic value is exported
By heartbeat Delay ForecastAnd current time t is as input, diagnostic valueCalculation formula are as follows:
Wherein: TlastFor the last heartbeat arrival time,0~+∞ of value range.
(4) fault distinguishing.
Node p is calculated in t momentIfThen assert that node q is normal, otherwise assert section
Point q failure.
The above-mentioned description to embodiment is for that can understand and apply the invention convenient for those skilled in the art.
Person skilled in the art obviously easily can make various modifications to above-described embodiment, and described herein general
Principle is applied in other embodiments without having to go through creative labor.Therefore, the present invention is not limited to the above embodiments, ability
Field technique personnel announcement according to the present invention, the improvement made for the present invention and modification all should be in protection scope of the present invention
Within.
Claims (1)
1. a kind of distributed system adaptive failure detection method based on index rolling average, includes the following steps:
(1) the tested node every the set time into system sends heartbeat message and receives the response message of its return, thus
It maintains to update the heartbeat time-delayed sequence that a designated length is n, n is the natural number greater than 1;
The heartbeat time-delayed sequence is by n heartbeat delay IA1~IAnChronologically from closely to far rearranging, any heart in sequence
It jumps delay and is equal to the arrival time that the arrival time that its corresponding heartbeat responds subtracts its preceding heartbeat response;If heartbeat
Time-delayed sequence has been expired, then is being stored in newest heartbeat delay while removing farthest heartbeat delay;
(2) according to the heartbeat time-delayed sequence, the pre- of next heartbeat delay is calculated in the last heartbeat response arrival time
Measured value EIA0, detailed process is as follows:
2.1 for any heartbeat delay IA in heartbeat time-delayed sequencei, it is calculated under wholeheartedly using the index method of moving average
Jump the influence weight φ of delayi, specific calculation expression is as follows:
Wherein:Expression rounds up, and i is natural number and 1≤i≤n;
2.2 use Variance ratio method to influence weight φiIt is adjusted optimization, obtains heartbeat delay IAiFor next heartbeat delay
It is final to influence weight θi, specific calculation expression is as follows:
Wherein: μ and δ is respectively the mean value and standard deviation of heartbeat time-delayed sequence, vi=IAi- μ, Ψ (vi) it is heartbeat delay IAiIt is corresponding
Variance ratio;
2.3 make the error mean of the delay of heartbeat in heartbeat time-delayed sequence and its predicted value as the safe clearance α predicted next time,
Specific calculation expression is as follows:
Wherein: EIAiFor heartbeat delay IAiPredicted value;
And then weight θ is influenced according to finaliIt is calculated by the following formula out the predicted value EIA of next heartbeat delay0;
(3) the predicted value EIA being delayed according to next heartbeat0It is calculated by the following formula out the diagnosis an of cumulative growth at any time
ValueAnd fault distinguishing is carried out to tested node according to the diagnostic value;
Wherein: TlastFor the arrival time of the last heartbeat response, t is the time.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710413817.5A CN107204879B (en) | 2017-06-05 | 2017-06-05 | A kind of distributed system adaptive failure detection method based on index rolling average |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710413817.5A CN107204879B (en) | 2017-06-05 | 2017-06-05 | A kind of distributed system adaptive failure detection method based on index rolling average |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107204879A CN107204879A (en) | 2017-09-26 |
CN107204879B true CN107204879B (en) | 2019-09-20 |
Family
ID=59906687
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710413817.5A Active CN107204879B (en) | 2017-06-05 | 2017-06-05 | A kind of distributed system adaptive failure detection method based on index rolling average |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107204879B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019178714A1 (en) * | 2018-03-19 | 2019-09-26 | 华为技术有限公司 | Fault detection method, apparatus, and system |
CN115190051B (en) * | 2021-04-01 | 2023-09-05 | 中国移动通信集团河南有限公司 | Heartbeat data identification method and electronic device |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2009062090A1 (en) * | 2007-11-08 | 2009-05-14 | Genetic Finance Holdings Limited | Distributed network for performing complex algorithms |
CN103117901B (en) * | 2013-02-01 | 2016-06-15 | 华为技术有限公司 | A kind of distributed heartbeat detection method, Apparatus and system |
CN107133478A (en) * | 2017-05-10 | 2017-09-05 | 南京航空航天大学 | A kind of high speed incremental formula aero-engine method for detecting abnormality |
-
2017
- 2017-06-05 CN CN201710413817.5A patent/CN107204879B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN107204879A (en) | 2017-09-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Wang et al. | LDPA: A local data processing architecture in ambient assisted living communications | |
US20020178275A1 (en) | Algorithm for prioritization of event datum in generic asynchronous telemetric streams | |
CN107204879B (en) | A kind of distributed system adaptive failure detection method based on index rolling average | |
CN111198808A (en) | Method, device, storage medium and electronic equipment for predicting performance index | |
CN112615742A (en) | Method, device, equipment and storage medium for early warning | |
Rodner et al. | Data-driven generation of rule-based behavior models for an ambient assisted living system | |
Dewasurendra et al. | Evidence filtering | |
Yürür et al. | Energy-efficient and context-aware smartphone sensor employment | |
Banerjee et al. | Decentralized sequential change detection with ordered CUSUMs | |
Nguyen et al. | Applications of anomaly detection using deep learning on time series data | |
Nelus et al. | Privacy-preserving variational information feature extraction for domestic activity monitoring versus speaker identification | |
CN101237357A (en) | Online failure detection method for industrial wireless sensor network | |
Abid et al. | Centralized KNN anomaly detector for WSN | |
Maksimović et al. | Comparative analysis of data mining techniques applied to wireless sensor network data for fire detection | |
CN116702006A (en) | Abnormality determination method, abnormality determination device, computer device, and storage medium | |
CN110138812B (en) | Network Safety Analysis system | |
CN107730845B (en) | Article losing-proof method, apparatus and terminal device | |
Yu et al. | Data anomaly detection and data fusion based on incremental principal component analysis in fog computing | |
Zhuikov et al. | Integration of context-aware control system in microgrid | |
Linets et al. | Green technologies in identification systems of transport telecommunication networks | |
Kang et al. | Alarm notification of body sensors utilising activity recognition and smart device application | |
Sun | Research on Intelligent Predictive Analysis System Based on Embedded Wireless Communication Network | |
CN110887652A (en) | Interactive multi-model detection method for vibration detection and displacement extraction of accelerometer | |
Liu et al. | Reliability evaluation for wireless sensor network based on hierarchical weighted voting system | |
Dima et al. | A network reliability oriented event detection scheme for Wireless Sensors and Actors Networks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |