CN111869163A - 一种故障检测的方法、装置及系统 - Google Patents
一种故障检测的方法、装置及系统 Download PDFInfo
- Publication number
- CN111869163A CN111869163A CN201880091411.2A CN201880091411A CN111869163A CN 111869163 A CN111869163 A CN 111869163A CN 201880091411 A CN201880091411 A CN 201880091411A CN 111869163 A CN111869163 A CN 111869163A
- Authority
- CN
- China
- Prior art keywords
- node
- nodes
- delay data
- heartbeat
- evaluation values
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/10—Active monitoring, e.g. heartbeat, ping or trace-route
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0706—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
- G06F11/0709—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment in a distributed system consisting of a plurality of standalone computer nodes, e.g. clusters, client-server systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0751—Error or fault detection not based on redundancy
- G06F11/0754—Error or fault detection not based on redundancy by exceeding limits
- G06F11/0757—Error or fault detection not based on redundancy by exceeding limits by exceeding a time limit, i.e. time-out, e.g. watchdogs
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L1/00—Arrangements for detecting or preventing errors in the information received
- H04L1/20—Arrangements for detecting or preventing errors in the information received using signal quality detector
- H04L1/205—Arrangements for detecting or preventing errors in the information received using signal quality detector jitter monitoring
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/08—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
- H04L43/0805—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability
- H04L43/0817—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability by checking functioning
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/08—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
- H04L43/0852—Delays
- H04L43/0864—Round trip delays
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/3003—Monitoring arrangements specially adapted to the computing system or computing system component being monitored
- G06F11/3006—Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2201/00—Indexing scheme relating to error detection, to error correction, and to monitoring
- G06F2201/805—Real-time
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2201/00—Indexing scheme relating to error detection, to error correction, and to monitoring
- G06F2201/81—Threshold
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/08—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
- H04L43/0823—Errors, e.g. transmission errors
- H04L43/0829—Packet loss
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/08—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
- H04L43/0852—Delays
- H04L43/087—Jitter
Landscapes
- Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Computer Networks & Wireless Communication (AREA)
- Theoretical Computer Science (AREA)
- Environmental & Geological Engineering (AREA)
- Quality & Reliability (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Cardiology (AREA)
- Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Computer Hardware Design (AREA)
- Debugging And Monitoring (AREA)
Abstract
一种故障检测的方法,所述方法应用于分布式的节点集群,所述节点集群包括多个节点,所述方法由所述多个节点中的任一节点执行,所述任一节点为第一节点,所述方法包括:所述第一节点判断是否满足健康度评估触发条件,当满足所述健康度评估触发条件时,所述第一节点根据所述第一节点与所述节点集群中的其它节点之间的心跳时延数据分别对所述节点集群中的其它节点健康度进行评估,并获得所述集群中的其它节点的健康度的评估结果。
Description
PCT国内申请,说明书已公开。
Claims (34)
- PCT国内申请,权利要求书已公开。
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2018/079422 WO2019178714A1 (zh) | 2018-03-19 | 2018-03-19 | 一种故障检测的方法、装置及系统 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111869163A true CN111869163A (zh) | 2020-10-30 |
CN111869163B CN111869163B (zh) | 2022-05-24 |
Family
ID=67988268
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201880091411.2A Active CN111869163B (zh) | 2018-03-19 | 2018-03-19 | 一种故障检测的方法、装置及系统 |
Country Status (4)
Country | Link |
---|---|
US (1) | US20210006484A1 (zh) |
EP (1) | EP3761559A4 (zh) |
CN (1) | CN111869163B (zh) |
WO (1) | WO2019178714A1 (zh) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113312234A (zh) * | 2021-05-18 | 2021-08-27 | 福建天泉教育科技有限公司 | 一种健康检测的优化方法及终端 |
CN114285602A (zh) * | 2021-11-26 | 2022-04-05 | 成都安恒信息技术有限公司 | 一种分布式业务安全检测方法 |
CN115225775A (zh) * | 2022-09-19 | 2022-10-21 | 苏州华兴源创科技股份有限公司 | 多通道的延迟修正方法、装置、计算机设备 |
CN115348157A (zh) * | 2021-05-14 | 2022-11-15 | 中国移动通信集团浙江有限公司 | 分布式存储集群的故障定位方法、装置、设备及存储介质 |
CN115550144A (zh) * | 2022-11-30 | 2022-12-30 | 季华实验室 | 分布式故障节点预测方法、装置、电子设备及存储介质 |
Families Citing this family (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
IT201900010362A1 (it) * | 2019-06-28 | 2020-12-28 | Telecom Italia Spa | Abilitazione della misura di perdita di pacchetti round-trip in una rete di comunicazioni a commutazione di pacchetto |
CN111556345B (zh) * | 2020-03-19 | 2023-08-29 | 视联动力信息技术股份有限公司 | 一种网络质量检测的方法、装置、电子设备及存储介质 |
US11811641B1 (en) * | 2020-03-20 | 2023-11-07 | Juniper Networks, Inc. | Secure network topology |
WO2022085260A1 (ja) * | 2020-10-22 | 2022-04-28 | パナソニックIpマネジメント株式会社 | 異常検知装置、異常検知方法及びプログラム |
US11584382B2 (en) * | 2021-02-12 | 2023-02-21 | Fca Us Llc | System and method for malfuncton operation machine stability determination |
CN112988463B (zh) * | 2021-02-23 | 2022-08-30 | 新华三大数据技术有限公司 | 一种故障节点隔离方法及装置 |
CN112804113A (zh) * | 2021-04-15 | 2021-05-14 | 北京全路通信信号研究设计院集团有限公司 | 一种故障判断方法及系统 |
CN113760592B (zh) * | 2021-07-30 | 2024-02-27 | 郑州云海信息技术有限公司 | 一种节点内核检测方法和相关装置 |
CN116127149B (zh) * | 2023-04-14 | 2023-07-04 | 杭州悦数科技有限公司 | 图数据库集群健康度的量化方法和系统 |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2012173996A (ja) * | 2011-02-22 | 2012-09-10 | Nec Corp | クラスタシステム、クラスタ管理方法、およびクラスタ管理プログラム |
CN103023716A (zh) * | 2012-11-26 | 2013-04-03 | 中怡(苏州)科技有限公司 | 一种零流量消耗的网络质量监控系统及监控方法 |
US20140297845A1 (en) * | 2013-03-29 | 2014-10-02 | Fujitsu Limited | Information processing system, computer-readable recording medium having stored therein control program for information processing device, and control method of information processing system |
WO2017008698A1 (zh) * | 2015-07-10 | 2017-01-19 | 努比亚技术有限公司 | 多通道路由方法及装置 |
CN106998302A (zh) * | 2016-01-26 | 2017-08-01 | 华为技术有限公司 | 一种业务流量的分配方法及装置 |
CN107204879A (zh) * | 2017-06-05 | 2017-09-26 | 浙江大学 | 一种基于指数移动平均的分布式系统自适应故障检测方法 |
US20170366436A1 (en) * | 2016-06-16 | 2017-12-21 | Hitachi, Ltd. | Computer system and method of controlling computer system |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
PL1627316T3 (pl) * | 2003-05-27 | 2018-10-31 | Vringo Infrastructure Inc. | Zbieranie danych w klastrze komputerowym |
US7284147B2 (en) * | 2003-08-27 | 2007-10-16 | International Business Machines Corporation | Reliable fault resolution in a cluster |
CN101795234B (zh) * | 2010-03-10 | 2012-02-01 | 北京航空航天大学 | 一种基于应用层组播算法的流媒体传输方案 |
CN102355369B (zh) * | 2011-09-27 | 2014-01-08 | 华为技术有限公司 | 虚拟化集群系统及其处理方法和设备 |
-
2018
- 2018-03-19 CN CN201880091411.2A patent/CN111869163B/zh active Active
- 2018-03-19 WO PCT/CN2018/079422 patent/WO2019178714A1/zh unknown
- 2018-03-19 EP EP18910654.5A patent/EP3761559A4/en active Pending
-
2020
- 2020-09-18 US US17/025,805 patent/US20210006484A1/en not_active Abandoned
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2012173996A (ja) * | 2011-02-22 | 2012-09-10 | Nec Corp | クラスタシステム、クラスタ管理方法、およびクラスタ管理プログラム |
CN103023716A (zh) * | 2012-11-26 | 2013-04-03 | 中怡(苏州)科技有限公司 | 一种零流量消耗的网络质量监控系统及监控方法 |
US20140297845A1 (en) * | 2013-03-29 | 2014-10-02 | Fujitsu Limited | Information processing system, computer-readable recording medium having stored therein control program for information processing device, and control method of information processing system |
WO2017008698A1 (zh) * | 2015-07-10 | 2017-01-19 | 努比亚技术有限公司 | 多通道路由方法及装置 |
CN106998302A (zh) * | 2016-01-26 | 2017-08-01 | 华为技术有限公司 | 一种业务流量的分配方法及装置 |
US20170366436A1 (en) * | 2016-06-16 | 2017-12-21 | Hitachi, Ltd. | Computer system and method of controlling computer system |
CN107204879A (zh) * | 2017-06-05 | 2017-09-26 | 浙江大学 | 一种基于指数移动平均的分布式系统自适应故障检测方法 |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115348157A (zh) * | 2021-05-14 | 2022-11-15 | 中国移动通信集团浙江有限公司 | 分布式存储集群的故障定位方法、装置、设备及存储介质 |
CN115348157B (zh) * | 2021-05-14 | 2023-09-05 | 中国移动通信集团浙江有限公司 | 分布式存储集群的故障定位方法、装置、设备及存储介质 |
CN113312234A (zh) * | 2021-05-18 | 2021-08-27 | 福建天泉教育科技有限公司 | 一种健康检测的优化方法及终端 |
CN114285602A (zh) * | 2021-11-26 | 2022-04-05 | 成都安恒信息技术有限公司 | 一种分布式业务安全检测方法 |
CN114285602B (zh) * | 2021-11-26 | 2024-02-02 | 成都安恒信息技术有限公司 | 一种分布式业务安全检测方法 |
CN115225775A (zh) * | 2022-09-19 | 2022-10-21 | 苏州华兴源创科技股份有限公司 | 多通道的延迟修正方法、装置、计算机设备 |
CN115225775B (zh) * | 2022-09-19 | 2022-12-09 | 苏州华兴源创科技股份有限公司 | 多通道的延迟修正方法、装置、计算机设备 |
CN115550144A (zh) * | 2022-11-30 | 2022-12-30 | 季华实验室 | 分布式故障节点预测方法、装置、电子设备及存储介质 |
Also Published As
Publication number | Publication date |
---|---|
CN111869163B (zh) | 2022-05-24 |
EP3761559A1 (en) | 2021-01-06 |
US20210006484A1 (en) | 2021-01-07 |
WO2019178714A1 (zh) | 2019-09-26 |
EP3761559A4 (en) | 2021-03-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111869163B (zh) | 一种故障检测的方法、装置及系统 | |
US10447561B2 (en) | BFD method and apparatus | |
KR101881409B1 (ko) | 소프트웨어 정의 네트워크에서 멀티-마스터 선택 | |
CN1794651B (zh) | 用于通信网络中问题解决的系统和方法 | |
CN108076019B (zh) | 基于流量镜像的异常流量检测方法及装置 | |
CN102404170B (zh) | 报文丢失检测方法、装置、及系统 | |
WO2002046928A1 (en) | Fault detection and prediction for management of computer networks | |
US9253029B2 (en) | Communication monitor, occurrence prediction method, and recording medium | |
JP4857226B2 (ja) | 無線基地局の障害監視装置および障害監視方法 | |
CN106302001B (zh) | 数据通信网络中业务故障检测方法、相关装置及系统 | |
EP2432193A2 (en) | Method of data replication in a distributed data storage system and corresponding device | |
WO2011154024A1 (en) | Enhancing accuracy of service level agreements in ethernet networks | |
US8971871B2 (en) | Radio base station, control apparatus, and abnormality detection method | |
US11652682B2 (en) | Operations management apparatus, operations management system, and operations management method | |
CN113543246B (zh) | 网络切换方法及设备 | |
US8788735B2 (en) | Interrupt control apparatus, interrupt control system, interrupt control method, and interrupt control program | |
CN110475244B (zh) | 终端管理方法、系统、装置、终端及存储介质 | |
CN114172796A (zh) | 通信网络的故障定位方法及相关装置 | |
CN110138657B (zh) | 交换机间的聚合链路切换方法、装置、设备及存储介质 | |
CN115242610A (zh) | 链路质量监测方法、装置、电子设备和计算机可读存储介质 | |
US20160234344A1 (en) | Message log removal apparatus and message log removal method | |
JP5937955B2 (ja) | パケット転送遅延計測装置及び方法及びプログラム | |
JP2021120827A (ja) | 制御システム、制御方法 | |
JP2021120827A5 (zh) | ||
CN111200520A (zh) | 网络监控方法、服务器和计算机可读存储介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |