WO2011157111A2 - 确定故障指示状态的方法、节点和系统 - Google Patents

确定故障指示状态的方法、节点和系统 Download PDF

Info

Publication number
WO2011157111A2
WO2011157111A2 PCT/CN2011/074888 CN2011074888W WO2011157111A2 WO 2011157111 A2 WO2011157111 A2 WO 2011157111A2 CN 2011074888 W CN2011074888 W CN 2011074888W WO 2011157111 A2 WO2011157111 A2 WO 2011157111A2
Authority
WO
WIPO (PCT)
Prior art keywords
node
service
service node
fault indication
detection result
Prior art date
Application number
PCT/CN2011/074888
Other languages
English (en)
French (fr)
Other versions
WO2011157111A3 (zh
Inventor
朱智勇
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP11795077.4A priority Critical patent/EP2704356B1/en
Priority to PCT/CN2011/074888 priority patent/WO2011157111A2/zh
Priority to JP2014513022A priority patent/JP5802829B2/ja
Priority to CN201180000645.XA priority patent/CN102918802B/zh
Publication of WO2011157111A2 publication Critical patent/WO2011157111A2/zh
Publication of WO2011157111A3 publication Critical patent/WO2011157111A3/zh
Priority to US14/087,880 priority patent/US9471408B2/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/008Reliability or availability analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3006Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0805Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability
    • H04L43/0817Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability by checking functioning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/10Active monitoring, e.g. heartbeat, ping or trace-route
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W24/00Supervisory, monitoring or testing arrangements
    • H04W24/04Arrangements for maintaining operational condition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W24/00Supervisory, monitoring or testing arrangements
    • H04W24/08Testing, supervising or monitoring using real traffic

Definitions

  • the present invention relates to the field of communications, and in particular, to a method, a node, and a system for determining a fault indication state. Background technique
  • the communication network in order to increase the reliability of the network communication or to increase the processing capability of the network node, multiple communication nodes are usually deployed on the same level of the network plane in the communication path. When one of the communication nodes fails, the other communication at the same level is triggered. The failover behavior of the node and the network resource preemption behavior. Therefore, how to detect communication node failures becomes an important issue that needs to be solved.
  • the GGSN In the GPRS (General Packet Radio Service) or UMTS (Universal Mobi-Telecommunication System) network, the GGSN (Gateway GPRS Support Node) is detected by Hello messages. If the fault status of the peer GGSN is not received within the specified time, the sender GGSN considers that the peer GGSN is faulty, which triggers service switching and network resource preemption.
  • GPRS General Packet Radio Service
  • UMTS Universal Mobi-Telecommunication System
  • embodiments of the present invention provide a method, a node, and a system for determining a fault indication state.
  • the technical solution is as follows:
  • a method for determining a fault indication state includes:
  • a method for determining a fault indication state includes:
  • An extension response request message where the response request message carries the service request node to each service in the service node pool Whether the service node has a fault detection result in the previous cycle of the current cycle;
  • each service node determines the fault indication state of the service node and the service node in addition to the service node according to the detection result.
  • the fault status of other service nodes indicates the status.
  • a service node including:
  • a receiver configured to receive, by the service requesting node, a detection result of whether each service node in the service node pool fails
  • a determiner configured to determine, according to the detection result, a fault indication state of the serving node and a fault indication state of other service nodes except the serving node.
  • a service request node including:
  • an extension requester configured to extend a response request message, where the response request message carries a detection result of whether the service requesting node fails in a previous cycle of the current period in the service node pool;
  • a sender configured to send the response request message to each service node in the service node pool at the beginning of a current period, so that each service node determines, according to the detection result, a failure indication state of the service node and The fault indication status of other service nodes other than the service node.
  • a system for determining a fault indication state comprising: a service request node and a service node in a service node pool; the service request node, comprising: an expander and a transmitter;
  • an extension requester configured to extend a response request message, where the response request message carries a detection result of whether the service requesting node fails in a previous cycle of the current period in the service node pool;
  • a sender configured to send the response request message to each service node in the service node pool at the beginning of a current period, so that each service node determines, according to the detection result, a failure indication state of the service node and The fault indication status of other service nodes other than the service node;
  • Each service node includes: a receiver and a determiner
  • a receiver configured to receive an response request message sent by the service requesting node, where the response request message carries a detection result of whether the service requesting node fails in a previous cycle of a current period in a service node pool ;
  • a determiner configured to determine, according to the detection result, a fault indication state of the serving node and a fault indication state of other service nodes except the serving node.
  • the technical solution provided by the embodiment of the present invention has the beneficial effects of: detecting, by the service requesting node, whether the service node is faulty, and determining the fault indication state of the service node and other service nodes according to the detection result of the service request node to the service node, and improving The reliability of service node failure detection.
  • FIG. 1 is a schematic diagram of an NxM internetwork architecture provided by an embodiment of the present invention.
  • FIG. 2 is a flowchart of a method for determining a fault indication state according to Embodiment 1 of the present invention
  • FIG. 2 is a flowchart of a method for determining a fault indication state according to Embodiment 1 of the present invention
  • FIG. 3 is a flowchart of a method for determining a fault indication state according to Embodiment 2 of the present invention.
  • FIG. 4 is a schematic flowchart of determining a fault indication state according to Embodiment 2 of the present invention.
  • FIG. 5 is a schematic diagram of behavior triggering and behavior timing control according to Embodiment 2 of the present invention.
  • FIG. 6 is a schematic structural diagram of a service node according to Embodiment 3 of the present invention.
  • FIG. 7 is a schematic structural diagram of a service request node according to Embodiment 4 of the present invention.
  • FIG. 8 is a schematic structural diagram of a system for determining a fault indication state according to Embodiment 5 of the present invention. detailed description
  • NxM interconnection network architecture composed of N service request nodes and M service nodes, where N is greater than or equal to 1, and M is greater than or equal to 1.
  • M service nodes form a service node pool
  • N service request nodes form a service request node pool
  • each service node in the service node pool and each service in the service request node pool The requesting node is interconnected through the entire network of the IP backbone network.
  • Each service node in the service node pool is in the same network position and has the same function. Sometimes, according to the configuration of the operator, they share some network resources, such as IP address pool resources, and sometimes they back up each other.
  • the other normal running service nodes When one of the service nodes fails, the other normal running service nodes will preempt the network resources of the failed service node or take over the services carried by the faulty service node.
  • the SGSN (Serving GPRS Support Node) is a service requester with respect to the GGSN
  • the GGSN is a service provider with respect to the SGSN. Therefore, in the GPRS network or the UMTS network,
  • the service node involved in the embodiment is a GGSN, and the service request node is an SGSN.
  • SAE system architecture evolution
  • ⁇ E mobile management entity
  • Serving GW Serving GateWay service gateway
  • ⁇ E is a service requesting node
  • Serving GW is a serving node
  • PDN GW Packet Data Network, Packet Data Network Gateway
  • Serving GW is a service requesting node
  • PDN GW is a serving node.
  • Example 1 The technical solution of the present invention is specifically described below based on the NxM internetwork architecture.
  • Example 1 The technical solution of the present invention is specifically described below based on the NxM internetwork architecture.
  • the embodiment provides a method for determining a fault indication state, where the method may be performed by a service node, where the method includes:
  • the fault detection state of the serving node and the other service nodes except the service node are determined according to the detection result by receiving a detection result of the service node that is sent to the service node pool.
  • the fault indication status improves the reliability of the service node fault detection.
  • the method can be performed by a service requesting node, the method comprising:
  • the response request message by extending the response request message, it carries the detection result of whether the service request node detects the failure of each service node in the service node pool in the previous cycle of the current cycle, and sends the detection result to each service node in the service node pool.
  • the response request message is used to enable each serving node to determine the fault indication state of the serving node and the fault indication state of other service nodes except the serving node according to the detection result, thereby improving the reliability of the service node fault detection.
  • This embodiment provides a method for determining a fault indication state.
  • the method includes:
  • period, set to T Select the fault detection and state synchronization period (referred to as period, set to T), and set the same period ⁇ on each service node in the service request node pool and each service node in the service node pool.
  • Steps 201-204 are performed for each service request node in the service request node pool:
  • the service requesting node extends the response request Echo Request message, so that the response request message carries the detection result of the service request node to each service node in the service node pool in the previous period of the current period (set to whether the period is faulty; Further, if the current period is the first period, the response request message may carry a status initialized by each service node in the service node pool. For example, the status of each service node may be initialized as a fault.
  • the method for detecting whether the service requesting node fails in the previous period of the current period for each service node in the service node pool is:
  • the service requesting node sends an echo request message to all the service nodes in the service node pool at the beginning of the period 1 ⁇ . If the service request node receives the response response Echo Responce message returned by the service node before the end of the period 1 ⁇ , the service The requesting node sets the service node to be normal. Otherwise, if the service requesting node does not receive the response response message returned by the serving node before the end of the period 1 ⁇ , the service requesting node sets the serving node as a fault.
  • the response request message may carry a detection result of whether the service request node has failed in the last week of the period L of each service node in the service node pool.
  • the method for the service request node to extend the response request Echo Request message is:
  • the service requesting node utilizes the idle bit of the response request message or adds a new bit, each bit indicating whether a service node is faulty.
  • Echo Request (GGSN_state: 0000 0111), where the 0th bit represents the state of GGSN-1, the 1st bit represents the state of GGSN-2, and the 2nd bit represents the state of GGSN-3.
  • the corresponding bit of the GGSN is 1 to indicate that the GGSN is normal, and the corresponding bit of the GGSN is 0 to detect the GGSN failure.
  • the service requesting node At the beginning of the current period (set to period T 2 ), the service requesting node sends the above response request message to each service node in the service node pool.
  • the serving node After receiving the response request message sent by the service requesting node, the serving node records the detection result, and returns a response response message to the service requesting node.
  • each serving node determines, according to the detection result sent by each service requesting node, the fault indication state of the serving node and the fault indication state of other service nodes except the serving node.
  • the service node determines a fault indication state of the service node other than the serving node, and at least includes one of the following situations:
  • the detection result sent by all the service request nodes in the current period indicates that the service node B in the other service node is faulty, setting the fault indication state of the service node B is a fault; If the detection result sent by a part of the service requesting node is received in the current period, and the detection result sent by the part of the service requesting node indicates that the serving node C in the other serving node is faulty, setting the fault indication state of the serving node C is uncertain.
  • service node the service node 8 and the service node C in this embodiment do not specifically refer to a certain node, but refer to a type of node whose detection result conforms to the characteristics defined in this embodiment.
  • the service node determines the fault indication status of the serving node according to the detection result sent by each service request node, and at least includes one of the following situations:
  • SGSN-1 xlylz l indicates that the service requesting node SGSN-1 respectively serves the serving nodes GGSN-x, GGSN-y, GGSN-
  • the fault detection result of z is xl, yl, zl
  • SGSN-2 x2y2z2 indicates that the fault detection result of the service request node SGSN-2 to the service nodes GGSN_x, GGSN-y, and GGSN-z is x2, y2, z2, SGSN- 3: x3y3z3 indicates that the service requesting node SGSN-3 has fault detection results for the service nodes GGSN-x, GGSN-y, and GGSN-z, respectively, x3, y3, z3.
  • SGSN-1 xlylz K SGSN-2: x2y2z2, and SGSN-3: x3y3z3 are sent to GGSN_x, GGSN-y, and GGSN_z, respectively, GGSN-x, GGSN-y, and GGSN-z according to SGSN-1: xlylz K SGSN-2: x2y2z2, and SGSN-3: x3y3z3 derive the fault indication status of the service node and other service nodes.
  • N/A indicates the fault indication status of GGSN-x
  • y: N/F/U indicates the fault indication status of GGSN-y
  • z: N/F/U indicates the fault indication status of GGSN-z
  • N indicates Normal
  • A means abnormal
  • F means fault
  • U means uncertain.
  • step 206 can also be performed:
  • Each service node takes the failure indication state of the service node as the primary state, and the failure indication state of the other service node is the secondary state, triggering the acquisition behavior and/or release behavior of the network resource.
  • This step specifically includes:
  • the operation of acquiring the network resource of the serving node is triggered after the preset first protection duration (set to Te l); when the fault status of the serving node is indicated
  • the operation of releasing the network resource of the serving node is triggered, and the network resource of the serving node is released in the preset second duration (set to Tr1);
  • the first protection duration is greater than the fourth duration, and the third protection duration is greater than the second duration.
  • service node 0 and the service node E in this embodiment do not specifically refer to a certain node, but refer to a type of node whose fault indication state conforms to the characteristics defined in this embodiment.
  • the present embodiment detects the service through the service requesting node. Whether the node is faulty, and notifying the service node of the detection result, detecting whether the opposite end of the service node is faulty with respect to the service node, eliminating the possibility of state misjudgment (or eliminating the misunderstanding of fault understanding), and improving the service node failure.
  • the reliability of the detection For example, if the communication link between the two GGSNs is interrupted, according to the prior art solution, both GGSNs consider the peer end fault at the same time.
  • the embodiment determines whether the GGSN can still be provided by the service requesting node.
  • the service if the GGSN can provide the service, can obtain the conclusion that the GGSN is normal. If the GGSN is unable to provide the service, the conclusion of the GGSN failure is obtained. Therefore, the reliability of the technical solution of the present embodiment is relatively high.
  • the service node can comprehensively determine the fault indication status of the service node and other service nodes according to the detection results of the multiple service request nodes, thereby further improving the reliability of the service node fault detection.
  • the embodiment also defines a complete behavior triggering and behavior timing control logic, which can avoid network behavior conflicts such as network resource acquisition and/or release.
  • this embodiment provides a service node, including:
  • the receiver 301 is configured to receive, by the service requesting node, a detection result that is faulty for each service node in the service node pool;
  • the determiner 302 is configured to determine, according to the detection result, a fault indication state of the serving node and a fault indication state of other service nodes except the serving node.
  • the receiver 301 is used for the following reasons:
  • the response request message sent by the service requesting node is received in the current period, and the response request message carries a detection result of whether the service requesting node has failed in the previous period of the current period for each service node in the service node pool.
  • the determiner 302 is configured to determine, according to the detection result, a fault of another service node other than the service node When indicating the status, at least one of the following cases is included:
  • the determiner 302 is configured to: when determining the fault indication state of the serving node according to the detection result, at least one of the following situations:
  • the service node also includes:
  • the trigger is used to execute the determiner, and the fault indication state of the service node is the primary state, and the fault indication state of the other service node is the secondary state, triggering the acquisition behavior and/or release behavior of the network resource.
  • the trigger is used to implement at least one of the following situations:
  • the operation of acquiring the network resource of the serving node is triggered after the preset first protection duration; when the fault indication state of the serving node changes from normal to abnormal The operation of releasing the network resource of the serving node is triggered, and the network resource of the serving node is released within a preset second time period;
  • the first protection duration is greater than the fourth duration, and the third protection duration is greater than the second duration.
  • the service node provided in this embodiment belongs to the same concept as the service node in the method embodiment, and the specific process is described in detail. Method embodiments are not described here.
  • the detection result of the failure of each serving node in the service node pool sent by the service requesting node is determined, and according to the detection result, the fault indication state of the serving node and the other than the service node are determined.
  • the fault indication status of other service nodes improves the reliability of service node fault detection.
  • the service node may comprehensively determine the fault indication status of the service node and other service nodes according to the detection results of the multiple service request nodes, thereby further improving the reliability of the service node fault detection.
  • the embodiment also defines a complete behavior triggering and behavior timing control logic, which can avoid network behavior conflicts.
  • this embodiment provides a service request node, including:
  • the expander 401 is configured to extend an echo request message, where the response request message carries a detection result of whether the service requesting node fails in a previous cycle of the current period in the service node pool;
  • the sender 402 is configured to send the response request message to each service node in the service node pool at the beginning of the current period, so that each service node determines the fault indication status of the service node and the service according to the detection result.
  • the expander 401 for
  • each idle bit indicates whether a service node in the pool of service nodes is faulty
  • a new bit is added to the response request message, and one of the service nodes in each new bit service node pool is faulty.
  • the service requesting node provided in this embodiment is in the same concept as the service requesting node in the method embodiment. For details, refer to the method embodiment, and details are not described herein again.
  • the response request message by extending the response request message, it carries the detection result of whether the service request node detects the failure of each service node in the service node pool in the previous cycle of the current cycle, and sends the detection result to each service node in the service node pool.
  • the response request message is used to enable each serving node to determine the fault indication state of the serving node and the fault indication state of other service nodes except the serving node according to the detection result, thereby improving the reliability of the service node fault detection.
  • the service node may comprehensively determine the fault indication status of the service node and other service nodes according to the detection results of the multiple service request nodes, thereby further improving the reliability of the service node fault detection.
  • the embodiment provides a system for determining a fault indication state, including: a service request node 501 and a service node 502 in the service node pool;
  • the service requesting node 501 includes: an expander 401 and a transmitter 402;
  • the expander 401 is configured to extend an echo request message, where the response request message carries a detection result of whether the service requesting node fails in a previous cycle of the current period in the service node pool;
  • the sender 402 is configured to send the response request message to each service node in the service node pool at the beginning of the current period, so that each service node determines the fault indication status of the service node and the service according to the detection result.
  • Each service node 502 includes: a receiver 301 and a determiner 302;
  • the receiver 301 is configured to receive an response request message sent by the service requesting node, where the response request message carries a detection result of whether the service requesting node fails in a previous cycle of the current cycle in each of the service node pools;
  • the determiner 302 is configured to determine, according to the detection result, a fault indication state of the serving node and a fault indication state of other service nodes except the serving node.
  • the service requesting node and the service node provided in this embodiment are in the same concept as the service requesting node and the service node in the method embodiment. For details, refer to the method embodiment, and details are not described herein again.
  • the response request message by extending the response request message, it carries the detection result of whether the service request node detects the failure of each service node in the service node pool in the previous cycle of the current cycle, and sends the detection result to each service node in the service node pool.
  • the response request message is used to enable each serving node to determine the fault indication state of the serving node and the fault indication state of other service nodes except the serving node according to the detection result, thereby improving the reliability of the service node fault detection.
  • the service node may comprehensively determine the fault indication status of the service node and other service nodes according to the detection results of the multiple service request nodes, thereby further improving the reliability of the service node fault detection.
  • a person skilled in the art may understand that all or part of the steps of implementing the above embodiments may be completed by hardware, or may be instructed by a program to execute related hardware, and the program may be stored in a computer readable storage medium.
  • the storage medium mentioned may be a read only memory, a magnetic disk or an optical disk or the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Cardiology (AREA)
  • Environmental & Geological Engineering (AREA)
  • Mathematical Physics (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Mobile Radio Communication Systems (AREA)
  • Telephonic Communication Services (AREA)
  • Maintenance And Management Of Digital Transmission (AREA)

Abstract

本发明实施例提供了一种确定故障指示状态的方法、节点和系统,涉及通信领域,所述方法包括:接收服务请求节点发送的对服务节点池中的各个服务节点是否发生故障的探测结果;根据探测结果,确定本服务节点和其他服务节点的故障指示状态。或者,扩展回应请求消息,携带本服务请求节点对服务节点池中的各个服务节点在当前周期的上一周期是否发生故障的探测结果;在当前周期开始时,向各个服务节点发送回应请求消息,使每个服务节点根据探测结果,确定本服务节点和其他服务节点的故障指示状态。本发明实施例还包括服务请求节点和服务节点,以及由二者组成的系统。本发明上述方案,提高了服务节点故障探测的可靠性。

Description

确定故障指示状态的方法、 节点和系统 技术领域
本发明涉及通信领域, 特别涉及一种确定故障指示状态的方法、 节点和系统。 背景技术
在通讯网络中为了增加网络通讯说的可靠性或为了增加网络节点的处理能力, 通常在通 讯路径上同一级别网络平面部署多个通讯节点, 当其中一个通讯节点故障后, 会触发同一 级别其他通讯节点的故障切换行为以及网络资源抢占行为。 因此, 如何检测通讯节点故障, 就成为需要解决的重要问题。
在 GPRS( General Packet Radio Service,通用无线分组业务)或 UMTS( Universal Mobi le Telecommunicat ions System, 通用移动通信系统) 网络中, GGSN ( Gateway GPRS Support Node , 网关 GPRS支持节点) 之间通过 Hel lo消息探测对端 GGSN的故障状态, 如果在规定 时间内没有收到响应报文, 则发送方 GGSN认为对端 GGSN故障, 从而触发业务倒换和网络 资源抢占行为。
在实现本发明的过程中, 发明人发现现有技术至少存在以下问题:
按照现有的故障检测方法, 如果两个 GGSN之间的通讯链路中断, 则两个 GGSN都同时 认为对端故障, 但实际上两个 GGSN可能都是正常的, 因此, 现有的故障检测方法存在误判 的可能性。 发明内容
为了提高故障检测的可靠性, 本发明实施例提供了一种确定故障指示状态的方法、 节 点和系统。 所述技术方案如下:
一种确定故障指示状态的方法, 包括:
接收服务请求节点发送的对服务节点池中的各个服务节点是否发生故障的探测结果; 根据所述探测结果, 确定本服务节点的故障指示状态和除本服务节点之外的其他服务 节点的故障指示状态。
一种确定故障指示状态的方法, 包括:
扩展回应请求消息, 所述回应请求消息携带本服务请求节点对服务节点池中的各个服 务节点在当前周期的上一周期是否发生故障的探测结果;
在当前周期开始时, 向所述服务节点池中的各个服务节点发送所述回应请求消息, 使 每个服务节点根据所述探测结果, 确定本服务节点的故障指示状态和除本服务节点之外的 其他服务节点的故障指示状态。
一种服务节点, 包括:
接收器, 用于接收服务请求节点发送的对服务节点池中的各个服务节点是否发生故障 的探测结果;
确定器, 用于根据所述探测结果, 确定本服务节点的故障指示状态和除本服务节点之 外的其他服务节点的故障指示状态。
一种服务请求节点, 包括:
扩展器, 用于扩展回应请求消息, 所述回应请求消息携带本服务请求节点对服务节点 池中的各个服务节点在当前周期的上一周期是否发生故障的探测结果;
发送器, 用于在当前周期开始时, 向所述服务节点池中的各个服务节点发送所述回应 请求消息, 使每个服务节点根据所述探测结果, 确定本服务节点的故障指示状态和除本服 务节点之外的其他服务节点的故障指示状态。
一种确定故障指示状态的系统, 包括: 服务请求节点和服务节点池中的服务节点; 所述服务请求节点, 包括: 扩展器和发送器;
扩展器, 用于扩展回应请求消息, 所述回应请求消息携带本服务请求节点对服务节点 池中的各个服务节点在当前周期的上一周期是否发生故障的探测结果;
发送器, 用于在当前周期开始时, 向所述服务节点池中的各个服务节点发送所述回应 请求消息, 使每个服务节点根据所述探测结果, 确定本服务节点的故障指示状态和除本服 务节点之外的其他服务节点的故障指示状态;
每个服务节点, 包括: 接收器和确定器;
接收器, 用于接收所述服务请求节点发送的回应请求消息, 所述回应请求消息携带所 述服务请求节点对服务节点池中的各个服务节点在当前周期的上一周期是否发生故障的探 测结果;
确定器, 用于根据所述探测结果, 确定本服务节点的故障指示状态和除本服务节点之 外的其他服务节点的故障指示状态。
本发明实施例提供的技术方案的有益效果是: 通过服务请求节点探测服务节点是否故 障, 并且根据服务请求节点对服务节点的探测结果, 确定本服务节点和其他服务节点的故 障指示状态, 提高了服务节点故障探测的可靠性。 附图说明
图 1是本发明实施例提供的 NxM互联网络架构示意图;
图 2-a是本发明实施例 1提供的确定故障指示状态的方法流程图;
图 2-b是本发明实施例 1提供的确定故障指示状态的方法流程图;
图 3是本发明实施例 2提供的确定故障指示状态的方法流程图;
图 4是本发明实施例 2提供的故障指示状态确定流程示意图;
图 5是本发明实施例 2提供的行为触发和行为时序控制示意图;
图 6是本发明实施例 3提供的服务节点结构示意图;
图 7是本发明实施例 4提供的服务请求节点结构示意图;
图 8是本发明实施例 5提供的确定故障指示状态的系统结构示意图。 具体实施方式
为使本发明的目的、 技术方案和优点更加清楚, 下面将结合附图对本发明实施方式作 进一步地详细描述。
本发明实施例提供的技术方案适用于 N个服务请求节点和 M个服务节点组成的 NxM互 联网络架构, 其中, N大于或等于 1, M大于或等于 1。 参见图 1所示的 NxM互联网络架构 示意图, M个服务节点组成服务节点池, N个服务请求节点组成服务请求节点池, 服务节点 池中的每一个服务节点与服务请求节点池中的各个服务请求节点通过 IP骨干网全网互联。 服务节点池中的每一个服务节点处于同等的网络地位, 具备同样的功能, 根据运营商的配 置有时它们会共享一些网络资源, 如 IP地址池资源, 有时它们之间还会互相备份业务。 当 其中一个服务节点故障的时候, 其他正常运行的服务节点会抢占故障服务节点的网络资源, 或接管故障服务节点承载的业务。在 GPRS网络或 UMTS网络中, SGSN ( Serving GPRS Support Node, GPRS服务支持节点) 相对 GGSN来讲是服务请求方, GGSN相对 SGSN来讲是服务提供 方, 因此, 在 GPRS网络或 UMTS网络中, 本实施例所涉及的服务节点是 GGSN, 服务请求节 点是 SGSN。在 LTECLong Term Evolution,长其月演进)- SAE (System Architecture Evolution, 系统架构演进) 网络架构中, 业务的发起方向总是从匪 E (移动管理实体)指向 Serving GW (Serving GateWay服务网关), 所以相对来讲匪 E是服务请求节点, Serving GW是服务节 点; 对于 Serving GW和 PDN GW (Packet Data Network, 分组数据网网关) 而言, Serving GW是服务请求节点, PDN GW是服务节点。
下面基于 NxM互联网络架构, 具体阐述本发明的技术方案。 实施例 1
参见图 2-a, 本实施例提供了一种确定故障指示状态的方法, 该方法可以由服务节点执 行, 该方法包括:
101a: 接收服务请求节点发送的对服务节点池中的各个服务节点是否发生故障的探测 结果;
102a: 根据该探测结果, 确定本服务节点的故障指示状态和除本服务节点之外的其他 服务节点的故障指示状态。
本实施例通过接收服务请求节点发送的对服务节点池中的各个服务节点是否发生故障 的探测结果, 根据该探测结果, 确定本服务节点的故障指示状态和除本服务节点之外的其 他服务节点的故障指示状态, 提高了服务节点故障探测的可靠性。
参见图 2-b, 该方法可以由服务请求节点执行, 该方法包括:
101b : 扩展回应请求消息, 该回应请求消息携带本服务请求节点对服务节点池中的各 个服务节点在当前周期的上一周期是否发生故障的探测结果;
102b : 在当前周期开始时, 向该服务节点池中的各个服务节点发送该回应请求消息, 使每个服务节点根据该探测结果, 确定本服务节点的故障指示状态和除本服务节点之外的 其他服务节点的故障指示状态。
本实施例通过扩展回应请求消息, 使其携带本服务请求节点对服务节点池中的各个服 务节点在当前周期的上一周期是否发生故障的探测结果, 并向服务节点池中的各个服务节 点发送该回应请求消息, 使每个服务节点根据该探测结果, 确定本服务节点的故障指示状 态和除本服务节点之外的其他服务节点的故障指示状态, 提高了服务节点故障探测的可靠 性。 实施例 2
本实施例提供了一种确定故障指示状态的方法, 参见图 3, 该方法包括:
200: 选取故障探测与状态同步周期 (简称周期, 设为 T), 在服务请求节点池中的每一 个服务请求节点和服务节点池中的每一个服务节点上设置相同的周期 τ。
对于服务请求节点池中的每一个服务请求节点都执行步骤 201-204:
201: 服务请求节点扩展回应请求 Echo Request 消息, 使该回应请求消息携带本服务 请求节点对服务节点池中的各个服务节点在当前周期的上一周期 (设为周期 是否发生 故障的探测结果; 进一步的, 如果当前周期为第一个周期, 则该回应请求消息可以携带服务节点池中的 各个服务节点初始化的状态, 例如, 可以将各个服务节点的状态初始化为故障。
其中, 本服务请求节点对服务节点池中的各个服务节点在当前周期的上一周期是否发 生故障的探测方法为:
本服务请求节点在周期1\开始时向服务节点池中的所有服务节点发送回应请求消息, 如果在周期 1\结束前本服务请求节点接收到服务节点返回的回应响应 Echo Responce消息, 则本服务请求节点设置该服务节点为正常, 反之, 如果在周期 1\结束前本服务请求节点没 有接收到服务节点返回的回应响应消息, 则本服务请求节点设置该服务节点为故障。 其中, 回应请求消息可以携带本服务请求节点对服务节点池中的各个服务节点在周期 L的上一周 期是否发生故障的探测结果。
其中, 服务请求节点扩展回应请求 Echo Request消息的方法为:
服务请求节点利用回应请求消息的空闲比特或增设新的比特, 每一比特分别表示一个 服务节点是否故障。 例如, Echo Request (GGSN— states: 0000 0111 ), 其中, 第 0 比特代 表 GGSN-1的状态, 第 1比特代表 GGSN-2的状态, 第 2比特代表 GGSN-3的状态。 对于 GGSN 来说, GGSN相应比特为 1代表探测到该 GGSN正常, GGSN相应比特为 0代表探测到该 GGSN 故障。
202: 在当前周期 (设为周期 T2) 开始时, 服务请求节点向服务节点池中的各个服务节 点发送上述回应请求消息。
203: 服务节点接收到服务请求节点发送的回应请求消息后, 记录探测结果, 并返回回 应响应消息给服务请求节点。
204: 如果在当前周期结束前服务请求节点接收到服务节点返回的回应响应消息, 则设 置该服务节点为正常, 反之, 如果在当前周期结束前服务请求节点没有接收到服务节点返 回的回应响应消息, 则设置该服务节点为故障。
205: 在当前周期结束时, 每个服务节点根据各个服务请求节点发送的探测结果, 确定 本服务节点的故障指示状态和除本服务节点之外的其他服务节点的故障指示状态。
具体的, 根据各个服务请求节点发送的探测结果, 服务节点确定除本服务节点之外的 其他服务节点的故障指示状态, 至少包括以下情况中的一种:
如果在当前周期内接收到至少一个服务请求节点发送的探测结果指示该其他服务节点 中的服务节点 Α正常, 则设置该服务节点 A的故障指示状态是正常;
如果在当前周期内接收到所有服务请求节点发送的探测结果都指示该其他服务节点中 的服务节点 B故障, 则设置该服务节点 B的故障指示状态是故障; 如果在当前周期内接收到部分服务请求节点发送的探测结果、 并且该部分服务请求节 点发送的探测结果都指示该其他服务节点中的服务节点 C故障, 则设置该服务节点 C的故 障指示状态是不确定。
需要说明的是, 本实施例中的服务节点 、 服务节点8、 服务节点 C并不特指某一个节 点, 而是指探测结果符合本实施例限定特征的一类节点。
具体的, 根据各个服务请求节点发送的探测结果, 服务节点确定本服务节点的故障指 示状态, 至少包括以下情况中的一种:
如果在当前周期内接收到至少一个服务请求节点发送的探测结果指示本服务节点正 常, 则设置本服务节点的故障指示状态是正常;
如果在当前周期内没有接收到任何服务请求节点发送的探测结果, 或者, 如果在当前 周期内接收到服务请求节点发送的探测结果都指示本服务节点故障, 则设置本服务节点的 故障指示状态是不正常。
为了更加形象的说明本步骤, 参见图 4所示的故障指示状态确定流程示意图, 其中, SGSN-1 : xlylz l表示服务请求节点 SGSN-1分别对服务节点 GGSN-x、 GGSN-y, GGSN-z的故 障探测结果为 xl、yl、z l, SGSN-2: x2y2z2表示服务请求节点 SGSN-2分别对服务节点 GGSN_x、 GGSN-y、 GGSN-z的故障探测结果为 x2、 y2、 z2 , SGSN-3: x3y3z3表示服务请求节点 SGSN-3 分别对服务节点 GGSN-x、 GGSN-y、 GGSN-z的故障探测结果为 x3、 y3、 z3。 SGSN-1 : xlylz K SGSN-2: x2y2z2、 和 SGSN- 3 : x3y3z3分别发送到 GGSN_x、 GGSN- y、 和 GGSN_z, GGSN- x、 GGSN- y、 和 GGSN-z分别根据 SGSN-1 : xlylz K SGSN-2: x2y2z2、 和 SGSN- 3 : x3y3z3得出 本服务节点和其他服务节点的故障指示状态。 其中, x : N/A表示 GGSN-x 的故障指示状态, y : N/F/U表示 GGSN-y的故障指示状态, z: N/F/U表示 GGSN-z的故障指示状态, N表示正常, A表示不正常, F表示故障, U表示不确定。
可选的, 步骤 205之后还可以执行步骤 206 :
206: 每个服务节点以本服务节点的故障指示状态为主状态, 以其他服务节点的故障指 示状态为辅状态, 触发网络资源的获取行为和 /或释放行为。
参见图 5所示的行为触发和行为时序控制示意图, 本步骤具体包括:
当本服务节点的故障指示状态由不正常变迁到正常时,则经过预设的第一保护时长(设 为 Te l )后触发获取本服务节点网络资源的操作; 当本服务节点的故障指示状态由正常变迁 到不正常时, 则触发释放本服务节点网络资源的操作, 并且在预设的第二时长 (设为 Trl ) 内将本服务节点的网络资源释放完毕;
当本服务节点的故障指示状态是正常时, 如果该其他服务节点中的服务节点 D 的故障 指示状态由正常变迁到故障时, 则经过预设的第三保护时长(设为 Tc2 )触发获取该服务节 点 D网络资源的操作;
当本服务节点的故障指示状态是正常时, 如果该其他服务节点中的服务节点 E 的故障 指示状态由故障变迁到正常或不确定时, 则触发释放该服务节点 E 网络资源的操作, 并且 在预设的第四时长 (设为 Tr2 ) 内将该服务节点 E的网络资源释放完毕;
其中, 第一保护时长大于第四时长, 第三保护时长大于第二时长。
需要说明的是, 本实施例中的服务节点0、 服务节点 E并不特指某一个节点, 而是指故 障指示状态符合本实施例限定特征的一类节点。
由于故障的含义就是不能再提供服务, 所以从服务请求节点来看服务节点是否故障与 服务节点是否能给服务请求节点提供服务的语义是一致的, 因此, 本实施例通过服务请求 节点来探测服务节点是否故障, 并将探测结果通知服务节点, 相对于服务节点之间彼此探 测对端是否故障, 消除了状态误判的可能性 (或者说消除了故障理解的歧议), 提高了服务 节点故障探测的可靠性。 例如, 如果两个 GGSN之间的通讯链路中断, 按照现有技术方案, 则两个 GGSN都同时认为对端故障, 按照本实施例的技术方案, 由服务请求节点来看 GGSN 是否还能提供服务, 如果 GGSN能提供服务, 则可以得出该 GGSN正常的结论, 如果 GGSN不 能提供服务, 则得出该 GGSN故障的结论, 因此, 本实施例的技术方案故障探测的可靠性比 较高。 当服务请求节点有多个时, 服务节点可以根据多个服务请求节点的探测结果综合确 定本服务节点和其他服务节点的故障指示状态, 进一步地提高了服务节点故障探测的可靠 性。 另外, 本实施例还定义了完备的行为触发和行为时序控制逻辑, 可以避免网络资源获 取和 /或释放等网络行为冲突。 实施例 3
参见图 6, 本实施例提供了一种服务节点, 包括:
接收器 301,用于接收服务请求节点发送的对服务节点池中的各个服务节点是否发生故 障的探测结果;
确定器 302, 用于根据该探测结果, 确定本服务节点的故障指示状态和除本服务节点之 外的其他服务节点的故障指示状态。
该接收器 301, 用于
在当前周期内接收服务请求节点发送的回应请求消息, 该回应请求消息携带该服务请 求节点对服务节点池中的各个服务节点在当前周期的上一周期是否发生故障的探测结果。
该确定器 302, 用于当根据该探测结果, 确定除本服务节点之外的其他服务节点的故障 指示状态时, 至少包括以下情况中的一种:
如果在当前周期内接收到至少一个服务请求节点发送的探测结果指示该其他服务节点 中的服务节点 A正常, 则设置该服务节点 A的故障指示状态是正常;
如果在当前周期内接收到所有服务请求节点发送的探测结果都指示该其他服务节点中 的服务节点 B故障, 则设置该服务节点 B的故障指示状态是故障;
如果在当前周期内接收到部分服务请求节点发送的探测结果、 并且该部分服务请求节 点发送的探测结果都指示该其他服务节点中的服务节点 C故障, 则设置该服务节点 C的故 障指示状态是不确定。
该确定器 302, 用于当根据该探测结果, 确定本服务节点的故障指示状态时, 至少包括 以下情况中的一种:
如果在当前周期内接收到至少一个服务请求节点发送的探测结果指示本服务节点正 常, 则设置本服务节点的故障指示状态是正常;
如果在当前周期内没有接收到任何服务请求节点发送的探测结果, 或者, 如果在当前 周期内接收到服务请求节点发送的探测结果都指示本服务节点故障, 则设置本服务节点的 故障指示状态是不正常。
该服务节点还包括:
触发器, 用于执行确定器之后, 以本服务节点的故障指示状态为主状态, 以其他服务 节点的故障指示状态为辅状态, 触发网络资源的获取行为和 /或释放行为。
该触发器, 用于实现以下情况中的至少一种:
当本服务节点的故障指示状态由不正常变迁到正常时, 则经过预设的第一保护时长后 触发获取本服务节点网络资源的操作; 当本服务节点的故障指示状态由正常变迁到不正常 时, 则触发释放本服务节点网络资源的操作, 并且在预设的第二时长内将本服务节点的网 络资源释放完毕;
当本服务节点的故障指示状态是正常时, 如果该其他服务节点中的服务节点 D 的故障 指示状态由正常变迁到故障时, 则经过预设的第三保护时长触发获取该服务节点 D 网络资 源的操作;
当本服务节点的故障指示状态是正常时, 如果该其他服务节点中的服务节点 E 的故障 指示状态由故障变迁到正常或不确定时, 则触发释放该服务节点 E 网络资源的操作, 并且 在预设的第四时长内将该服务节点 E的网络资源释放完毕;
其中, 该第一保护时长大于该第四时长, 该第三保护时长大于该第二时长。
本实施例提供的服务节点与方法实施例中的服务节点属于同一构思, 其具体过程详见 方法实施例, 这里不再赘述。
本实施例本实施例通过接收服务请求节点发送的对服务节点池中的各个服务节点是否 发生故障的探测结果, 根据该探测结果, 确定本服务节点的故障指示状态和除本服务节点 之外的其他服务节点的故障指示状态, 提高了服务节点故障探测的可靠性。 当服务请求节 点有多个时, 服务节点可以根据多个服务请求节点的探测结果综合确定本服务节点和其他 服务节点的故障指示状态, 进一步地提高了服务节点故障探测的可靠性。 另外, 本实施例 还定义了完备的行为触发和行为时序控制逻辑, 可以避免网络行为冲突。 实施例 4
参见图 7, 本实施例提供了一种服务请求节点, 包括:
扩展器 401, 用于扩展回应请求消息, 该回应请求消息携带本服务请求节点对服务节点 池中的各个服务节点在当前周期的上一周期是否发生故障的探测结果;
发送器 402, 用于在当前周期开始时, 向该服务节点池中的各个服务节点发送该回应请 求消息, 使每个服务节点根据该探测结果, 确定本服务节点的故障指示状态和除本服务节 点之外的其他服务节点的故障指示状态。
该扩展器 401, 用于
利用回应请求消息的空闲比特, 每一个空闲比特表示服务节点池中的一个服务节点是 否故障;
或者, 在回应请求消息中增设新的比特, 每一个新的比特服务节点池中的一个服务节 点是否故障。
本实施例提供的服务请求节点与方法实施例中的服务请求节点属于同一构思, 其具体 过程详见方法实施例, 这里不再赘述。
本实施例通过扩展回应请求消息, 使其携带本服务请求节点对服务节点池中的各个服 务节点在当前周期的上一周期是否发生故障的探测结果, 并向服务节点池中的各个服务节 点发送该回应请求消息, 使每个服务节点根据该探测结果, 确定本服务节点的故障指示状 态和除本服务节点之外的其他服务节点的故障指示状态, 提高了服务节点故障探测的可靠 性。 当服务请求节点有多个时, 服务节点可以根据多个服务请求节点的探测结果综合确定 本服务节点和其他服务节点的故障指示状态, 进一步地提高了服务节点故障探测的可靠性。 实施例 5
参见图 8, 本实施例提供了一种确定故障指示状态的系统, 包括: 服务请求节点 501和 服务节点池中的服务节点 502;
该服务请求节点 501, 包括: 扩展器 401和发送器 402;
扩展器 401, 用于扩展回应请求消息, 该回应请求消息携带本服务请求节点对服务节点 池中的各个服务节点在当前周期的上一周期是否发生故障的探测结果;
发送器 402, 用于在当前周期开始时, 向该服务节点池中的各个服务节点发送该回应请 求消息, 使每个服务节点根据该探测结果, 确定本服务节点的故障指示状态和除本服务节 点之外的其他服务节点的故障指示状态;
每个服务节点 502, 包括: 接收器 301和确定器 302;
接收器 301, 用于接收该服务请求节点发送的回应请求消息, 该回应请求消息携带该服 务请求节点对服务节点池中的各个服务节点在当前周期的上一周期是否发生故障的探测结 果;
确定器 302, 用于根据该探测结果, 确定本服务节点的故障指示状态和除本服务节点之 外的其他服务节点的故障指示状态。
本实施例提供的服务请求节点和服务节点与方法实施例中的服务请求节点和服务节点 属于同一构思, 其具体过程详见方法实施例, 这里不再赘述。
本实施例通过扩展回应请求消息, 使其携带本服务请求节点对服务节点池中的各个服 务节点在当前周期的上一周期是否发生故障的探测结果, 并向服务节点池中的各个服务节 点发送该回应请求消息, 使每个服务节点根据该探测结果, 确定本服务节点的故障指示状 态和除本服务节点之外的其他服务节点的故障指示状态, 提高了服务节点故障探测的可靠 性。 当服务请求节点有多个时, 服务节点可以根据多个服务请求节点的探测结果综合确定 本服务节点和其他服务节点的故障指示状态, 进一步地提高了服务节点故障探测的可靠性。 本领域普通技术人员可以理解实现上述实施例的全部或部分步骤可以通过硬件来完 成, 也可以通过程序来指令相关的硬件完成, 所述的程序可以存储于一种计算机可读存储 介质中, 上述提到的存储介质可以是只读存储器, 磁盘或光盘等。
以上所述仅为本发明的较佳实施例, 并不用以限制本发明, 凡在本发明的精神和原则 之内, 所作的任何修改、 等同替换、 改进等, 均应包含在本发明的保护范围之内。

Claims

权 利 要 求 书
1、 一种确定故障指示状态的方法, 其特征在于, 包括:
接收服务请求节点发送的对服务节点池中的各个服务节点是否发生故障的探测结果; 根据所述探测结果, 确定本服务节点的故障指示状态和除本服务节点之外的其他服务节 点的故障指示状态。
2、根据权利要求 1所述的方法, 其特征在于, 所述接收服务请求节点发送的对服务节点 池中的各个服务节点是否发生故障的探测结果, 包括:
在当前周期内接收服务请求节点发送的回应请求消息, 所述回应请求消息携带所述服务 请求节点对服务节点池中的各个服务节点在当前周期的上一周期是否发生故障的探测结果。
3、根据权利要求 1或 2所述的方法, 所述根据所述探测结果, 确定除本服务节点之外的 其他服务节点的故障指示状态, 至少包括以下情况中的一种:
如果在当前周期内接收到至少一个服务请求节点发送的探测结果指示所述其他服务节点 中的服务节点 A正常, 则设置所述服务节点 A的故障指示状态是正常;
如果在当前周期内接收到所有服务请求节点发送的探测结果都指示所述其他服务节点中 的服务节点 B故障, 则设置所述服务节点 B的故障指示状态是故障;
如果在当前周期内接收到部分服务请求节点发送的探测结果、 并且所述部分服务请求节 点发送的探测结果都指示所述其他服务节点中的服务节点 C故障, 则设置所述服务节点 C的 故障指示状态是不确定。
4、根据权利要求 1或 2所述的方法, 所述根据所述探测结果, 确定本服务节点的故障指 示状态, 至少包括以下情况中的一种:
如果在当前周期内接收到至少一个服务请求节点发送的探测结果指示本服务节点正常, 则设置本服务节点的故障指示状态是正常;
如果在当前周期内没有接收到任何服务请求节点发送的探测结果, 或者, 如果在当前周 期内接收到服务请求节点发送的探测结果都指示本服务节点故障, 则设置本服务节点的故障 指示状态是不正常。
5、 根据权利要求 1所述的方法, 其特征在于, 所述根据所述探测结果, 确定本服务节点 的故障指示状态和除本服务节点之外的其他服务节点的故障指示状态, 之后包括:
以本服务节点的故障指示状态为主状态, 以其他服务节点的故障指示状态为辅状态, 触 发网络资源的获取行为和 /或释放行为。
6、 根据权利要求 5所述的方法, 其特征在于, 以本服务节点的故障指示状态为主状态, 以其他服务节点的故障指示状态为辅状态, 触发网络资源的获取行为和 /或释放行为, 至少包 括以下情况中的一种:
当本服务节点的故障指示状态由不正常变迁到正常时, 则经过预设的第一保护时长后触 发获取本服务节点网络资源的操作; 当本服务节点的故障指示状态由正常变迁到不正常时, 则触发释放本服务节点网络资源的操作, 并且在预设的第二时长内将本服务节点的网络资源 释放完毕;
当本服务节点的故障指示状态是正常时, 如果所述其他服务节点中的服务节点 D的故障 指示状态由正常变迁到故障时, 则经过预设的第三保护时长触发获取所述服务节点 D网络资 源的操作;
当本服务节点的故障指示状态是正常时, 如果所述其他服务节点中的服务节点 E的故障 指示状态由故障变迁到正常或不确定时, 则触发释放所述服务节点 E网络资源的操作, 并且 在预设的第四时长内将所述服务节点 E的网络资源释放完毕;
其中, 所述第一保护时长大于所述第四时长, 所述第三保护时长大于所述第二时长。
7、根据权利要求 1所述的方法, 其特征在于, 所述接收服务请求节点发送的对服务节点 池中的各个服务节点是否发生故障的探测结果, 之前包括:
所述服务请求节点扩展回应请求消息, 所述回应请求消息携带本服务请求节点对服务节 点池中的各个服务节点在当前周期的上一周期是否发生故障的探测结果;
所述服务请求节点在当前周期开始时, 向所述服务节点池中的各个服务节点发送所述回 应请求消息。
8、 一种确定故障指示状态的方法, 其特征在于, 包括:
扩展回应请求消息, 所述回应请求消息携带本服务请求节点对服务节点池中的各个服务 节点在当前周期的上一周期是否发生故障的探测结果;
在当前周期开始时, 向所述服务节点池中的各个服务节点发送所述回应请求消息, 使每 个服务节点根据所述探测结果, 确定本服务节点的故障指示状态和除本服务节点之外的其他 服务节点的故障指示状态。
9、 根据权利要求 8所述的方法, 其特征在于, 所述扩展回应请求消息, 包括: 利用回应请求消息的空闲比特, 每一个空闲比特表示服务节点池中的一个服务节点是否 故障; 或者, 在回应请求消息中增设新的比特, 每一个新的比特服务节点池中的一个服务节点 是否故障。
10、 一种服务节点, 其特征在于, 包括:
接收器, 用于接收服务请求节点发送的对服务节点池中的各个服务节点是否发生故障的 探测结果;
确定器, 用于根据所述探测结果, 确定本服务节点的故障指示状态和除本服务节点之外 的其他服务节点的故障指示状态。
11、 根据权利要求 10所述的服务节点, 其特征在于, 所述接收器, 用于
在当前周期内接收服务请求节点发送的回应请求消息, 所述回应请求消息携带所述服务 请求节点对服务节点池中的各个服务节点在当前周期的上一周期是否发生故障的探测结果。
12、 根据权利要求 10或 11所述的服务节点, 所述确定器, 用于当根据所述探测结果, 确定除本服务节点之外的其他服务节点的故障指示状态时, 至少包括以下情况中的一种: 如果在当前周期内接收到至少一个服务请求节点发送的探测结果指示所述其他服务节点 中的服务节点 A正常, 则设置所述服务节点 A的故障指示状态是正常;
如果在当前周期内接收到所有服务请求节点发送的探测结果都指示所述其他服务节点中 的服务节点 B故障, 则设置所述服务节点 B的故障指示状态是故障;
如果在当前周期内接收到部分服务请求节点发送的探测结果、 并且所述部分服务请求节 点发送的探测结果都指示所述其他服务节点中的服务节点 C故障, 则设置所述服务节点 C的 故障指示状态是不确定。
13、 根据权利要求 10或 11所述的服务节点, 所述确定器, 用于当根据所述探测结果, 确定本服务节点的故障指示状态时, 至少包括以下情况中的一种:
如果在当前周期内接收到至少一个服务请求节点发送的探测结果指示本服务节点正常, 则设置本服务节点的故障指示状态是正常;
如果在当前周期内没有接收到任何服务请求节点发送的探测结果, 或者, 如果在当前周 期内接收到服务请求节点发送的探测结果都指示本服务节点故障, 则设置本服务节点的故障 指示状态是不正常。
14、 根据权利要求 10所述的服务节点, 其特征在于, 所述服务节点还包括: 触发器, 用于执行确定器之后, 以本服务节点的故障指示状态为主状态, 以其他服务节 点的故障指示状态为辅状态, 触发网络资源的获取行为和 /或释放行为。
15、 根据权利要求 14所述的服务节点, 其特征在于, 所述触发器, 用于实现以下情况中 的至少一种:
当本服务节点的故障指示状态由不正常变迁到正常时, 则经过预设的第一保护时长后触 发获取本服务节点网络资源的操作; 当本服务节点的故障指示状态由正常变迁到不正常时, 则触发释放本服务节点网络资源的操作, 并且在预设的第二时长内将本服务节点的网络资源 释放完毕;
当本服务节点的故障指示状态是正常时, 如果所述其他服务节点中的服务节点 D的故障 指示状态由正常变迁到故障时, 则经过预设的第三保护时长触发获取所述服务节点 D网络资 源的操作;
当本服务节点的故障指示状态是正常时, 如果所述其他服务节点中的服务节点 E的故障 指示状态由故障变迁到正常或不确定时, 则触发释放所述服务节点 E网络资源的操作, 并且 在预设的第四时长内将所述服务节点 E的网络资源释放完毕;
其中, 所述第一保护时长大于所述第四时长, 所述第三保护时长大于所述第二时长。
16、 一种服务请求节点, 其特征在于, 包括:
扩展器, 用于扩展回应请求消息, 所述回应请求消息携带本服务请求节点对服务节点池 中的各个服务节点在当前周期的上一周期是否发生故障的探测结果;
发送器, 用于在当前周期开始时, 向所述服务节点池中的各个服务节点发送所述回应请 求消息, 使每个服务节点根据所述探测结果, 确定本服务节点的故障指示状态和除本服务节 点之外的其他服务节点的故障指示状态。
17、 根据权利要求 16所述的服务请求节点, 其特征在于, 所述扩展器, 用于 利用回应请求消息的空闲比特, 每一个空闲比特表示服务节点池中的一个服务节点是否 故障;
或者, 在回应请求消息中增设新的比特, 每一个新的比特服务节点池中的一个服务节点 是否故障。
18、 一种确定故障指示状态的系统, 其特征在于, 包括: 服务请求节点和服务节点池中 的服务节点;
所述服务请求节点, 包括: 扩展器和发送器;
扩展器, 用于扩展回应请求消息, 所述回应请求消息携带本服务请求节点对服务节点池 中的各个服务节点在当前周期的上一周期是否发生故障的探测结果;
发送器, 用于在当前周期开始时, 向所述服务节点池中的各个服务节点发送所述回应请 求消息, 使每个服务节点根据所述探测结果, 确定本服务节点的故障指示状态和除本服务节 点之外的其他服务节点的故障指示状态;
每个服务节点, 包括: 接收器和确定器;
接收器, 用于接收所述服务请求节点发送的回应请求消息, 所述回应请求消息携带所述 服务请求节点对服务节点池中的各个服务节点在当前周期的上一周期是否发生故障的探测结 果;
确定器, 用于根据所述探测结果, 确定本服务节点的故障指示状态和除本服务节点之外 的其他服务节点的故障指示状态。
PCT/CN2011/074888 2011-05-30 2011-05-30 确定故障指示状态的方法、节点和系统 WO2011157111A2 (zh)

Priority Applications (5)

Application Number Priority Date Filing Date Title
EP11795077.4A EP2704356B1 (en) 2011-05-30 2011-05-30 Method and service node for determining fault state
PCT/CN2011/074888 WO2011157111A2 (zh) 2011-05-30 2011-05-30 确定故障指示状态的方法、节点和系统
JP2014513022A JP5802829B2 (ja) 2011-05-30 2011-05-30 障害表示状態を判定する方法、ノード、及びシステム
CN201180000645.XA CN102918802B (zh) 2011-05-30 2011-05-30 确定故障指示状态的方法、节点和系统
US14/087,880 US9471408B2 (en) 2011-05-30 2013-11-22 Method, node and system for determining fault indication state

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2011/074888 WO2011157111A2 (zh) 2011-05-30 2011-05-30 确定故障指示状态的方法、节点和系统

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US14/087,880 Continuation US9471408B2 (en) 2011-05-30 2013-11-22 Method, node and system for determining fault indication state

Publications (2)

Publication Number Publication Date
WO2011157111A2 true WO2011157111A2 (zh) 2011-12-22
WO2011157111A3 WO2011157111A3 (zh) 2012-05-03

Family

ID=45348606

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2011/074888 WO2011157111A2 (zh) 2011-05-30 2011-05-30 确定故障指示状态的方法、节点和系统

Country Status (5)

Country Link
US (1) US9471408B2 (zh)
EP (1) EP2704356B1 (zh)
JP (1) JP5802829B2 (zh)
CN (1) CN102918802B (zh)
WO (1) WO2011157111A2 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3086509A4 (en) * 2013-12-19 2016-10-26 Zte Corp METHOD, DEVICE AND SYSTEM FOR TROUBLESHOOTING FOR NETWORK SERVICE NODE

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106656580B (zh) * 2016-11-29 2020-06-26 华为技术有限公司 一种业务状态的迁移方法及装置
CN110162424B (zh) * 2019-05-23 2022-03-22 腾讯科技(深圳)有限公司 故障处理方法、系统、装置及存储介质
CN113489608A (zh) * 2021-06-30 2021-10-08 四川虹美智能科技有限公司 业务异常的处理方法和装置

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS61131933A (ja) * 1984-11-30 1986-06-19 Nec Eng Ltd 分散処理形交換装置のヘルスチエツク方式
US6161196A (en) * 1998-06-19 2000-12-12 Lucent Technologies Inc. Fault tolerance via N-modular software redundancy using indirect instrumentation
JP3983138B2 (ja) * 2002-08-29 2007-09-26 富士通株式会社 障害情報収集プログラムおよび障害情報収集装置
JP4457581B2 (ja) * 2003-05-28 2010-04-28 日本電気株式会社 耐障害システム、プログラム並列実行方法、耐障害システムの障害検出装置およびプログラム
US7284147B2 (en) * 2003-08-27 2007-10-16 International Business Machines Corporation Reliable fault resolution in a cluster
US7228460B2 (en) * 2004-01-23 2007-06-05 Hewlett-Packard Development Company, L.P. Multi-state status reporting for high-availability cluster nodes
CN1728658A (zh) * 2004-07-29 2006-02-01 华为技术有限公司 一种检测网关服务节点和计费网关之间连通性的方法
CN100431314C (zh) * 2005-05-08 2008-11-05 中兴通讯股份有限公司 自动交换光网络中维护控制平面可达性信息的方法
CN100579025C (zh) * 2006-03-30 2010-01-06 中兴通讯股份有限公司 一种自动交换光网络的路由信息维护方法
CN101207408B (zh) 2006-12-22 2012-07-11 中兴通讯股份有限公司 一种用于主备倒换的综合故障检测装置和方法
US8671151B2 (en) * 2007-01-24 2014-03-11 Oracle International Corporation Maintaining item-to-node mapping information in a distributed system
CN101420335B (zh) * 2007-10-26 2011-09-14 华为技术有限公司 对等网络节点故障检测/处理方法及装置
CN101459549B (zh) * 2007-12-14 2011-09-21 华为技术有限公司 链路故障处理方法及数据转发装置
US8774010B2 (en) * 2010-11-02 2014-07-08 Cisco Technology, Inc. System and method for providing proactive fault monitoring in a network environment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
None
See also references of EP2704356A4

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3086509A4 (en) * 2013-12-19 2016-10-26 Zte Corp METHOD, DEVICE AND SYSTEM FOR TROUBLESHOOTING FOR NETWORK SERVICE NODE

Also Published As

Publication number Publication date
CN102918802A (zh) 2013-02-06
EP2704356B1 (en) 2019-09-04
US9471408B2 (en) 2016-10-18
EP2704356A4 (en) 2014-06-25
US20140082432A1 (en) 2014-03-20
JP2014522593A (ja) 2014-09-04
EP2704356A2 (en) 2014-03-05
JP5802829B2 (ja) 2015-11-04
WO2011157111A3 (zh) 2012-05-03
CN102918802B (zh) 2015-03-11

Similar Documents

Publication Publication Date Title
CN107612776B (zh) 一种通信连接检测方法及装置
WO2016082710A1 (zh) 一种呼叫控制方法、Diameter协议转发设备及系统
CN102143018B (zh) 消息循环的检测方法、路由代理设备及组网系统
WO2011157111A2 (zh) 确定故障指示状态的方法、节点和系统
CN104468506A (zh) 会话状态检测方法及装置
WO2015003299A1 (zh) 一种误码率检测的方法及网络设备
CN103999406A (zh) 通信路径的处理方法与装置
JP2020512780A5 (zh)
CN109194521B (zh) 一种流量转发方法及设备
US20080267080A1 (en) Fault Verification for an Unpaired Unidirectional Switched-Path
CN110225133A (zh) 消息发送方法、节点、装置、系统及相关设备
US7869350B1 (en) Method and apparatus for determining a data communication network repair strategy
JP5215469B2 (ja) バックオフ時間を用いたmmeとノードbとの間のインタフェースの再確立
CN112367179B (zh) 一种链路切换方法及装置
WO2016062021A1 (zh) 业务能力探测方法及装置
JP5834365B2 (ja) ダイアミタルーティングの運用のための方法および装置
CN100502270C (zh) 通信系统中的协议版本探测方法
CN108111431A (zh) 业务数据发送方法及装置
CN112068784A (zh) 一种云打印设备多网卡自动切换的方法及云打印设备
CN101599818B (zh) 为gtp消息设置定时器信息的方法及gtp消息节点
CN112995260A (zh) 一种会话消息交互方法、装置及计算机可读存储介质
CN106789639B (zh) 一种报文处理方法和装置
US11665225B2 (en) Distributed backup of unshipped charging data records
US12034797B2 (en) Distributed backup of unshipped charging data records
WO2017000122A1 (zh) 一种双栈地址管理方法及第一网元

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 201180000645.X

Country of ref document: CN

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11795077

Country of ref document: EP

Kind code of ref document: A2

ENP Entry into the national phase

Ref document number: 2014513022

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE