WO2018058618A1 - Fault processing method and device - Google Patents

Fault processing method and device Download PDF

Info

Publication number
WO2018058618A1
WO2018058618A1 PCT/CN2016/101284 CN2016101284W WO2018058618A1 WO 2018058618 A1 WO2018058618 A1 WO 2018058618A1 CN 2016101284 W CN2016101284 W CN 2016101284W WO 2018058618 A1 WO2018058618 A1 WO 2018058618A1
Authority
WO
WIPO (PCT)
Prior art keywords
network element
plane network
user plane
control plane
fault
Prior art date
Application number
PCT/CN2016/101284
Other languages
French (fr)
Chinese (zh)
Inventor
刘会勇
曾侃
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2016/101284 priority Critical patent/WO2018058618A1/en
Publication of WO2018058618A1 publication Critical patent/WO2018058618A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W24/00Supervisory, monitoring or testing arrangements
    • H04W24/04Arrangements for maintaining operational condition

Definitions

  • the present invention relates to the field of communications technologies, and in particular, to a fault processing method and device.
  • S-GW Serving GateWay
  • P-GW Packet Data Network GateWay
  • 5G fifth-generation mobile
  • the communication system (5G) needs for multiple service slicing. Therefore, the S-GW/P-GW gateway sinking distributed deployment becomes the future deployment trend.
  • 3GPP 3rd Generation Partnership Project
  • SGW-C Control Plane S-GW
  • SGW-U User Plane S-GW
  • Control Plane Packet Network Gateway Control Plane P-GW, PGW-C
  • User Plane Packet Network Gateway User Plane P-GW, PGW-U
  • the Gateway General Packet Radio Service Support Node (GGSN) function is included in the P-GW, and the PGW is also split into a Control Plane Gateway General Packet Radio Service Support Node (Control Plane GGSN, GGSN- C) and User Plane Gateway General Packet Radio Service Support Node (User Plane GGSN, GGSN-U), which will not be described separately.
  • Control Plane GGSN, GGSN- C Control Plane Gateway General Packet Radio Service Support Node
  • User Plane Gateway General Packet Radio Service Support Node User Plane GGSN, GGSN-U
  • the separation of the CP/UP causes the link between the CP and the UP inside the conventional S-GW/P-GW gateway to become a standard external 3GPP Sx interface.
  • the Sx interface includes an Sxa interface between the SGW-C and the SGW-U, and an Sxb interface between the PGW-C and the PGW-U.
  • the DGW can be regarded as including the functions of the SGW-U and the functions of the PGW-U, and the GGSN-C can be regarded as being combined with the PGW-C.
  • GGSN-U is considered to be in one set with PGW-U.
  • CP and UP can detect each other. For the CP, whether it is an UP fault or a link fault between the CP and the UP, the CP considers it to be an UP fault and directly processes it according to the UP fault. For example, the CP will reselect UP for user reactivation. For UP, whether it is a CP fault or a link failure between the CP and the UP, the UP is considered to be a CP fault, and thus directly processes the fault according to the UP fault, for example, the UP releases the local service.
  • the CP processes the fault according to the UP fault or the fault is processed according to the fault of the CP. This causes the service of the UP that is not faulty to be interrupted and the service experience of the user. It can be seen that the current test results are not accurate enough, which may cause the UP business to be interrupted.
  • the embodiment of the invention provides a fault processing method and device for reducing the possibility of service interruption of the UP.
  • a fault processing method is provided, which is performed by a signaling processing network element, such as implemented by an MME.
  • the method includes: the signaling processing network element receives the first detection information sent by the first control plane network element, where the first detection information is used to indicate the state of the first user plane network element obtained by the first control plane network element.
  • the signaling processing network element receives the second detection information sent by the second user plane network element, where the second detection information is used to indicate the status of the first user plane network element obtained by the second user plane network element.
  • the signaling processing network element determines the fault type according to the first probe information and the second probe information, where the fault type includes the first user plane network element fault, or a link fault between the first control plane network element and the first user plane network element.
  • the signaling processing network element can receive the information sent by the multi-party network element, and therefore the task of performing the fault determination is handed over to the signaling processing network element.
  • the signaling processing network element can comprehensively determine whether the user plane network element fault and the control plane network element are combined with the received first probe information and the second probe information.
  • the fault is also a link fault between the control plane network element and the user plane network element. Because multiple aspects of information are comprehensively considered in the fault judgment, not only the information of a single network element is considered, but the accuracy of the judgment result is improved. Therefore, if the network element fails, the network element can be processed according to the network element failure. If the link is faulty, the link fault can be processed.
  • the service interruption of the faultless UP can be avoided to ensure the continuity of the service or the fault in the UP. In this case, you can also restore the UP service as quickly as possible, and try not to affect the user's business experience.
  • the signaling processing network element determines the fault type according to the first probe information and the second probe information, by implementing the following manner:
  • the second detection information indicates that the first user plane network element is faulty, and the signaling processing network element determines that the fault type is the first user plane network element fault; or, if the first detection information indicates that the first user plane network element fails, and the second The detection information indicates that the first user plane network element is normal, and the signaling processing network element determines that the fault type is normal for the first user plane network element, and the link between the first control plane network element and the first user plane network element is faulty.
  • the signaling processing network element determines the fault type by comprehensively considering the first probe information and the second probe information. In this way, it can effectively determine whether the first user plane network element fault or the first A link fault between the control plane network element and the first user plane network element can effectively distinguish the network element fault and the link fault, so that different faults can be handled differently, and the link fault is avoided as much as possible.
  • a bad experience such as business interruption caused by network element failure processing.
  • the signaling processing network element determines that the fault type is the first control plane network element and the first user plane network element. After the link failure occurs, the signaling processing network element can perform fault recovery processing.
  • the fault recovery process includes, but is not limited to, the following two methods: the signaling processing network element re-selects the control plane network element for the first user plane network element, and sends the identifier of the first user plane network element to the reselected control plane network element.
  • the re-selected control plane network element manages the first user plane network element; or the signaling processing network element instructs the first control plane network element to wait for link recovery.
  • the signaling processing network element can reselect the control plane network element and send it to the reselected control plane network element. Sending the identifier of the first user plane network element.
  • the signaling processing network element may send the re-selected control plane network element in addition to the identifier of the first user plane network element to the reselected control plane network element.
  • Sending a fault indication that is, indicating exactly where the fault occurred
  • the reselected control plane network element receives the information sent by the signaling processing network element, and then can know exactly where the fault occurs, and can be based on the first user plane network element.
  • the identifier establishes a link with the first user plane network element in time, so that the network returns to normal as soon as possible.
  • the signaling processing network element may also indicate that the first control plane network element is waiting for the link to be restored, and no other processing is required. Therefore, the restored first control plane network element can continue to be used, thereby improving the utilization rate of the network element.
  • the first control plane network element is a control plane serving gateway
  • the second user plane network element is a base station, the first user plane network element is a user plane serving gateway, and the signaling processing network element is a mobility management entity; or, the first control plane network element is a control plane packet data network gateway, and the second The user plane network element is a user plane service gateway, the first user plane network element is a user plane packet data network gateway, and the signaling processing network element is a mobility management entity; or, the first control plane network element is a control plane serving gateway, and the second The user plane network element is a user plane packet data network gateway, the first user plane network element is a user plane serving gateway, and the signaling processing network element is a mobility management entity.
  • the network elements may have different selection modes.
  • the application scenarios supported by the embodiments of the present invention are applicable to the embodiments of the present invention.
  • a fault processing method is provided, which is performed by a signaling processing network element, such as implemented by an MME.
  • the method includes: the signaling processing network element obtains the first detection information, where the first detection information is used to indicate the state of the first control plane network element.
  • the signaling processing network element receives the second detection information sent by the first user plane network element, where the second detection information is used to indicate the status of the second user plane network element obtained by the first user plane network element.
  • the signaling processing network element determines the fault type according to the first probe information and the second probe information, where the fault type includes only the first control plane network element fault, or only the second user plane network element fault, or the first control plane network element and the first control plane Both user plane network elements are faulty.
  • the signaling processing network element releases the service associated with the first control plane network element and/or the second user plane network element.
  • the signaling processing network element determines that both the first control plane network element and the second user plane network element are faulty, and the first control plane network element manages the second user plane network element, that is, the first fault
  • the signaling processing network element can locally release the service associated with the first control plane network element and/or the second user plane network element, so that the part The service can be restored on other network elements to minimize the length of business interruption.
  • the signaling processing network element determines, according to the first detection information and the second detection information, a fault type, by: if the first detection information indicates If the control plane network element fails, the signaling processing network element determines that the first control plane network element is faulty; or, if the second detection information indicates that the second user plane network element fails, the signaling processing network element determines the second user plane network. Meta failure.
  • the signaling processing network element determines whether the first control plane network element and the second user plane network element are faulty according to the first probe information and the second probe information, and the determining manner is relatively straightforward, and the faulty network element can be locked relatively quickly.
  • a second possible implementation manner of the second aspect after the signaling processing network element determines the fault type, if only the first control plane network element fails, signaling The processing network element re-selects the control plane network element for the second user plane network element, and sends the identifier of the second user plane network element to the reselected control plane network element, so that the reselected control plane network element manages the second user plane network. yuan.
  • the signaling processing network element may reselect the control plane network element, and send the identifier of the second user plane network element to the reselected control plane network element, optionally, the signaling processing network
  • the element may also send a fault indication to the reselected control plane network element, that is, indicating exactly where the fault occurred, and then reselecting the control.
  • the network element can learn exactly where the fault occurs, and can establish a link with the second user plane network element according to the identifier of the second user plane network element, so that the network Return to normal as soon as possible.
  • the second user plane network element since the second user plane network element has no fault, the service of the second user plane network element can continue, and the possibility of service interruption is minimized.
  • the signaling processing network element obtains the first detection information
  • the method is as follows: the signaling processing network element receives the first detection information sent by the second control plane network element, where the first detection information is used to indicate the state of the first control plane network element obtained by the second control plane network element; or The signaling processing network element detects the first control plane network element, and generates first detection information according to the detection result.
  • the signaling processing network element may directly detect the first control plane network element to obtain the first detection information, or may also be the second control plane network element to detect the first control plane network element, and send the first detection information.
  • the signaling processing network element obtains the first detection information in a flexible manner. In practical applications, an appropriate manner can be selected according to the difference of the network elements that need to be detected.
  • the signaling processing network element is a mobile management entity
  • the first control plane network element is a control plane The serving gateway
  • the first user plane network element is a base station
  • the second user plane network element is a user plane serving gateway
  • the signaling processing network element is a mobility management entity
  • the second control plane network element is a control plane serving gateway
  • first The control plane network element is a control plane packet data network gateway
  • the first user plane network element is a user plane service gateway
  • the second user plane network element is a user plane packet data network gateway.
  • the network elements may have different selection modes.
  • the application scenarios supported by the embodiments of the present invention are applicable to the embodiments of the present invention.
  • a fault processing method is provided, which is implemented by an SDN controller.
  • the method includes: the SDN controller detects the first switch, and obtains the first probe information.
  • the SDN controller receives the second probe information sent by the second switch, where the second probe information is used to indicate the status of the first switch obtained by the second switch.
  • the SDN controller determines the fault type according to the first probe information and the second probe information, where the fault type includes a first switch fault or a link fault between the SDN controller and the first switch.
  • the SDN controller can receive the second detection information sent by the second switch, and the SDN controller can combine the received information.
  • the first detection information and the second detection information comprehensively determine the type of the fault, because comprehensive information is considered in the judgment of the fault, and not only the information of the single network element is considered, and the accuracy of the judgment result is improved, thereby If the fault is faulty, the fault can be processed according to the fault of the NE. If the link is faulty, the fault can be processed according to the link fault.
  • the service interruption of the faultless switch can be avoided to ensure the continuity of the service. You can also restore the services of the switch as quickly as possible without affecting the user's service experience.
  • a fault processing method is provided, the method being performed by a first network element.
  • the method includes: the first network element detects that the state of the second network element is a fault, and the first network element generates the detection information according to the detection of the second network element, and sends the detection information to the signaling processing network element, and the detection information
  • the identifier of the second network element is carried in, and the detection information is used to determine the type of the fault.
  • the first network element is a control plane network element or a user plane network element
  • the second network element is a control plane network element or a user plane network element.
  • the first network element that is the control plane network element or the user plane network element can detect the state of the second network element. If the state of the second network element is known to be faulty, the first network element can temporarily not perform fault processing, for example, The service information associated with the second network element is not released locally, but the detection information (the detection information may include the first detection information or the second detection information as described in the foregoing aspects) is sent to the signaling processing network element, and the detection is performed. The information may carry the identifier of the second network element. After receiving the detection information, the signaling processing network element may determine, according to the identifier of the second network element carried in the detection information, that the status of the second network element is faulty.
  • the signaling processing network element can comprehensively determine the fault type according to the probe information sent by the multi-party network element, improve the fault judgment accuracy, and can distinguish whether the network element fault or the link fault is as far as possible. If the fault occurs on the NE, the fault can be processed according to the fault of the NE. If the link is faulty, the fault can be processed according to the link fault. You can avoid service interruption of the faultless UP and ensure the continuity of the service or the fault in the UP. Under the same time, you can also restore the UP service as quickly as possible, and try not to affect the user's business experience.
  • a signaling processing network element in a fifth aspect, includes a receiver and a processor.
  • the receiver is configured to receive the first probe information sent by the first control plane network element, and receive the second probe information sent by the second user plane network element.
  • the first probe information is used to indicate the first control plane network
  • the second probe information is used to indicate the state of the first user plane network element obtained by the second user plane network element.
  • the processor is configured to determine a fault type according to the first probe information and the second probe information, where the fault type includes a first user plane network element fault, or a link fault between the first control plane network element and the first user plane network element.
  • the processor is configured to determine a fault type according to the first probe information and the second probe information, including: if the first probe information and the second probe information are both indicated If the first user plane network element is faulty, the fault type is determined to be the first user plane network element fault; or, if the first probe information indicates that the first user plane network element is faulty, and the second probe information indicates that the first user plane network element is normal, Then, the fault type is determined to be that the first user plane network element is normal, and the link between the first control plane network element and the first user plane network element is faulty.
  • the signaling processing network element further includes a transmitter.
  • the processor is further configured to: after determining that the fault type is a link fault between the first control plane network element and the first user plane network element, reselect the control plane network element for the first user plane network element, and pass the transmitter. Sending, to the reselected control plane network element, the identifier of the first user plane network element, so that the reselected control plane network element manages the first user plane network element; or determining the fault type as the first control plane network element and the first After the link between the user plane network elements fails, the first control plane network element is instructed to wait for link recovery.
  • the first control plane network element is a control plane serving gateway
  • the second user plane network element is a base station, the first user plane network element is a user plane serving gateway, and the signaling processing network element is a mobility management entity; or, the first control plane network element is a control plane packet data network gateway, and the second The user plane network element is a user plane service gateway, the first user plane network element is a user plane packet data network gateway, and the signaling processing network element is a mobility management entity; or, the first control plane network element is a control plane serving gateway, and the second The user plane network element is a user plane packet data network gateway, the first user plane network element is a user plane serving gateway, and the signaling processing network element is a mobility management entity.
  • a signaling processing network element including a processor and a receiver.
  • the processor is configured to obtain first probe information, where the first probe information is used to indicate the first control The state of the face network element.
  • the receiver is configured to receive the second probe information that is sent by the first user plane network element, where the second probe information is used to indicate the state of the second user plane network element obtained by the first user plane network element.
  • the processor is further configured to determine a fault type according to the first probe information and the second probe information, where the fault type includes only the first control plane network element fault, or only the second user plane network element fault, or the first control plane network element and the first control plane Both user plane network elements are faulty.
  • the processor releases the first control plane network element and/or the second The service associated with the user plane network element.
  • the processor is configured to determine a fault type according to the first probe information and the second probe information, including: if the first probe information indicates the first control plane network element If the second probe information indicates that the second user plane network element is faulty, the second user plane network element fault is determined.
  • the signaling processing network element further includes a transmitter.
  • the processor is further configured to: after determining the fault type, if the fault type is only the first control plane network element fault, reselect the control plane network element for the second user plane network element, and use the transmitter to reselect the control plane.
  • the network element sends the identifier of the second user plane network element, so that the reselected control plane network element manages the second user plane network element.
  • the processor is configured to obtain the first detection information, including: Acquiring the first detection information sent by the second control plane network element received by the receiving unit, where the first detection information is used to indicate the state of the first control plane network element obtained by the second control plane network element; or, for the first control plane The network element performs detection, and generates first detection information according to the detection result.
  • the signaling processing network element is a mobility management entity, and the first control plane network element is a control plane serving gateway, A user plane network element is a base station, and a second user plane network element is a user plane serving gateway; or, the signaling processing network element is a mobility management entity, and the second control plane network element is a control plane serving gateway, and the first control plane network element
  • the first user plane network element is a user plane service gateway, and the second user plane network element is a user plane packet data network gateway.
  • an SDN controller comprising a processor and a receiver.
  • the processor is configured to detect the first switch to obtain the first probe information.
  • the receiver is configured to receive the second probe information sent by the second switch, where the second probe information is used to indicate the status of the first switch obtained by the second switch.
  • the processor is further configured to determine a fault type according to the first probe information and the second probe information, where the fault type includes a first switch fault, or a link fault between the SDN controller and the first switch.
  • a network element including a processor and a transmitter.
  • the processor is configured to detect, by detecting, that the state of the second network element is a fault, and generate the probe information according to the detection of the second network element.
  • the transmitter is configured to send the probe information to the signaling processing network element, where the probe information carries the identifier of the second network element, and the probe information is used to determine the fault type.
  • the network element is a control plane network element or a user plane network element
  • the second network element is a control plane network element or a user plane network element.
  • a signaling processing network element comprising a functional unit for performing the method provided by the first aspect or any one of the possible implementations of the first aspect.
  • a signaling processing network element comprising a functional unit for performing the method provided by the second aspect or any one of the possible implementations of the second aspect.
  • an SDN controller comprising functional units for performing the method provided by the third aspect or any of the possible implementations of the third aspect.
  • a network element is provided, the network element being a first network element, the network element comprising a functional unit for performing the method provided by the fourth aspect or any one of the possible implementation manners of the fourth aspect.
  • a thirteenth aspect a computer storage medium for storing computer software instructions for use in the above-described signaling processing network element, comprising any of the possible implementations for performing the first aspect or the first aspect Let the program designed by the network element be processed.
  • a fourteenth aspect a computer storage medium for storing computer software instructions for use in the signaling processing network element, comprising any of the possible implementations for performing the second aspect or the second aspect Let the program designed by the network element be processed.
  • a computer storage medium for storage as the SDN controller
  • Computer software instructions for use comprising a program designed to perform the SDN controller in any one of the possible implementations of the third aspect or the third aspect.
  • a computer storage medium for storing computer software instructions for use in the first network element, and includes any possible implementation manner for performing the fourth aspect or the fourth aspect.
  • the signaling processing network element comprehensively considers multiple aspects of information when performing fault diagnosis, and not only considers the information of a single network element, but also improves the accuracy of the judgment result, so that if the network element fails, If the link is faulty, you can perform the fault according to the link fault. You can avoid service interruption of the faultless UP and ensure the continuity of the service. In the case of the UP fault, you can also recover the fault as quickly as possible. UP business, try not to affect the user's business experience.
  • FIG. 1 is a schematic diagram of a network architecture applied to an embodiment of the present invention
  • FIG. 2 is a flowchart of a fault processing method according to an embodiment of the present invention.
  • FIG. 3 is a flowchart of a fault processing method according to an embodiment of the present invention.
  • FIG. 4 is a flowchart of a fault processing method according to an embodiment of the present invention.
  • FIG. 5 is a flowchart of a fault processing method according to an embodiment of the present invention.
  • FIG. 6 is a flowchart of a fault processing method according to an embodiment of the present invention.
  • FIG. 7 is a schematic diagram of a network architecture according to an embodiment of the present invention.
  • FIG. 8 is a flowchart of a fault processing method according to an embodiment of the present invention.
  • FIG. 9 is a schematic structural diagram of a computer device according to an embodiment of the present invention.
  • FIG. 10 is a schematic structural diagram of a signaling processing network element according to an embodiment of the present disclosure.
  • FIG. 12 is a schematic structural diagram of an SDN controller according to an embodiment of the present invention.
  • FIG. 13 is a schematic structural diagram of a first network element according to an embodiment of the present invention.
  • LTE Long Term Evolution
  • 5G 5th Generation Partnership Project
  • 3GPP 3rd Generation Partnership Project
  • Non-3GPP non-3GPP
  • UE User Equipment
  • UE User Equipment
  • RAN Radio Access Network
  • the user equipment may also be referred to as a wireless terminal device, a mobile terminal device, a Subscriber Unit, a Subscriber Station, a Mobile Station, a Mobile Station, a Remote Station, and a Pickup Station.
  • Access Point AP
  • Remote Terminal Access Terminal, User Terminal, User Agent, User Device, etc.
  • the user equipment may include a mobile telephone (or "cellular" telephone), a computer with a mobile terminal device, a dedicated terminal device in the NB-IoT, portable, pocket, handheld, computer built-in or vehicle-mounted Moving device.
  • a mobile telephone or "cellular" telephone
  • PCS Personal Communication Service
  • SIP Session Initiation Protocol
  • WLL Wireless Local Loop
  • PDA Personal Digital Assistants
  • Network devices also known as network elements.
  • the network device includes a control plane network element, a user plane network element, or a signaling processing network element.
  • the network elements in the embodiment of the present invention may be physical devices or logical devices.
  • the S-GW is split into SGW-C and SGW-U
  • the PGW is split into PGW-C and PGW.
  • -U the SGW-C and the PGW-C as the control plane
  • the SGW-U and the PGW-U as the user plane may also be components of the DGW.
  • One CGW can manage multiple DGWs, and one DGW can belong to multiple CGWs.
  • CGW and CP can be understood as the same concept
  • DGW and UP can be understood as the same concept.
  • the control plane network element in the embodiment of the present invention may include a CGW, or in a future communication system (for example, a 5G system), the current Mobility Management Entity (MME) and the CGW may be (or It also includes other devices) to merge to form a new control plane network element, or the new control plane network element may also include other possible network devices for implementing the functions of the control plane.
  • MME Mobility Management Entity
  • CGW may be (or It also includes other devices) to merge to form a new control plane network element, or the new control plane network element may also include other possible network devices for implementing the functions of the control plane.
  • the user plane network element in the embodiment of the present invention may include a DGW, or may also include other possible network devices for implementing functions of the user plane.
  • the user plane network element may further include an access network element, such as a base station (for example, an access point).
  • a base station may specifically refer to a device in an access network that communicates with a wireless terminal over one or more sectors over an air interface.
  • the base station can be used to convert the received air frame with an Internet Protocol (IP) packet as a router between the wireless terminal device and the rest of the access network, wherein the rest of the access network can include an IP network. .
  • IP Internet Protocol
  • the base station can also coordinate attribute management of the air interface.
  • the base station may be an evolved base station (NodeB or eNB or e-NodeB, evolutional Node B) in a system such as Long Term Evolution (LTE) or Long Term Evolution (LTE-A).
  • NodeB evolved base station
  • eNB evolved base station
  • e-NodeB evolutional Node B
  • LTE Long Term Evolution
  • LTE-A Long Term Evolution
  • the signaling processing network element mainly performs the fault determination work, and the signaling processing network element may be implemented by using the MME, or may also be implemented by other network devices.
  • the signaling processing network element in the embodiment of the present invention may also be implemented by a controller (Controller) in a Software Defined Network (SDN), which is hereinafter referred to as an SDN controller.
  • the user plane network element may include a switch (Switch) in the SDN.
  • the concept of "User Bearer Context" may be a subordinate concept of the concept of "business". For example, if the service is interrupted, the user session context of the service may be interrupted or lost.
  • system and “network” in the embodiments of the present invention may be used interchangeably.
  • Multiple means two or more.
  • the character "/”, unless otherwise specified, generally indicates that the contextual object is an "or" relationship.
  • the network architecture applied in the embodiment of the present invention is an architecture in which the control plane and the user plane are separated, and is described below with reference to the accompanying drawings.
  • FIG. 1 is a schematic diagram of a network architecture according to an embodiment of the present invention.
  • the signaling processing network element is implemented by using an MME as an example.
  • the user equipment is connected to the base station through the Uu interface, and the base station is connected to the SGW-U in the DGW through the S1-U interface.
  • the SGW-U is connected to the PGW-U in the same DGW through the S5/S8-U interface, and the PGW-U passes.
  • the SGi interface is connected to the Internet. The relationship between the PGW-U and the Internet connection is not shown in Figure 1.
  • the base station is connected to the MME through the S1-MME interface
  • the MME is connected to the SGW-C in the CGW through the S11 interface
  • the SGW-C is connected to the PGW-C in the same CGW through the S5/S8-C interface.
  • the SGW-C can also The PGW-C in the different CGWs is connected, which is not shown in FIG.
  • the SGW-C connects to the SGW-U through the Sxa interface
  • the PGW-C connects to the PGW-U through the Sxb interface.
  • the CGW in FIG. 1 manages the DGW, specifically, the SGW-C manages the SGW-U, and the PGW-C manages the PGW-U.
  • the name of the interface and the name of the network element introduced in the embodiment of the present invention do not constitute a pair.
  • the limitations of the device itself may also have other names in the application, interfaces and network elements.
  • heartbeat messages are generally exchanged between the CP and the UP.
  • the CP periodically sends a heartbeat message to the UP.
  • the UP After receiving the heartbeat message, the UP sends a response message to the CP.
  • the CP After receiving the heartbeat message, the CP sends a response message to the UP.
  • the CP determines that the UP is faulty, then the CP enters the fault processing flow of the UP. For example, the CP will reselect the UP for user reactivation.
  • the UP determines that the CP is faulty, then the UP enters the fault handling process of the CP, for example, the UP releases the local service.
  • the CP has not received the heartbeat message or the response message sent by the UP for a long time, or the heartbeat message or the response message sent by the CP that has not been received by the CP for a long time may be an UP fault or a CP fault, or may be a CP and an UP.
  • the link between the faults It can be seen that whether the UP is faulty or the link between the CP and the UP is faulty, the CP is currently considered to be an UP fault, and is directly processed according to the UP fault. Regardless of whether the CP is faulty or the link between the CP and the UP is faulty, the UP is uniformly considered to be a CP fault, and thus is directly processed according to the CP fault.
  • the CP processes the fault according to the UP fault, or the UP processes the fault according to the CP fault, which causes the service of the UP that is not faulty to be interrupted or lost. It will be longer and greatly affect the user's business experience. It can be seen that the accuracy of the detection result of the CP or the UP is not high, which may cause the service of the UP to be interrupted or lost.
  • the signaling processing network element can receive the information sent by the multi-party network element, and therefore the task of performing the fault determination is handed over to the signaling processing network element.
  • the signaling processing network element can comprehensively determine whether the user plane network element fault, the control plane network element fault, or the link fault between the control plane network element and the user plane network element is combined with the received first probe information and the second probe information. Because the multi-faceted information will be comprehensively considered in the judgment of the fault, not just the information of the single network element, the accuracy of the judgment result is improved. If the fault occurs on the user plane network element, the fault can be processed according to the fault of the network element on the control plane. If the fault occurs on the control plane network element, the link fault can be processed. In the case of a link fault, the service interruption of the fault-free UP can be avoided as much as possible, and the service continuity can be ensured. In the case of an UP fault, the UP service can be restored as quickly as possible without affecting the user experience.
  • the network architecture and the device provided by the embodiment of the present invention are described above.
  • the method provided by the embodiment of the present invention is described below with reference to the accompanying drawings.
  • the network architecture shown in FIG. 1 is taken as an example. According to the above description, those skilled in the art naturally know that the application scenario of the embodiment of the present invention is not limited to this. .
  • An embodiment of the invention provides a fault processing method.
  • the first control plane network element is configured to detect that the state of the first user plane network element is a fault, and the first control plane network element generates the first probe information according to the detection of the first user plane network element, and Sending the first probe information to the signaling processing network element.
  • the second user plane network element learns that the state of the first user plane network element is faulty, and the second user plane network element generates second detection information according to the detection of the first user plane network element, and the second detection information is also Send to the signaling processing network element.
  • the signaling processing network element determines the fault type according to the first probe information and the second probe information.
  • the technical solution provided by the embodiment of the present invention can effectively determine whether the user plane network element fault or the link fault.
  • the first control plane network element is SGW-C
  • the first user plane network element is SGW-U
  • the second user plane network element is a base station
  • the signaling processing network element is an MME. ,as shown in picture 2.
  • S21 and SGW-C know that the state of the SGW-U is a fault.
  • the SGW-C can detect the state of the SGW-U through the heartbeat message or other signaling messages on the Sx interface. For example, the SGW-C periodically sends a heartbeat message to the SGW-U, and the SGW-U sends a response message to the SGW-C after receiving the heartbeat message, or the SGW-U periodically sends a heartbeat message to the SGW-C, and the SGW-C receives the heartbeat. The message will be sent to the SGW-U after the message. If the SGW-C does not receive the response message or the heartbeat message sent by the SGW-U for a long time, the SGW-C determines that the SGW-U is faulty.
  • the SGW-C after the SGW-C learns that the status of the SGW-U is faulty, the SGW-C keeps the service associated with the SGW-U locally and continues to perform normally.
  • the SGW-C can identify the SGW-U fault after determining that the SGW-U is faulty, but can still keep the service associated with the SGW-U from continuing normally. In this way, users can be prevented from going offline. The volume keeps the business going.
  • the SGW-C sends the first probe information to the MME, where the first probe information is generated by the SGW-C according to the detection result, that is, the first probe information is used to indicate the status of the SGW-U obtained by the SGW-C. .
  • the identifier of the faulty SGW-U may be carried in the first probe information.
  • the identifier of the SGW-U may be the IP address of the forwarding plane of the SGW-U, or other identifiers used to identify the identity of the SGW-U.
  • the SGW-C may not carry the identifier of the faulty SGW-U in the first probe information, and may set a detection period for the MME, according to receiving in the detection period.
  • the identifier of the faulty SGW-U carried by the first probe information sent by the SGW-C determines which SGW-Us are faulty.
  • the SGW-C may not send the first probe information to the MME, and if the MME does not receive the first probe information sent by the SGW-C in the detection period, the default SGW-C detection result to the SGW-U is normal. Therefore, the MME can learn the detection result of the SGW-C accordingly.
  • the SGW-C sends the first probe information to the MME through the extended S11 interface message.
  • the extended S11 interface message is an Echo Request message, or may be a newly added fault processing message or the like.
  • the SGW-C learns that the state of the SGW-U is faulty. Since the SGW-C and the SGW-U are mutually detected, the SGW-C can know that the state of the SGW-U is faulty, and the same. SGW-U can also know that the status of SGW-C is faulty.
  • the SGW-U After the SGW-U learns that the status of the SGW-C is faulty, the SGW-C can still maintain the normal service of the SGW-C.
  • the SGW-U can identify the SGW-C fault after determining that the SGW-C is faulty, but can still keep the service associated with the SGW-C from proceeding normally. In this way, users can be prevented from going offline and the service can be continued.
  • the base station learns that the state of the SGW-U is a fault.
  • the base station may detect the state of the sensing SGW-U based on the data transmission channel of the user plane, thereby determining whether the SGW-U is faulty.
  • the base station sends the second probe information to the MME, where the second probe information is generated by the base station according to the detection result, that is, the second probe information is used to indicate the status of the SGW-U obtained by the base station.
  • the second probe information may carry the identifier of the faulty SGW-U. If the base station determines that the SGW-U is faultless, the base station may not carry the faulty SGW in the second probe information.
  • the identifier of the -U for the MME, may set a detection period, and determine which SGW-Us are faulty according to the identifier of the fault SGW-U carried by the second probe information sent by the base station received in the detection period. Or the base station may not send the second probe information to the MME, if the MME does not receive the second probe information sent by the base station in the detection period, the default base station detects the SGW-U as normal, that is, the user plane data. The transmission channel is normal, so that the MME can learn the detection result of the base station accordingly.
  • the local device may have different processing manners for the local service associated with the SGW-U.
  • the base station sends the second detection information to the MME in different manners. The following is a brief introduction.
  • the UP for example, the base station
  • learns that the state of the other UP for example, the SGW-U
  • the UP locally releases the service associated with another UP, and the UP may be in the MME due to the release of the service.
  • the triggered message carries the second probe information.
  • the UE Context Release Request message sent by the base station to the MME may carry the second probe information. If the base station learns that the status of the SGW-U is a fault, the base station may send an S1UE Context Release Request message to the MME for all services associated with the SGW-U, that is, send an S1UE Context Release Request message to the MME, where the S1UE Context Release Request message is The Cause Indicates can be identified as S1-U Failure.
  • the S1UE Context Release Request message may also carry the identifier of the faulty SGW-U. In this way, after receiving the S1UE Context Release Request message, the MME can know the detection result of the base station. In addition, the MME may further send an S1 User Equipment Context Release Command (UE Context Release Command) message to the base station to instruct the base station to release the air interface and the local Local business.
  • UE Context Release Command S1 User Equipment Context Release Command
  • the UP for example, the base station
  • the UP learns that the status of the other UP (for example, the SGW-U) is faulty
  • the UP can continue to maintain the normal association with the other UP.
  • the user goes offline, allowing the business to continue.
  • the UP may send the second probe information to the MME by using an existing message or a newly added fault processing message or the like. For example, if the base station learns that the state of the SGW-U is a fault, the base station continues to maintain the SGW-U-related service locally, and the base station sends the second probe information to the MME through the extended S1-MME interface message.
  • S21-S22 and S23-S24 can be regarded as two parts, and the execution order of these two parts can be arbitrary.
  • the MME determines a fault type according to the first probe information and the second probe information, where the fault type includes an SGW-U fault, or a link fault between the SGW-C and the SGW-U.
  • the MME combines the identifier of the faulty SGW-U carried in the first probe information, and the identifier of the faulty SGW-U carried in the second probe information or the SGW-U associated with the user whose indication indicates the S1-U Failure.
  • the identifier identifies whether it is a SGW-U failure or a link failure between SGW-C and SGW-U.
  • the identifier of the faulty SGW-U carried in the second probe information includes the identifier of the faulty SGW-U carried in the first probe information, or the Cause Indicates identifier in the second probe information is S1
  • the identifier of the SGW-U associated with the user of the -U Failure includes the identifier of the faulty SGW-U carried in the first probe information, that is, the first probe information and the second probe information both indicate that the SGW-U is faulty, and the MME determines the SGW. -U failure.
  • the identifier of the faulty SGW-U carried in the second probe information does not include the identifier of the faulty SGW-U carried in the first probe information, or the Cause Indicates identifier in the second probe information is The identifier of the SGW-U that is associated with the user-initiated SGW-U of the S1-U Failure does not include the identifier of the faulty SGW-U carried in the first probe information, or the MME does not receive the second probe information in the detection period, that is, the first probe information. Instructing the SGW-U to fail, and the second probe information indicates that the SGW-U is normal, the MME determines that the SGW-U is normal, and the link between the SGW-C and the SGW-U is faulty.
  • the MME performs fault recovery processing.
  • the MME may perform link failure recovery processing in combination with a predefined link failure processing policy.
  • the link fault handling policy may be predefined by the operator or may be predefined by a protocol or a standard.
  • the link fault handling strategy in the embodiment of the present invention includes but is not limited to the following:
  • the MME may acquire the SGW-C of the failed SGW-U home association based on the user session context, ie, the SGW-C of the SGW-U for managing the fault.
  • the MME sends an extended S11 interface message to the SGW-C, instructing the SGW-C to initiate fault processing of the SGW-U.
  • the extended S11 interface message sent by the MME to the SGW-C may carry the identifier of the faulty SGW-U, the fault service processing indication, and the like.
  • the fault handling mode of the user plane network element is related to the deployment mode of the user plane network element.
  • the embodiments of the present invention provide several types of UP deployment modes. The following describes how to implement the fault processing of the UP in different deployment modes.
  • Deployment mode 1 A CP manages multiple UPs, and a load balancing deployment between multiple UPs managed by one CP. Then, if the UP fails, the CP can release the service corresponding to the failed UP, and the user equipment served by the failed UP needs to reactivate the selection to other UPs. In this deployment mode, the faulty UP is handled in a simple manner. However, the service corresponding to the fault UP may be lost. The service recovery time is long and the user experience may not be very good.
  • Deployment mode 2 N+1 backup deployment between multiple UPs managed by one CP, that is, multiple UPs managed by one CP include N primary UPs and one standby UP. If a primary UP fails, the CP loads the user's session context of the failed primary UP to the standby UP based on the user session context of the primary UP saved locally by the CP, thereby restoring the user's service. In this deployment mode, most services corresponding to the fault UP can be recovered. However, since the backup of the user session context of the fault UP to the CP is performed periodically, the user service context that is not backed up in time during the backup period may still be lost, and the time required for the entire service recovery is long. There may still be a business interruption, and the user experience is general.
  • Deployment mode 3 Multiple UP (N-way) mode redundancy between multiple UPs managed by one CP.
  • UP is used for main purposes, but certain redundant resources are reserved between UPs. If a certain UP is faulty, the CP scatters the user session context of the failed UP to other UPs based on the user session context of the UP saved locally by the CP, thereby restoring the user's service.
  • the service recovery time and service integrity required for the fault UP are similar to those of the deployment mode 2.
  • Deployment mode 4 A multiple-upup between multiple UPs managed by a CP is a 1+1 backup mode. That is, one primary UP corresponds to one standby UP. In this deployment mode, the primary UP and the standby UP can detect each other. If a primary UP is faulty, the standby UP corresponding to the primary UP can be switched to the primary UP to continue to provide services. In this deployment mode, user service recovery time is shorter, service interruption time is shorter, and user experience is better.
  • the MME can reselect the CP, re-establish the link between the reselected CP and the UP, and do not need to continue to perform service processing through the faulty link, so that the service can continue as soon as possible.
  • the MME may reselect the CP. After the CP is reselected, the MME may send a Modify Bearer Request message to the reselected CP.
  • the Modify Bearer Request message may be extended to carry the Sx link recovery indication and may carry the UP identifier.
  • the Modify Bearer Request message may also carry the address of the original CP, so that the reselected CP can obtain the service information of the UP from the original CP.
  • the reselected CP may trigger an Sx Session Modification message according to the Sx link recovery indication carried therein to restore the link between the UP and the UP.
  • the CP may send a Modify Bearer Response message to the MME, where the Modify Bearer Response message may be extended to indicate that the Sx link has returned to normal.
  • the MME may reselect the SGW-C, and re-establish the link between the reselected SGW-C and the SGW-U, without continuing to perform service processing through the faulty link. Enable the business to continue as soon as possible.
  • the MME determines the link between the SGW-C and the SGW-U.
  • the MME can reselect SGW-C.
  • the MME may send a Modify Bearer Request message to the reselected SGW-C.
  • the Modify Bearer Request message may be extended to carry the Sx link recovery indication and may carry the identifier of the SGW-U.
  • the original SGW-C address may be carried in the Modify Bearer Request message, so that the reselected SGW-C can obtain the service information of the SGW-U from the original SGW-C. .
  • the reselected SGW-C may trigger the Sx Session Modification message according to the Sx link recovery indication carried therein to restore the link with the SGW-U.
  • the SGW-C may send a Modify Bearer Response message to the MME, where the indication of the Sx link may be extended in the Modify Bearer Response message. Back to normal.
  • the MME may indicate that the original CP does not perform processing and wait for link recovery between the CP and the UP.
  • the MME does not reselect the CP, but can send an extended S11 interface message, such as an Echo Request message or a newly added fault handling message, to the original CP.
  • the extended S11 interface message may carry the identifier of the failed UP, and may carry an indication of waiting for the Sx link to recover.
  • the CP waits for the Sx link recovery indication carried in the process, and does not process, waiting for the link between the CP and the UP to resume.
  • the MME can locally release all services associated with the faulty network element.
  • the MME may wait for the fault recovery of the faulty network element without processing. In this way, after the faulty network element is restored, the faulty network element becomes a normal network element again, and can continue to be utilized, and the original service can be continued, without replacing other network element processing in the middle, thereby reducing the possibility of user session information loss. Sexuality also improves the utilization of network elements.
  • the MME may indicate that the original SGW-C does not process, and waits for link recovery between the SGW-C and the SGW-U.
  • MME determines SGW-C and If the link between the SGW and the U is faulty, the MME does not reselect the SGW-C, but sends an extended S11 interface message, such as an Echo Request message or a newly added fault handling message, to the original SGW-C.
  • the extended S11 interface message may carry the identifier of the faulty SGW-U, and may carry an indication of waiting for the Sx link to recover.
  • the SGW-C waits for the link recovery between the SGW-C and the SGW-U according to the waiting Sx link recovery indication carried in the SGW-C.
  • link fault handling strategies are only examples, and other link fault handling strategies may be implemented in the actual application, which are all within the protection scope of the embodiments of the present invention.
  • the MME may adopt the first type of link fault handling policy as described above, that is, the UP fault. Process the policy for link failure recovery.
  • link fault recovery can be performed in different ways for different deployment modes of the UP. It is flexible and conforms to the actual network.
  • the MME may adopt the second type of link fault handling policy or the third type as described above.
  • the link fault handling policy is to reselect the control plane network element policy or wait for the link recovery policy to perform link fault recovery. If the second link fault handling strategy is used for link recovery, the recovery speed is faster and the service can continue as soon as possible. If the third link fault handling strategy is used for link recovery, after the link between the SGW-C and the SGW-U is restored, the original SGW-C can continue to be utilized, and the original service can be continued. There is no need to replace other network element processing in the middle, which reduces the possibility of user session context loss and improves the utilization of network elements.
  • the SGW-C and the SGW-U may not initiate the faulty service processing, and send the fault state of the peer end obtained by the probe to the probe information.
  • the MME can determine whether the SGW-U fault or the Sx link fault is relatively accurately determined by combining the probe information of the user plane sent by the base station and the probe information of the control plane sent by the SGW-C.
  • the probability of link failure caused by the network is higher than that of the SGW-U failure, and the cost of the SGW-U fault service processing is high.
  • the SGW-U fault is processed to greatly improve the service experience of the network.
  • An embodiment of the present invention provides a fault processing method.
  • the first control plane network element is PGW-C
  • the first user plane network element is PGW-U
  • the second user plane network element is SGW-U
  • the signaling processing network element is MME. Describe.
  • S31 and PGW-C know that the state of the PGW-U is a fault.
  • the PGW-C can determine whether the status of the PGW-U is a fault through a mechanism such as heartbeat detection, and no further description is provided.
  • the PGW-C after the PGW-C learns that the state of the PGW-U is a fault, the service that the PGW-C still keeps the PGW-U locally continues to perform normally.
  • the PGW-C After the PGW-C knows that the status of the PGW-U is faulty, the PGW-U fault can be identified, but the service associated with the PGW-U can still be maintained normally. In this way, users can be prevented from going offline and the business can be continued as much as possible.
  • the PGW-C sends the first probe information to the MME, where the first probe information is generated by the PGW-C according to the detection result, that is, the first probe information is used to indicate the status of the PGW-U obtained by the PGW-C.
  • the identifier of the faulty PGW-U may be carried in the first probe information, and the identifier of the PGW-U is, for example, the IP address of the forwarding plane of the PGW-U, and may of course be other An identifier used to identify the identity of the PGW-U. In addition, the identifier of the PGW-C may also be carried in the first probe information. If the PGW-C determines that the PGW-U is fault-free, the PGW-C may not carry the identifier of the faulty PGW-U in the first probe information. For the MME, the detection period may be set, according to receiving in the detection period.
  • the identifier of the fault PGW-U carried by the first probe information sent by the PGW-C determines which PGW-Us are faulty.
  • the PGW-C may not send the first probe information to the MME, so that if the MME does not receive the first probe information sent by the PGW-C within the detection period, the default PGW-C detection result of the PGW-U is normal. Therefore, the MME can learn the detection result of the PGW-C accordingly.
  • the PGW-C sends the first probe information to the SGW-C through the extended S5/S8 interface message, and the SGW-C forwards the first probe information to the MME through the S11 interface message, for example, the first probe information is carried in the extended Echo.
  • the Request message or in the newly added fault handling message In the Request message or in the newly added fault handling message.
  • the PGW-C knows that the state of the PGW-U is faulty. Since the PGW-C and the PGW-U are mutually detected, the PGW-C can know that the state of the PGW-U is faulty, the same. , PGW-U can also know that the status of PGW-C is faulty.
  • the PGW-U After the PGW-U learns that the status of the PGW-C is faulty, the PGW-C can still maintain the normal business of the PGW-C.
  • the PGW-U can identify the PGW-C failure after determining the PGW-C failure, but can still maintain the PGW-C-related service to continue normal. In this way, users can be prevented from going offline and the service can be continued.
  • S33 and SGW-U know that the state of the PGW-U is a fault.
  • the SGW-U can detect the state of the PGW-U based on the data transmission channel of the user plane, thereby determining whether the PGW-U is faulty.
  • the SGW-U sends the second probe information to the MME, where the second probe information is generated by the SGW-U according to the detection result, that is, the second probe information is used to indicate the status of the PGW-U obtained by the SGW-U.
  • the second probe information may carry the identifier of the faulty PGW-U. If the SGW-U determines that the PGW-U has no fault, the SGW-U may be in the second probe information.
  • the identifier of the PGW-U that does not carry the fault, for the MME, may set a detection period, and determine which ones are based on the identifier of the fault PGW-U carried by the second probe information sent by the SGW-U received in the detection period. PGW-U failure.
  • the SGW-U may not send the second probe information to the MME, and if the MME does not receive the second probe information sent by the SGW-U in the detection period, the default SGW-U detection result to the PGW-U is normal. Therefore, the MME can learn the detection result of the SGW-U accordingly.
  • the SGW-U may have different processing modes for the local service associated with the PGW-U after the status of the PGW-U is known to be faulty.
  • the SGW-U sends the second detection information to the MME. different way. The following is a brief introduction.
  • UP for example, SGW-U
  • PGW-U the state of another UP
  • the UP will release the service associated with the other UP.
  • the UP may carry the second probe information in the message triggered by the MME.
  • the SGW-U can trigger the release process of the user service bearer associated with the PGW-U. For example, the SGW-U sends a message for performing fault processing to the SGW-C through the Sx-interface, for example, a UPlane Session Delete Request message, and the SGW-C sends the S11 interface to the MME for troubleshooting.
  • the SGW-U can extend the identifier of the PGW-U carrying the fault in the UPlane Session Delete Request message.
  • the identifier of the PGW-U carrying the fault can be extended in the Delete Bearer Request message.
  • SGW-C is not shown in FIG.
  • the UP In the second mode, if the UP (for example, the SGW-U) learns that the status of the other UP (for example, the PGW-U) is a fault, the UP can continue to maintain the service associated with the other UP. Users can be prevented from going offline, so that the business can continue.
  • the UP may send the second probe information to the MME by using an existing message or a newly added fault processing message or the like. For example, if the SGW-U knows that the status of the PGW-U is a fault, the SGW-U continues to maintain the PGW-U-related service locally, and the SGW-U can send the second probe information through the extended S1-MME interface message. To the MME.
  • S31-S32 and S33-S34 can be regarded as two parts, and the execution order of these two parts can be arbitrary.
  • the MME determines a fault type according to the first probe information and the second probe information, where the fault type includes a PGW-U fault, or a link fault between the PGW-C and the PGW-U.
  • the MME determines whether the PGW-U fault or the PGW-C and the PGW-U are combined with the identifier of the faulty PGW-U carried in the first probe information and the identifier of the faulty PGW-U carried in the second probe information. The link between the faults.
  • the identifier of the faulty PGW-U carried in the second probe information includes the identifier of the faulty PGW-U carried in the first probe information, that is, the first probe information and the second probe
  • the information indicates that the SGW-U is faulty, and the MME determines that the PGW-U is faulty.
  • the MME determines that the PGW-U is normal, and the link between the PGW-C and the PGW-U is faulty.
  • the MME performs fault recovery processing.
  • the MME may adopt the first type introduced in S26 in the embodiment shown in FIG. 2 .
  • the link fault handling policy performs link fault recovery.
  • the MME may adopt the second type introduced in S26 in the embodiment shown in FIG. 2.
  • the link fault handling policy or the third link fault handling policy performs link fault recovery.
  • the PGW-C and the PGW-U may not initiate the fault service processing, and send the peer fault status obtained by the probe to the MME by using the probe information. Then, the MME combines the detection information of the user plane sent by the SGW-U and the detection information of the control plane sent by the PGW-C to determine whether the PGW-U fault or the Sx link fault is relatively accurate.
  • the probability of link failure caused by the network is relatively high relative to the PGW-U failure, and the cost of the PGW-U fault service processing is high.
  • An embodiment of the invention provides a fault processing method.
  • the first control plane network element is SGW-C
  • the first user plane network element is SGW-U
  • the second user plane network element is a base station and/or a PGW-U
  • the signaling processing network element is an MME.
  • the SGW-C learns that the state of the SGW-U is a fault.
  • the SGW-C can determine whether the state of the SGW-U is a fault through a mechanism such as heartbeat detection, and no further description is provided.
  • the SGW-C after the SGW-C learns that the status of the SGW-U is faulty, the SGW-C keeps the service associated with the SGW-U locally and continues to perform normally.
  • the SGW-C can identify the SGW-U fault after determining that the SGW-U is faulty, but can still keep the service associated with the SGW-U from continuing normally. In this way, users can be prevented from going offline and the business can be continued as much as possible.
  • the SGW-C sends the first probe information to the MME, where the first probe information is generated by the SGW-C according to the detection result, that is, the first probe information is used to indicate the status of the SGW-U obtained by the SGW-C.
  • the identifier of the faulty SGW-U may be carried in the first probe information, and the identifier of the SGW-U is, for example, the IP address of the forwarding plane of the SGW-U, and may of course be other An identifier used to identify the identity of the SGW-U. If the SGW-C determines that the SGW-U is not faulty, the SGW-C may not carry the identifier of the faulty SGW-U in the first probe information, or the SGW-C may not send the first probe information to the MME, so that the MME The detection result of SGW-C can be known from this.
  • the SGW-C sends the first probe information to the MME through the extended S11 interface message.
  • the PGW-U knows that the state of the SGW-U is a fault.
  • the PGW-U can determine whether the state of the SGW-U is a fault through a mechanism such as heartbeat detection, and no further description is provided.
  • the PGW-U after the PGW-U learns that the state of the SGW-U is a fault, the PGW-U still keeps the service associated with the SGW-U and continues to perform normally.
  • the PGW-U After the PGW-U learns that the SGW-U is faulty, it can identify the SGW-U fault, but the service associated with the SGW-U can still be maintained normally. In this way, users can be prevented from going offline and the business can be continued as much as possible.
  • the PGW-U sends the second detection information to the MME, where the second detection information is generated by the PGW-U according to the detection result, that is, the second detection information is used to indicate the status of the SGW-U obtained by the PGW-U.
  • the identifier of the faulty SGW-U may be carried in the second probe information. If the PGW-U determines that the SGW-U is fault-free, the PGW-U may not carry the identifier of the faulty SGW-U in the second probe information.
  • the detection period may be set. The SGW-U faults are determined according to the identifier of the fault SGW-U carried by the first probe information sent by the PGW-U received during the detection period.
  • the PGW-U may not send the second probe information to the MME, and if the MM E does not receive the first probe information sent by the PGW-U within the detection period, the default PGW-U detection result of the SGW-U is Normal, so that the MME default PGW-U detects the SGW-U as normal, so that the MME can learn the detection result of the PGW-U accordingly.
  • the PGW-U sends the second probe information to the PGW-C
  • the PGW-C sends the second probe information to the SGW-C by extending the S5/S8 interface message, and the SGW-C further transmits the second probe by extending the S11 interface message.
  • the information is sent to the MME. PGW-C is not shown in FIG.
  • S41-S42 and S43-S44 can be regarded as two parts, and the execution order of these two parts can be arbitrary.
  • the MME determines a fault type according to the first probe information and the second probe information, where the fault type includes an SGW-U fault, or a link fault between the SGW-C and the SGW-U.
  • the MME determines whether the SGW-U fault or the SGW-C and the SGW are combined with the identifier of the faulty SGW-U carried in the first probe information and the identifier of the faulty SGW-U carried in the two second probe information.
  • the link between -U is faulty.
  • the identifier of the faulty SGW-U carried in the second probe information that is sent by the base station to the MME includes the identifier of the faulty SGW-U carried in the first probe information, and the PGW-U is sent to the MME.
  • the identifier of the faulty SGW-U carried in the second probe information also includes the identifier of the faulty SGW-U carried in the first probe information, that is, the first probe information and the second probe information indicate that the SGW-U is faulty.
  • the MME determines that the SGW-U fails, and the link between the SGW-C and the SGW-U is normal.
  • the PGW-U sends the identifier to the PGW-U.
  • the identifier of the faulty SGW-U carried in the second probe information of the MME does not include the identifier of the faulty SGW-U carried in the first probe information, that is, the first probe information indicates that the SGW-U is faulty, and the second probe is detected.
  • the information indicates that the SGW-U is normal, and the MME determines that the SGW-U is normal, and the link between the SGW-C and the SGW-U is faulty.
  • the MME performs fault recovery processing.
  • the MME may adopt the first type introduced in S26 in the embodiment shown in FIG. 2 .
  • the link fault handling policy performs link fault recovery.
  • the MME may adopt the second type introduced in S26 in the embodiment shown in FIG. 2 .
  • the link fault handling policy or the third link fault handling policy performs link fault recovery.
  • the base station where the user forwarding plane is located may not initiate the user service bearer release processing associated with the SGW-U, so that the MME and the SGW-C have an opportunity to be based on the predefined
  • the link fault handling strategy and the redundancy mechanism provided in the current UP deployment mode restore the user service associated with the faulty SGW-U.
  • the MME can determine whether the SGW-U fault or the Sx chain is relatively accurately determined by the probe information acquired by the base station through the user forwarding plane and/or the probe information acquired by the PGW-U and the detection fault information acquired by the SGW-C through the control plane. Road failure.
  • the SGW-U fails, the associated user service is deactivated, and the user's service recovery time is longer.
  • the redundant resources provided by the SGW-U-based deployment can quickly recover the services of the faulty SGW-U.
  • the service recovery time of the users is shorter and the service experience of the users is better.
  • the user service release and reactivation of the heavyweight process after the SGW-U failure can be avoided to perform the recovery process of the user service, and the redundancy provided by the lightweight SGW-U deployment is selected.
  • the remaining resources recover the faulty user service, the user service recovery time is short, the user service experience is good, and the impact of the signaling storm of a large number of users carrying the deactivation/reactivation on the network is avoided.
  • An embodiment of the invention further provides a fault processing method.
  • the signaling processing network element detects the first control plane network element, and obtains the first detection information. If the signaling processing network element determines that the first control plane network element is faulty, the first detection information indicates that the first control plane network element is faulty.
  • the first user plane network element learns that the state of the second user plane network element is fault by detecting.
  • the first user plane network element generates second detection information according to the detection of the second user plane network element, and the second detection signal The information is sent to the signaling processing network element.
  • the signaling processing network element determines the fault type according to the first probe information and the second probe information.
  • the first control plane network element is SGW-C
  • the first user plane network element is the base station
  • the second user plane network element is the SGW-U
  • the signaling processing network element is the MME. As shown in Figure 5.
  • the MME determines that the SGW-C is faulty, and generates the first probe information, that is, the first probe information is used to indicate the state of the SGW-C obtained by the MME.
  • the MME detects the SGW-C, and generates first detection information according to the detection result, where the first detection information may indicate whether the SGW-C is faulty.
  • the MME after the MME determines that the SGW-C is faulty, the MME still maintains the service associated with the SGW-C to continue normal.
  • the MME may identify the SGW-C failure after determining that the SGW-C is faulty, but may still keep the service associated with the SGW-C from proceeding normally. Because the SGW-C fault or the link fault can be determined in combination with other information, the MME can continue to keep the SGW-C-associated service continuing normally, so that the service can continue and avoid the service. Sudden interruptions to improve the user experience.
  • the SGW-U learns that the state of the SGW-C is a fault.
  • the SGW-U can learn whether the status of the SGW-C is faulty through a mechanism such as heartbeat detection.
  • the SGW-U after the SGW-U determines that the SGW-C is faulty, the SGW-U still maintains the service associated with the SGW-U to continue normal.
  • the SGW-U can identify the SGW-C fault after determining that the SGW-C is faulty, but can still keep the service associated with the SGW-U from continuing normally. Since the SGW-C failure or the link failure is determined by the MME, as the SGW-U, the service associated with the SGW-U can continue to be performed normally when it is uncertain what is the fault, so if If the link is faulty, it will not affect the SGW-U to continue processing the service, so that the service can continue, avoid sudden interruption of the service, and improve the user experience.
  • the base station learns that the state of the SGW-U is a fault.
  • the base station can detect the state of the perceived SGW-U based on the forwarding channel of the user plane, and the SGW-U detects Perceive the state of PGW-U. If the PGW-U is faulty, the SGW-U may directly detect and release the user service bearer associated with the SGW-U. The SGW-U may send the fault information of the PGW-U to the MME through the SGW-C. Similarly, if the SGW is used. The -U fault may also cause the base station or PGW-U to detect and release the locally associated user traffic bearer.
  • the base station sends the second probe information to the MME, where the second probe information is generated by the base station according to the detection result, that is, the second probe information is used to indicate the status of the SGW-U obtained by the base station.
  • the second probe information may carry the identifier of the faulty SGW-U. If the base station determines that the SGW-U is faultless, the base station may not carry the faulty SGW in the second probe information. The identifier of the -U, or the base station may not send the second probe information to the MME, so that the MME can learn the detection result of the base station accordingly.
  • the UP for example, the base station
  • learns that the status of the other UP for example, the SGW-U
  • the UP locally releases the service associated with another UP, and the UP may be released due to the release of the service.
  • the message triggered by the MME carries the second probe information.
  • the base station determines that the SGW-U is faulty, and sends an S1UE Context Release Request message to the MME to release the air interface and the local service related to the SGW-U, that is, the second probe information can be implemented by using the S1UE Context Release Request message.
  • the MME can know the detection result of the base station.
  • the MME may send an S1UE Context Release Command message to the base station to instruct the base station to release the air interface and the local service.
  • S51, S52 and S53-S54 can be regarded as three parts, and the order of execution of these three parts can be arbitrary.
  • the MME determines a fault type according to the first probe information and the second probe information, where the fault type includes an SGW-C fault, or both the SGW-C and the SGW-U fault, that is, when the SGW-C fails, the SGW-U Is it normal or faulty?
  • the MME determines whether only the SGW-C is faulty, only the SGW-C identifier of the faulty SGW-C carried in the first probe information, and the identifier of the faulty SGW-U carried in the second probe information.
  • the SGW-U failure is still faulty for both SGW-C and SGW-U.
  • the MME determines that the SGW-C corresponding to the identifier of the faulty SGW-C carried in the first probe information is a fault.
  • the MME determines that the SGW-U corresponding to the identifier of the faulty SGW-U carried in the second probe information is a fault.
  • the MME performs fault recovery processing.
  • the MME may adopt the second fault handling policy or the fifth fault handling policy introduced in S26 in the embodiment shown in FIG. 2 to perform fault recovery. deal with.
  • the MME may adopt the first fault handling policy or the fifth fault handling policy introduced in S26 in the embodiment shown in FIG. 2 to perform fault recovery. deal with.
  • the MME determines that both the SGW-C and the SGW-U are faulty, and the faulty SGW-C and the SGW-U are not associated, that is, the faulty SGW-C and the SGW-U are not associated with each other.
  • SGW-C and SGW-U the MME can be handled as a SGW-C fault and an SGW-U fault, respectively, and the processing manner is as described above.
  • the MME may adopt the fourth fault handling policy described in S26 in the embodiment shown in FIG. 2, that is, release the service policy to perform fault recovery processing.
  • the MME may locally release all services associated with the faulty SGW-C and/or SGW-U of the associated relationship.
  • the SGW-U associated with the SGW-C may include the SGW-U managed by the SGW-C.
  • the MME may not start the fault service processing temporarily, and the MME combines the second probe information sent by the base station to cause the SGW-C fault, the SGW-U fault, and the fault.
  • the MME may not start the fault service processing temporarily, and the MME combines the second probe information sent by the base station to cause the SGW-C fault, the SGW-U fault, and the fault.
  • An embodiment of the invention further provides a fault processing method.
  • the second control plane network element is configured to detect that the state of the first control plane network element is a fault, and the second control plane network element generates the first probe information according to the detection of the first control plane network element, and Sending the first probe information to the signaling processing network element.
  • the first user plane network element learns that the state of the second user plane network element is faulty, and the first user plane network element generates second detection information according to the detection of the second user plane network element, and the second detection information is also Send to the signaling processing network element.
  • the signaling processing network element determines the fault type based on the first probe information and the second probe information.
  • the first control plane network element is PGW-C
  • the first user plane network element is SGW-U
  • the second user plane network element is PGW-U
  • the second control plane network element is SGW-U.
  • the signaling processing network element is an MME as an example, as shown in FIG. 6.
  • S61 and SGW-C know that the state of the PGW-C is a failure.
  • the SGW-C can learn whether the status of the PGW-C is fault through a mechanism such as heartbeat detection.
  • the SGW-C after the SGW-C determines that the PGW-C is faulty, the SGW-C still maintains the PGW-C-associated service to continue normal.
  • the PGW-C fault can be identified, but the service associated with the PGW-C can still be maintained normally. Because whether the PGW-C fault or the link fault is determined by the MME, as the SGW-C, the service associated with the PGW-C can continue to be performed normally when it is uncertain what is the fault, so if It is indeed a link failure, so it generally does not affect the PGW-C to continue to process the business, so that the business can continue, avoid sudden interruption of business, and improve the user experience.
  • the SGW-C sends the first probe information to the MME, where the first probe information is generated by the SGW-C according to the detection result, that is, the first probe information is used to indicate the status of the PGW-C obtained by the SGW-C.
  • the identifier of the faulty PGW-C may be carried in the first probe information. If the SGW-C determines that the PGW-C is faultless, the SGW-C is in the first probe. The information may not carry the identifier of the faulty PGW-C, and for the MME, the detection week may be set. And determining which PGW-Cs are faulty according to the identifier of the fault PGW-C carried by the first probe information sent by the SGW-C received in the detection period.
  • the SGW-C may not send the first probe information to the MME, so that if the MME does not receive the first probe information sent by the SGW-C within the detection period, the default SGW-C detection result for the PGW-C is normal. Therefore, the MME can learn the detection result of the SGW-C accordingly.
  • the SGW-C sends the first probe information to the MME through the extended S11 interface message.
  • the extended S11 interface message is, for example, an Echo Request message or a newly added fault handling message.
  • S63 and PGW-U know that the state of the PGW-C is a failure.
  • the PGW-U can know whether the status of the PGW-C is fault through a mechanism such as heartbeat detection, and no further description is provided.
  • the PGW-U after the PGW-U determines that the PGW-C is faulty, the PGW-U locally keeps the service associated with the PGW-U from continuing normally.
  • the PGW-U can identify the PGW-C failure after determining the PGW-C failure, but can still maintain the PGW-U-related service to continue normal. Since the PGW-C failure or the link failure is determined by the MME, as the PGW-U, the service associated with the PGW-U can continue to be performed normally when it is uncertain what is the fault, so if It is indeed a link failure, so it generally does not affect the PGW-U to continue to process the business, so that the business can continue, avoid sudden interruption of the business, and improve the user experience.
  • S64 and SGW-U know that the state of the PGW-U is a fault.
  • the SGW-U can learn whether the status of the PGW-U is faulty through a mechanism such as heartbeat detection.
  • the SGW-U sends the second probe information to the MME, where the second probe information is generated by the SGW-U according to the detection result, that is, the second probe information is used to indicate the status of the PGW-U obtained by the SGW-U.
  • the second probe information may carry the identifier of the faulty PGW-U, and may also carry the identifier of the PGW-C associated with the faulty PGW-U, if the SGW -U determines that the PGW-U is not faulty, then the SGW-U may not carry the identifier of the faulty PGW-U and the identifier of the PGW-C associated with the faulty PGW-U in the second probe information, for the MME It can be said that the detection period can be set to determine which PGW-Us are faulty according to the identifier of the fault PGW-U carried by the second probe information sent by the SGW-U received during the detection period.
  • the SGW-U may not send the second probe information to the MME, and if the MME does not receive the second probe information sent by the SGW-U within the detection period, the default SGW-U detection result for the PGW-U is normal. Therefore, the MME can learn the detection result of the SGW-U accordingly.
  • the SGW-C determines the identity of the PGW-C associated with the faulty PGW-U based on the association relationship between the locally constructed PGW-C and the PGW-U. For example, the SGW-C may construct an association relationship between the PGW-C and the PGW-U based on the PGW-U information exchanged between the SGW-C and the PGW-C when creating the user plane channels of the SGW-U and the PGW-U, or PGW -C may transmit the association relationship between the PGW-C and the PGW-U to the SGW-C, and the SGW-C may also send the association relationship to the MME.
  • the SGW-U may have different processing modes for the local service associated with the PGW-U after the status of the PGW-U is known to be faulty.
  • the SGW-U sends the second detection information to the MME. different way. The following is a brief introduction.
  • the UP for example, the SGW-U
  • the UP learns that the state of the other UP (for example, the PGW-U) is a fault
  • the UP locally releases the service associated with another UP, and the UP may be due to the release of the service.
  • the message triggered to the MME carries the second probe information.
  • the SGW-U can trigger the release process of the user service bearer associated with the PGW-U. For example, the SGW-U sends a message for performing fault processing to the SGW-C through the Sxa interface, for example, a UPlane Session Delete Request message, and the SGW-C sends a message for performing fault processing to the MME through the S11 interface, for example, a Delete Bearer Request message. And sending a message for performing fault processing, such as a Delete Bearer Request message, to the PGW-C through the S5/S8 interface, thereby releasing the user service bearer associated with the faulty PGW-U.
  • the SGW-U can extend the identifier of the PGW-U carrying the fault in the UPlane Session Delete Request message.
  • the identifier of the PGW-U carrying the fault can be extended in the Delete Bearer Request message.
  • UP for example, SGW-U
  • PGW-U the state of another UP
  • the UP may send the second probe information to the MME by using an existing message or a newly added fault processing message or the like. For example, if the SGW-U knows that the status of the PGW-U is a fault, the SGW-U continues to maintain the PGW-U-related service locally, and the SGW-U sends the second probe information to the MME through the SGW-C.
  • S61-S62, S63, S64-S65 can be regarded as three parts, and the order of execution of these three parts can be arbitrary.
  • the MME determines a fault type according to the first probe information and the second probe information, where the fault type includes only a PGW-C fault, only a PGW-U fault, or both the PGW-C and the PGW-U fault.
  • the MME determines whether the PGW-C fault, the PGW-U fault, or the PGW- is combined with the identifier of the faulty PGW-C carried in the first probe information and the identifier of the faulty PGW-U carried in the second probe information. Both C and PGW-U fail.
  • the MME determines that the PGW-C corresponding to the identifier of the faulty PGW-C carried in the first probe information is a fault.
  • the MME determines that the PGW-U corresponding to the identifier of the faulty PGW-U carried in the second probe information is a fault.
  • the MME performs fault recovery processing.
  • the MME may adopt the second fault handling policy or the fifth fault handling policy introduced in S26 in the embodiment shown in FIG. 2 to perform fault recovery processing. .
  • the MME may adopt the first fault handling policy or the fifth fault handling policy introduced in S26 in the embodiment shown in FIG. 2 to perform fault recovery processing. .
  • the MME determines that there is both a PGW-C and a PGW-U fault, and the faulty PGW-C and the PGW-U are not associated, that is, the faulty PGW-C and the PGW-U are not associated with each other.
  • the MME can be handled as a PGW-C fault and a PGW-U fault, respectively, and the processing manner is as described above.
  • the MME may adopt the fourth fault handling policy described in S26 in the embodiment shown in FIG. 2, that is, release the service policy to perform fault recovery processing.
  • the MME may locally release all services associated with the faulty PGW-C and/or PGW-U of the associated relationship.
  • the PGW-U associated with the PGW-C may include the PGW-U managed by the PGW-C.
  • the SGW-C after detecting the PGW-C failure, the SGW-C may not start the fault service processing temporarily.
  • the PGW-U may not start the fault service processing temporarily.
  • the MME combines the detection information sent by the SGW-C and the PGW-U to fault the PGW-C, the PGW-U failure, the unrelated PGW-C and PGW-U failures, the associated PGW-C and PGW-U failures.
  • FIG. 7 is a schematic diagram of a network architecture according to an embodiment of the present invention. It can be seen that FIG. 7 shows the network architecture of the SDN, including the SDN controller and multiple switches. In FIG. 7, three switches are taken as an example. In practical applications, the number of switches can be set according to the situation. In FIG. 7, the signaling processing network element is implemented by using an SDN controller as an example.
  • An embodiment of the invention provides a fault determination and processing method.
  • the signaling processing network element determines the first user plane network element fault by the probe, and the signaling processing network element generates the first probe information according to the detection of the first user plane network element.
  • the second user plane network element learns that the state of the first user plane network element is faulty, and the second user plane network element generates second detection information according to the detection of the first user plane network element, and the second detection information is also Send to the signaling processing network element.
  • Signaling processing network element The fault type is determined according to the first probe information and the second probe information.
  • the first user plane network element and the second user plane network element are both switches and the signaling processing network element is an SDN controller as an example.
  • switch 71 is the first user plane network element in the embodiment of the present invention
  • switch 72 is the second user plane network element in the embodiment of the present invention. Therefore, it should be clear that the same type of device is given different reference numerals in FIG. 7 for the convenience of description, and does not mean that the types of these devices are different. See Figure 8.
  • the SDN controller determines that the switch 71 is faulty, and generates first probe information, that is, the first probe information is used to indicate the status of the switch 71 obtained by the SDN controller.
  • the SDN controller detects the switch 71 and can determine whether the switch 71 is faulty.
  • the SDN controller may generate the first probe information according to the detection result of the switch 71.
  • the first probe information may indicate whether the switch 71 is normal or faulty.
  • the fault of the switch 71 is taken as an example. It can be understood that the SDN controller detects the switch 71, and detects the control plane connection state between the SDN controller and the switch 71, that is, detects the link state between the SDN controller and the switch 71. It can be understood that the first probe information can indicate whether the signaling connection status between the SDN controller and the switch 71 is normal or faulty.
  • the SDN controller after the SDN controller determines that the switch 71 is faulty, the SDN controller locally keeps the service associated with the switch 71 from continuing normally.
  • the SDN controller can identify that the switch 71 is faulty, but can still keep the services associated with the switch 71 continue to operate normally. Because the switch 71 is faulty or the link fault can be determined in combination with other information, the SDN controller can continue to keep the services associated with the switch 71 continue to operate normally, so that the service can continue and avoid the service. Sudden interruptions to improve the user experience.
  • the switch 72 learns that the state of the switch 71 is a fault.
  • the switch 72 detects the switch 71 and can determine whether the switch 71 is faulty. It can be understood that the switch 72 detects the switch 71 and detects the user between the switch 72 and the switch 71. Face connection status.
  • the switch 72 sends the second probe information to the SDN controller, where the second probe information is generated by the switch 72 according to the detection result, that is, the second probe information is used to indicate the status of the switch 71 obtained by the switch 72.
  • the switch 72 can generate the second probe information according to the detection result of the switch 71.
  • the second probe information can indicate whether the switch 71 is normal or faulty.
  • the fault of the switch 71 is taken as an example. It can be understood that the second probe information can indicate whether the user plane connection status between the switch 71 and the switch 72 is normal or faulty.
  • the switch 72 can detect other switches in addition to the switch 71.
  • the switch 73 can detect the switch 73. Therefore, the second probe information can indicate that the switch 71 is normal or faulty, and can also indicate the switch. 73 is normal or faulty, and is not limited in the embodiment of the present invention.
  • S81, and S82-S83 can be regarded as two parts, and the execution order of these two parts can be arbitrary.
  • the SDN controller determines the fault type according to the first probe information and the second probe information, where the fault type includes a fault of the switch 71, or a link fault between the SDN controller and the switch 71.
  • the SDN controller determines whether the switch 71 is faulty or the link between the switch 71 and the SDN controller, in combination with the identifier of the faulty switch carried by the first probe information and the identifier of the faulty switch carried in the second probe information. malfunction.
  • the identifier of the faulty switch carried in the second probe information includes the identifier of the faulty switch carried by the first probe information, that is, the identifier of the switch 71 carrying the fault in the first probe information, and the second The probe information also carries the identity of the failed switch 71, and the SDN controller can determine that the switch 71 is faulty.
  • the first probe information carries the identifier of the faulty switch 71.
  • the second probe information does not carry the identifier of the switch 71, and the SDN controller can determine that the switch 71 is normal, and the link between the switch 71 and the SDN controller is faulty.
  • S85 SDN controller performs fault recovery processing.
  • the SDN controller may reselect the new switch to replace the original faulty switch 71, for example, selecting The switch 73 replaces the switch 71, and needs to update the forwarding table of the failed switch 71 to the switch 73, and modify the forwarding table of the upstream and downstream switches of the switch 71 so that its upstream and downstream switches are forwarded through the reselected switch 73.
  • the SDN controller can wait for the SDN control connection interface to resume, and continue to let the user plane data flow through the switch. 71 for transmission.
  • the technical solution provided by the embodiment of the present invention can effectively identify whether the network element is faulty or the link is faulty, so that different measures can be taken respectively to ensure that the service of the user plane can be continued, and the possibility of sudden interruption of the service is reduced, and the possibility is improved. Network performance.
  • FIG. 9 is a schematic diagram of a computer device 100 according to an embodiment of the present invention.
  • the computer device 100 includes at least one processor 101, a communication bus 102, a memory 103, and at least one communication interface 104.
  • the signaling processing network element or the first network element and the like can be implemented by the computer device 100 shown in FIG.
  • the first network element may be a control plane network element (such as a first control plane network element or a second control plane network element), or a user plane network element (such as a first user plane network element or a second user plane network element)
  • the second network element may also be a control plane network element or a user plane network element.
  • the first network element obtains the detection information according to the detection of the second network element by detecting the state of the second network element, and the detection information may be The first probe information or the second probe information is included, and the first network element sends the generated probe information to the signaling processing network element.
  • the first network element and the second network element are, reference may be made to the method provided by any one of FIG. 2 to FIG. 6 or FIG.
  • the processor 101 can be a general purpose central processing unit (CPU), a microprocessor, an application-specific integrated circuit (ASIC), or one or more integrated circuits for controlling the execution of the program of the present invention.
  • CPU central processing unit
  • ASIC application-specific integrated circuit
  • Communication bus 102 can include a path for communicating information between the components described above.
  • Communication interface 104 using any type of transceiver, for communicating with other devices or communication networks, such as Ethernet, Radio Access Network (RAN), Wireless Local Area Networks (WLAN), etc.
  • RAN Radio Access Network
  • WLAN Wireless Local Area Networks
  • the memory 103 can be a read-only memory (ROM) or other type of static storage device that can store static information and instructions, a random access memory (RAM) or other type that can store information and instructions.
  • the dynamic storage device can also be an Electrically Erasable Programmable Read-Only Memory (EEPROM), a Compact Disc Read-Only Memory (CD-ROM) or other optical disc storage, and a disc storage device. (including compact discs, laser discs, optical discs, digital versatile discs, Blu-ray discs, etc.), magnetic disk storage media or other magnetic storage devices, or can be used to carry or store desired program code in the form of instructions or data structures and can be Any other media accessed, but not limited to this.
  • the memory 103 can be independently present and connected to the processor 101 via a bus.
  • the memory 103 can also be integrated with the processor 101.
  • the memory 103 is used to store application code for executing the solution of the present invention, and is controlled by the processor 101 for execution.
  • the processor 101 is configured to execute application code stored in the memory 103. If the signaling processing network element, the control plane network element, or the user plane network element is implemented by the computer device 100, one or more of the signaling processing network element, the control plane network element, or the user plane network element memory 103 may be stored.
  • the software module, the signaling processing network element, the control plane network element, or the user plane network element may implement the stored software module through the processor 101 and the program code in the memory 103 to implement the determination or processing of the fault.
  • processor 101 may include one or more CPUs, such as CPU0 and CPU1 in FIG.
  • the computer device 100 may include a plurality of processors 101, such as the first processor 1011 and the second processor 1012 in FIG. 9, wherein the first processor 1011 and the second process The names of the devices 1012 are different and the reference numerals are different, just to distinguish the plurality of processors 101.
  • processors 101 may be a single-CPU processor 101 or a multi-CPU processor 101.
  • Processor 101 herein may refer to one or more devices, circuits, and/or processing cores for processing data, such as computer program instructions.
  • the computer device 100 described above may be a general purpose computer device or a special purpose computer device.
  • the computer device 100 may be a desktop computer, a portable computer, a network server, a personal digital assistant (PDA), a mobile phone, a tablet computer, a wireless terminal device, a communication device, an embedded device, or have FIG. A device of similar structure.
  • PDA personal digital assistant
  • Embodiments of the invention do not limit the type of computer device 100.
  • an embodiment of the present invention provides a signaling processing network element, where the signaling processing network element includes a receiving unit 1001 and a processing unit 1002.
  • the signaling processing network element may further include a sending unit 1003, which is shown together in FIG.
  • the sending unit 1003 is an optional functional unit. In order to distinguish it from the required functional unit, it is drawn in the form of a dotted line in FIG.
  • the physical device corresponding to the receiving unit 1001 and the sending unit 1003 may include the communication interface 104 in FIG. 9, and the physical device corresponding to the processing unit 1002 may be the processor 101 in FIG. It can be considered that in the communication interface 104 in FIG. 9, some communication interfaces 104 implement the functions of the receiving unit 1001, and some communication interfaces 104 can implement the functions of the transmitting unit 1003, or can be considered as being in the communication interface 104 in FIG. It is possible that each communication interface 104 can implement both the function of the receiving unit 1001 and the function of the transmitting unit 1003.
  • the signaling processing network element may be used to perform the method provided by the embodiment shown in any one of the above Figures 2 to 4, for example, may be a signaling processing network element as described above. Therefore, for the functions and the like implemented by the units in the signaling processing network element, reference may be made to the description of the previous method part, and details are not described herein.
  • an embodiment of the present invention provides a signaling processing network element, where the signaling processing network element includes a receiving unit 1101 and a processing unit 1102.
  • the signaling processing network element may further include a sending unit 1103, which is shown together in FIG.
  • the sending unit 1103 is an optional functional unit, which is drawn in the form of a broken line in FIG. 11 in order to distinguish it from the required functional unit.
  • the physical device corresponding to the receiving unit 1101 and the sending unit 1103 may include the communication interface 104 in FIG. 9, and the physical device corresponding to the processing unit 1102 may be the processor in FIG. 101. It can be considered that in the communication interface 104 in FIG. 9, some communication interfaces 104 implement the functions of the receiving unit 1101, and some communication interfaces 104 can implement the functions of the transmitting unit 1103, or can be considered as being in the communication interface 104 in FIG. It is possible that each communication interface 104 can implement both the function of the receiving unit 1101 and the function of the transmitting unit 1103.
  • the signaling processing network element may be used to perform the method provided by the embodiment shown in any of the above-mentioned Figures 5-6, and may be, for example, a signaling processing network element as described above. Therefore, for the functions and the like implemented by the units in the signaling processing network element, reference may be made to the description of the previous method part, and details are not described herein.
  • an embodiment of the present invention provides an SDN controller, where the SDN controller includes a receiving unit 1201 and a processing unit 1202.
  • the physical device corresponding to the receiving unit 1201 may be the communication interface 104 in FIG. 9, and the physical device corresponding to the processing unit 1202 may be the processor 101 in FIG. It can be considered that in the communication interface 104 in FIG. 9, some communication interfaces 104 implement the function of the receiving unit 1201, and some communication interfaces 104 can implement the function of transmitting data, or can be considered that in the communication interface 104 in FIG. It is possible that each communication interface 104 can implement both the function of the receiving unit 1201 and the function of transmitting data.
  • the signaling processing network element may be used to perform the method provided by the embodiment shown in FIG. 8 above, and may be, for example, the SDN controller as described in the embodiment shown in FIG. 7 or FIG. 8. Therefore, for the functions and the like implemented by the units in the signaling processing network element, reference may be made to the description of the previous method part, and details are not described herein.
  • an embodiment of the present invention provides a network element, where the network element is a first network element, where the network element includes a sending unit 1301 and a processing unit 1302.
  • the physical device corresponding to the sending unit 1301 may be the communication interface 104 in FIG. 9, and the physical device corresponding to the processing unit 1302 may be the processor 101 in FIG. It can be considered that, in the communication interface 104 in FIG. 9, some communication interfaces 104 implement the functions of the transmitting unit 1301, and some communication interfaces 104 can implement the function of receiving data, or it can be considered that in the communication interface 104 in FIG. It is possible that each communication interface 104 can implement both the function of the transmitting unit 1301 and the function of receiving data.
  • the network element may be used to perform the method provided by the embodiment shown in any one of the above FIG. 2-6 and FIG. 8, for example, the first control plane network element and the second may be as described above.
  • the control plane network element, the first user plane network element, or the second user plane network element may be used to perform the method provided by the embodiment shown in any one of the above FIG. 2-6 and FIG. 8, for example, the first control plane network element and the second may be as described above.
  • the control plane network element, the first user plane network element, or the second user plane network element may be used to perform the method provided by the embodiment shown in any one of the above FIG. 2-6 and FIG. 8, for example, the first control plane network element and the second may be as described above.
  • the control plane network element, the first user plane network element, or the second user plane network element may be as described above.
  • the signaling processing network element can receive the information sent by the multi-party network element, and therefore the task of performing the fault determination is handed over to the signaling processing network element.
  • the signaling processing network element can comprehensively determine whether the user plane network element fault, the control plane network element fault, or the link fault between the control plane network element and the user plane network element is combined with the received first probe information and the second probe information. Because the multi-faceted information is comprehensively considered in the judgment of the fault, not only the information of the single network element is considered, but the accuracy of the judgment result is improved, so that if the network element fails, it can be processed according to the network element failure, if A link fault can be processed according to the link fault. You can avoid service interruption of the faultless UP and ensure the continuity of the service. In the case of an UP fault, you can also restore the UP service as quickly as possible. Business experience.
  • the disclosed apparatus and method can be implemented in other ways.
  • the device embodiments described above are merely illustrative.
  • the division of the unit or unit is only a logical function division.
  • there may be another division manner for example, multiple units or components may be used. Combinations can be integrated into another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be electrical or otherwise.
  • the embodiment of the present invention further provides a computer storage medium, wherein the computer storage medium may store a program, where the program includes some or all of the bandwidth adjustment method in any one of the video communication processes described in the foregoing method embodiments. step.
  • the functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may also be an independent physical module.
  • the integrated unit if implemented in the form of a software functional unit and sold or used as a standalone product, may be stored in a computer readable storage medium. Based on this understanding, this All or part of the technical solution of the invention may be embodied in the form of a software product stored in a storage medium, including instructions for causing a computer device, such as a personal computer, a server, or a network device. Etc., or a processor, performs all or part of the steps of the methods described in various embodiments of the present invention.
  • the foregoing storage medium includes: a universal serial bus flash drive, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disk, and the like, which can store program codes.

Abstract

A fault processing method and device for reducing the possibility of service interruption of a UP (user plane). In the embodiments of the present invention, since a signalling processing network element may receive information sent by multi-party network elements, a fault determination task is handed over to the signalling processing network element. The signalling processing network element may comprehensively determine whether a user plane network element fails, a control plane network element fails, or a link between the control plane network element and the user plane network element fails by combining received first detection information and second detection information. Since information about multiple aspects may be comprehensively considered during fault determination, instead of only taking information about a single network element into consideration, the accuracy rate of a determination result is improved, so that subsequently, if a network element fails, processing may be performed according to the network element fault, and if the link fails, processing may be performed according to the link fault, thereby preventing service interruption of a fault-free UP as far as possible and guaranteeing the continuity of a service.

Description

一种故障处理方法及设备Fault processing method and device 技术领域Technical field
本发明涉及通信技术领域,尤其涉及一种故障处理方法及设备。The present invention relates to the field of communications technologies, and in particular, to a fault processing method and device.
背景技术Background technique
目前,服务网关(Serving GateWay,S-GW)和分组数据网网关(Packet Data Network GateWay,P-GW)集中部署在区域/省中心已无法满足容量性能持续增长,并鉴于未来的第五代移动通信系统(5G)对于多样业务切片的需求,因此,S-GW/P-GW网关下沉分布式部署成为未来的部署发展趋势。在S-GW/P-GW网关下沉分布式部署的趋势下,第三代合作伙伴计划(3rd Generation Partnership Project,3GPP)立项开始了对于S-GW/P-GW网关节点的控制面(Control Plane,CP)/用户面(User Plane,UP)分离的研究。Currently, Serving GateWay (S-GW) and Packet Data Network GateWay (P-GW) are deployed centrally in the regional/provincial center, which is unable to meet the continuous growth of capacity performance, and given the future fifth-generation mobile The communication system (5G) needs for multiple service slicing. Therefore, the S-GW/P-GW gateway sinking distributed deployment becomes the future deployment trend. Under the trend of S-GW/P-GW gateway sinking distributed deployment, the 3rd Generation Partnership Project (3GPP) project initiated the control plane for S-GW/P-GW gateway nodes (Control) Plane, CP) / User Plane (UP) separation study.
在现有的系统架构演进(System Architecture Evolution,SAE)架构下,按照CP/UP分离的理念,S-GW和P-GW分裂为控制面服务网关(Control Plane S-GW,SGW-C)、用户面服务网关(User Plane S-GW,SGW-U)、控制面分组数据网网关(Control Plane P-GW,PGW-C)和用户面分组数据网网关(User Plane P-GW,PGW-U)。其中,网关通用分组无线服务支持节点(Gateway General Packet Radio Service Support Node,GGSN)功能包含在P-GW中,随着PGW也分裂为控制面网关通用分组无线服务支持节点(Control Plane GGSN,GGSN-C)和用户面网关通用分组无线服务支持节点(User Plane GGSN,GGSN-U),后续不再单独描述。Under the existing System Architecture Evolution (SAE) architecture, the S-GW and P-GW are split into Control Plane S-GW (SGW-C) according to the concept of CP/UP separation. User Plane S-GW (SGW-U), Control Plane Packet Network Gateway (Control Plane P-GW, PGW-C) and User Plane Packet Network Gateway (User Plane P-GW, PGW-U) ). The Gateway General Packet Radio Service Support Node (GGSN) function is included in the P-GW, and the PGW is also split into a Control Plane Gateway General Packet Radio Service Support Node (Control Plane GGSN, GGSN- C) and User Plane Gateway General Packet Radio Service Support Node (User Plane GGSN, GGSN-U), which will not be described separately.
CP/UP的分离使得传统的S-GW/P-GW网关内部的CP和UP之间的链路变成了标准的外部3GPP Sx接口。其中,Sx接口包括SGW-C和SGW-U之间的Sxa接口,以及PGW-C和PGW-U之间的Sxb接口。在CP/UP分离后,CP随控制面网关(Control Plane Gateway,CGW)集中部署,UP随分布式网关(Distributed Gateway,DGW)下沉分布式部署,则CP和UP之间的链路故障的概率将增大。 其中,CGW可以视为包括SGW-C的功能和PGW-C的功能,DGW可以视为包括SGW-U的功能和PGW-U的功能,而GGSN-C可以视为与PGW-C合一设置,GGSN-U视为与PGW-U合一设置。The separation of the CP/UP causes the link between the CP and the UP inside the conventional S-GW/P-GW gateway to become a standard external 3GPP Sx interface. The Sx interface includes an Sxa interface between the SGW-C and the SGW-U, and an Sxb interface between the PGW-C and the PGW-U. After the CP/UP is separated, the CP is deployed centrally with the Control Plane Gateway (CGW), and the UP is distributed with the distributed gateway (DGW). The link between the CP and the UP is faulty. The probability will increase. The CGW can be regarded as including the functions of the SGW-C and the functions of the PGW-C. The DGW can be regarded as including the functions of the SGW-U and the functions of the PGW-U, and the GGSN-C can be regarded as being combined with the PGW-C. GGSN-U is considered to be in one set with PGW-U.
若CP和UP正常,只是CP和UP之间的链路故障,实际上并不影响UP继续提供用户业务接入服务。现在CP和UP可以互相检测。对于CP来说,无论是UP故障还是CP和UP之间的链路故障,CP都会认为是UP故障,从而直接按照UP故障进行处理,例如CP会重选UP进行用户重新激活。而对于UP来说,无论是CP故障还是CP和UP之间的链路故障,UP都会认为是CP故障,从而直接按照UP故障进行处理,例如UP会释放本地的业务。如果实际上是CP和UP之间的链路故障,CP按照UP故障进行处理或UP按照CP故障进行处理,则会导致本来没有故障的UP的业务中断,也影响用户的业务体验。可见,由于目前的检测结果准确性不高,很可能导致UP的业务中断。If the CP and the UP are normal, only the link between the CP and the UP is faulty, which does not actually affect the UP to continue to provide user service access services. Now CP and UP can detect each other. For the CP, whether it is an UP fault or a link fault between the CP and the UP, the CP considers it to be an UP fault and directly processes it according to the UP fault. For example, the CP will reselect UP for user reactivation. For UP, whether it is a CP fault or a link failure between the CP and the UP, the UP is considered to be a CP fault, and thus directly processes the fault according to the UP fault, for example, the UP releases the local service. If the link between the CP and the UP is faulty, the CP processes the fault according to the UP fault or the fault is processed according to the fault of the CP. This causes the service of the UP that is not faulty to be interrupted and the service experience of the user. It can be seen that the current test results are not accurate enough, which may cause the UP business to be interrupted.
发明内容Summary of the invention
本发明实施例提供一种故障处理方法及设备,用于减小UP的业务中断的可能性。The embodiment of the invention provides a fault processing method and device for reducing the possibility of service interruption of the UP.
第一方面,提供一种故障处理方法,该方法由信令处理网元执行,该信令处理网元例如通过MME实现。该方法包括:信令处理网元接收第一控制面网元发送的第一探测信息,第一探测信息用于指示第一控制面网元得到的第一用户面网元的状态。信令处理网元接收第二用户面网元发送的第二探测信息,第二探测信息用于指示第二用户面网元得到的第一用户面网元的状态。信令处理网元根据第一探测信息和第二探测信息确定故障类型,故障类型包括第一用户面网元故障,或第一控制面网元与第一用户面网元之间的链路故障。In a first aspect, a fault processing method is provided, which is performed by a signaling processing network element, such as implemented by an MME. The method includes: the signaling processing network element receives the first detection information sent by the first control plane network element, where the first detection information is used to indicate the state of the first user plane network element obtained by the first control plane network element. The signaling processing network element receives the second detection information sent by the second user plane network element, where the second detection information is used to indicate the status of the first user plane network element obtained by the second user plane network element. The signaling processing network element determines the fault type according to the first probe information and the second probe information, where the fault type includes the first user plane network element fault, or a link fault between the first control plane network element and the first user plane network element. .
本发明实施例中,因信令处理网元可以接收多方网元发送的信息,因此将进行故障判断的任务交给信令处理网元。信令处理网元可以结合接收的第一探测信息和第二探测信息来综合判断究竟是用户面网元故障、控制面网元 故障还是控制面网元和用户面网元之间的链路故障,因在进行故障判断时会综合考虑多方面的信息,而不只是考虑单网元的信息,提高了判断结果的准确率,从而后续如果是网元故障可以按照网元故障进行处理,如果是链路故障则可以按照链路故障进行处理,可以尽量避免无故障的UP的业务中断,保证业务的连续性,或者在UP故障情况下,也可以尽量快速恢复UP的业务,尽量不影响用户的业务体验。In the embodiment of the present invention, the signaling processing network element can receive the information sent by the multi-party network element, and therefore the task of performing the fault determination is handed over to the signaling processing network element. The signaling processing network element can comprehensively determine whether the user plane network element fault and the control plane network element are combined with the received first probe information and the second probe information. The fault is also a link fault between the control plane network element and the user plane network element. Because multiple aspects of information are comprehensively considered in the fault judgment, not only the information of a single network element is considered, but the accuracy of the judgment result is improved. Therefore, if the network element fails, the network element can be processed according to the network element failure. If the link is faulty, the link fault can be processed. The service interruption of the faultless UP can be avoided to ensure the continuity of the service or the fault in the UP. In this case, you can also restore the UP service as quickly as possible, and try not to affect the user's business experience.
结合第一方面,在第一方面的第一种可能的实现方式中,信令处理网元根据第一探测信息和第二探测信息确定故障类型,通过以下方式实现:若第一探测信息和第二探测信息均指示第一用户面网元故障,则信令处理网元确定故障类型为第一用户面网元故障;或,若第一探测信息指示第一用户面网元故障,而第二探测信息指示第一用户面网元正常,则信令处理网元确定故障类型为第一用户面网元正常,第一控制面网元与第一用户面网元之间的链路故障。With reference to the first aspect, in a first possible implementation manner of the first aspect, the signaling processing network element determines the fault type according to the first probe information and the second probe information, by implementing the following manner: The second detection information indicates that the first user plane network element is faulty, and the signaling processing network element determines that the fault type is the first user plane network element fault; or, if the first detection information indicates that the first user plane network element fails, and the second The detection information indicates that the first user plane network element is normal, and the signaling processing network element determines that the fault type is normal for the first user plane network element, and the link between the first control plane network element and the first user plane network element is faulty.
也就是说,信令处理网元在确定故障类型时是综合考虑了第一探测信息和第二探测信息来进行确定,通过这种方式,可以有效确定究竟是第一用户面网元故障还是第一控制面网元与第一用户面网元之间的链路故障,可以有效分辨网元故障和链路故障,从而对不同的故障可以采取不同的处理方式,尽量避免出现因将链路故障当做网元故障处理而带来的业务中断等不好的体验。That is to say, the signaling processing network element determines the fault type by comprehensively considering the first probe information and the second probe information. In this way, it can effectively determine whether the first user plane network element fault or the first A link fault between the control plane network element and the first user plane network element can effectively distinguish the network element fault and the link fault, so that different faults can be handled differently, and the link fault is avoided as much as possible. A bad experience such as business interruption caused by network element failure processing.
结合第一方面的第一种可能的实现方式,在第一方面的第二种可能的实现方式中,在信令处理网元确定故障类型为第一控制面网元与第一用户面网元之间的链路故障之后,信令处理网元可以进行故障恢复处理。故障恢复处理包括但不限于以下两种方式:信令处理网元重新为第一用户面网元选择控制面网元,并向重新选择的控制面网元发送第一用户面网元的标识,使得重新选择的控制面网元管理第一用户面网元;或,信令处理网元指示第一控制面网元等待链路恢复。With reference to the first possible implementation manner of the first aspect, in a second possible implementation manner of the first aspect, the signaling processing network element determines that the fault type is the first control plane network element and the first user plane network element. After the link failure occurs, the signaling processing network element can perform fault recovery processing. The fault recovery process includes, but is not limited to, the following two methods: the signaling processing network element re-selects the control plane network element for the first user plane network element, and sends the identifier of the first user plane network element to the reselected control plane network element. The re-selected control plane network element manages the first user plane network element; or the signaling processing network element instructs the first control plane network element to wait for link recovery.
信令处理网元可以重新选择控制面网元,且向重新选择的控制面网元发 送第一用户面网元的标识,可选的,信令处理网元除了向重新选择的控制面网元发送第一用户面网元的标识之外,还可以向重新选择的控制面网元发送故障指示,即指示究竟是哪里出现了故障,则重新选择的控制面网元接收信令处理网元发送的信息后就可以获知究竟何处出现故障,以及可以根据第一用户面网元的标识及时建立与第一用户面网元之间的链路,使得网络尽快恢复正常。或者,信令处理网元也可以指示第一控制面网元等待链路恢复,无需进行其他处理,这样,恢复后的第一控制面网元可以继续使用,提高网元的利用率。The signaling processing network element can reselect the control plane network element and send it to the reselected control plane network element. Sending the identifier of the first user plane network element. Optionally, the signaling processing network element may send the re-selected control plane network element in addition to the identifier of the first user plane network element to the reselected control plane network element. Sending a fault indication, that is, indicating exactly where the fault occurred, the reselected control plane network element receives the information sent by the signaling processing network element, and then can know exactly where the fault occurs, and can be based on the first user plane network element. The identifier establishes a link with the first user plane network element in time, so that the network returns to normal as soon as possible. Alternatively, the signaling processing network element may also indicate that the first control plane network element is waiting for the link to be restored, and no other processing is required. Therefore, the restored first control plane network element can continue to be used, thereby improving the utilization rate of the network element.
结合第一方面或第一方面的第一种可能的实现方式或第二种可能的实现方式,在第一方面的第三种可能的实现方式中,第一控制面网元为控制面服务网关,第二用户面网元为基站,第一用户面网元为用户面服务网关,信令处理网元为移动管理实体;或,第一控制面网元为控制面分组数据网网关,第二用户面网元为用户面服务网关,第一用户面网元为用户面分组数据网网关,信令处理网元为移动管理实体;或,第一控制面网元为控制面服务网关,第二用户面网元为用户面分组数据网网关,第一用户面网元为用户面服务网关,信令处理网元为移动管理实体。With reference to the first aspect or the first possible implementation manner or the second possible implementation manner of the first aspect, in a third possible implementation manner of the first aspect, the first control plane network element is a control plane serving gateway The second user plane network element is a base station, the first user plane network element is a user plane serving gateway, and the signaling processing network element is a mobility management entity; or, the first control plane network element is a control plane packet data network gateway, and the second The user plane network element is a user plane service gateway, the first user plane network element is a user plane packet data network gateway, and the signaling processing network element is a mobility management entity; or, the first control plane network element is a control plane serving gateway, and the second The user plane network element is a user plane packet data network gateway, the first user plane network element is a user plane serving gateway, and the signaling processing network element is a mobility management entity.
根据需处理的场景等因素的不同,各网元可以有不同的选取方式,本发明实施例支持的应用场景较为广泛,在不同网元场景下都可以应用本发明实施例所提供的方案。Depending on the factors to be processed, the network elements may have different selection modes. The application scenarios supported by the embodiments of the present invention are applicable to the embodiments of the present invention.
第二方面,提供一种故障处理方法,该方法由信令处理网元执行,该信令处理网元例如通过MME实现。该方法包括:信令处理网元获得第一探测信息,第一探测信息用于指示第一控制面网元的状态。信令处理网元接收第一用户面网元发送的第二探测信息,所述第二探测信息用于指示第一用户面网元得到的第二用户面网元的状态。信令处理网元根据第一探测信息和第二探测信息确定故障类型,故障类型包括仅第一控制面网元故障,或仅第二用户面网元故障,或第一控制面网元与第二用户面网元均故障。那么,若第一控制面网元与第二用户面网元均故障,且第一控制面网元管理第二用户面网元, 则信令处理网元释放第一控制面网元和/或第二用户面网元关联的业务。In a second aspect, a fault processing method is provided, which is performed by a signaling processing network element, such as implemented by an MME. The method includes: the signaling processing network element obtains the first detection information, where the first detection information is used to indicate the state of the first control plane network element. The signaling processing network element receives the second detection information sent by the first user plane network element, where the second detection information is used to indicate the status of the second user plane network element obtained by the first user plane network element. The signaling processing network element determines the fault type according to the first probe information and the second probe information, where the fault type includes only the first control plane network element fault, or only the second user plane network element fault, or the first control plane network element and the first control plane Both user plane network elements are faulty. Then, if the first control plane network element and the second user plane network element both fail, and the first control plane network element manages the second user plane network element, The signaling processing network element releases the service associated with the first control plane network element and/or the second user plane network element.
本发明实施例中,若信令处理网元确定第一控制面网元与第二用户面网元均故障,且第一控制面网元管理第二用户面网元,也就是故障的第一控制面网元和第二用户面网元为相互关联的网元,那么信令处理网元可以在本地释放第一控制面网元和/或第二用户面网元关联的业务,使得这部分业务得以在其他网元上重新恢复,尽量减小业务中断的时长。In the embodiment of the present invention, if the signaling processing network element determines that both the first control plane network element and the second user plane network element are faulty, and the first control plane network element manages the second user plane network element, that is, the first fault The control plane network element and the second user plane network element are interconnected network elements, and the signaling processing network element can locally release the service associated with the first control plane network element and/or the second user plane network element, so that the part The service can be restored on other network elements to minimize the length of business interruption.
结合第二方面,在第二方面的第一种可能的实现方式中,信令处理网元根据第一探测信息和第二探测信息确定故障类型,通过以下方式实现:若第一探测信息指示第一控制面网元故障,则信令处理网元确定第一控制面网元故障;或,若第二探测信息指示第二用户面网元故障,则信令处理网元确定第二用户面网元故障。With reference to the second aspect, in a first possible implementation manner of the second aspect, the signaling processing network element determines, according to the first detection information and the second detection information, a fault type, by: if the first detection information indicates If the control plane network element fails, the signaling processing network element determines that the first control plane network element is faulty; or, if the second detection information indicates that the second user plane network element fails, the signaling processing network element determines the second user plane network. Meta failure.
即,信令处理网元根据第一探测信息和第二探测信息来确定第一控制面网元和第二用户面网元是否故障,确定方式较为直接,能够较快地锁定故障网元。That is, the signaling processing network element determines whether the first control plane network element and the second user plane network element are faulty according to the first probe information and the second probe information, and the determining manner is relatively straightforward, and the faulty network element can be locked relatively quickly.
结合第二方面的第一种可能的实现方式,在第二方面的第二种可能的实现方式中,在信令处理网元确定故障类型之后,若仅第一控制面网元故障,信令处理网元重新为第二用户面网元选择控制面网元,并向重新选择的控制面网元发送第二用户面网元的标识,使得重新选择的控制面网元管理第二用户面网元。With reference to the first possible implementation manner of the second aspect, in a second possible implementation manner of the second aspect, after the signaling processing network element determines the fault type, if only the first control plane network element fails, signaling The processing network element re-selects the control plane network element for the second user plane network element, and sends the identifier of the second user plane network element to the reselected control plane network element, so that the reselected control plane network element manages the second user plane network. yuan.
若仅第一控制面网元故障,信令处理网元可以重新选择控制面网元,且向重新选择的控制面网元发送第二用户面网元的标识,可选的,信令处理网元除了向重新选择的控制面网元发送第二用户面网元的标识之外,还可以向重新选择的控制面网元发送故障指示,即指示究竟是哪里出现了故障,则重新选择的控制面网元接收信令处理网元发送的信息后就可以获知究竟何处出现故障,以及可以根据第二用户面网元的标识及时建立与第二用户面网元之间的链路,使得网络尽快恢复正常。另外在此过程中,由于第二用户面网元无故障,则第二用户面网元的业务可以继续,尽量减小业务中断的可能性。 If only the first control plane network element fails, the signaling processing network element may reselect the control plane network element, and send the identifier of the second user plane network element to the reselected control plane network element, optionally, the signaling processing network In addition to sending the identifier of the second user plane network element to the reselected control plane network element, the element may also send a fault indication to the reselected control plane network element, that is, indicating exactly where the fault occurred, and then reselecting the control. After receiving the information sent by the network element, the network element can learn exactly where the fault occurs, and can establish a link with the second user plane network element according to the identifier of the second user plane network element, so that the network Return to normal as soon as possible. In addition, in this process, since the second user plane network element has no fault, the service of the second user plane network element can continue, and the possibility of service interruption is minimized.
结合第二方面或第二方面的第一种可能的实现方式或第二种可能的实现方式,在第二方面的第三种可能的实现方式中,信令处理网元获得第一探测信息,通过以下方式实现:信令处理网元接收第二控制面网元发送的第一探测信息,第一探测信息用于指示第二控制面网元得到的第一控制面网元的状态;或,信令处理网元对第一控制面网元进行探测,根据探测结果生成第一探测信息。With reference to the second aspect or the first possible implementation manner or the second possible implementation manner of the second aspect, in a third possible implementation manner of the second aspect, the signaling processing network element obtains the first detection information, The method is as follows: the signaling processing network element receives the first detection information sent by the second control plane network element, where the first detection information is used to indicate the state of the first control plane network element obtained by the second control plane network element; or The signaling processing network element detects the first control plane network element, and generates first detection information according to the detection result.
即,信令处理网元可以直接对第一控制面网元探测以获得第一探测信息,或者也可以是第二控制面网元对第一控制面网元进行探测,将第一探测信息发送给信令处理网元,信令处理网元获得第一探测信息的方式较为灵活。在实际应用中,根据需要探测的网元的不同可以选取合适的方式。That is, the signaling processing network element may directly detect the first control plane network element to obtain the first detection information, or may also be the second control plane network element to detect the first control plane network element, and send the first detection information. For the signaling processing network element, the signaling processing network element obtains the first detection information in a flexible manner. In practical applications, an appropriate manner can be selected according to the difference of the network elements that need to be detected.
结合第二方面或第二方面的第三种可能的实现方式,在第二方面的第四种可能的实现方式中,信令处理网元为移动管理实体,第一控制面网元为控制面服务网关,第一用户面网元为基站,第二用户面网元为用户面服务网关;或,信令处理网元为移动管理实体,第二控制面网元为控制面服务网关,第一控制面网元为控制面分组数据网网关,第一用户面网元为用户面服务网关,第二用户面网元为用户面分组数据网网关。With reference to the second aspect or the third possible implementation manner of the second aspect, in a fourth possible implementation manner of the second aspect, the signaling processing network element is a mobile management entity, and the first control plane network element is a control plane The serving gateway, the first user plane network element is a base station, and the second user plane network element is a user plane serving gateway; or the signaling processing network element is a mobility management entity, and the second control plane network element is a control plane serving gateway, first The control plane network element is a control plane packet data network gateway, the first user plane network element is a user plane service gateway, and the second user plane network element is a user plane packet data network gateway.
根据需处理的场景等因素的不同,各网元可以有不同的选取方式,本发明实施例支持的应用场景较为广泛,在不同网元场景下都可以应用本发明实施例所提供的方案。Depending on the factors to be processed, the network elements may have different selection modes. The application scenarios supported by the embodiments of the present invention are applicable to the embodiments of the present invention.
第三方面,提供一种故障处理方法,该方法通过SDN控制器实现。该方法包括:SDN控制器对第一交换机进行探测,获得第一探测信息。SDN控制器接收第二交换机发送的第二探测信息,第二探测信息用于指示第二交换机得到的第一交换机的状态。SDN控制器根据第一探测信息和第二探测信息确定故障类型,故障类型包括第一交换机故障,或SDN控制器与第一交换机之间的链路故障。In a third aspect, a fault processing method is provided, which is implemented by an SDN controller. The method includes: the SDN controller detects the first switch, and obtains the first probe information. The SDN controller receives the second probe information sent by the second switch, where the second probe information is used to indicate the status of the first switch obtained by the second switch. The SDN controller determines the fault type according to the first probe information and the second probe information, where the fault type includes a first switch fault or a link fault between the SDN controller and the first switch.
本发明实施例中,因SDN控制器除了可以自行对第一交换机进行探测外,还可以接收第二交换机发送的第二探测信息,则SDN控制器可以结合接收的 第一探测信息和第二探测信息来综合判断故障类型,因在进行故障判断时会综合考虑多方面的信息,而不只是考虑单网元的信息,提高了判断结果的准确率,从而后续如果是网元故障可以按照网元故障进行处理,如果是链路故障则可以按照链路故障进行处理,可以尽量避免无故障的交换机的业务中断,保证业务的连续性,或者在交换机故障情况下,也可以尽量快速恢复交换机的业务,尽量不影响用户的业务体验。In the embodiment of the present invention, the SDN controller can receive the second detection information sent by the second switch, and the SDN controller can combine the received information. The first detection information and the second detection information comprehensively determine the type of the fault, because comprehensive information is considered in the judgment of the fault, and not only the information of the single network element is considered, and the accuracy of the judgment result is improved, thereby If the fault is faulty, the fault can be processed according to the fault of the NE. If the link is faulty, the fault can be processed according to the link fault. The service interruption of the faultless switch can be avoided to ensure the continuity of the service. You can also restore the services of the switch as quickly as possible without affecting the user's service experience.
第四方面,提供一种故障处理方法,该方法由第一网元执行。该方法包括:第一网元通过探测获知第二网元的状态为故障,第一网元根据对第二网元的探测生成探测信息,并将探测信息发送给信令处理网元,探测信息中携带第二网元的标识,探测信息用于确定故障类型。In a fourth aspect, a fault processing method is provided, the method being performed by a first network element. The method includes: the first network element detects that the state of the second network element is a fault, and the first network element generates the detection information according to the detection of the second network element, and sends the detection information to the signaling processing network element, and the detection information The identifier of the second network element is carried in, and the detection information is used to determine the type of the fault.
结合第四方面,在第四方面的第一种可能的实现方式中,第一网元为控制面网元或用户面网元,第二网元为控制面网元或用户面网元。With reference to the fourth aspect, in a first possible implementation manner of the fourth aspect, the first network element is a control plane network element or a user plane network element, and the second network element is a control plane network element or a user plane network element.
即,作为控制面网元或用户面网元的第一网元可以探测第二网元的状态,若获知第二网元的状态为故障,则第一网元可以暂不进行故障处理,例如暂不在本地释放第二网元关联的业务,而是将探测信息(该探测信息可以包括如前各方面所述的第一探测信息或第二探测信息)发送给信令处理网元,且探测信息中可以携带第二网元的标识,信令处理网元接收探测信息后就可以根据该探测信息携带的第二网元的标识确定该探测信息指示的是第二网元的状态为故障,从而,信令处理网元可以根据多方网元发送的探测信息来综合判断故障类型,提高了故障判断准确性,可以尽量区分究竟是网元故障还是链路故障。后续如果是网元故障可以按照网元故障进行处理,如果是链路故障则可以按照链路故障进行处理,可以尽量避免无故障的UP的业务中断,保证业务的连续性,或者在UP故障情况下,也可以尽量快速恢复UP的业务,尽量不影响用户的业务体验。That is, the first network element that is the control plane network element or the user plane network element can detect the state of the second network element. If the state of the second network element is known to be faulty, the first network element can temporarily not perform fault processing, for example, The service information associated with the second network element is not released locally, but the detection information (the detection information may include the first detection information or the second detection information as described in the foregoing aspects) is sent to the signaling processing network element, and the detection is performed. The information may carry the identifier of the second network element. After receiving the detection information, the signaling processing network element may determine, according to the identifier of the second network element carried in the detection information, that the status of the second network element is faulty. Therefore, the signaling processing network element can comprehensively determine the fault type according to the probe information sent by the multi-party network element, improve the fault judgment accuracy, and can distinguish whether the network element fault or the link fault is as far as possible. If the fault occurs on the NE, the fault can be processed according to the fault of the NE. If the link is faulty, the fault can be processed according to the link fault. You can avoid service interruption of the faultless UP and ensure the continuity of the service or the fault in the UP. Under the same time, you can also restore the UP service as quickly as possible, and try not to affect the user's business experience.
第五方面,提供一种信令处理网元,该信令处理网元包括接收器和处理器。其中,接收器用于接收第一控制面网元发送的第一探测信息,及,接收第二用户面网元发送的第二探测信息。第一探测信息用于指示第一控制面网 元得到的第一用户面网元的状态,第二探测信息用于指示第二用户面网元得到的第一用户面网元的状态。处理器用于根据第一探测信息和第二探测信息确定故障类型,故障类型包括第一用户面网元故障,或第一控制面网元与第一用户面网元之间的链路故障。In a fifth aspect, a signaling processing network element is provided, where the signaling processing network element includes a receiver and a processor. The receiver is configured to receive the first probe information sent by the first control plane network element, and receive the second probe information sent by the second user plane network element. The first probe information is used to indicate the first control plane network The state of the first user plane network element obtained by the element, and the second probe information is used to indicate the state of the first user plane network element obtained by the second user plane network element. The processor is configured to determine a fault type according to the first probe information and the second probe information, where the fault type includes a first user plane network element fault, or a link fault between the first control plane network element and the first user plane network element.
结合第五方面,在第五方面的第一种可能的实现方式中,处理器用于根据第一探测信息和第二探测信息确定故障类型,包括:若第一探测信息和第二探测信息均指示第一用户面网元故障,则确定故障类型为第一用户面网元故障;或,若第一探测信息指示第一用户面网元故障,而第二探测信息指示第一用户面网元正常,则确定故障类型为第一用户面网元正常,第一控制面网元与第一用户面网元之间的链路故障。With reference to the fifth aspect, in a first possible implementation manner of the fifth aspect, the processor is configured to determine a fault type according to the first probe information and the second probe information, including: if the first probe information and the second probe information are both indicated If the first user plane network element is faulty, the fault type is determined to be the first user plane network element fault; or, if the first probe information indicates that the first user plane network element is faulty, and the second probe information indicates that the first user plane network element is normal, Then, the fault type is determined to be that the first user plane network element is normal, and the link between the first control plane network element and the first user plane network element is faulty.
结合第五方面的第一种可能的实现方式,在第五方面的第二种可能的实现方式中,该信令处理网元还包括发送器。处理器还用于:在确定故障类型为第一控制面网元与第一用户面网元之间的链路故障之后,重新为第一用户面网元选择控制面网元,并通过发送器向重新选择的控制面网元发送第一用户面网元的标识,使得重新选择的控制面网元管理第一用户面网元;或,在确定故障类型为第一控制面网元与第一用户面网元之间的链路故障之后,指示第一控制面网元等待链路恢复。With reference to the first possible implementation manner of the fifth aspect, in a second possible implementation manner of the fifth aspect, the signaling processing network element further includes a transmitter. The processor is further configured to: after determining that the fault type is a link fault between the first control plane network element and the first user plane network element, reselect the control plane network element for the first user plane network element, and pass the transmitter. Sending, to the reselected control plane network element, the identifier of the first user plane network element, so that the reselected control plane network element manages the first user plane network element; or determining the fault type as the first control plane network element and the first After the link between the user plane network elements fails, the first control plane network element is instructed to wait for link recovery.
结合第五方面或第五方面的第一种可能的实现方式或第二种可能的实现方式,在第五方面的第三种可能的实现方式中,第一控制面网元为控制面服务网关,第二用户面网元为基站,第一用户面网元为用户面服务网关,信令处理网元为移动管理实体;或,第一控制面网元为控制面分组数据网网关,第二用户面网元为用户面服务网关,第一用户面网元为用户面分组数据网网关,信令处理网元为移动管理实体;或,第一控制面网元为控制面服务网关,第二用户面网元为用户面分组数据网网关,第一用户面网元为用户面服务网关,信令处理网元为移动管理实体。With reference to the fifth aspect or the first possible implementation manner or the second possible implementation manner of the fifth aspect, in a third possible implementation manner of the fifth aspect, the first control plane network element is a control plane serving gateway The second user plane network element is a base station, the first user plane network element is a user plane serving gateway, and the signaling processing network element is a mobility management entity; or, the first control plane network element is a control plane packet data network gateway, and the second The user plane network element is a user plane service gateway, the first user plane network element is a user plane packet data network gateway, and the signaling processing network element is a mobility management entity; or, the first control plane network element is a control plane serving gateway, and the second The user plane network element is a user plane packet data network gateway, the first user plane network element is a user plane serving gateway, and the signaling processing network element is a mobility management entity.
第六方面,提供一种信令处理网元,该信令处理网元包括处理器和接收器。其中,处理器用于获得第一探测信息,第一探测信息用于指示第一控制 面网元的状态。接收器用于接收第一用户面网元发送的第二探测信息;其中,第二探测信息用于指示第一用户面网元得到的第二用户面网元的状态。处理器还用于根据第一探测信息和第二探测信息确定故障类型,故障类型包括仅第一控制面网元故障,或仅第二用户面网元故障,或第一控制面网元与第二用户面网元均故障。若故障类型为第一控制面网元与第二用户面网元均故障,且第一控制面网元管理第二用户面网元,则处理器释放第一控制面网元和/或第二用户面网元关联的业务。In a sixth aspect, a signaling processing network element is provided, the signaling processing network element including a processor and a receiver. The processor is configured to obtain first probe information, where the first probe information is used to indicate the first control The state of the face network element. The receiver is configured to receive the second probe information that is sent by the first user plane network element, where the second probe information is used to indicate the state of the second user plane network element obtained by the first user plane network element. The processor is further configured to determine a fault type according to the first probe information and the second probe information, where the fault type includes only the first control plane network element fault, or only the second user plane network element fault, or the first control plane network element and the first control plane Both user plane network elements are faulty. If the fault type is that the first control plane network element and the second user plane network element are both faulty, and the first control plane network element manages the second user plane network element, the processor releases the first control plane network element and/or the second The service associated with the user plane network element.
结合第六方面,在第六方面的第一种可能的实现方式中,处理器用于根据第一探测信息和第二探测信息确定故障类型,包括:若第一探测信息指示第一控制面网元故障,则确定第一控制面网元故障;或,若第二探测信息指示第二用户面网元故障,则确定第二用户面网元故障。With reference to the sixth aspect, in a first possible implementation manner of the sixth aspect, the processor is configured to determine a fault type according to the first probe information and the second probe information, including: if the first probe information indicates the first control plane network element If the second probe information indicates that the second user plane network element is faulty, the second user plane network element fault is determined.
结合第六方面的第一种可能的实现方式,在第六方面的第二种可能的实现方式中,该信令处理网元还包括发送器。处理器还用于:在确定故障类型之后,若故障类型为仅第一控制面网元故障,则重新为第二用户面网元选择控制面网元,并通过发送器向重新选择的控制面网元发送第二用户面网元的标识,使得重新选择的控制面网元管理第二用户面网元。With reference to the first possible implementation manner of the sixth aspect, in a second possible implementation manner of the sixth aspect, the signaling processing network element further includes a transmitter. The processor is further configured to: after determining the fault type, if the fault type is only the first control plane network element fault, reselect the control plane network element for the second user plane network element, and use the transmitter to reselect the control plane. The network element sends the identifier of the second user plane network element, so that the reselected control plane network element manages the second user plane network element.
结合第六方面或第六方面的第一种可能的实现方式或第二种可能的实现方式,在第六方面的第三种可能的实现方式中,处理器用于获得第一探测信息,包括:获得接收单元所接收的第二控制面网元发送的第一探测信息,第一探测信息用于指示第二控制面网元得到的第一控制面网元的状态;或,对第一控制面网元进行探测,根据探测结果生成第一探测信息。With reference to the sixth aspect, or the first possible implementation manner of the sixth aspect, or the second possible implementation manner, in a third possible implementation manner of the sixth aspect, the processor is configured to obtain the first detection information, including: Acquiring the first detection information sent by the second control plane network element received by the receiving unit, where the first detection information is used to indicate the state of the first control plane network element obtained by the second control plane network element; or, for the first control plane The network element performs detection, and generates first detection information according to the detection result.
结合第六方面的第三种可能的实现方式,在第六方面的第四种可能的实现方式中,信令处理网元为移动管理实体,第一控制面网元为控制面服务网关,第一用户面网元为基站,第二用户面网元为用户面服务网关;或,信令处理网元为移动管理实体,第二控制面网元为控制面服务网关,第一控制面网元为控制面分组数据网网关,第一用户面网元为用户面服务网关,第二用户面网元为用户面分组数据网网关。 With reference to the third possible implementation manner of the sixth aspect, in a fourth possible implementation manner of the sixth aspect, the signaling processing network element is a mobility management entity, and the first control plane network element is a control plane serving gateway, A user plane network element is a base station, and a second user plane network element is a user plane serving gateway; or, the signaling processing network element is a mobility management entity, and the second control plane network element is a control plane serving gateway, and the first control plane network element For the control plane packet data network gateway, the first user plane network element is a user plane service gateway, and the second user plane network element is a user plane packet data network gateway.
第七方面,提供一种SDN控制器,该SDN控制器包括处理器和接收器。其中,处理器用于对第一交换机进行探测,获得第一探测信息。接收器用于接收第二交换机发送的第二探测信息,第二探测信息用于指示第二交换机得到的第一交换机的状态。处理器还用于根据第一探测信息和第二探测信息确定故障类型,故障类型包括第一交换机故障,或SDN控制器与第一交换机之间的链路故障。In a seventh aspect, an SDN controller is provided, the SDN controller comprising a processor and a receiver. The processor is configured to detect the first switch to obtain the first probe information. The receiver is configured to receive the second probe information sent by the second switch, where the second probe information is used to indicate the status of the first switch obtained by the second switch. The processor is further configured to determine a fault type according to the first probe information and the second probe information, where the fault type includes a first switch fault, or a link fault between the SDN controller and the first switch.
第八方面,提供一种网元,该网元包括处理器和发送器。其中,处理器用于通过探测获知第二网元的状态为故障,根据对第二网元的探测生成探测信息。发送器用于将探测信息发送给信令处理网元,探测信息中携带第二网元的标识,探测信息用于确定故障类型。In an eighth aspect, a network element is provided, the network element including a processor and a transmitter. The processor is configured to detect, by detecting, that the state of the second network element is a fault, and generate the probe information according to the detection of the second network element. The transmitter is configured to send the probe information to the signaling processing network element, where the probe information carries the identifier of the second network element, and the probe information is used to determine the fault type.
结合第八方面,在第八方面的第一种可能的实现方式中,该网元为控制面网元或用户面网元,第二网元为控制面网元或用户面网元。With reference to the eighth aspect, in a first possible implementation manner of the eighth aspect, the network element is a control plane network element or a user plane network element, and the second network element is a control plane network element or a user plane network element.
第九方面,提供一种信令处理网元,该信令处理网元包括用于执行第一方面或第一方面的任一种可能的实现方式所提供的方法的功能单元。In a ninth aspect, a signaling processing network element is provided, the signaling processing network element comprising a functional unit for performing the method provided by the first aspect or any one of the possible implementations of the first aspect.
第十方面,提供一种信令处理网元,该信令处理网元包括用于执行第二方面或第二方面的任一种可能的实现方式所提供的方法的功能单元。In a tenth aspect, a signaling processing network element is provided, the signaling processing network element comprising a functional unit for performing the method provided by the second aspect or any one of the possible implementations of the second aspect.
第十一方面,提供一种SDN控制器,该SDN控制器包括用于执行第三方面或第三方面的任一种可能的实现方式所提供的方法的功能单元。In an eleventh aspect, an SDN controller is provided, the SDN controller comprising functional units for performing the method provided by the third aspect or any of the possible implementations of the third aspect.
第十二方面,提供一种网元,该网元为第一网元,该网元包括用于执行第四方面或第四方面的任一种可能的实现方式所提供的方法的功能单元。In a twelfth aspect, a network element is provided, the network element being a first network element, the network element comprising a functional unit for performing the method provided by the fourth aspect or any one of the possible implementation manners of the fourth aspect.
第十三方面,提供一种计算机存储介质,用于储存为上述信令处理网元所用的计算机软件指令,其包含用于执行第一方面或第一方面的任一种可能的实现方式为信令处理网元所设计的程序。A thirteenth aspect, a computer storage medium for storing computer software instructions for use in the above-described signaling processing network element, comprising any of the possible implementations for performing the first aspect or the first aspect Let the program designed by the network element be processed.
第十四方面,提供一种计算机存储介质,用于储存为上述信令处理网元所用的计算机软件指令,其包含用于执行第二方面或第二方面的任一种可能的实现方式为信令处理网元所设计的程序。A fourteenth aspect, a computer storage medium for storing computer software instructions for use in the signaling processing network element, comprising any of the possible implementations for performing the second aspect or the second aspect Let the program designed by the network element be processed.
第十五方面,提供一种计算机存储介质,用于储存为上述SDN控制器所 用的计算机软件指令,其包含用于执行第三方面或第三方面的任一种可能的实现方式为SDN控制器所设计的程序。In a fifteenth aspect, a computer storage medium is provided for storage as the SDN controller Computer software instructions for use, comprising a program designed to perform the SDN controller in any one of the possible implementations of the third aspect or the third aspect.
第十六方面,提供一种计算机存储介质,用于储存为上述第一网元所用的计算机软件指令,其包含用于执行第四方面或第四方面的任一种可能的实现方式为第一网元所设计的程序。In a sixteenth aspect, a computer storage medium is provided for storing computer software instructions for use in the first network element, and includes any possible implementation manner for performing the fourth aspect or the fourth aspect. The program designed by the network element.
本发明实施例中,信令处理网元在进行故障判断时会综合考虑多方面的信息,而不只是考虑单网元的信息,提高了判断结果的准确率,从而后续如果是网元故障可以按照网元故障进行处理,如果是链路故障则可以按照链路故障进行处理,可以尽量避免无故障的UP的业务中断,保证业务的连续性,或者在UP故障情况下,也可以尽量快速恢复UP的业务,尽量不影响用户的业务体验。In the embodiment of the present invention, the signaling processing network element comprehensively considers multiple aspects of information when performing fault diagnosis, and not only considers the information of a single network element, but also improves the accuracy of the judgment result, so that if the network element fails, If the link is faulty, you can perform the fault according to the link fault. You can avoid service interruption of the faultless UP and ensure the continuity of the service. In the case of the UP fault, you can also recover the fault as quickly as possible. UP business, try not to affect the user's business experience.
附图说明DRAWINGS
为了更清楚地说明本发明实施例的技术方案,下面将对本发明实施例中所需要使用的附图作简单地介绍,显而易见地,下面所介绍的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the embodiments of the present invention will be briefly described below. It is obvious that the following drawings are only some embodiments of the present invention. Those skilled in the art can also obtain other drawings based on these drawings without paying any creative work.
图1为本发明实施例应用的一种网络架构示意图;FIG. 1 is a schematic diagram of a network architecture applied to an embodiment of the present invention; FIG.
图2为本发明实施例提供的故障处理方法的一种流程图;2 is a flowchart of a fault processing method according to an embodiment of the present invention;
图3为本发明实施例提供的故障处理方法的一种流程图;FIG. 3 is a flowchart of a fault processing method according to an embodiment of the present invention;
图4为本发明实施例提供的故障处理方法的一种流程图;FIG. 4 is a flowchart of a fault processing method according to an embodiment of the present invention;
图5为本发明实施例提供的故障处理方法的一种流程图;FIG. 5 is a flowchart of a fault processing method according to an embodiment of the present invention;
图6为本发明实施例提供的故障处理方法的一种流程图;FIG. 6 is a flowchart of a fault processing method according to an embodiment of the present invention;
图7为本发明实施例应用的一种网络架构示意图;FIG. 7 is a schematic diagram of a network architecture according to an embodiment of the present invention;
图8为本发明实施例提供的故障处理方法的一种流程图;FIG. 8 is a flowchart of a fault processing method according to an embodiment of the present invention;
图9为本发明实施例提供的计算机设备的结构示意图;FIG. 9 is a schematic structural diagram of a computer device according to an embodiment of the present invention;
图10-图11为本发明实施例提供的信令处理网元的结构示意图; FIG. 10 is a schematic structural diagram of a signaling processing network element according to an embodiment of the present disclosure;
图12为本发明实施例提供的SDN控制器的结构示意图;FIG. 12 is a schematic structural diagram of an SDN controller according to an embodiment of the present invention;
图13为本发明实施例提供的第一网元的结构示意图。FIG. 13 is a schematic structural diagram of a first network element according to an embodiment of the present invention.
具体实施方式detailed description
为使本发明实施例的目的、技术方案和优点更加清楚,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明实施例保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described in conjunction with the drawings in the embodiments of the present invention. It is a partial embodiment of the invention, and not all of the embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present invention without departing from the inventive scope are the scope of the embodiments of the present invention.
本文中描述的技术可用于各种通信系统,例如长期演进(Long Term Evolution,LTE)系统,4.5G系统,或5G系统,以及其他此类通信系统或今后出现的演进系统。以及,本文提供的技术方案不仅仅适用于第三代合作伙伴项目(3rd Generation Partnership Project,3GPP)接入方式,还可适用于非3GPP(Non-3GPP)接入方式下的控制面与用户面分离的情况。The techniques described herein may be used in various communication systems, such as Long Term Evolution (LTE) systems, 4.5G systems, or 5G systems, as well as other such communication systems or evolved systems that will emerge in the future. Moreover, the technical solution provided in this paper is not only applicable to the 3rd Generation Partnership Project (3GPP) access method, but also applicable to the control plane and user plane in the non-3GPP (Non-3GPP) access mode. Separation situation.
以下,对本发明实施例中的部分用语进行解释说明,以便于本领域技术人员理解。Hereinafter, some of the terms in the embodiments of the present invention will be explained to facilitate understanding by those skilled in the art.
1)用户设备(User Equipment,UE),是指向用户提供语音和/或数据连通性的设备,例如可以包括具有无线连接功能的手持式设备、或连接到无线调制解调器的处理设备。该用户设备可以经无线接入网(Radio Access Network,RAN)与核心网进行通信,与RAN交换语音和/或数据。该用户设备也可称为无线终端设备、移动终端设备、签约单元(Subscriber Unit)、签约站(Subscriber Station),移动站(Mobile Station)、移动台(Mobile)、远程站(Remote Station)、接入点(Access Point,AP)、远程终端设备(Remote Terminal)、接入终端设备(Access Terminal)、用户终端设备(User Terminal)、用户代理(User Agent)、或用户装备(User Device)等。例如,用户设备可以包括移动电话(或称为“蜂窝”电话),具有移动终端设备的计算机,NB-IoT中的专用终端设备,便携式、袖珍式、手持式、计算机内置的或者车载的移 动装置。例如,个人通信业务(Personal Communication Service,PCS)电话、无绳电话、会话发起协议(SIP)话机、无线本地环路(Wireless Local Loop,WLL)站、个人数字助理(Personal Digital Assistant,PDA)等设备。1) User Equipment (UE), which is a device that provides voice and/or data connectivity to a user, for example, may include a handheld device with wireless connectivity, or a processing device connected to a wireless modem. The user equipment can communicate with the core network via a Radio Access Network (RAN) to exchange voice and/or data with the RAN. The user equipment may also be referred to as a wireless terminal device, a mobile terminal device, a Subscriber Unit, a Subscriber Station, a Mobile Station, a Mobile Station, a Remote Station, and a Pickup Station. Access Point (AP), Remote Terminal, Access Terminal, User Terminal, User Agent, User Device, etc. For example, the user equipment may include a mobile telephone (or "cellular" telephone), a computer with a mobile terminal device, a dedicated terminal device in the NB-IoT, portable, pocket, handheld, computer built-in or vehicle-mounted Moving device. For example, Personal Communication Service (PCS) phones, cordless phones, Session Initiation Protocol (SIP) phones, Wireless Local Loop (WLL) stations, Personal Digital Assistants (PDAs), etc. .
2)网络设备,也称为网元。例如,网络设备包括控制面网元、用户面网元、或信令处理网元。且,本发明实施例中的各网元可以是实体设备,也可以是逻辑设备。2) Network devices, also known as network elements. For example, the network device includes a control plane network element, a user plane network element, or a signaling processing network element. The network elements in the embodiment of the present invention may be physical devices or logical devices.
在控制面(或称为信令面)与用户面(或称为转发面)分离的一种网络架构中,S-GW分裂为SGW-C和SGW-U,PGW分裂为PGW-C和PGW-U。其中,作为控制面的SGW-C和PGW-C可以是CGW的组成部分,同理,作为用户面的SGW-U和PGW-U也可以是DGW的组成部分。一个CGW可以管理多个DGW,一个DGW可以归属于多个CGW。其中,CGW和CP可以理解为同一概念,DGW和UP可以理解为同一概念。In a network architecture where the control plane (or signaling plane) is separated from the user plane (or the forwarding plane), the S-GW is split into SGW-C and SGW-U, and the PGW is split into PGW-C and PGW. -U. Among them, the SGW-C and the PGW-C as the control plane may be components of the CGW. Similarly, the SGW-U and the PGW-U as the user plane may also be components of the DGW. One CGW can manage multiple DGWs, and one DGW can belong to multiple CGWs. Among them, CGW and CP can be understood as the same concept, and DGW and UP can be understood as the same concept.
其中,本发明实施例中的控制面网元可以包括CGW,或者,在未来的通信系统(例如5G系统)中,可能会将现在的移动性管理实体(Mobility Management Entity,MME)和CGW(或者还包括其他的设备)合并以形成新的控制面网元,或者新的控制面网元还可以包括其他可能的用于实现控制面的功能的网络设备。The control plane network element in the embodiment of the present invention may include a CGW, or in a future communication system (for example, a 5G system), the current Mobility Management Entity (MME) and the CGW may be (or It also includes other devices) to merge to form a new control plane network element, or the new control plane network element may also include other possible network devices for implementing the functions of the control plane.
本发明实施例中的用户面网元可以包括DGW,或者还可以包括其他可能的用于实现用户面的功能的网络设备。The user plane network element in the embodiment of the present invention may include a DGW, or may also include other possible network devices for implementing functions of the user plane.
另外,本发明实施例中,用户面网元还可以包括接入网网元,例如基站(例如,接入点)。基站具体可以是指接入网中在空中接口上通过一个或多个扇区与无线终端通信的设备。基站可用于将收到的空中帧与网际协议(Internet Protocol,IP)分组进行相互转换,作为无线终端设备与接入网的其余部分之间的路由器,其中接入网的其余部分可包括IP网络。基站还可协调对空中接口的属性管理。例如,基站可以是长期演进(Long Term Evolution,LTE)或长期演进升级版(LTE-Advanced,LTE-A)等系统中的演进型基站(NodeB或eNB或e-NodeB,evolutional Node B),本发明实施例并不限定。 In addition, in the embodiment of the present invention, the user plane network element may further include an access network element, such as a base station (for example, an access point). A base station may specifically refer to a device in an access network that communicates with a wireless terminal over one or more sectors over an air interface. The base station can be used to convert the received air frame with an Internet Protocol (IP) packet as a router between the wireless terminal device and the rest of the access network, wherein the rest of the access network can include an IP network. . The base station can also coordinate attribute management of the air interface. For example, the base station may be an evolved base station (NodeB or eNB or e-NodeB, evolutional Node B) in a system such as Long Term Evolution (LTE) or Long Term Evolution (LTE-A). The embodiment of the invention is not limited.
在本发明实施例中,信令处理网元主要完成故障判定工作,该信令处理网元可以通过MME实现,或者也可能通过其他的网络设备实现。In the embodiment of the present invention, the signaling processing network element mainly performs the fault determination work, and the signaling processing network element may be implemented by using the MME, or may also be implemented by other network devices.
或者,本发明实施例中的信令处理网元也可以通过软件定义网络(Software Defined Network,SDN)中的控制器(Controller)实现,下文中将其称为SDN控制器。在这种情况下,用户面网元可以包括SDN中的交换机(Switch)。Alternatively, the signaling processing network element in the embodiment of the present invention may also be implemented by a controller (Controller) in a Software Defined Network (SDN), which is hereinafter referred to as an SDN controller. In this case, the user plane network element may include a switch (Switch) in the SDN.
3)本发明实施例中,“用户会话上下文(User Bearer Context)”这一概念可以是“业务”这一概念的下位概念。例如,若业务中断,则可能导致该业务的用户会话上下文中断或丢失。3) In the embodiment of the present invention, the concept of "User Bearer Context" may be a subordinate concept of the concept of "business". For example, if the service is interrupted, the user session context of the service may be interrupted or lost.
4)本发明实施例中的术语“系统”和“网络”可被互换使用。“多个”是指两个或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,字符“/”,如无特殊说明,一般表示前后关联对象是一种“或”的关系。4) The terms "system" and "network" in the embodiments of the present invention may be used interchangeably. "Multiple" means two or more. "and/or", describing the association relationship of the associated objects, indicating that there may be three relationships, for example, A and/or B, which may indicate that there are three cases where A exists separately, A and B exist at the same time, and B exists separately. In addition, the character "/", unless otherwise specified, generally indicates that the contextual object is an "or" relationship.
本发明实施例应用的网络架构为控制面和用户面分离的架构,下面结合附图进行介绍。The network architecture applied in the embodiment of the present invention is an architecture in which the control plane and the user plane are separated, and is described below with reference to the accompanying drawings.
请参见图1,为本发明实施例应用的一种网络架构示意图,在图1中,以信令处理网元通过MME实现为例。图1中,用户设备通过Uu接口连接基站,基站通过S1-U接口连接DGW中的SGW-U,SGW-U通过S5/S8-U接口连接同一个DGW中的PGW-U,PGW-U通过SGi接口连接互联网(Internet),关于PGW-U与互联网连接的关系在图1中未画出。另外,基站通过S1-MME接口连接MME,MME通过S11接口连接CGW中的SGW-C,SGW-C通过S5/S8-C接口连接同一个CGW中的PGW-C,当然,SGW-C也可以连接不同的CGW中的PGW-C,在图1中未画出。SGW-C通过Sxa接口连接SGW-U,PGW-C通过Sxb接口连接PGW-U。可以理解为,图1中的CGW管理DGW,具体来说,SGW-C管理SGW-U,PGW-C管理PGW-U。FIG. 1 is a schematic diagram of a network architecture according to an embodiment of the present invention. In FIG. 1, the signaling processing network element is implemented by using an MME as an example. In Figure 1, the user equipment is connected to the base station through the Uu interface, and the base station is connected to the SGW-U in the DGW through the S1-U interface. The SGW-U is connected to the PGW-U in the same DGW through the S5/S8-U interface, and the PGW-U passes. The SGi interface is connected to the Internet. The relationship between the PGW-U and the Internet connection is not shown in Figure 1. In addition, the base station is connected to the MME through the S1-MME interface, and the MME is connected to the SGW-C in the CGW through the S11 interface, and the SGW-C is connected to the PGW-C in the same CGW through the S5/S8-C interface. Of course, the SGW-C can also The PGW-C in the different CGWs is connected, which is not shown in FIG. The SGW-C connects to the SGW-U through the Sxa interface, and the PGW-C connects to the PGW-U through the Sxb interface. It can be understood that the CGW in FIG. 1 manages the DGW, specifically, the SGW-C manages the SGW-U, and the PGW-C manages the PGW-U.
其中,本发明实施例中介绍的接口的名称以及网元的名称等均不构成对 设备本身的限制,在应用中,接口和网元等也可能有其他的名称。The name of the interface and the name of the network element introduced in the embodiment of the present invention do not constitute a pair. The limitations of the device itself may also have other names in the application, interfaces and network elements.
目前,CP和UP之间一般会互通心跳(Heartbeat)消息。例如CP周期性向UP发送心跳消息,UP接收心跳消息后会向CP回复响应消息;或者,UP会周期性向CP发送心跳消息,CP接收心跳消息后会向UP回复响应消息。那么,如果CP长时间未收到UP发送的心跳消息或回复的响应消息,则CP判断为UP故障,那么CP就会进入UP的故障处理流程,例如CP会重选UP进行用户重新激活。Currently, heartbeat messages are generally exchanged between the CP and the UP. For example, the CP periodically sends a heartbeat message to the UP. After receiving the heartbeat message, the UP sends a response message to the CP. Alternatively, the UP periodically sends a heartbeat message to the CP. After receiving the heartbeat message, the CP sends a response message to the UP. Then, if the CP does not receive the heartbeat message or the response message sent by the UP for a long time, the CP determines that the UP is faulty, then the CP enters the fault processing flow of the UP. For example, the CP will reselect the UP for user reactivation.
同样的,如果UP长时间未收到CP发送的心跳消息或回复的响应消息,则UP判断为CP故障,那么UP就会进入CP的故障处理流程,例如UP会释放本地的业务等。Similarly, if the UP does not receive the heartbeat message or the response message sent by the CP for a long time, the UP determines that the CP is faulty, then the UP enters the fault handling process of the CP, for example, the UP releases the local service.
而CP长时间未收到UP发送的心跳消息或回复的响应消息,或UP长时间未收到CP发送的心跳消息或回复的响应消息,可能是UP故障或CP故障,也可能是CP和UP之间的链路故障。可见,无论是UP故障还是CP和UP之间的链路故障,目前CP都会统一认为是UP故障,从而直接按照UP故障进行处理。而无论是CP故障还是CP和UP之间的链路故障,UP都会统一认为是CP故障,从而直接按照CP故障进行处理。如果实际上是CP和UP之间的链路故障,CP按照UP故障进行处理,或UP按照CP故障进行处理,则会导致本来没有故障的UP的业务中断或丢失,业务恢复所需的时间又会比较长,极大地影响用户的业务体验。可见,由于CP或UP的检测结果准确性都不高,很可能导致UP的业务中断或丢失。The CP has not received the heartbeat message or the response message sent by the UP for a long time, or the heartbeat message or the response message sent by the CP that has not been received by the CP for a long time may be an UP fault or a CP fault, or may be a CP and an UP. The link between the faults. It can be seen that whether the UP is faulty or the link between the CP and the UP is faulty, the CP is currently considered to be an UP fault, and is directly processed according to the UP fault. Regardless of whether the CP is faulty or the link between the CP and the UP is faulty, the UP is uniformly considered to be a CP fault, and thus is directly processed according to the CP fault. If the link between the CP and the UP is faulty, the CP processes the fault according to the UP fault, or the UP processes the fault according to the CP fault, which causes the service of the UP that is not faulty to be interrupted or lost. It will be longer and greatly affect the user's business experience. It can be seen that the accuracy of the detection result of the CP or the UP is not high, which may cause the service of the UP to be interrupted or lost.
本发明实施例中,因信令处理网元可以接收多方网元发送的信息,因此将进行故障判断的任务交给信令处理网元。信令处理网元可以结合接收的第一探测信息和第二探测信息来综合判断究竟是用户面网元故障、控制面网元故障还是控制面网元和用户面网元之间的链路故障,因在进行故障判断时会综合考虑多方面的信息,而不只是考虑单网元的信息,提高了判断结果的准确率。后续如果是用户面网元故障可以按照用户面网元故障进行处理,如果是控制面网元故障可以按照控制面网元故障进行处理,如果是链路故障则可 以按照链路故障进行处理,可以尽量避免无故障的UP的业务中断,保证业务的连续性,或者在UP故障情况下,也可以尽量快速恢复UP的业务,尽量不影响用户的业务体验。In the embodiment of the present invention, the signaling processing network element can receive the information sent by the multi-party network element, and therefore the task of performing the fault determination is handed over to the signaling processing network element. The signaling processing network element can comprehensively determine whether the user plane network element fault, the control plane network element fault, or the link fault between the control plane network element and the user plane network element is combined with the received first probe information and the second probe information. Because the multi-faceted information will be comprehensively considered in the judgment of the fault, not just the information of the single network element, the accuracy of the judgment result is improved. If the fault occurs on the user plane network element, the fault can be processed according to the fault of the network element on the control plane. If the fault occurs on the control plane network element, the link fault can be processed. In the case of a link fault, the service interruption of the fault-free UP can be avoided as much as possible, and the service continuity can be ensured. In the case of an UP fault, the UP service can be restored as quickly as possible without affecting the user experience.
以上介绍了本发明实施例所应用的网络架构以及本发明实施例所提供的设备,下面结合附图介绍本发明实施例所提供的方法。在下文即将介绍的图2-图7所示的实施例中,均以图1所示的网络架构为例,根据以上的介绍,本领域技术人员自然知晓本发明实施例的应用场景不限于此。The network architecture and the device provided by the embodiment of the present invention are described above. The method provided by the embodiment of the present invention is described below with reference to the accompanying drawings. In the embodiment shown in FIG. 2 to FIG. 7 to be described later, the network architecture shown in FIG. 1 is taken as an example. According to the above description, those skilled in the art naturally know that the application scenario of the embodiment of the present invention is not limited to this. .
本发明一实施例提供一种故障处理方法。在本发明实施例中,第一控制面网元通过探测获知第一用户面网元的状态为故障,第一控制面网元根据对第一用户面网元的探测生成第一探测信息,并将第一探测信息发送给信令处理网元。另外第二用户面网元通过探测获知第一用户面网元的状态为故障,第二用户面网元根据对第一用户面网元的探测生成第二探测信息,并将第二探测信息也发送给信令处理网元。信令处理网元根据第一探测信息和第二探测信息确定故障类型。通过本发明实施例提供的技术方案,可以有效确定是用户面网元故障还是链路故障。在下面的介绍过程中,以第一控制面网元是SGW-C、第一用户面网元是SGW-U、第二用户面网元是基站、信令处理网元是MME为例进行描述,如图2所示。An embodiment of the invention provides a fault processing method. In the embodiment of the present invention, the first control plane network element is configured to detect that the state of the first user plane network element is a fault, and the first control plane network element generates the first probe information according to the detection of the first user plane network element, and Sending the first probe information to the signaling processing network element. In addition, the second user plane network element learns that the state of the first user plane network element is faulty, and the second user plane network element generates second detection information according to the detection of the first user plane network element, and the second detection information is also Send to the signaling processing network element. The signaling processing network element determines the fault type according to the first probe information and the second probe information. The technical solution provided by the embodiment of the present invention can effectively determine whether the user plane network element fault or the link fault. In the following description, the first control plane network element is SGW-C, the first user plane network element is SGW-U, the second user plane network element is a base station, and the signaling processing network element is an MME. ,as shown in picture 2.
S21、SGW-C获知SGW-U的状态为故障。S21 and SGW-C know that the state of the SGW-U is a fault.
SGW-C在Sx接口可以通过心跳消息或者其它信令消息探测感知SGW-U的状态。例如,SGW-C周期性向SGW-U发送心跳消息,则SGW-U接收心跳消息后会向SGW-C发送响应消息,或者SGW-U周期性向SGW-C发送心跳消息,则SGW-C接收心跳消息后会向SGW-U发送响应消息。如果SGW-C长时间未收到SGW-U发送的响应消息或心跳消息,则SGW-C确定SGW-U故障。The SGW-C can detect the state of the SGW-U through the heartbeat message or other signaling messages on the Sx interface. For example, the SGW-C periodically sends a heartbeat message to the SGW-U, and the SGW-U sends a response message to the SGW-C after receiving the heartbeat message, or the SGW-U periodically sends a heartbeat message to the SGW-C, and the SGW-C receives the heartbeat. The message will be sent to the SGW-U after the message. If the SGW-C does not receive the response message or the heartbeat message sent by the SGW-U for a long time, the SGW-C determines that the SGW-U is faulty.
本发明实施例中,SGW-C在获知SGW-U的状态为故障后,SGW-C本地仍保持SGW-U关联的业务继续正常进行。In the embodiment of the present invention, after the SGW-C learns that the status of the SGW-U is faulty, the SGW-C keeps the service associated with the SGW-U locally and continues to perform normally.
其中,SGW-C在判定SGW-U故障后,可以标识该SGW-U故障,但仍然可以保持SGW-U关联的业务继续正常进行。通过这种方式可避免用户下线,尽 量保持业务得以继续。The SGW-C can identify the SGW-U fault after determining that the SGW-U is faulty, but can still keep the service associated with the SGW-U from continuing normally. In this way, users can be prevented from going offline. The volume keeps the business going.
S22、SGW-C向MME发送第一探测信息,其中,第一探测信息是SGW-C根据探测结果生成的,也就是说,第一探测信息用于指示SGW-C得到的SGW-U的状态。S22. The SGW-C sends the first probe information to the MME, where the first probe information is generated by the SGW-C according to the detection result, that is, the first probe information is used to indicate the status of the SGW-U obtained by the SGW-C. .
若SGW-C判定SGW-U故障,那么在第一探测信息中可以携带该故障的SGW-U的标识。例如,SGW-U的标识可以为SGW-U的转发面的IP地址,或是其他用于标识SGW-U的身份的标识。若SGW-C判定SGW-U无故障,那么SGW-C在第一探测信息中可以不携带该故障的SGW-U的标识,对于MME来说,可以设置检测周期,根据在该检测周期内接收的SGW-C发送的第一探测信息携带的故障SGW-U的标识来确定究竟哪些SGW-U故障。或者SGW-C也可以不向MME发送第一探测信息,则MME若在该检测周期内未收到SGW-C发送的第一探测信息,就默认SGW-C对SGW-U的探测结果为正常,从而MME可以据此获知SGW-C的探测结果。If the SGW-C determines that the SGW-U is faulty, the identifier of the faulty SGW-U may be carried in the first probe information. For example, the identifier of the SGW-U may be the IP address of the forwarding plane of the SGW-U, or other identifiers used to identify the identity of the SGW-U. If the SGW-C determines that the SGW-U is fault-free, the SGW-C may not carry the identifier of the faulty SGW-U in the first probe information, and may set a detection period for the MME, according to receiving in the detection period. The identifier of the faulty SGW-U carried by the first probe information sent by the SGW-C determines which SGW-Us are faulty. Alternatively, the SGW-C may not send the first probe information to the MME, and if the MME does not receive the first probe information sent by the SGW-C in the detection period, the default SGW-C detection result to the SGW-U is normal. Therefore, the MME can learn the detection result of the SGW-C accordingly.
例如,SGW-C通过扩展的S11接口消息将第一探测信息发送给MME。例如,扩展的S11接口消息为回送请求(Echo Request)消息,或者也可以是新增的故障处理消息等。For example, the SGW-C sends the first probe information to the MME through the extended S11 interface message. For example, the extended S11 interface message is an Echo Request message, or may be a newly added fault processing message or the like.
如前介绍的是SGW-C获知SGW-U的状态为故障时的处理方式,因SGW-C和SGW-U是互相检测,那么,SGW-C可以获知SGW-U的状态为故障,同样的,SGW-U也可以获知SGW-C的状态为故障。As described above, the SGW-C learns that the state of the SGW-U is faulty. Since the SGW-C and the SGW-U are mutually detected, the SGW-C can know that the state of the SGW-U is faulty, and the same. SGW-U can also know that the status of SGW-C is faulty.
SGW-U在获知SGW-C的状态为故障后,也可以在本地仍保持SGW-C关联的业务继续正常进行。After the SGW-U learns that the status of the SGW-C is faulty, the SGW-C can still maintain the normal service of the SGW-C.
其中,SGW-U在判定SGW-C故障后,可以标识该SGW-C故障,但仍然可以保持SGW-C关联的业务继续正常进行。通过这种方式可避免用户下线,保证业务继续进行。The SGW-U can identify the SGW-C fault after determining that the SGW-C is faulty, but can still keep the service associated with the SGW-C from proceeding normally. In this way, users can be prevented from going offline and the service can be continued.
S23、基站获知SGW-U的状态为故障。S23. The base station learns that the state of the SGW-U is a fault.
其中,基站可以基于用户面的数据传输通道探测感知SGW-U的状态,从而确定SGW-U是否故障。 The base station may detect the state of the sensing SGW-U based on the data transmission channel of the user plane, thereby determining whether the SGW-U is faulty.
S24、基站向MME发送第二探测信息,其中,第二探测信息是基站根据探测结果生成的,也就是说,第二探测信息用于指示基站得到的SGW-U的状态。S24. The base station sends the second probe information to the MME, where the second probe information is generated by the base station according to the detection result, that is, the second probe information is used to indicate the status of the SGW-U obtained by the base station.
若基站判定SGW-U故障,那么在第二探测信息中可以携带该故障的SGW-U的标识,若基站判定SGW-U无故障,那么基站在第二探测信息中可以不携带该故障的SGW-U的标识,对于MME来说,可以设置检测周期,根据在该检测周期内接收的基站发送的第二探测信息携带的故障SGW-U的标识来确定究竟哪些SGW-U故障。或者基站也可以不向MME发送第二探测信息,则MME若在该检测周期内未收到基站发送的第二探测信息,就默认基站对于SGW-U的探测结果为正常,即用户面的数据传输通道正常,从而MME可以据此获知基站的探测结果。If the base station determines that the SGW-U is faulty, the second probe information may carry the identifier of the faulty SGW-U. If the base station determines that the SGW-U is faultless, the base station may not carry the faulty SGW in the second probe information. The identifier of the -U, for the MME, may set a detection period, and determine which SGW-Us are faulty according to the identifier of the fault SGW-U carried by the second probe information sent by the base station received in the detection period. Or the base station may not send the second probe information to the MME, if the MME does not receive the second probe information sent by the base station in the detection period, the default base station detects the SGW-U as normal, that is, the user plane data. The transmission channel is normal, so that the MME can learn the detection result of the base station accordingly.
其中,基站在获知SGW-U的状态为故障后,对于本地的与该SGW-U关联的业务可以有不同的处理方式,相应的,基站向MME发送第二探测信息也就有不同的方式。以下简单介绍。After the base station learns that the status of the SGW-U is faulty, the local device may have different processing manners for the local service associated with the SGW-U. Correspondingly, the base station sends the second detection information to the MME in different manners. The following is a brief introduction.
方式一:method one:
在方式一中,若UP(例如基站)获知另一UP(例如SGW-U)的状态为故障,则该UP在本地会释放与另一UP关联的业务,UP可以在因释放业务而向MME触发的消息中携带第二探测信息。In the first mode, if the UP (for example, the base station) learns that the state of the other UP (for example, the SGW-U) is a fault, the UP locally releases the service associated with another UP, and the UP may be in the MME due to the release of the service. The triggered message carries the second probe information.
例如,基站发送至MME的用户设备上下文释放请求(UE Context Release Request)消息可携带上述第二探测信息。若基站获知SGW-U的状态为故障,则基站可以针对该SGW-U关联的所有业务触发S1UE Context Release Request消息到MME,即向MME发送S1UE Context Release Request消息,在S1UE Context Release Request消息中,可以扩展原因指示(Cause Indicates)标识为S1-U失败(Failure)。可选的,S1UE Context Release Request消息还可以携带故障的SGW-U的标识。这样,MME收到S1UE Context Release Request消息后,就可以知道基站的探测结果。另外,MME还可以向基站发送S1用户设备上下文释放确定(UE Context Release Command)消息,以指示基站释放空口和本 地的业务。For example, the UE Context Release Request message sent by the base station to the MME may carry the second probe information. If the base station learns that the status of the SGW-U is a fault, the base station may send an S1UE Context Release Request message to the MME for all services associated with the SGW-U, that is, send an S1UE Context Release Request message to the MME, where the S1UE Context Release Request message is The Cause Indicates can be identified as S1-U Failure. Optionally, the S1UE Context Release Request message may also carry the identifier of the faulty SGW-U. In this way, after receiving the S1UE Context Release Request message, the MME can know the detection result of the base station. In addition, the MME may further send an S1 User Equipment Context Release Command (UE Context Release Command) message to the base station to instruct the base station to release the air interface and the local Local business.
方式二:Method 2:
在方式二中,若UP(例如基站)获知另一UP(例如SGW-U)的状态为故障,则该UP在本地可以继续保持与另一UP关联的业务正常进行,通过这种方式可避免用户下线,使得业务得以继续。该UP可以通过现有的消息或新增的故障处理消息等将第二探测信息发送给MME。例如,基站获知SGW-U的状态为故障,则基站在本地仍保持SGW-U关联的业务继续进行,且基站通过扩展的S1-MME接口消息将第二探测信息发送给MME。In the second mode, if the UP (for example, the base station) learns that the status of the other UP (for example, the SGW-U) is faulty, the UP can continue to maintain the normal association with the other UP. The user goes offline, allowing the business to continue. The UP may send the second probe information to the MME by using an existing message or a newly added fault processing message or the like. For example, if the base station learns that the state of the SGW-U is a fault, the base station continues to maintain the SGW-U-related service locally, and the base station sends the second probe information to the MME through the extended S1-MME interface message.
其中,S21-S22以及S23-S24可以视为两个部分,这两个部分的执行顺序可以任意。Among them, S21-S22 and S23-S24 can be regarded as two parts, and the execution order of these two parts can be arbitrary.
S25、MME根据第一探测信息和第二探测信息确定故障类型,其中故障类型包括SGW-U故障,或SGW-C和SGW-U之间的链路故障。S25. The MME determines a fault type according to the first probe information and the second probe information, where the fault type includes an SGW-U fault, or a link fault between the SGW-C and the SGW-U.
例如,MME结合第一探测信息中携带的故障的SGW-U的标识,以及第二探测信息中携带的故障的SGW-U的标识或Cause Indicates标识为S1-U Failure的用户关联的SGW-U的标识,确定究竟是SGW-U故障还是SGW-C和SGW-U之间的链路故障。For example, the MME combines the identifier of the faulty SGW-U carried in the first probe information, and the identifier of the faulty SGW-U carried in the second probe information or the SGW-U associated with the user whose indication indicates the S1-U Failure. The identifier identifies whether it is a SGW-U failure or a link failure between SGW-C and SGW-U.
可能的实施方式中,若第二探测信息中携带的故障的SGW-U的标识中包括第一探测信息中携带的故障的SGW-U的标识,或第二探测信息中的Cause Indicates标识为S1-U Failure的用户关联的SGW-U的标识中包括第一探测信息中携带的故障的SGW-U的标识,即第一探测信息和第二探测信息均指示SGW-U故障,则MME确定SGW-U故障。In a possible implementation, if the identifier of the faulty SGW-U carried in the second probe information includes the identifier of the faulty SGW-U carried in the first probe information, or the Cause Indicates identifier in the second probe information is S1 The identifier of the SGW-U associated with the user of the -U Failure includes the identifier of the faulty SGW-U carried in the first probe information, that is, the first probe information and the second probe information both indicate that the SGW-U is faulty, and the MME determines the SGW. -U failure.
可能的实施方式中,若第二探测信息中携带的故障的SGW-U的标识中不包括第一探测信息中携带的故障的SGW-U的标识,或第二探测信息中的Cause Indicates标识为S1-U Failure的用户关联的SGW-U的标识中不包括第一探测信息中携带的故障的SGW-U的标识,或MME在检测周期内未收到第二探测信息,即第一探测信息指示SGW-U故障,而第二探测信息指示SGW-U正常,则MME确定SGW-U正常,而SGW-C和SGW-U之间的链路故障。 In a possible implementation, if the identifier of the faulty SGW-U carried in the second probe information does not include the identifier of the faulty SGW-U carried in the first probe information, or the Cause Indicates identifier in the second probe information is The identifier of the SGW-U that is associated with the user-initiated SGW-U of the S1-U Failure does not include the identifier of the faulty SGW-U carried in the first probe information, or the MME does not receive the second probe information in the detection period, that is, the first probe information. Instructing the SGW-U to fail, and the second probe information indicates that the SGW-U is normal, the MME determines that the SGW-U is normal, and the link between the SGW-C and the SGW-U is faulty.
S26、MME进行故障恢复处理。S26. The MME performs fault recovery processing.
在本发明实施例中,MME可结合预定义的链路故障处理策略来进行链路故障恢复处理。其中,链路故障处理策略可以由运营商预定义,或者也可以通过协议或标准预定义。本发明实施例中的链路故障处理策略包括但不限于以下几种:In the embodiment of the present invention, the MME may perform link failure recovery processing in combination with a predefined link failure processing policy. The link fault handling policy may be predefined by the operator or may be predefined by a protocol or a standard. The link fault handling strategy in the embodiment of the present invention includes but is not limited to the following:
1、UP故障处理策略。1. UP fault handling strategy.
MME可基于用户会话上下文获取故障的SGW-U归属关联的SGW-C,即用于管理故障的SGW-U的SGW-C。MME向该SGW-C发送扩展S11接口消息,指示该SGW-C启动SGW-U的故障处理。在MME发送给SGW-C的扩展S11接口消息中可以携带故障的SGW-U的标识,以及故障业务处理指示等。具体的,对于用户面网元的故障处理方式,与用户面网元的部署方式有关。本发明实施例提供几种UP的部署方式,下面介绍这几种部署方式以及在不同的部署方式下分别如何进行UP的故障处理。The MME may acquire the SGW-C of the failed SGW-U home association based on the user session context, ie, the SGW-C of the SGW-U for managing the fault. The MME sends an extended S11 interface message to the SGW-C, instructing the SGW-C to initiate fault processing of the SGW-U. The extended S11 interface message sent by the MME to the SGW-C may carry the identifier of the faulty SGW-U, the fault service processing indication, and the like. Specifically, the fault handling mode of the user plane network element is related to the deployment mode of the user plane network element. The embodiments of the present invention provide several types of UP deployment modes. The following describes how to implement the fault processing of the UP in different deployment modes.
部署方式1:一个CP管理多个UP,且一个CP管理的多个UP之间负载均衡部署。那么,若UP故障,则CP可以释放故障的UP对应的业务,故障的UP所服务的用户设备需重新激活选择到其它UP。这种部署方式下对于故障UP的处理方式比较简单,但故障UP对应的业务可能会全部丢失,业务恢复时间比较长,用户体验可能不是很好。Deployment mode 1: A CP manages multiple UPs, and a load balancing deployment between multiple UPs managed by one CP. Then, if the UP fails, the CP can release the service corresponding to the failed UP, and the user equipment served by the failed UP needs to reactivate the selection to other UPs. In this deployment mode, the faulty UP is handled in a simple manner. However, the service corresponding to the fault UP may be lost. The service recovery time is long and the user experience may not be very good.
部署方式2:一个CP管理的多个UP之间呈N+1备份部署,即一个CP管理的多个UP中包括N个主用UP和一个备用UP。若某个主用UP故障,则CP基于CP本地保存的该主用UP的用户会话上下文,将故障的主用UP的用户会话上下文加载到备用UP,从而恢复用户的业务。这种部署方式下,故障UP对应的大部分业务可恢复。然而,由于将故障UP的用户会话上下文备份到CP是周期性执行的,因此故障发生时在备份周期内未及时备份的用户业务上下文可能还是会丢失,且整个业务恢复所需的时间较长,可能还是会存在业务中断的情况,用户体验一般。Deployment mode 2: N+1 backup deployment between multiple UPs managed by one CP, that is, multiple UPs managed by one CP include N primary UPs and one standby UP. If a primary UP fails, the CP loads the user's session context of the failed primary UP to the standby UP based on the user session context of the primary UP saved locally by the CP, thereby restoring the user's service. In this deployment mode, most services corresponding to the fault UP can be recovered. However, since the backup of the user session context of the fault UP to the CP is performed periodically, the user service context that is not backed up in time during the backup period may still be lost, and the time required for the entire service recovery is long. There may still be a business interruption, and the user experience is general.
部署方式3:一个CP管理的多个UP之间呈多方(N-way)方式冗余,即所 有UP均为主用,但各UP间预留一定的冗余资源。若某个UP故障,则CP基于CP本地保存的该UP的用户会话上下文,将故障UP的用户会话上下文分散加载到其它UP,从而恢复用户的业务。这种部署方式下,故障UP所需的业务恢复时间和业务完整度与部署方式2类似。Deployment mode 3: Multiple UP (N-way) mode redundancy between multiple UPs managed by one CP. UP is used for main purposes, but certain redundant resources are reserved between UPs. If a certain UP is faulty, the CP scatters the user session context of the failed UP to other UPs based on the user session context of the UP saved locally by the CP, thereby restoring the user's service. In this deployment mode, the service recovery time and service integrity required for the fault UP are similar to those of the deployment mode 2.
部署方式4:一个CP管理的多个UP之间呈1+1备份方式,即一个主用UP对应一个备用UP。在这种部署方式下,主用UP和备用UP可以相互检测。若某个主用UP故障,与该主用UP对应的备用UP可以倒换为主用UP,对外继续提供业务。这种部署方式下,用户业务恢复时间较短,业务中断时间也较短,用户体验比较好。Deployment mode 4: A multiple-upup between multiple UPs managed by a CP is a 1+1 backup mode. That is, one primary UP corresponds to one standby UP. In this deployment mode, the primary UP and the standby UP can detect each other. If a primary UP is faulty, the standby UP corresponding to the primary UP can be switched to the primary UP to continue to provide services. In this deployment mode, user service recovery time is shorter, service interruption time is shorter, and user experience is better.
2、重选控制面网元策略。2. Reselect the control plane network element strategy.
即,MME可以重选CP,在重选的CP和UP之间重新建立链路,无需继续通过故障的链路进行业务处理,可以使得业务能够尽快继续。That is, the MME can reselect the CP, re-establish the link between the reselected CP and the UP, and do not need to continue to perform service processing through the faulty link, so that the service can continue as soon as possible.
在这种链路故障处理策略中,MME若确定CP和UP之间的链路故障,则MME可以重选CP。在重选CP后,MME可以向重选的CP发送变更承载请求(Modify Bearer Request)消息,在该Modify Bearer Request消息中可以扩展携带Sx链路恢复指示,以及可以携带UP的标识。另外,为了使得业务得以尽快恢复,在该Modify Bearer Request消息中还可以携带原来的CP的地址,从而重选的CP可以从原来的CP中获取该UP的业务信息。In this link failure handling strategy, if the MME determines that the link between the CP and the UP is faulty, the MME may reselect the CP. After the CP is reselected, the MME may send a Modify Bearer Request message to the reselected CP. The Modify Bearer Request message may be extended to carry the Sx link recovery indication and may carry the UP identifier. In addition, in order to enable the service to be restored as soon as possible, the Modify Bearer Request message may also carry the address of the original CP, so that the reselected CP can obtain the service information of the UP from the original CP.
重选的CP接收该Modify Bearer Request消息后,可以根据其中携带的Sx链路恢复指示触发Sx会话变更(Sx Session Modification)消息,以恢复和UP之间的链路。After receiving the Modify Bearer Request message, the reselected CP may trigger an Sx Session Modification message according to the Sx link recovery indication carried therein to restore the link between the UP and the UP.
在重选的CP和UP之间的链路恢复后,CP可以向MME发送变更承载响应(Modify Bearer Response)消息,在该Modify Bearer Response消息中可以扩展指示Sx链路已恢复正常。After the link between the reselected CP and the UP is restored, the CP may send a Modify Bearer Response message to the MME, where the Modify Bearer Response message may be extended to indicate that the Sx link has returned to normal.
以本发明实施例为例来说,也就是MME可以重选SGW-C,在重选的SGW-C和SGW-U之间重新建立链路,无需继续通过故障的链路进行业务处理,可以使得业务能够尽快继续。其中,MME若确定SGW-C和SGW-U之间的链路 故障,则MME可以重选SGW-C。在重选SGW-C后,MME可以向重选的SGW-C发送Modify Bearer Request消息,在该Modify Bearer Request消息中可以扩展携带Sx链路恢复指示,以及可以携带SGW-U的标识。另外,为了使得业务得以尽快恢复,在该Modify Bearer Request消息中还可以携带原来的SGW-C的地址,从而重选的SGW-C可以从原来的SGW-C中获取该SGW-U的业务信息。For example, in the embodiment of the present invention, the MME may reselect the SGW-C, and re-establish the link between the reselected SGW-C and the SGW-U, without continuing to perform service processing through the faulty link. Enable the business to continue as soon as possible. Wherein, the MME determines the link between the SGW-C and the SGW-U. In case of failure, the MME can reselect SGW-C. After the SGW-C is reselected, the MME may send a Modify Bearer Request message to the reselected SGW-C. The Modify Bearer Request message may be extended to carry the Sx link recovery indication and may carry the identifier of the SGW-U. In addition, in order to enable the service to be restored as soon as possible, the original SGW-C address may be carried in the Modify Bearer Request message, so that the reselected SGW-C can obtain the service information of the SGW-U from the original SGW-C. .
重选的SGW-C接收该Modify Bearer Request消息后,可以根据其中携带的Sx链路恢复指示触发Sx Session Modification消息,以恢复和SGW-U之间的链路。After receiving the Modify Bearer Request message, the reselected SGW-C may trigger the Sx Session Modification message according to the Sx link recovery indication carried therein to restore the link with the SGW-U.
在重选的SGW-C和SGW-U之间的链路恢复后,SGW-C可以向MME发送变更承载响应(Modify Bearer Response)消息,在该Modify Bearer Response消息中可以扩展指示Sx链路已恢复正常。After the link between the reselected SGW-C and the SGW-U is restored, the SGW-C may send a Modify Bearer Response message to the MME, where the indication of the Sx link may be extended in the Modify Bearer Response message. Back to normal.
3、等待链路恢复策略。3. Wait for the link recovery strategy.
即,MME可以指示原来的CP不做处理,等待CP和UP之间的链路恢复。That is, the MME may indicate that the original CP does not perform processing and wait for link recovery between the CP and the UP.
在这种链路故障处理策略中,MME不重选CP,而是可以向原来的CP发送扩展S11接口消息,例如Echo Request消息或新增的故障处理消息。在扩展S11接口消息中可以携带故障的UP的标识,以及可以携带等待Sx链路恢复的指示。CP接收该扩展S11接口消息后,根据其中携带的等待Sx链路恢复指示,不做处理,等待CP和UP之间的链路恢复。In this link fault handling policy, the MME does not reselect the CP, but can send an extended S11 interface message, such as an Echo Request message or a newly added fault handling message, to the original CP. The extended S11 interface message may carry the identifier of the failed UP, and may carry an indication of waiting for the Sx link to recover. After receiving the extended S11 interface message, the CP waits for the Sx link recovery indication carried in the process, and does not process, waiting for the link between the CP and the UP to resume.
4、释放业务策略。4. Release the business strategy.
即MME可以在本地释放故障的网元关联的所有业务。That is, the MME can locally release all services associated with the faulty network element.
5、等待网元故障恢复策略。5. Wait for the NE recovery policy.
即,MME可以不做处理,等待故障网元的故障恢复。这样,在故障网元恢复后,该故障网元重新变为正常网元,可以继续加以利用,原来的业务也可以得以继续,无需中途更换其他网元处理,减小了用户会话信息丢失的可能性,也提高了网元的利用率。That is, the MME may wait for the fault recovery of the faulty network element without processing. In this way, after the faulty network element is restored, the faulty network element becomes a normal network element again, and can continue to be utilized, and the original service can be continued, without replacing other network element processing in the middle, thereby reducing the possibility of user session information loss. Sexuality also improves the utilization of network elements.
以本发明实施例为例来说,也就是MME可以指示原来的SGW-C不做处理,等待SGW-C和SGW-U之间的链路恢复。其中,MME若确定SGW-C和 SGW-U之间的链路故障,则MME不重选SGW-C,而是向原来的SGW-C发送扩展S11接口消息,例如Echo Request消息或新增的故障处理消息。在扩展S11接口消息中可以携带故障的SGW-U的标识,以及可以携带等待Sx链路恢复的指示。SGW-C接收该扩展S11接口消息后,根据其中携带的等待Sx链路恢复指示,不做处理,等待SGW-C和SGW-U之间的链路恢复。For example, in the embodiment of the present invention, the MME may indicate that the original SGW-C does not process, and waits for link recovery between the SGW-C and the SGW-U. Where MME determines SGW-C and If the link between the SGW and the U is faulty, the MME does not reselect the SGW-C, but sends an extended S11 interface message, such as an Echo Request message or a newly added fault handling message, to the original SGW-C. The extended S11 interface message may carry the identifier of the faulty SGW-U, and may carry an indication of waiting for the Sx link to recover. After receiving the extended S11 interface message, the SGW-C waits for the link recovery between the SGW-C and the SGW-U according to the waiting Sx link recovery indication carried in the SGW-C.
以上几种链路故障处理策略只是举例,在实际应用中还可能有其他的链路故障处理策略,均在本发明实施例的保护范围之内。The foregoing link fault handling strategies are only examples, and other link fault handling strategies may be implemented in the actual application, which are all within the protection scope of the embodiments of the present invention.
可能的实施方式中,若MME确定SGW-U故障,而SGW-C和SGW-U之间的链路正常,则MME可以采取如前所述的第1种链路故障处理策略,即UP故障处理策略进行链路故障恢复。在第1种链路故障处理策略下,可以针对UP的不同部署方式采取不同的方式进行链路故障恢复,较为灵活,也符合实际网络情况。In a possible implementation manner, if the MME determines that the SGW-U is faulty, and the link between the SGW-C and the SGW-U is normal, the MME may adopt the first type of link fault handling policy as described above, that is, the UP fault. Process the policy for link failure recovery. In the first link fault handling policy, link fault recovery can be performed in different ways for different deployment modes of the UP. It is flexible and conforms to the actual network.
可能的实施方式中,若MME确定SGW-U正常,而SGW-C和SGW-U之间的链路故障,则MME可以采取如前所述的第2种链路故障处理策略或第3种链路故障处理策略,即重选控制面网元策略或等待链路恢复策略进行链路故障恢复。若采用第2种链路故障处理策略进行链路恢复,则恢复速度比较快,业务可以尽快得以继续。若采用第3种链路故障处理策略进行链路恢复,则在SGW-C和SGW-U之间的链路恢复后,原来的SGW-C可以继续加以利用,原来的业务也可以得以继续,无需中途更换其他网元处理,减小了用户会话上下文丢失的可能性,也提高了网元的利用率。In a possible implementation manner, if the MME determines that the SGW-U is normal, and the link between the SGW-C and the SGW-U is faulty, the MME may adopt the second type of link fault handling policy or the third type as described above. The link fault handling policy is to reselect the control plane network element policy or wait for the link recovery policy to perform link fault recovery. If the second link fault handling strategy is used for link recovery, the recovery speed is faster and the service can continue as soon as possible. If the third link fault handling strategy is used for link recovery, after the link between the SGW-C and the SGW-U is restored, the original SGW-C can continue to be utilized, and the original service can be continued. There is no need to replace other network element processing in the middle, which reduces the possibility of user session context loss and improves the utilization of network elements.
通过本发明实施例提供的技术方案,SGW-C以及SGW-U在检测到Sx链路对端故障后,可以暂不启动故障业务处理,同时通过探测信息将探测获取的对端故障状态发送给MME,则MME结合基站发送的用户面的探测信息,以及SGW-C发送的控制面的探测信息,就可以较为准确地确定究竟是SGW-U故障还是Sx链路故障。With the technical solution provided by the embodiment of the present invention, after detecting the fault of the opposite end of the Sx link, the SGW-C and the SGW-U may not initiate the faulty service processing, and send the fault state of the peer end obtained by the probe to the probe information. In the MME, the MME can determine whether the SGW-U fault or the Sx link fault is relatively accurately determined by combining the probe information of the user plane sent by the base station and the probe information of the control plane sent by the SGW-C.
由于网络引起的链路故障相对于SGW-U故障的概率较高,且SGW-U故障业务处理的代价较高。通过区分识别Sx链路故障,避免将Sx链路故障按 SGW-U故障进行处理,可大大提升网络的业务体验。The probability of link failure caused by the network is higher than that of the SGW-U failure, and the cost of the SGW-U fault service processing is high. By distinguishing between Sx link faults and avoiding Sx link faults The SGW-U fault is processed to greatly improve the service experience of the network.
本发明一实施例提还提供一种故障处理方法。在图3的例子中,以第一控制面网元是PGW-C、第一用户面网元是PGW-U、第二用户面网元是SGW-U、信令处理网元是MME为例进行描述。An embodiment of the present invention provides a fault processing method. In the example of FIG. 3, the first control plane network element is PGW-C, the first user plane network element is PGW-U, the second user plane network element is SGW-U, and the signaling processing network element is MME. Describe.
S31、PGW-C获知PGW-U的状态为故障。S31 and PGW-C know that the state of the PGW-U is a fault.
PGW-C可通过心跳检测等机制确定PGW-U的状态是否为故障,不多赘述。The PGW-C can determine whether the status of the PGW-U is a fault through a mechanism such as heartbeat detection, and no further description is provided.
本发明实施例中,PGW-C在获知PGW-U的状态为故障后,PGW-C本地仍保持PGW-U关联的业务继续正常进行。In the embodiment of the present invention, after the PGW-C learns that the state of the PGW-U is a fault, the service that the PGW-C still keeps the PGW-U locally continues to perform normally.
其中,PGW-C在获知PGW-U的状态为故障后,可以标识该PGW-U故障,但仍然可以保持PGW-U关联的业务继续正常进行。通过这种方式可避免用户下线,尽量使得业务得以继续。After the PGW-C knows that the status of the PGW-U is faulty, the PGW-U fault can be identified, but the service associated with the PGW-U can still be maintained normally. In this way, users can be prevented from going offline and the business can be continued as much as possible.
S32、PGW-C向MME发送第一探测信息,其中,第一探测信息是PGW-C根据探测结果生成的,即,第一探测信息用于指示PGW-C得到的PGW-U的状态。S32. The PGW-C sends the first probe information to the MME, where the first probe information is generated by the PGW-C according to the detection result, that is, the first probe information is used to indicate the status of the PGW-U obtained by the PGW-C.
若PGW-C判定PGW-U故障,那么在第一探测信息中可以携带该故障的PGW-U的标识,PGW-U的标识例如为PGW-U的转发面的IP地址,当然也可以是其他用于标识PGW-U的身份的标识。另外第一探测信息中还可以携带PGW-C的标识。若PGW-C判定PGW-U无故障,那么PGW-C在第一探测信息中可以不携带该故障的PGW-U的标识,对于MME来说,可以设置检测周期,根据在该检测周期内接收的PGW-C发送的第一探测信息携带的故障PGW-U的标识来确定究竟哪些PGW-U故障。或者PGW-C也可以不向MME发送第一探测信息,从而MME若在该检测周期内未收到PGW-C发送的第一探测信息,就默认PGW-C对PGW-U的探测结果为正常,从而MME可以据此获知PGW-C的探测结果。If the PGW-C determines that the PGW-U is faulty, the identifier of the faulty PGW-U may be carried in the first probe information, and the identifier of the PGW-U is, for example, the IP address of the forwarding plane of the PGW-U, and may of course be other An identifier used to identify the identity of the PGW-U. In addition, the identifier of the PGW-C may also be carried in the first probe information. If the PGW-C determines that the PGW-U is fault-free, the PGW-C may not carry the identifier of the faulty PGW-U in the first probe information. For the MME, the detection period may be set, according to receiving in the detection period. The identifier of the fault PGW-U carried by the first probe information sent by the PGW-C determines which PGW-Us are faulty. Alternatively, the PGW-C may not send the first probe information to the MME, so that if the MME does not receive the first probe information sent by the PGW-C within the detection period, the default PGW-C detection result of the PGW-U is normal. Therefore, the MME can learn the detection result of the PGW-C accordingly.
例如,PGW-C通过扩展的S5/S8接口消息将第一探测信息发送给SGW-C,SGW-C再通过S11接口消息将第一探测信息转发给MME,第一探测信息例如承载在扩展Echo Request消息中或新增的故障处理消息中。 For example, the PGW-C sends the first probe information to the SGW-C through the extended S5/S8 interface message, and the SGW-C forwards the first probe information to the MME through the S11 interface message, for example, the first probe information is carried in the extended Echo. In the Request message or in the newly added fault handling message.
如前介绍的是PGW-C获知PGW-U的状态为故障时的处理方式,因PGW-C和PGW-U是互相检测,那么,PGW-C可以获知PGW-U的状态为故障,同样的,PGW-U也可以获知PGW-C的状态为故障。As described above, the PGW-C knows that the state of the PGW-U is faulty. Since the PGW-C and the PGW-U are mutually detected, the PGW-C can know that the state of the PGW-U is faulty, the same. , PGW-U can also know that the status of PGW-C is faulty.
PGW-U在获知PGW-C的状态为故障后,也可以在本地仍保持PGW-C关联的业务继续正常进行。After the PGW-U learns that the status of the PGW-C is faulty, the PGW-C can still maintain the normal business of the PGW-C.
其中,PGW-U在判定PGW-C故障后,可以标识该PGW-C故障,但仍然可以保持PGW-C关联的业务继续正常进行。通过这种方式可避免用户下线,保证业务继续进行。The PGW-U can identify the PGW-C failure after determining the PGW-C failure, but can still maintain the PGW-C-related service to continue normal. In this way, users can be prevented from going offline and the service can be continued.
S33、SGW-U获知PGW-U的状态为故障。S33 and SGW-U know that the state of the PGW-U is a fault.
其中,SGW-U可以基于用户面的数据传输通道探测感知PGW-U的状态,从而确定PGW-U是否故障。The SGW-U can detect the state of the PGW-U based on the data transmission channel of the user plane, thereby determining whether the PGW-U is faulty.
S34、SGW-U向MME发送第二探测信息,其中,第二探测信息是SGW-U根据探测结果生成的,即,第二探测信息用于指示SGW-U得到的PGW-U的状态。S34. The SGW-U sends the second probe information to the MME, where the second probe information is generated by the SGW-U according to the detection result, that is, the second probe information is used to indicate the status of the PGW-U obtained by the SGW-U.
若SGW-U判定PGW-U故障,那么在第二探测信息中可以携带该故障的PGW-U的标识,若SGW-U判定PGW-U无故障,那么SGW-U在第二探测信息中可以不携带该故障的PGW-U的标识,对于MME来说,可以设置检测周期,根据在该检测周期内接收的SGW-U发送的第二探测信息携带的故障PGW-U的标识来确定究竟哪些PGW-U故障。或者SGW-U也可以不向MME发送第二探测信息,则MME若在该检测周期内未收到SGW-U发送的第二探测信息,就默认SGW-U对PGW-U的探测结果为正常,从而MME可以据此获知SGW-U的探测结果。If the SGW-U determines that the PGW-U is faulty, the second probe information may carry the identifier of the faulty PGW-U. If the SGW-U determines that the PGW-U has no fault, the SGW-U may be in the second probe information. The identifier of the PGW-U that does not carry the fault, for the MME, may set a detection period, and determine which ones are based on the identifier of the fault PGW-U carried by the second probe information sent by the SGW-U received in the detection period. PGW-U failure. Alternatively, the SGW-U may not send the second probe information to the MME, and if the MME does not receive the second probe information sent by the SGW-U in the detection period, the default SGW-U detection result to the PGW-U is normal. Therefore, the MME can learn the detection result of the SGW-U accordingly.
其中,SGW-U在获知PGW-U的状态为故障后,对于本地的与该PGW-U关联的业务可以有不同的处理方式,相应的,SGW-U向MME发送第二探测信息也就有不同的方式。以下简单介绍。The SGW-U may have different processing modes for the local service associated with the PGW-U after the status of the PGW-U is known to be faulty. Correspondingly, the SGW-U sends the second detection information to the MME. different way. The following is a brief introduction.
方式一:method one:
在方式一中,若UP(例如SGW-U)获知另一UP(例如PGW-U)的状态 为故障,则该UP在本地会释放与另一UP关联的业务,UP可以在因释放业务而向MME触发的消息中携带第二探测信息。In mode 1, if UP (for example, SGW-U) knows the state of another UP (for example, PGW-U) If the fault is the fault, the UP will release the service associated with the other UP. The UP may carry the second probe information in the message triggered by the MME.
例如,SGW-U若获知PGW-U的状态为故障,则可以触发PGW-U关联的用户业务承载的释放处理。例如SGW-U通过Sxa接口向SGW-C发送用于进行故障处理的消息,例如用户面会话删除请求(UPlane Session Delete Request)消息,SGW-C再通过S11接口向MME发送用于进行故障处理的消息,例如删除承载请求(Delete Bearer Request)消息,及通过S5/S8接口向PGW-C发送用于进行故障处理的消息,例如删除承载命令(Delete Bearer Command)消息,从而释放故障PGW-U关联的用户业务承载。SGW-U在该UPlane Session Delete Request消息中可以扩展携带故障的PGW-U的标识,相应的,在Delete Bearer Request消息中可以扩展携带故障的PGW-U的标识。在图3中未画出SGW-C。For example, if the SGW-U learns that the status of the PGW-U is faulty, the SGW-U can trigger the release process of the user service bearer associated with the PGW-U. For example, the SGW-U sends a message for performing fault processing to the SGW-C through the Sx-interface, for example, a UPlane Session Delete Request message, and the SGW-C sends the S11 interface to the MME for troubleshooting. A message, such as a Delete Bearer Request message, and a message for performing fault processing, such as a Delete Bearer Command message, is sent to the PGW-C through the S5/S8 interface, thereby releasing the faulty PGW-U association. User traffic is hosted. The SGW-U can extend the identifier of the PGW-U carrying the fault in the UPlane Session Delete Request message. Correspondingly, the identifier of the PGW-U carrying the fault can be extended in the Delete Bearer Request message. SGW-C is not shown in FIG.
方式二:Method 2:
在方式二中,若UP(例如SGW-U)获知另一UP(例如PGW-U)的状态为故障,则该UP在本地可以继续保持与另一UP关联的业务正常进行,通过这种方式可避免用户下线,使得业务得以继续。该UP可以通过现有的消息或新增的故障处理消息等将第二探测信息发送给MME。例如,SGW-U获知PGW-U的状态为故障,则SGW-U在本地仍保持PGW-U关联的业务继续进行,且SGW-U可通过扩展的S1-MME接口消息将第二探测信息发送给MME。In the second mode, if the UP (for example, the SGW-U) learns that the status of the other UP (for example, the PGW-U) is a fault, the UP can continue to maintain the service associated with the other UP. Users can be prevented from going offline, so that the business can continue. The UP may send the second probe information to the MME by using an existing message or a newly added fault processing message or the like. For example, if the SGW-U knows that the status of the PGW-U is a fault, the SGW-U continues to maintain the PGW-U-related service locally, and the SGW-U can send the second probe information through the extended S1-MME interface message. To the MME.
其中,S31-S32以及S33-S34可以视为两个部分,这两个部分的执行顺序可以任意。Among them, S31-S32 and S33-S34 can be regarded as two parts, and the execution order of these two parts can be arbitrary.
S35、MME根据第一探测信息和第二探测信息确定故障类型,其中,故障类型包括PGW-U故障,或PGW-C和PGW-U之间的链路故障。S35. The MME determines a fault type according to the first probe information and the second probe information, where the fault type includes a PGW-U fault, or a link fault between the PGW-C and the PGW-U.
例如,MME结合第一探测信息中携带的故障的PGW-U的标识,以及第二探测信息中携带的故障的PGW-U的标识,确定究竟是PGW-U故障还是PGW-C和PGW-U之间的链路故障。For example, the MME determines whether the PGW-U fault or the PGW-C and the PGW-U are combined with the identifier of the faulty PGW-U carried in the first probe information and the identifier of the faulty PGW-U carried in the second probe information. The link between the faults.
可能的实施方式中,若第二探测信息中携带的故障的PGW-U的标识中包括第一探测信息中携带的故障的PGW-U的标识,即第一探测信息和第二探测 信息均指示SGW-U故障,则MME确定PGW-U故障。In a possible implementation, if the identifier of the faulty PGW-U carried in the second probe information includes the identifier of the faulty PGW-U carried in the first probe information, that is, the first probe information and the second probe The information indicates that the SGW-U is faulty, and the MME determines that the PGW-U is faulty.
可能的实施方式中,若第二探测信息中携带的故障的PGW-U的标识中不包括第一探测信息中携带的故障的PGW-U的标识,即第一探测信息指示SGW-U故障,而第二探测信息指示SGW-U正常,则MME确定PGW-U正常,而PGW-C和PGW-U之间的链路故障。In a possible implementation, if the identifier of the faulty PGW-U carried in the second probe information does not include the identifier of the faulty PGW-U carried in the first probe information, that is, the first probe information indicates that the SGW-U is faulty. While the second probe information indicates that the SGW-U is normal, the MME determines that the PGW-U is normal, and the link between the PGW-C and the PGW-U is faulty.
S36、MME进行故障恢复处理。S36. The MME performs fault recovery processing.
可能的实施方式中,若MME确定PGW-U故障,而PGW-C和PGW-U之间的链路正常,则MME可以采取如图2所示的实施例中的S26中介绍的第1种链路故障处理策略进行链路故障恢复。In a possible implementation manner, if the MME determines that the PGW-U is faulty and the link between the PGW-C and the PGW-U is normal, the MME may adopt the first type introduced in S26 in the embodiment shown in FIG. 2 . The link fault handling policy performs link fault recovery.
可能的实施方式中,若MME确定PGW-U正常,而PGW-C和PGW-U之间的链路故障,则MME可以采取如图2所示的实施例中的S26中介绍的第2种链路故障处理策略或第3种链路故障处理策略进行链路故障恢复。In a possible implementation manner, if the MME determines that the PGW-U is normal and the link between the PGW-C and the PGW-U is faulty, the MME may adopt the second type introduced in S26 in the embodiment shown in FIG. 2. The link fault handling policy or the third link fault handling policy performs link fault recovery.
通过本发明实施例的技术方案,PGW-C以及PGW-U在检测到Sx链路对端故障后,可以暂不启动故障业务处理,同时通过探测信息将探测获取的对端故障状态发送给MME,则MME结合SGW-U发送的用户面的探测信息,以及PGW-C发送的控制面的探测信息,就可以较为准确地确定究竟是PGW-U故障还是Sx链路故障。With the technical solution of the embodiment of the present invention, after detecting the fault of the opposite end of the Sx link, the PGW-C and the PGW-U may not initiate the fault service processing, and send the peer fault status obtained by the probe to the MME by using the probe information. Then, the MME combines the detection information of the user plane sent by the SGW-U and the detection information of the control plane sent by the PGW-C to determine whether the PGW-U fault or the Sx link fault is relatively accurate.
由于网络引起的链路故障相对于PGW-U故障的概率较高,且PGW-U故障业务处理的代价较高。通过区分识别Sx链路故障,避免将Sx链路故障按PGW-U故障进行处理,可大大提升网络的业务体验。The probability of link failure caused by the network is relatively high relative to the PGW-U failure, and the cost of the PGW-U fault service processing is high. By distinguishing between Sx link faults and avoiding Sx link faults according to PGW-U faults, the service experience of the network can be greatly improved.
本发明一实施例提供一种故障处理方法。在图4中,以第一控制面网元是SGW-C、第一用户面网元是SGW-U、第二用户面网元是基站和/或PGW-U、信令处理网元是MME为例进行描述。An embodiment of the invention provides a fault processing method. In FIG. 4, the first control plane network element is SGW-C, the first user plane network element is SGW-U, the second user plane network element is a base station and/or a PGW-U, and the signaling processing network element is an MME. For an example, describe it.
S41、SGW-C获知SGW-U的状态为故障。S41. The SGW-C learns that the state of the SGW-U is a fault.
SGW-C可通过心跳检测等机制确定SGW-U的状态是否为故障,不多赘述。The SGW-C can determine whether the state of the SGW-U is a fault through a mechanism such as heartbeat detection, and no further description is provided.
本发明实施例中,SGW-C在获知SGW-U的状态为故障后,SGW-C本地仍保持SGW-U关联的业务继续正常进行。 In the embodiment of the present invention, after the SGW-C learns that the status of the SGW-U is faulty, the SGW-C keeps the service associated with the SGW-U locally and continues to perform normally.
其中,SGW-C在判定SGW-U故障后,可以标识该SGW-U故障,但仍然可以保持SGW-U关联的业务继续正常进行。通过这种方式可避免用户下线,尽量使得业务得以继续。The SGW-C can identify the SGW-U fault after determining that the SGW-U is faulty, but can still keep the service associated with the SGW-U from continuing normally. In this way, users can be prevented from going offline and the business can be continued as much as possible.
S42、SGW-C向MME发送第一探测信息,其中,第一探测信息是SGW-C根据探测结果生成的,即,第一探测信息用于指示SGW-C得到的SGW-U的状态。S42. The SGW-C sends the first probe information to the MME, where the first probe information is generated by the SGW-C according to the detection result, that is, the first probe information is used to indicate the status of the SGW-U obtained by the SGW-C.
若SGW-C判定SGW-U故障,那么在第一探测信息中可以携带该故障的SGW-U的标识,SGW-U的标识例如为SGW-U的转发面的IP地址,当然也可以是其他用于标识SGW-U的身份的标识。若SGW-C判定SGW-U无故障,那么SGW-C在第一探测信息中可以不携带该故障的SGW-U的标识,或者SGW-C也可以不向MME发送第一探测信息,从而MME可以据此获知SGW-C的探测结果。If the SGW-C determines that the SGW-U is faulty, the identifier of the faulty SGW-U may be carried in the first probe information, and the identifier of the SGW-U is, for example, the IP address of the forwarding plane of the SGW-U, and may of course be other An identifier used to identify the identity of the SGW-U. If the SGW-C determines that the SGW-U is not faulty, the SGW-C may not carry the identifier of the faulty SGW-U in the first probe information, or the SGW-C may not send the first probe information to the MME, so that the MME The detection result of SGW-C can be known from this.
例如,SGW-C通过扩展的S11接口消息将第一探测信息发送给MME。For example, the SGW-C sends the first probe information to the MME through the extended S11 interface message.
S43、PGW-U获知SGW-U的状态为故障。S43. The PGW-U knows that the state of the SGW-U is a fault.
PGW-U可以通过心跳检测等机制确定SGW-U的状态是否为故障,不多赘述。The PGW-U can determine whether the state of the SGW-U is a fault through a mechanism such as heartbeat detection, and no further description is provided.
本发明实施例中,PGW-U在获知SGW-U的状态为故障后,PGW-U本地仍保持SGW-U关联的业务继续正常进行。In the embodiment of the present invention, after the PGW-U learns that the state of the SGW-U is a fault, the PGW-U still keeps the service associated with the SGW-U and continues to perform normally.
其中,PGW-U在获知SGW-U的状态为故障后,可以标识该SGW-U故障,但仍然可以保持SGW-U关联的业务继续正常进行。通过这种方式,可避免用户下线,尽量使得业务得以继续。After the PGW-U learns that the SGW-U is faulty, it can identify the SGW-U fault, but the service associated with the SGW-U can still be maintained normally. In this way, users can be prevented from going offline and the business can be continued as much as possible.
S44、PGW-U向MME发送第二探测信息,其中,第二探测信息是PGW-U根据探测结果生成的,即,第二探测信息用于指示PGW-U得到的SGW-U的状态。S44. The PGW-U sends the second detection information to the MME, where the second detection information is generated by the PGW-U according to the detection result, that is, the second detection information is used to indicate the status of the SGW-U obtained by the PGW-U.
若PGW-U判定SGW-U故障,那么在第二探测信息中可以携带该故障的SGW-U的标识。若PGW-U判定SGW-U无故障,那么PGW-U在第二探测信息中可以不携带该故障的SGW-U的标识,对于MME来说,可以设置检测周期,根 据在该检测周期内接收的PGW-U发送的第一探测信息携带的故障SGW-U的标识来确定究竟哪些SGW-U故障。或者PGW-U也可以不向MME发送第二探测信息,则MM E若在该检测周期内未收到PGW-U发送的第一探测信息,就默认PGW-U对SGW-U的探测结果为正常,从而MME默认PGW-U对SGW-U的探测结果为正常,从而MME可以据此获知PGW-U的探测结果。If the PGW-U determines that the SGW-U is faulty, the identifier of the faulty SGW-U may be carried in the second probe information. If the PGW-U determines that the SGW-U is fault-free, the PGW-U may not carry the identifier of the faulty SGW-U in the second probe information. For the MME, the detection period may be set. The SGW-U faults are determined according to the identifier of the fault SGW-U carried by the first probe information sent by the PGW-U received during the detection period. Alternatively, the PGW-U may not send the second probe information to the MME, and if the MM E does not receive the first probe information sent by the PGW-U within the detection period, the default PGW-U detection result of the SGW-U is Normal, so that the MME default PGW-U detects the SGW-U as normal, so that the MME can learn the detection result of the PGW-U accordingly.
例如,PGW-U将第二探测信息发送给PGW-C,PGW-C通过扩展S5/S8接口消息将第二探测信息发送给SGW-C,SGW-C再通过扩展S11接口消息将第二探测信息发送给MME。图4中未画出PGW-C。For example, the PGW-U sends the second probe information to the PGW-C, and the PGW-C sends the second probe information to the SGW-C by extending the S5/S8 interface message, and the SGW-C further transmits the second probe by extending the S11 interface message. The information is sent to the MME. PGW-C is not shown in FIG.
其中,S41-S42以及S43-S44可以视为两个部分,这两个部分的执行顺序可以任意。Among them, S41-S42 and S43-S44 can be regarded as two parts, and the execution order of these two parts can be arbitrary.
S45、MME根据第一探测信息和第二探测信息确定故障类型,其中,故障类型包括SGW-U故障,或SGW-C和SGW-U之间的链路故障。S45. The MME determines a fault type according to the first probe information and the second probe information, where the fault type includes an SGW-U fault, or a link fault between the SGW-C and the SGW-U.
例如,MME结合第一探测信息中携带的故障的SGW-U的标识,以及两个第二探测信息中携带的故障的SGW-U的标识,确定究竟是SGW-U故障还是SGW-C和SGW-U之间的链路故障。For example, the MME determines whether the SGW-U fault or the SGW-C and the SGW are combined with the identifier of the faulty SGW-U carried in the first probe information and the identifier of the faulty SGW-U carried in the two second probe information. The link between -U is faulty.
可能的实施方式中,若基站发送给MME的第二探测信息中携带的故障的SGW-U的标识中包括第一探测信息中携带的故障的SGW-U的标识,以及PGW-U发送给MME的第二探测信息中携带的故障的SGW-U的标识中也包括第一探测信息中携带的故障的SGW-U的标识,即第一探测信息和第二探测信息指示SGW-U故障,则MME确定SGW-U故障,而SGW-C和SGW-U之间的链路正常。In a possible implementation, the identifier of the faulty SGW-U carried in the second probe information that is sent by the base station to the MME includes the identifier of the faulty SGW-U carried in the first probe information, and the PGW-U is sent to the MME. The identifier of the faulty SGW-U carried in the second probe information also includes the identifier of the faulty SGW-U carried in the first probe information, that is, the first probe information and the second probe information indicate that the SGW-U is faulty. The MME determines that the SGW-U fails, and the link between the SGW-C and the SGW-U is normal.
可能的实施方式中,若基站发送给MME的第二探测信息中携带的故障的SGW-U的标识中不包括第一探测信息中携带的故障的SGW-U的标识,以及PGW-U发送给MME的第二探测信息中携带的故障的SGW-U的标识中也不包括第一探测信息中携带的故障的SGW-U的标识,即第一探测信息指示SGW-U故障,而第二探测信息指示SGW-U正常,则MME确定SGW-U正常,而SGW-C和SGW-U之间的链路故障。 In a possible implementation manner, if the identifier of the faulty SGW-U carried in the second probe information that is sent by the base station to the MME does not include the identifier of the faulty SGW-U carried in the first probe information, and the PGW-U sends the identifier to the PGW-U. The identifier of the faulty SGW-U carried in the second probe information of the MME does not include the identifier of the faulty SGW-U carried in the first probe information, that is, the first probe information indicates that the SGW-U is faulty, and the second probe is detected. The information indicates that the SGW-U is normal, and the MME determines that the SGW-U is normal, and the link between the SGW-C and the SGW-U is faulty.
S46、MME进行故障恢复处理。S46. The MME performs fault recovery processing.
可能的实施方式中,若MME确定SGW-U故障,而SGW-C和SGW-U之间的链路正常,则MME可以采取如图2所示的实施例中的S26中介绍的第1种链路故障处理策略进行链路故障恢复。In a possible implementation manner, if the MME determines that the SGW-U is faulty, and the link between the SGW-C and the SGW-U is normal, the MME may adopt the first type introduced in S26 in the embodiment shown in FIG. 2 . The link fault handling policy performs link fault recovery.
可能的实施方式中,若MME确定SGW-U正常,而SGW-C和SGW-U之间的链路故障,则MME可以采取如图2所示的实施例中的S26中介绍的第2种链路故障处理策略或第3种链路故障处理策略进行链路故障恢复。In a possible implementation manner, if the MME determines that the SGW-U is normal and the link between the SGW-C and the SGW-U is faulty, the MME may adopt the second type introduced in S26 in the embodiment shown in FIG. 2 . The link fault handling policy or the third link fault handling policy performs link fault recovery.
通过本发明实施例提供的方案,用户转发面所在的基站在检测到SGW-U故障后,可以不启动SGW-U关联的用户业务承载释放处理,使得MME和SGW-C有机会基于预定义的链路故障处理策略,以及目前的UP的部署方式中提供的冗余机制恢复故障SGW-U关联的用户业务。MME结合基站通过用户转发面获取的探测信息和/或PGW-U获取的探测信息,以及SGW-C通过控制面获取的探测故障信息,就可以较为准确地确定究竟是SGW-U故障还是Sx链路故障。With the solution provided by the embodiment of the present invention, after detecting the SGW-U fault, the base station where the user forwarding plane is located may not initiate the user service bearer release processing associated with the SGW-U, so that the MME and the SGW-C have an opportunity to be based on the predefined The link fault handling strategy and the redundancy mechanism provided in the current UP deployment mode restore the user service associated with the faulty SGW-U. The MME can determine whether the SGW-U fault or the Sx chain is relatively accurately determined by the probe information acquired by the base station through the user forwarding plane and/or the probe information acquired by the PGW-U and the detection fault information acquired by the SGW-C through the control plane. Road failure.
若SGW-U故障,导致其关联的用户业务去激活,用户的业务恢复时间较长。而基于SGW-U的部署提供的冗余资源可快速恢复故障SGW-U的业务,用户的业务恢复时间较短,用户的业务体验更好。If the SGW-U fails, the associated user service is deactivated, and the user's service recovery time is longer. The redundant resources provided by the SGW-U-based deployment can quickly recover the services of the faulty SGW-U. The service recovery time of the users is shorter and the service experience of the users is better.
基于本发明实施例提供的技术方案,可避免SGW-U故障后的重量级流程的用户业务释放和重新激活进行用户业务的恢复处理流程,而选择了轻量级的SGW-U部署提供的冗余资源恢复故障用户业务,用户业务恢复时间短,用户业务体验好,且避免了大量用户承载去激活/重新激活的信令风暴对网络的影响。Based on the technical solution provided by the embodiment of the present invention, the user service release and reactivation of the heavyweight process after the SGW-U failure can be avoided to perform the recovery process of the user service, and the redundancy provided by the lightweight SGW-U deployment is selected. The remaining resources recover the faulty user service, the user service recovery time is short, the user service experience is good, and the impact of the signaling storm of a large number of users carrying the deactivation/reactivation on the network is avoided.
本发明一实施例还提供一种故障处理方法。在本发明实施例中,信令处理网元对第一控制面网元进行探测,获得第一探测信息。若信令处理网元通过探测确定第一控制面网元故障,则第一探测信息表明第一控制面网元故障。另外,第一用户面网元通过探测获知第二用户面网元的状态为故障。第一用户面网元根据对第二用户面网元的探测生成第二探测信息,并将第二探测信 息发送给信令处理网元。信令处理网元根据第一探测信息和第二探测信息确定故障类型。在下面的介绍过程中,以第一控制面网元是SGW-C、第一用户面网元是基站、第二用户面网元是SGW-U、信令处理网元是MME为例进行描述,如图5所示。An embodiment of the invention further provides a fault processing method. In the embodiment of the present invention, the signaling processing network element detects the first control plane network element, and obtains the first detection information. If the signaling processing network element determines that the first control plane network element is faulty, the first detection information indicates that the first control plane network element is faulty. In addition, the first user plane network element learns that the state of the second user plane network element is fault by detecting. The first user plane network element generates second detection information according to the detection of the second user plane network element, and the second detection signal The information is sent to the signaling processing network element. The signaling processing network element determines the fault type according to the first probe information and the second probe information. In the following description, the first control plane network element is SGW-C, the first user plane network element is the base station, the second user plane network element is the SGW-U, and the signaling processing network element is the MME. As shown in Figure 5.
S51、MME确定SGW-C故障,生成第一探测信息,即,第一探测信息用于指示MME得到的SGW-C的状态。S51: The MME determines that the SGW-C is faulty, and generates the first probe information, that is, the first probe information is used to indicate the state of the SGW-C obtained by the MME.
MME对SGW-C进行探测,根据探测结果生成第一探测信息,第一探测信息可以指示SGW-C是否故障。The MME detects the SGW-C, and generates first detection information according to the detection result, where the first detection information may indicate whether the SGW-C is faulty.
本发明实施例中,MME在确定SGW-C故障后,在MME本地仍保持SGW-C关联的业务继续正常进行。In the embodiment of the present invention, after the MME determines that the SGW-C is faulty, the MME still maintains the service associated with the SGW-C to continue normal.
其中,MME在判定SGW-C故障后,可以标识该SGW-C故障,但仍然可以保持SGW-C关联的业务继续正常进行。因为究竟是SGW-C故障还是链路故障还可以结合其他信息进行判定,因此在不确定究竟是何处故障时MME可以继续保持SGW-C关联的业务继续正常进行,使得业务得以继续,避免业务突然中断,提高用户体验。The MME may identify the SGW-C failure after determining that the SGW-C is faulty, but may still keep the service associated with the SGW-C from proceeding normally. Because the SGW-C fault or the link fault can be determined in combination with other information, the MME can continue to keep the SGW-C-associated service continuing normally, so that the service can continue and avoid the service. Sudden interruptions to improve the user experience.
S52、SGW-U获知SGW-C的状态为故障。S52. The SGW-U learns that the state of the SGW-C is a fault.
SGW-U可通过心跳检测等机制获知SGW-C的状态是否故障,不多赘述。The SGW-U can learn whether the status of the SGW-C is faulty through a mechanism such as heartbeat detection.
本发明实施例中,SGW-U在确定SGW-C故障后,SGW-U本地仍保持SGW-U关联的业务继续正常进行。In the embodiment of the present invention, after the SGW-U determines that the SGW-C is faulty, the SGW-U still maintains the service associated with the SGW-U to continue normal.
其中,SGW-U在判定SGW-C故障后,可以标识该SGW-C故障,但仍然可以保持该SGW-U关联的业务继续正常进行。因为究竟是SGW-C故障还是链路故障是由MME进行判定,因此作为SGW-U来说,在不确定究竟是何处故障时可以继续保持SGW-U关联的业务继续正常进行,这样,如果确实是链路故障,那么一般是不会影响SGW-U继续处理业务的,使得业务得以继续,避免业务突然中断,提高用户体验。The SGW-U can identify the SGW-C fault after determining that the SGW-C is faulty, but can still keep the service associated with the SGW-U from continuing normally. Since the SGW-C failure or the link failure is determined by the MME, as the SGW-U, the service associated with the SGW-U can continue to be performed normally when it is uncertain what is the fault, so if If the link is faulty, it will not affect the SGW-U to continue processing the service, so that the service can continue, avoid sudden interruption of the service, and improve the user experience.
S53、基站获知SGW-U的状态为故障。S53. The base station learns that the state of the SGW-U is a fault.
基站可以基于用户面的转发通道探测感知SGW-U的状态,而SGW-U探测 感知PGW-U的状态。若PGW-U故障,则可能直接导致SGW-U探测感知并释放SGW-U本地关联的用户业务承载,SGW-U可以通过SGW-C向MME发送PGW-U的故障信息,同样的,若SGW-U故障,也可能导致基站或PGW-U探测感知并释放本地关联的用户业务承载。The base station can detect the state of the perceived SGW-U based on the forwarding channel of the user plane, and the SGW-U detects Perceive the state of PGW-U. If the PGW-U is faulty, the SGW-U may directly detect and release the user service bearer associated with the SGW-U. The SGW-U may send the fault information of the PGW-U to the MME through the SGW-C. Similarly, if the SGW is used. The -U fault may also cause the base station or PGW-U to detect and release the locally associated user traffic bearer.
S54、基站向MME发送第二探测信息,其中,第二探测信息是基站根据探测结果生成的,即,第二探测信息用于指示基站得到的SGW-U的状态。S54. The base station sends the second probe information to the MME, where the second probe information is generated by the base station according to the detection result, that is, the second probe information is used to indicate the status of the SGW-U obtained by the base station.
若基站判定SGW-U故障,那么在第二探测信息中可以携带该故障的SGW-U的标识,若基站判定SGW-U无故障,那么基站在第二探测信息中可以不携带该故障的SGW-U的标识,或者基站也可以不向MME发送第二探测信息,从而MME可以据此获知基站的探测结果。If the base station determines that the SGW-U is faulty, the second probe information may carry the identifier of the faulty SGW-U. If the base station determines that the SGW-U is faultless, the base station may not carry the faulty SGW in the second probe information. The identifier of the -U, or the base station may not send the second probe information to the MME, so that the MME can learn the detection result of the base station accordingly.
本发明实施例中,若UP(例如基站)获知另一UP(例如SGW-U)的状态为故障,则该UP在本地会释放与另一UP关联的业务,UP可以在因释放业务而向MME触发的消息中携带第二探测信息。例如,基站确定SGW-U故障,则向MME发送S1UE Context Release Request消息,以释放空口和本地的与该SGW-U相关的业务,即第二探测信息可以通过该S1UE Context Release Request消息实现。在该S1UE Context Release Request消息中,可以扩展Cause Indicates标识为S1-U Failure,还可以携带故障的SGW-U的标识。从而MME就可以知道基站的探测结果。另外,MME接收该S1UE Context Release Request消息后,可以向基站发送S1UE Context Release Command消息,以指示基站释放空口和本地的业务。In the embodiment of the present invention, if the UP (for example, the base station) learns that the status of the other UP (for example, the SGW-U) is a fault, the UP locally releases the service associated with another UP, and the UP may be released due to the release of the service. The message triggered by the MME carries the second probe information. For example, the base station determines that the SGW-U is faulty, and sends an S1UE Context Release Request message to the MME to release the air interface and the local service related to the SGW-U, that is, the second probe information can be implemented by using the S1UE Context Release Request message. In the S1UE Context Release Request message, the indication that the Cause Indicates identifier is S1-U Failure and the faulty SGW-U can be carried. Therefore, the MME can know the detection result of the base station. In addition, after receiving the S1UE Context Release Request message, the MME may send an S1UE Context Release Command message to the base station to instruct the base station to release the air interface and the local service.
其中,S51、S52以及S53-S54可以视为三个部分,这三个部分的执行顺序可以任意。Among them, S51, S52 and S53-S54 can be regarded as three parts, and the order of execution of these three parts can be arbitrary.
S55、MME根据第一探测信息和第二探测信息确定故障类型,其中,故障类型包括SGW-C故障,或SGW-C和SGW-U均故障,即,在SGW-C故障时,SGW-U是正常还是故障。S55. The MME determines a fault type according to the first probe information and the second probe information, where the fault type includes an SGW-C fault, or both the SGW-C and the SGW-U fault, that is, when the SGW-C fails, the SGW-U Is it normal or faulty?
例如,MME结合第一探测信息中携带的故障的SGW-C的标识,以及第二探测信息中携带的故障的SGW-U的标识,确定究竟是仅SGW-C故障、仅 SGW-U故障还是SGW-C和SGW-U均故障。For example, the MME determines whether only the SGW-C is faulty, only the SGW-C identifier of the faulty SGW-C carried in the first probe information, and the identifier of the faulty SGW-U carried in the second probe information. The SGW-U failure is still faulty for both SGW-C and SGW-U.
可能的实施方式中,MME确定第一探测信息中携带的故障的SGW-C的标识对应的SGW-C为故障。In a possible implementation manner, the MME determines that the SGW-C corresponding to the identifier of the faulty SGW-C carried in the first probe information is a fault.
可能的实施方式中,MME确定第二探测信息中携带的故障的SGW-U的标识对应的SGW-U为故障。In a possible implementation manner, the MME determines that the SGW-U corresponding to the identifier of the faulty SGW-U carried in the second probe information is a fault.
S56、MME进行故障恢复处理。S56. The MME performs fault recovery processing.
可能的实施方式中,若MME确定仅SGW-C故障,则MME可以采取如图2所示的实施例中的S26中介绍的第2种故障处理策略或第5种故障处理策略来进行故障恢复处理。In a possible implementation manner, if the MME determines that only the SGW-C is faulty, the MME may adopt the second fault handling policy or the fifth fault handling policy introduced in S26 in the embodiment shown in FIG. 2 to perform fault recovery. deal with.
可能的实施方式中,若MME确定仅SGW-U故障,则MME可以采取如图2所示的实施例中的S26中介绍的第1种故障处理策略或第5种故障处理策略来进行故障恢复处理。In a possible implementation manner, if the MME determines that only the SGW-U is faulty, the MME may adopt the first fault handling policy or the fifth fault handling policy introduced in S26 in the embodiment shown in FIG. 2 to perform fault recovery. deal with.
可能的实施方式中,若MME确定既有SGW-C也有SGW-U故障,且故障的SGW-C和SGW-U均不相关联,即故障的SGW-C和SGW-U中没有相互关联的SGW-C和SGW-U,则MME可以分别作为SGW-C故障和SGW-U故障来处理,处理方式参考如上的描述。In a possible implementation manner, if the MME determines that both the SGW-C and the SGW-U are faulty, and the faulty SGW-C and the SGW-U are not associated, that is, the faulty SGW-C and the SGW-U are not associated with each other. SGW-C and SGW-U, the MME can be handled as a SGW-C fault and an SGW-U fault, respectively, and the processing manner is as described above.
可能的实施方式中,若MME确定既有SGW-C也有SGW-U故障,且故障的SGW-C和SGW-U中有相互关联的SGW-C和SGW-U,则对于相互关联的SGW-C和SGW-U,MME可以采取如图2所示的实施例中的S26中介绍的第4种故障处理策略,即释放业务策略来进行故障恢复处理。例如,MME可以在本地释放有关联关系的故障的SGW-C和/或SGW-U关联的所有业务。In a possible implementation manner, if the MME determines that there is an existing SGW-C and an SGW-U fault, and the faulty SGW-C and the SGW-U have mutually associated SGW-C and SGW-U, then the associated SGW- C and SGW-U, the MME may adopt the fourth fault handling policy described in S26 in the embodiment shown in FIG. 2, that is, release the service policy to perform fault recovery processing. For example, the MME may locally release all services associated with the faulty SGW-C and/or SGW-U of the associated relationship.
其中,SGW-C关联的SGW-U,可以包括SGW-C所管理的SGW-U。The SGW-U associated with the SGW-C may include the SGW-U managed by the SGW-C.
通过本发明实施例提供的技术方案,MME在检测到PGW-C故障后,可以暂不启动故障业务处理,MME结合基站发送的第二探测信息,将SGW-C故障、SGW-U故障、不相关联的SGW-C和SGW-U故障、相关联的SGW-C和SGW-U故障等几种情况分不同方式来处理,从而可以尽量减小业务处理的代价,例如减少去激活网元关联的业务的可能性,提升网络的业务体验。 With the technical solution provided by the embodiment of the present invention, after detecting the fault of the PGW-C, the MME may not start the fault service processing temporarily, and the MME combines the second probe information sent by the base station to cause the SGW-C fault, the SGW-U fault, and the fault. Several cases of associated SGW-C and SGW-U failures, associated SGW-C and SGW-U failures are handled in different ways, thereby minimizing the cost of service processing, such as reducing deactivation of network element associations. The possibility of business to enhance the business experience of the network.
本发明一实施例还提供一种故障处理方法。在本发明实施例中,第二控制面网元通过探测获知第一控制面网元的状态为故障,第二控制面网元根据对第一控制面网元的探测生成第一探测信息,并将第一探测信息发送给信令处理网元。另外第一用户面网元通过探测获知第二用户面网元的状态为故障,第一用户面网元根据对第二用户面网元的探测生成第二探测信息,并将第二探测信息也发送给信令处理网元。信令处理网元根据第一探测信息和第二探测信息确定究竟故障类型。通过本发明实施例提供的技术方案,可以确定是仅第一控制面网元故障、仅第二用户面网元故障还是第一控制面网元与第二用户面网元均故障。在下面的介绍过程中,以第一控制面网元是PGW-C、第一用户面网元是SGW-U、第二用户面网元是PGW-U、第二控制面网元是SGW-C、信令处理网元是MME为例进行描述,如图6所示。An embodiment of the invention further provides a fault processing method. In the embodiment of the present invention, the second control plane network element is configured to detect that the state of the first control plane network element is a fault, and the second control plane network element generates the first probe information according to the detection of the first control plane network element, and Sending the first probe information to the signaling processing network element. In addition, the first user plane network element learns that the state of the second user plane network element is faulty, and the first user plane network element generates second detection information according to the detection of the second user plane network element, and the second detection information is also Send to the signaling processing network element. The signaling processing network element determines the fault type based on the first probe information and the second probe information. With the technical solution provided by the embodiment of the present invention, it can be determined that only the first control plane network element is faulty, only the second user plane network element is faulty, or the first control plane network element and the second user plane network element are both faulty. In the following description, the first control plane network element is PGW-C, the first user plane network element is SGW-U, the second user plane network element is PGW-U, and the second control plane network element is SGW-U. C. The signaling processing network element is an MME as an example, as shown in FIG. 6.
S61、SGW-C获知PGW-C的状态为故障。S61 and SGW-C know that the state of the PGW-C is a failure.
SGW-C可通过心跳检测等机制获知PGW-C的状态是否为故障,不多赘述。The SGW-C can learn whether the status of the PGW-C is fault through a mechanism such as heartbeat detection.
本发明实施例中,SGW-C在确定PGW-C故障后,SGW-C本地仍保持PGW-C关联的业务继续正常进行。In the embodiment of the present invention, after the SGW-C determines that the PGW-C is faulty, the SGW-C still maintains the PGW-C-associated service to continue normal.
其中,SGW-C在判定PGW-C故障后,可以标识该PGW-C故障,但仍然可以保持PGW-C关联的业务继续正常进行。因为究竟是PGW-C故障还是链路故障是由MME进行判定,因此作为SGW-C来说,在不确定究竟是何处故障时可以继续保持PGW-C关联的业务继续正常进行,这样,如果确实是链路故障,那么一般是不会影响PGW-C继续处理业务的,使得业务得以继续,避免业务突然中断,提高用户体验。After the SGW-C determines that the PGW-C is faulty, the PGW-C fault can be identified, but the service associated with the PGW-C can still be maintained normally. Because whether the PGW-C fault or the link fault is determined by the MME, as the SGW-C, the service associated with the PGW-C can continue to be performed normally when it is uncertain what is the fault, so if It is indeed a link failure, so it generally does not affect the PGW-C to continue to process the business, so that the business can continue, avoid sudden interruption of business, and improve the user experience.
S62、SGW-C向MME发送第一探测信息,其中,第一探测信息是SGW-C根据探测结果生成的,即,第一探测信息用于指示SGW-C得到的PGW-C的状态。S62. The SGW-C sends the first probe information to the MME, where the first probe information is generated by the SGW-C according to the detection result, that is, the first probe information is used to indicate the status of the PGW-C obtained by the SGW-C.
若SGW-C获知PGW-C的状态为故障,那么在第一探测信息中可以携带该故障的PGW-C的标识,若SGW-C判定PGW-C无故障,那么SGW-C在第一探测信息中可以不携带该故障的PGW-C的标识,对于MME来说,可以设置检测周 期,根据在该检测周期内接收的SGW-C发送的第一探测信息携带的故障PGW-C的标识来确定究竟哪些PGW-C故障。或者SGW-C也可以不向MME发送第一探测信息,从而MME若在该检测周期内未收到SGW-C发送的第一探测信息,就默认SGW-C对于PGW-C的探测结果为正常,从而MME可以据此获知SGW-C的探测结果。If the SGW-C learns that the status of the PGW-C is faulty, the identifier of the faulty PGW-C may be carried in the first probe information. If the SGW-C determines that the PGW-C is faultless, the SGW-C is in the first probe. The information may not carry the identifier of the faulty PGW-C, and for the MME, the detection week may be set. And determining which PGW-Cs are faulty according to the identifier of the fault PGW-C carried by the first probe information sent by the SGW-C received in the detection period. Alternatively, the SGW-C may not send the first probe information to the MME, so that if the MME does not receive the first probe information sent by the SGW-C within the detection period, the default SGW-C detection result for the PGW-C is normal. Therefore, the MME can learn the detection result of the SGW-C accordingly.
例如,SGW-C通过扩展的S11接口消息将第一探测信息发送给MME。扩展的S11接口消息,例如为Echo Request消息或新增的故障处理消息。For example, the SGW-C sends the first probe information to the MME through the extended S11 interface message. The extended S11 interface message is, for example, an Echo Request message or a newly added fault handling message.
S63、PGW-U获知PGW-C的状态为故障。S63 and PGW-U know that the state of the PGW-C is a failure.
PGW-U可通过心跳检测等机制获知PGW-C的状态是否为故障,不多赘述。The PGW-U can know whether the status of the PGW-C is fault through a mechanism such as heartbeat detection, and no further description is provided.
本发明实施例中,PGW-U在确定PGW-C故障后,PGW-U本地仍保持该PGW-U关联的业务继续正常进行。In the embodiment of the present invention, after the PGW-U determines that the PGW-C is faulty, the PGW-U locally keeps the service associated with the PGW-U from continuing normally.
其中,PGW-U在判定PGW-C故障后,可以标识该PGW-C故障,但仍然可以保持PGW-U关联的业务继续正常进行。因为究竟是PGW-C故障还是链路故障是由MME进行判定,因此作为PGW-U来说,在不确定究竟是何处故障时可以继续保持PGW-U关联的业务继续正常进行,这样,如果确实是链路故障,那么一般是不会影响PGW-U继续处理业务的,使得业务得以继续,避免业务突然中断,提高用户体验。The PGW-U can identify the PGW-C failure after determining the PGW-C failure, but can still maintain the PGW-U-related service to continue normal. Since the PGW-C failure or the link failure is determined by the MME, as the PGW-U, the service associated with the PGW-U can continue to be performed normally when it is uncertain what is the fault, so if It is indeed a link failure, so it generally does not affect the PGW-U to continue to process the business, so that the business can continue, avoid sudden interruption of the business, and improve the user experience.
S64、SGW-U获知PGW-U的状态为故障。S64 and SGW-U know that the state of the PGW-U is a fault.
SGW-U可以通过心跳检测等机制获知PGW-U的状态是否为故障,不多赘述。The SGW-U can learn whether the status of the PGW-U is faulty through a mechanism such as heartbeat detection.
S65、SGW-U向MME发送第二探测信息,其中,第二探测信息是SGW-U根据探测结果生成的,即,第二探测信息用于指示SGW-U得到的PGW-U的状态。S65. The SGW-U sends the second probe information to the MME, where the second probe information is generated by the SGW-U according to the detection result, that is, the second probe information is used to indicate the status of the PGW-U obtained by the SGW-U.
若SGW-U判定PGW-U故障,那么在第二探测信息中可以携带该故障的PGW-U的标识,另外,还可以携带与该故障的PGW-U关联的PGW-C的标识,若SGW-U判定PGW-U无故障,那么SGW-U在第二探测信息中可以不携带该故障的PGW-U的标识和与该故障的PGW-U关联的PGW-C的标识,对于MME来 说,可以设置检测周期,根据在该检测周期内接收的SGW-U发送的第二探测信息携带的故障PGW-U的标识来确定究竟哪些PGW-U故障。或者SGW-U也可以不向MME发送第二探测信息,则MME若在该检测周期内未收到SGW-U发送的第二探测信息,就默认SGW-U对于PGW-U的探测结果为正常,从而MME可以据此获知SGW-U的探测结果。If the SGW-U determines that the PGW-U is faulty, the second probe information may carry the identifier of the faulty PGW-U, and may also carry the identifier of the PGW-C associated with the faulty PGW-U, if the SGW -U determines that the PGW-U is not faulty, then the SGW-U may not carry the identifier of the faulty PGW-U and the identifier of the PGW-C associated with the faulty PGW-U in the second probe information, for the MME It can be said that the detection period can be set to determine which PGW-Us are faulty according to the identifier of the fault PGW-U carried by the second probe information sent by the SGW-U received during the detection period. Alternatively, the SGW-U may not send the second probe information to the MME, and if the MME does not receive the second probe information sent by the SGW-U within the detection period, the default SGW-U detection result for the PGW-U is normal. Therefore, the MME can learn the detection result of the SGW-U accordingly.
其中,SGW-C基于本地构建的PGW-C和PGW-U的关联关系来确定与故障的PGW-U关联的PGW-C的标识。例如,SGW-C可基于创建SGW-U和PGW-U的用户面通道时SGW-C和PGW-C之间交换的PGW-U信息来构建PGW-C和PGW-U的关联关系,或者PGW-C可以将PGW-C与PGW-U之间的关联关系发送给SGW-C,另外SGW-C也可以将该关联关系发送给MME。The SGW-C determines the identity of the PGW-C associated with the faulty PGW-U based on the association relationship between the locally constructed PGW-C and the PGW-U. For example, the SGW-C may construct an association relationship between the PGW-C and the PGW-U based on the PGW-U information exchanged between the SGW-C and the PGW-C when creating the user plane channels of the SGW-U and the PGW-U, or PGW -C may transmit the association relationship between the PGW-C and the PGW-U to the SGW-C, and the SGW-C may also send the association relationship to the MME.
其中,SGW-U在获知PGW-U的状态为故障后,对于本地的与该PGW-U关联的业务可以有不同的处理方式,相应的,SGW-U向MME发送第二探测信息也就有不同的方式。以下简单介绍。The SGW-U may have different processing modes for the local service associated with the PGW-U after the status of the PGW-U is known to be faulty. Correspondingly, the SGW-U sends the second detection information to the MME. different way. The following is a brief introduction.
方式一:method one:
在方式一中,若UP(例如SGW-U)获知另一UP(例如PGW-U)的状态为故障,则该UP在本地会释放与另一UP关联的业务,UP可以在因释放业务而向MME触发的消息中携带第二探测信息。In the first mode, if the UP (for example, the SGW-U) learns that the state of the other UP (for example, the PGW-U) is a fault, the UP locally releases the service associated with another UP, and the UP may be due to the release of the service. The message triggered to the MME carries the second probe information.
例如,SGW-U若获知PGW-U的状态为故障,则可以触发PGW-U关联的用户业务承载的释放处理。例如SGW-U通过Sxa接口向SGW-C发送用于进行故障处理的消息,例如UPlane Session Delete Request消息,SGW-C再通过S11接口向MME发送用于进行故障处理的消息,例如Delete Bearer Request消息,及通过S5/S8接口向PGW-C发送用于进行故障处理的消息,例如Delete Bearer Request消息,从而释放故障PGW-U关联的用户业务承载。SGW-U在该UPlane Session Delete Request消息中可以扩展携带故障的PGW-U的标识,相应的,在Delete Bearer Request消息中可以扩展携带故障的PGW-U的标识。For example, if the SGW-U learns that the status of the PGW-U is faulty, the SGW-U can trigger the release process of the user service bearer associated with the PGW-U. For example, the SGW-U sends a message for performing fault processing to the SGW-C through the Sxa interface, for example, a UPlane Session Delete Request message, and the SGW-C sends a message for performing fault processing to the MME through the S11 interface, for example, a Delete Bearer Request message. And sending a message for performing fault processing, such as a Delete Bearer Request message, to the PGW-C through the S5/S8 interface, thereby releasing the user service bearer associated with the faulty PGW-U. The SGW-U can extend the identifier of the PGW-U carrying the fault in the UPlane Session Delete Request message. Correspondingly, the identifier of the PGW-U carrying the fault can be extended in the Delete Bearer Request message.
方式二:Method 2:
在方式二中,若UP(例如SGW-U)获知另一UP(例如PGW-U)的状态 为故障,则该UP在本地可以继续保持与另一UP关联的业务正常进行,通过这种方式可避免用户下线,使得业务得以继续。该UP可以通过现有的消息或新增的故障处理消息等将第二探测信息发送给MME。例如,SGW-U获知PGW-U的状态为故障,则SGW-U在本地仍保持PGW-U关联的业务继续进行,且SGW-U通过SGW-C将第二探测信息发送给MME。In mode 2, if UP (for example, SGW-U) knows the state of another UP (for example, PGW-U) In the case of a fault, the UP can continue to maintain the normal association with another UP. In this way, the user can be prevented from going offline, so that the service can continue. The UP may send the second probe information to the MME by using an existing message or a newly added fault processing message or the like. For example, if the SGW-U knows that the status of the PGW-U is a fault, the SGW-U continues to maintain the PGW-U-related service locally, and the SGW-U sends the second probe information to the MME through the SGW-C.
其中,S61-S62、S63、S64-S65可以视为三个部分,这三个部分的执行顺序可以任意。Among them, S61-S62, S63, S64-S65 can be regarded as three parts, and the order of execution of these three parts can be arbitrary.
S66、MME根据第一探测信息和第二探测信息确定故障类型,其中,故障类型包括仅PGW-C故障、仅PGW-U故障、或PGW-C和PGW-U均故障。S66. The MME determines a fault type according to the first probe information and the second probe information, where the fault type includes only a PGW-C fault, only a PGW-U fault, or both the PGW-C and the PGW-U fault.
例如,MME结合第一探测信息中携带的故障的PGW-C的标识,以及第二探测信息中携带的故障的PGW-U的标识,确定究竟是PGW-C故障、PGW-U故障还是PGW-C和PGW-U均故障。For example, the MME determines whether the PGW-C fault, the PGW-U fault, or the PGW- is combined with the identifier of the faulty PGW-C carried in the first probe information and the identifier of the faulty PGW-U carried in the second probe information. Both C and PGW-U fail.
可能的实施方式中,MME确定第一探测信息中携带的故障的PGW-C的标识对应的PGW-C为故障。In a possible implementation manner, the MME determines that the PGW-C corresponding to the identifier of the faulty PGW-C carried in the first probe information is a fault.
可能的实施方式中,MME确定第二探测信息中携带的故障的PGW-U的标识对应的PGW-U为故障。In a possible implementation manner, the MME determines that the PGW-U corresponding to the identifier of the faulty PGW-U carried in the second probe information is a fault.
S67、MME进行故障恢复处理。S67. The MME performs fault recovery processing.
可能的实施方式中,若MME确定PGW-C故障,则MME可以采取如图2所示的实施例中的S26中介绍的第2种故障处理策略或第5种故障处理策略来进行故障恢复处理。In a possible implementation manner, if the MME determines that the PGW-C is faulty, the MME may adopt the second fault handling policy or the fifth fault handling policy introduced in S26 in the embodiment shown in FIG. 2 to perform fault recovery processing. .
可能的实施方式中,若MME确定PGW-U故障,则MME可以采取如图2所示的实施例中的S26中介绍的第1种故障处理策略或第5种故障处理策略来进行故障恢复处理。In a possible implementation manner, if the MME determines that the PGW-U is faulty, the MME may adopt the first fault handling policy or the fifth fault handling policy introduced in S26 in the embodiment shown in FIG. 2 to perform fault recovery processing. .
可能的实施方式中,若MME确定既有PGW-C也有PGW-U故障,且故障的PGW-C和PGW-U均不相关联,即故障的PGW-C和PGW-U中没有相互关联的PGW-C和PGW-U,则MME可以分别作为PGW-C故障和PGW-U故障来处理,处理方式参考如上的描述。 In a possible implementation manner, if the MME determines that there is both a PGW-C and a PGW-U fault, and the faulty PGW-C and the PGW-U are not associated, that is, the faulty PGW-C and the PGW-U are not associated with each other. For PGW-C and PGW-U, the MME can be handled as a PGW-C fault and a PGW-U fault, respectively, and the processing manner is as described above.
可能的实施方式中,若MME确定既有PGW-C也有PGW-U故障,且故障的PGW-C和PGW-U中有相互关联的PGW-C和PGW-U,则对于相互关联的PGW-C和PGW-U,MME可以采取如图2所示的实施例中的S26中介绍的第4种故障处理策略,即释放业务策略来进行故障恢复处理。例如,MME可以在本地释放有关联关系的故障的PGW-C和/或PGW-U关联的所有业务。In a possible implementation manner, if the MME determines that there is both a PGW-C and a PGW-U fault, and the faulty PGW-C and the PGW-U have mutually associated PGW-C and PGW-U, then the associated PGW- C and PGW-U, the MME may adopt the fourth fault handling policy described in S26 in the embodiment shown in FIG. 2, that is, release the service policy to perform fault recovery processing. For example, the MME may locally release all services associated with the faulty PGW-C and/or PGW-U of the associated relationship.
其中,PGW-C关联的PGW-U,可以包括PGW-C所管理的PGW-U。The PGW-U associated with the PGW-C may include the PGW-U managed by the PGW-C.
通过本发明实施例提供的技术方案,SGW-C在检测到PGW-C故障后,可以暂不启动故障业务处理,PGW-U在检测到PGW-C故障后,也可以暂不启动故障业务处理,MME结合SGW-C和PGW-U发送的探测信息,将PGW-C故障、PGW-U故障、不相关联的PGW-C和PGW-U故障、相关联的PGW-C和PGW-U故障等几种情况分不同方式来处理,从而可以尽量减小业务处理的代价,例如减少去激活网元关联的业务的可能性,提升网络的业务体验。With the technical solution provided by the embodiment of the present invention, after detecting the PGW-C failure, the SGW-C may not start the fault service processing temporarily. After detecting the PGW-C fault, the PGW-U may not start the fault service processing temporarily. The MME combines the detection information sent by the SGW-C and the PGW-U to fault the PGW-C, the PGW-U failure, the unrelated PGW-C and PGW-U failures, the associated PGW-C and PGW-U failures. Several situations are handled in different ways, so that the cost of service processing can be minimized, for example, the possibility of deactivating the service associated with the network element is reduced, and the service experience of the network is improved.
如前的各个实施例都是以图1所示的网络架构为例,在实际应用中,本发明实施例的方案还可以应用于其他的网络架构。可以认为,只要是控制面与用户面分离的网络架构均可适用于本发明实施例提供的技术方案。下面举例介绍另一种网络架构。The foregoing various embodiments are exemplified by the network architecture shown in FIG. 1. In practical applications, the solution of the embodiment of the present invention may also be applied to other network architectures. It can be considered that the network architecture that is separate from the control plane and the user plane can be applied to the technical solution provided by the embodiment of the present invention. The following example shows another network architecture.
请参见图7,为本发明一实施例应用的一种网络架构示意图。可以看到,图7所示的是SDN的网络架构,其中包括SDN控制器和多个交换机,图7中以3个交换机为例,在实际应用中,交换机的数量可根据情况设定。在图7中,以信令处理网元通过SDN控制器实现为例。FIG. 7 is a schematic diagram of a network architecture according to an embodiment of the present invention. It can be seen that FIG. 7 shows the network architecture of the SDN, including the SDN controller and multiple switches. In FIG. 7, three switches are taken as an example. In practical applications, the number of switches can be set according to the situation. In FIG. 7, the signaling processing network element is implemented by using an SDN controller as an example.
为了更好地理解,下面介绍本发明实施例提供的技术方案如何应用在图8所示的网络架构中。For a better understanding, how to apply the technical solution provided by the embodiment of the present invention to the network architecture shown in FIG. 8 is described below.
本发明一实施例提供一种故障确定和处理方法。在本发明实施例中,信令处理网元通过探测确定第一用户面网元故障,信令处理网元根据对第一用户面网元的探测生成第一探测信息。另外第二用户面网元通过探测获知第一用户面网元的状态为故障,第二用户面网元根据对第一用户面网元的探测生成第二探测信息,并将第二探测信息也发送给信令处理网元。信令处理网元 根据第一探测信息和第二探测信息确定故障类型。在下面的介绍过程中,以第一用户面网元和第二用户面网元均是交换机、信令处理网元是SDN控制器为例。因为两个用户面网元均为交换机,为了便于下文的描述,在图7中将三个交换机分别给予不同的附图标记,即图7中的交换机71、交换机72和交换机73,例如其中的交换机71为本发明实施例中的第一用户面网元,交换机72为本发明实施例中的第二用户面网元。因此需要明确的是,在图7中为同一类型的设备给予不同的附图标记,只是为了方便描述,并不代表这些设备的类型有所不同。请参见图8。An embodiment of the invention provides a fault determination and processing method. In the embodiment of the present invention, the signaling processing network element determines the first user plane network element fault by the probe, and the signaling processing network element generates the first probe information according to the detection of the first user plane network element. In addition, the second user plane network element learns that the state of the first user plane network element is faulty, and the second user plane network element generates second detection information according to the detection of the first user plane network element, and the second detection information is also Send to the signaling processing network element. Signaling processing network element The fault type is determined according to the first probe information and the second probe information. In the following description, the first user plane network element and the second user plane network element are both switches and the signaling processing network element is an SDN controller as an example. Since the two user plane network elements are both switches, in order to facilitate the following description, three switches are respectively given different reference numerals in FIG. 7, namely, switch 71, switch 72 and switch 73 in FIG. 7, for example, The switch 71 is the first user plane network element in the embodiment of the present invention, and the switch 72 is the second user plane network element in the embodiment of the present invention. Therefore, it should be clear that the same type of device is given different reference numerals in FIG. 7 for the convenience of description, and does not mean that the types of these devices are different. See Figure 8.
S81、SDN控制器确定交换机71故障,生成第一探测信息,即,第一探测信息用于指示SDN控制器得到的交换机71的状态。S81. The SDN controller determines that the switch 71 is faulty, and generates first probe information, that is, the first probe information is used to indicate the status of the switch 71 obtained by the SDN controller.
SDN控制器对交换机71进行探测,可以确定交换机71是否故障。SDN控制器可以根据对交换机71的探测结果生成第一探测信息,第一探测信息可指示交换机71是正常还是故障,本发明实施例中以交换机71故障为例。可以理解为,SDN控制器对交换机71进行探测,是探测SDN控制器与交换机71之间的控制面连接状态,即探测SDN控制器与交换机71之间的链路状态。可以理解为,第一探测信息可指示SDN控制器与交换机71之间的信令连接状态是正常还是故障。The SDN controller detects the switch 71 and can determine whether the switch 71 is faulty. The SDN controller may generate the first probe information according to the detection result of the switch 71. The first probe information may indicate whether the switch 71 is normal or faulty. In the embodiment of the present invention, the fault of the switch 71 is taken as an example. It can be understood that the SDN controller detects the switch 71, and detects the control plane connection state between the SDN controller and the switch 71, that is, detects the link state between the SDN controller and the switch 71. It can be understood that the first probe information can indicate whether the signaling connection status between the SDN controller and the switch 71 is normal or faulty.
本发明实施例中,SDN控制器在确定交换机71故障后,SDN控制器本地仍保持交换机71关联的业务继续正常进行。In the embodiment of the present invention, after the SDN controller determines that the switch 71 is faulty, the SDN controller locally keeps the service associated with the switch 71 from continuing normally.
SDN控制器在判定交换机71故障后,可以标识交换机71故障,但仍然可以保持交换机71关联的业务继续正常进行。因为究竟是交换机71故障还是链路故障还可以结合其他信息进行判定,因此在不确定究竟是何处故障时SDN控制器可以继续保持交换机71关联的业务继续正常进行,使得业务得以继续,避免业务突然中断,提高用户体验。After determining that the switch 71 is faulty, the SDN controller can identify that the switch 71 is faulty, but can still keep the services associated with the switch 71 continue to operate normally. Because the switch 71 is faulty or the link fault can be determined in combination with other information, the SDN controller can continue to keep the services associated with the switch 71 continue to operate normally, so that the service can continue and avoid the service. Sudden interruptions to improve the user experience.
S82、交换机72获知交换机71的状态为故障。S82. The switch 72 learns that the state of the switch 71 is a fault.
交换机72对交换机71进行探测,可以确定交换机71是否故障。可以理解为,交换机72对交换机71进行探测,是探测交换机72与交换机71之间的用户 面连接状态。The switch 72 detects the switch 71 and can determine whether the switch 71 is faulty. It can be understood that the switch 72 detects the switch 71 and detects the user between the switch 72 and the switch 71. Face connection status.
S83、交换机72向SDN控制器发送第二探测信息,其中,第二探测信息是交换机72根据探测结果生成的,即,第二探测信息用于指示交换机72得到的交换机71的状态。S83, the switch 72 sends the second probe information to the SDN controller, where the second probe information is generated by the switch 72 according to the detection result, that is, the second probe information is used to indicate the status of the switch 71 obtained by the switch 72.
交换机72可以根据对交换机71的探测结果生成第二探测信息,第二探测信息可指示交换机71是正常还是故障,本发明实施例中以交换机71故障为例。可以理解为,第二探测信息可指示交换机71与交换机72之间的用户面连接状态是正常还是故障。当然,交换机72除了对交换机71进行探测之外,还可以对其他交换机进行探测,例如可以对交换机73进行探测,因此,第二探测信息除了可以指示交换机71是正常还是故障外,还可以指示交换机73是正常还是故障,本发明实施例不作限制。The switch 72 can generate the second probe information according to the detection result of the switch 71. The second probe information can indicate whether the switch 71 is normal or faulty. In the embodiment of the present invention, the fault of the switch 71 is taken as an example. It can be understood that the second probe information can indicate whether the user plane connection status between the switch 71 and the switch 72 is normal or faulty. Of course, the switch 72 can detect other switches in addition to the switch 71. For example, the switch 73 can detect the switch 73. Therefore, the second probe information can indicate that the switch 71 is normal or faulty, and can also indicate the switch. 73 is normal or faulty, and is not limited in the embodiment of the present invention.
其中,S81、以及S82-S83可以视为两个部分,这两个部分的执行顺序可以任意。Among them, S81, and S82-S83 can be regarded as two parts, and the execution order of these two parts can be arbitrary.
S84、SDN控制器根据第一探测信息和第二探测信息确定故障类型,其中,故障类型包括交换机71故障,或SDN控制器和交换机71之间的链路故障。S84. The SDN controller determines the fault type according to the first probe information and the second probe information, where the fault type includes a fault of the switch 71, or a link fault between the SDN controller and the switch 71.
例如,SDN控制器结合第一探测信息携带的故障的交换机的标识,以及第二探测信息中携带的故障的交换机的标识,确定究竟是交换机71故障还是交换机71和SDN控制器之间的链路故障。For example, the SDN controller determines whether the switch 71 is faulty or the link between the switch 71 and the SDN controller, in combination with the identifier of the faulty switch carried by the first probe information and the identifier of the faulty switch carried in the second probe information. malfunction.
可能的实施方式中,若第二探测信息中携带的故障的交换机的标识中包括第一探测信息携带的故障的交换机的标识,即第一探测信息中携带了故障的交换机71的标识,第二探测信息中也携带了故障的交换机71的标识,则SDN控制器可以确定是交换机71故障。In a possible implementation, if the identifier of the faulty switch carried in the second probe information includes the identifier of the faulty switch carried by the first probe information, that is, the identifier of the switch 71 carrying the fault in the first probe information, and the second The probe information also carries the identity of the failed switch 71, and the SDN controller can determine that the switch 71 is faulty.
可能的实施方式中,若第二探测信息中携带的故障的交换机的标识中不包括第一探测信息携带的故障的交换机的标识,即第一探测信息中携带了故障的交换机71的标识,而第二探测信息中未携带交换机71的标识,则SDN控制器可以确定交换机71正常,而交换机71和SDN控制器之间的链路故障。In a possible implementation, if the identifier of the faulty switch carried in the second probe information does not include the identifier of the faulty switch carried by the first probe information, the first probe information carries the identifier of the faulty switch 71. The second probe information does not carry the identifier of the switch 71, and the SDN controller can determine that the switch 71 is normal, and the link between the switch 71 and the SDN controller is faulty.
S85、SDN控制器进行故障恢复处理。 S85, SDN controller performs fault recovery processing.
可能的实施方式中,若SDN控制器确定是交换机71故障,而交换机71和SDN控制器之间的链路故障,则SDN控制器可以重新选择新的交换机接替原有故障的交换机71,例如选择交换机73来接替交换机71,需要将故障的交换机71的转发表更新到交换机73中,同时修改交换机71的上行和下行交换机的转发表,使得其上行和下行交换机通过重选的交换机73进行转发。In a possible implementation manner, if the SDN controller determines that the switch 71 is faulty and the link between the switch 71 and the SDN controller fails, the SDN controller may reselect the new switch to replace the original faulty switch 71, for example, selecting The switch 73 replaces the switch 71, and needs to update the forwarding table of the failed switch 71 to the switch 73, and modify the forwarding table of the upstream and downstream switches of the switch 71 so that its upstream and downstream switches are forwarded through the reselected switch 73.
可能的实施方式中,若SDN控制器可以确定交换机71正常,而交换机71和SDN控制器之间的链路故障,则SDN控制器可以等待SDN控制连接接口恢复,继续令用户面数据流通过交换机71进行传输。In a possible implementation manner, if the SDN controller can determine that the switch 71 is normal, and the link between the switch 71 and the SDN controller is faulty, the SDN controller can wait for the SDN control connection interface to resume, and continue to let the user plane data flow through the switch. 71 for transmission.
通过本发明实施例提供的技术方案,可以有效识别究竟是网元故障还是链路故障,从而可以分别采取不同的措施,尽量保证用户面的业务能够继续,减小业务突然中断的可能性,提高网络性能。The technical solution provided by the embodiment of the present invention can effectively identify whether the network element is faulty or the link is faulty, so that different measures can be taken respectively to ensure that the service of the user plane can be continued, and the possibility of sudden interruption of the service is reduced, and the possibility is improved. Network performance.
下面结合说明书附图介绍本发明实施例提供的设备。The device provided by the embodiment of the present invention is described below with reference to the accompanying drawings.
图9所示为本发明一实施例提供的计算机设备100的示意图。计算机设备100包括至少一个处理器101,通信总线102,存储器103以及至少一个通信接口104。在本发明实施例中,信令处理网元或第一网元等设备均可以通过图9所示的计算机设备100实现。其中,第一网元可以是控制面网元(如第一控制面网元或第二控制面网元),或用户面网元(如第一用户面网元或第二用户面网元),第二网元也可以是控制面网元或用户面网元,第一网元通过探测获知第二网元的状态为故障,根据对第二网元的探测生成探测信息,该探测信息可以包括第一探测信息或第二探测信息,且第一网元将生成的探测信息发送给信令处理网元。至于第一网元与第二网元究竟是何种网元,可参考图2-图6、或图8中的任意一个附图所提供的方法的介绍。FIG. 9 is a schematic diagram of a computer device 100 according to an embodiment of the present invention. The computer device 100 includes at least one processor 101, a communication bus 102, a memory 103, and at least one communication interface 104. In the embodiment of the present invention, the signaling processing network element or the first network element and the like can be implemented by the computer device 100 shown in FIG. The first network element may be a control plane network element (such as a first control plane network element or a second control plane network element), or a user plane network element (such as a first user plane network element or a second user plane network element) The second network element may also be a control plane network element or a user plane network element. The first network element obtains the detection information according to the detection of the second network element by detecting the state of the second network element, and the detection information may be The first probe information or the second probe information is included, and the first network element sends the generated probe information to the signaling processing network element. As to what kind of network element the first network element and the second network element are, reference may be made to the method provided by any one of FIG. 2 to FIG. 6 or FIG.
处理器101可以是通用中央处理器(CPU),微处理器,特定应用集成电路(Application-Specific Integrated Circuit,ASIC),或一个或多个用于控制本发明方案程序执行的集成电路。The processor 101 can be a general purpose central processing unit (CPU), a microprocessor, an application-specific integrated circuit (ASIC), or one or more integrated circuits for controlling the execution of the program of the present invention.
通信总线102可包括一通路,在上述组件之间传送信息。通信接口104,使用任何收发器一类的装置,用于与其他设备或通信网络通信,如以太网, 无线接入网(RAN),无线局域网(Wireless Local Area Networks,WLAN)等。Communication bus 102 can include a path for communicating information between the components described above. Communication interface 104, using any type of transceiver, for communicating with other devices or communication networks, such as Ethernet, Radio Access Network (RAN), Wireless Local Area Networks (WLAN), etc.
存储器103可以是只读存储器(read-only memory,ROM)或可存储静态信息和指令的其他类型的静态存储设备,随机存取存储器(random access memory,RAM)或者可存储信息和指令的其他类型的动态存储设备,也可以是电可擦可编程只读存储器(Electrically Erasable Programmable Read-Only Memory,EEPROM)、只读光盘(Compact Disc Read-Only Memory,CD-ROM)或其他光盘存储、光碟存储(包括压缩光碟、激光碟、光碟、数字通用光碟、蓝光光碟等)、磁盘存储介质或者其他磁存储设备、或者能够用于携带或存储具有指令或数据结构形式的期望的程序代码并能够由计算机存取的任何其他介质,但不限于此。存储器103可以是独立存在,通过总线与处理器101相连接。存储器103也可以和处理器101集成在一起。The memory 103 can be a read-only memory (ROM) or other type of static storage device that can store static information and instructions, a random access memory (RAM) or other type that can store information and instructions. The dynamic storage device can also be an Electrically Erasable Programmable Read-Only Memory (EEPROM), a Compact Disc Read-Only Memory (CD-ROM) or other optical disc storage, and a disc storage device. (including compact discs, laser discs, optical discs, digital versatile discs, Blu-ray discs, etc.), magnetic disk storage media or other magnetic storage devices, or can be used to carry or store desired program code in the form of instructions or data structures and can be Any other media accessed, but not limited to this. The memory 103 can be independently present and connected to the processor 101 via a bus. The memory 103 can also be integrated with the processor 101.
其中,存储器103用于存储执行本发明方案的应用程序代码,并由处理器101来控制执行。处理器101用于执行存储器103中存储的应用程序代码。若信令处理网元、控制面网元、或用户面网元通过计算机设备100实现,则信令处理网元、控制面网元、或用户面网元的存储器103中可以存储一个或多个软件模块,信令处理网元、控制面网元、或用户面网元可以通过处理器101以及存储器103中的程序代码来实现存储的软件模块,以实现对于故障的确定或处理。The memory 103 is used to store application code for executing the solution of the present invention, and is controlled by the processor 101 for execution. The processor 101 is configured to execute application code stored in the memory 103. If the signaling processing network element, the control plane network element, or the user plane network element is implemented by the computer device 100, one or more of the signaling processing network element, the control plane network element, or the user plane network element memory 103 may be stored. The software module, the signaling processing network element, the control plane network element, or the user plane network element may implement the stored software module through the processor 101 and the program code in the memory 103 to implement the determination or processing of the fault.
在具体实现中,作为一种实施例,处理器101可以包括一个或多个CPU,例如图9中的CPU0和CPU1。In a particular implementation, as an embodiment, processor 101 may include one or more CPUs, such as CPU0 and CPU1 in FIG.
在具体实现中,作为一种实施例,计算机设备100可以包括多个处理器101,例如图9中的第一处理器1011和第二处理器1012,其中,第一处理器1011和第二处理器1012之所以命名不同以及附图标记不同,只是为了区分多个处理器101。这些处理器101中的每一个可以是一个单核(single-CPU)处理器101,也可以是一个多核(multi-CPU)处理器101。这里的处理器101可以指一个或多个设备、电路、和/或用于处理数据(例如计算机程序指令)的处理核。 In a specific implementation, as an embodiment, the computer device 100 may include a plurality of processors 101, such as the first processor 1011 and the second processor 1012 in FIG. 9, wherein the first processor 1011 and the second process The names of the devices 1012 are different and the reference numerals are different, just to distinguish the plurality of processors 101. Each of these processors 101 may be a single-CPU processor 101 or a multi-CPU processor 101. Processor 101 herein may refer to one or more devices, circuits, and/or processing cores for processing data, such as computer program instructions.
上述的计算机设备100可以是一个通用计算机设备或者是一个专用计算机设备。在具体实现中,计算机设备100可以是台式机、便携式电脑、网络服务器、掌上电脑(Personal Digital Assistant,PDA)、移动手机、平板电脑、无线终端设备、通信设备、嵌入式设备或有图9中类似结构的设备。本发明实施例不限定计算机设备100的类型。The computer device 100 described above may be a general purpose computer device or a special purpose computer device. In a specific implementation, the computer device 100 may be a desktop computer, a portable computer, a network server, a personal digital assistant (PDA), a mobile phone, a tablet computer, a wireless terminal device, a communication device, an embedded device, or have FIG. A device of similar structure. Embodiments of the invention do not limit the type of computer device 100.
请参见图10,本发明一实施例提供一种信令处理网元,该信令处理网元包括接收单元1001和处理单元1002。Referring to FIG. 10, an embodiment of the present invention provides a signaling processing network element, where the signaling processing network element includes a receiving unit 1001 and a processing unit 1002.
可选的,该信令处理网元还可以包括发送单元1003,在图10中一并示出。其中,发送单元1003为可选的功能单元,为了与必选的功能单元相区分,在图10中画为虚线形式。Optionally, the signaling processing network element may further include a sending unit 1003, which is shown together in FIG. The sending unit 1003 is an optional functional unit. In order to distinguish it from the required functional unit, it is drawn in the form of a dotted line in FIG.
在实际应用中,接收单元1001和发送单元1003对应的实体设备可以包括图9中的通信接口104,处理单元1002对应的实体设备可以是图9中的处理器101。可以认为,图9中的通信接口104中,有的通信接口104实现接收单元1001的功能,还有的通信接口104能够实现发送单元1003的功能,或者可以认为,图9中的通信接口104中,可能每个通信接口104都既能实现接收单元1001的功能也能实现发送单元1003的功能。In a practical application, the physical device corresponding to the receiving unit 1001 and the sending unit 1003 may include the communication interface 104 in FIG. 9, and the physical device corresponding to the processing unit 1002 may be the processor 101 in FIG. It can be considered that in the communication interface 104 in FIG. 9, some communication interfaces 104 implement the functions of the receiving unit 1001, and some communication interfaces 104 can implement the functions of the transmitting unit 1003, or can be considered as being in the communication interface 104 in FIG. It is possible that each communication interface 104 can implement both the function of the receiving unit 1001 and the function of the transmitting unit 1003.
该信令处理网元可以用于执行上述图2-图4中的任意一个附图所示的实施例所提供的方法,例如可以是如前所述的信令处理网元。因此,对于该信令处理网元中的各单元所实现的功能等,可参考如前方法部分的描述,不多赘述。The signaling processing network element may be used to perform the method provided by the embodiment shown in any one of the above Figures 2 to 4, for example, may be a signaling processing network element as described above. Therefore, for the functions and the like implemented by the units in the signaling processing network element, reference may be made to the description of the previous method part, and details are not described herein.
请参见图11,本发明一实施例提供一种信令处理网元,该信令处理网元包括接收单元1101和处理单元1102。Referring to FIG. 11, an embodiment of the present invention provides a signaling processing network element, where the signaling processing network element includes a receiving unit 1101 and a processing unit 1102.
可选的,该信令处理网元还可以包括发送单元1103,在图11中一并示出。其中,发送单元1103为可选的功能单元,为了与必选的功能单元相区分,在图11中画为虚线形式。Optionally, the signaling processing network element may further include a sending unit 1103, which is shown together in FIG. The sending unit 1103 is an optional functional unit, which is drawn in the form of a broken line in FIG. 11 in order to distinguish it from the required functional unit.
在实际应用中,接收单元1101和发送单元1103对应的实体设备可以包括图9中的通信接口104,处理单元1102对应的实体设备可以是图9中的处理器 101。可以认为,图9中的通信接口104中,有的通信接口104实现接收单元1101的功能,还有的通信接口104能够实现发送单元1103的功能,或者可以认为,图9中的通信接口104中,可能每个通信接口104都既能实现接收单元1101的功能也能实现发送单元1103的功能。In a practical application, the physical device corresponding to the receiving unit 1101 and the sending unit 1103 may include the communication interface 104 in FIG. 9, and the physical device corresponding to the processing unit 1102 may be the processor in FIG. 101. It can be considered that in the communication interface 104 in FIG. 9, some communication interfaces 104 implement the functions of the receiving unit 1101, and some communication interfaces 104 can implement the functions of the transmitting unit 1103, or can be considered as being in the communication interface 104 in FIG. It is possible that each communication interface 104 can implement both the function of the receiving unit 1101 and the function of the transmitting unit 1103.
该信令处理网元可以用于执行上述图5-图6中的任意一个附图所示的实施例所提供的方法,例如可以是如前所述的信令处理网元。因此,对于该信令处理网元中的各单元所实现的功能等,可参考如前方法部分的描述,不多赘述。The signaling processing network element may be used to perform the method provided by the embodiment shown in any of the above-mentioned Figures 5-6, and may be, for example, a signaling processing network element as described above. Therefore, for the functions and the like implemented by the units in the signaling processing network element, reference may be made to the description of the previous method part, and details are not described herein.
请参见图12,本发明一实施例提供一种SDN控制器,该SDN控制器包括接收单元1201和处理单元1202。Referring to FIG. 12, an embodiment of the present invention provides an SDN controller, where the SDN controller includes a receiving unit 1201 and a processing unit 1202.
在实际应用中,接收单元1201对应的实体设备可以是图9中的通信接口104,处理单元1202对应的实体设备可以是图9中的处理器101。可以认为,图9中的通信接口104中,有的通信接口104实现接收单元1201的功能,还有的通信接口104能够实现发送数据的功能,或者可以认为,图9中的通信接口104中,可能每个通信接口104都既能实现接收单元1201的功能也能实现发送数据的功能。In a practical application, the physical device corresponding to the receiving unit 1201 may be the communication interface 104 in FIG. 9, and the physical device corresponding to the processing unit 1202 may be the processor 101 in FIG. It can be considered that in the communication interface 104 in FIG. 9, some communication interfaces 104 implement the function of the receiving unit 1201, and some communication interfaces 104 can implement the function of transmitting data, or can be considered that in the communication interface 104 in FIG. It is possible that each communication interface 104 can implement both the function of the receiving unit 1201 and the function of transmitting data.
该信令处理网元可以用于执行上述图8所示的实施例所提供的方法,例如可以是如图7或图8所示的实施例所述的SDN控制器。因此,对于该信令处理网元中的各单元所实现的功能等,可参考如前方法部分的描述,不多赘述。The signaling processing network element may be used to perform the method provided by the embodiment shown in FIG. 8 above, and may be, for example, the SDN controller as described in the embodiment shown in FIG. 7 or FIG. 8. Therefore, for the functions and the like implemented by the units in the signaling processing network element, reference may be made to the description of the previous method part, and details are not described herein.
请参见图13,本发明一实施例提供一种网元,该网元为第一网元,该网元包括发送单元1301和处理单元1302。Referring to FIG. 13, an embodiment of the present invention provides a network element, where the network element is a first network element, where the network element includes a sending unit 1301 and a processing unit 1302.
在实际应用中,发送单元1301对应的实体设备可以是图9中的通信接口104,处理单元1302对应的实体设备可以是图9中的处理器101。可以认为,图9中的通信接口104中,有的通信接口104实现发送单元1301的功能,还有的通信接口104能够实现接收数据的功能,或者可以认为,图9中的通信接口104中,可能每个通信接口104都既能实现发送单元1301的功能也能实现接收数据的功能。 In a practical application, the physical device corresponding to the sending unit 1301 may be the communication interface 104 in FIG. 9, and the physical device corresponding to the processing unit 1302 may be the processor 101 in FIG. It can be considered that, in the communication interface 104 in FIG. 9, some communication interfaces 104 implement the functions of the transmitting unit 1301, and some communication interfaces 104 can implement the function of receiving data, or it can be considered that in the communication interface 104 in FIG. It is possible that each communication interface 104 can implement both the function of the transmitting unit 1301 and the function of receiving data.
该网元可以用于执行上述图2-图6、以及图8中的任意一个附图所示的实施例所提供的方法,例如可以是如前所述的第一控制面网元、第二控制面网元、第一用户面网元、或第二用户面网元。因此,对于该网元中的各单元所实现的功能等,可参考如前方法部分的描述,不多赘述。The network element may be used to perform the method provided by the embodiment shown in any one of the above FIG. 2-6 and FIG. 8, for example, the first control plane network element and the second may be as described above. The control plane network element, the first user plane network element, or the second user plane network element. Therefore, for the functions and the like implemented by the units in the network element, reference may be made to the description of the previous method part, and details are not described herein.
本发明实施例中,因信令处理网元可以接收多方网元发送的信息,因此将进行故障判断的任务交给信令处理网元。信令处理网元可以结合接收的第一探测信息和第二探测信息来综合判断究竟是用户面网元故障、控制面网元故障还是控制面网元和用户面网元之间的链路故障,因在进行故障判断时会综合考虑多方面的信息,而不只是考虑单网元的信息,提高了判断结果的准确率,从而后续如果是网元故障可以按照网元故障进行处理,如果是链路故障则可以按照链路故障进行处理,可以尽量避免无故障的UP的业务中断,保证业务的连续性,或者在UP故障情况下,也可以尽量快速恢复UP的业务,尽量不影响用户的业务体验。In the embodiment of the present invention, the signaling processing network element can receive the information sent by the multi-party network element, and therefore the task of performing the fault determination is handed over to the signaling processing network element. The signaling processing network element can comprehensively determine whether the user plane network element fault, the control plane network element fault, or the link fault between the control plane network element and the user plane network element is combined with the received first probe information and the second probe information. Because the multi-faceted information is comprehensively considered in the judgment of the fault, not only the information of the single network element is considered, but the accuracy of the judgment result is improved, so that if the network element fails, it can be processed according to the network element failure, if A link fault can be processed according to the link fault. You can avoid service interruption of the faultless UP and ensure the continuity of the service. In the case of an UP fault, you can also restore the UP service as quickly as possible. Business experience.
在本发明中,应该理解到,所揭露的设备和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性或其它的形式。In the present invention, it should be understood that the disclosed apparatus and method can be implemented in other ways. For example, the device embodiments described above are merely illustrative. For example, the division of the unit or unit is only a logical function division. In actual implementation, there may be another division manner, for example, multiple units or components may be used. Combinations can be integrated into another system, or some features can be ignored or not executed. In addition, the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be electrical or otherwise.
本发明实施例还提供一种计算机存储介质,其中,该计算机存储介质可存储有程序,该程序执行时包括上述方法实施例中记载的任何一种视频通信过程中的带宽调整方法的部分或全部步骤。The embodiment of the present invention further provides a computer storage medium, wherein the computer storage medium may store a program, where the program includes some or all of the bandwidth adjustment method in any one of the video communication processes described in the foregoing method embodiments. step.
在本发明实施例中的各功能单元可以集成在一个处理单元中,或者各个单元也可以均是独立的物理模块。The functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may also be an independent physical module.
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本 发明的技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备,例如可以是个人计算机,服务器,或者网络设备等,或处理器(processor)执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:通用串行总线闪存盘(Universal Serial Bus flash drive)、移动硬盘、ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。The integrated unit, if implemented in the form of a software functional unit and sold or used as a standalone product, may be stored in a computer readable storage medium. Based on this understanding, this All or part of the technical solution of the invention may be embodied in the form of a software product stored in a storage medium, including instructions for causing a computer device, such as a personal computer, a server, or a network device. Etc., or a processor, performs all or part of the steps of the methods described in various embodiments of the present invention. The foregoing storage medium includes: a universal serial bus flash drive, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disk, and the like, which can store program codes.
以上所述,以上实施例仅用以对本发明的技术方案进行了详细介绍,但以上实施例的说明只是用于帮助理解本发明实施例的方法,不应理解为对本发明实施例的限制。本技术领域的技术人员可轻易想到的变化或替换,都应涵盖在本发明实施例的保护范围之内。 The above embodiments are only used to describe the technical solutions of the present invention in detail, but the description of the above embodiments is only for the purpose of facilitating the understanding of the embodiments of the present invention, and should not be construed as limiting the embodiments of the present invention. Variations or substitutions that may be readily conceived by those skilled in the art are intended to be included within the scope of the present invention.

Claims (24)

  1. 一种故障处理方法,其特征在于,包括:A fault processing method, comprising:
    信令处理网元接收第一控制面网元发送的第一探测信息;所述第一探测信息用于指示所述第一控制面网元得到的第一用户面网元的状态;The signaling processing network element receives the first probe information sent by the first control plane network element, where the first probe information is used to indicate the state of the first user plane network element obtained by the first control plane network element;
    所述信令处理网元接收第二用户面网元发送的第二探测信息;所述第二探测信息用于指示所述第二用户面网元得到的所述第一用户面网元的状态;The signaling processing network element receives the second probe information sent by the second user plane network element, where the second probe information is used to indicate the status of the first user plane network element obtained by the second user plane network element ;
    所述信令处理网元根据所述第一探测信息和所述第二探测信息确定故障类型,所述故障类型包括所述第一用户面网元故障,或所述第一控制面网元与所述第一用户面网元之间的链路故障。Determining, by the signaling processing network element, the fault type according to the first probe information and the second probe information, where the fault type includes the first user plane network element fault, or the first control plane network element and The link between the first user plane network elements is faulty.
  2. 如权利要求1所述的方法,其特征在于,所述信令处理网元根据所述第一探测信息和所述第二探测信息确定故障类型,包括:The method according to claim 1, wherein the signaling processing network element determines a fault type according to the first probe information and the second probe information, including:
    若所述第一探测信息和所述第二探测信息均指示所述第一用户面网元故障,则所述信令处理网元确定故障类型为所述第一用户面网元故障;或If the first probe information and the second probe information both indicate that the first user plane network element is faulty, the signaling processing network element determines that the fault type is the first user plane network element fault; or
    若所述第一探测信息指示所述第一用户面网元故障,而所述第二探测信息指示所述第一用户面网元正常,则所述信令处理网元确定故障类型为所述第一用户面网元正常,所述第一控制面网元与所述第一用户面网元之间的链路故障。If the first probe information indicates that the first user plane network element is faulty, and the second probe information indicates that the first user plane network element is normal, the signaling processing network element determines that the fault type is the The first user plane network element is normal, and the link between the first control plane network element and the first user plane network element is faulty.
  3. 如权利要求2所述的方法,其特征在于,在所述信令处理网元确定故障类型为所述第一控制面网元与所述第一用户面网元之间的链路故障之后,还包括:The method according to claim 2, after the signaling processing network element determines that the fault type is a link fault between the first control plane network element and the first user plane network element, Also includes:
    所述信令处理网元重新为所述第一用户面网元选择控制面网元,并向重新选择的控制面网元发送所述第一用户面网元的标识,使得重新选择的控制面网元管理所述第一用户面网元;或The signaling processing network element re-selects the control plane network element for the first user plane network element, and sends the identifier of the first user plane network element to the reselected control plane network element, so that the reselected control plane The network element manages the first user plane network element; or
    所述信令处理网元指示所述第一控制面网元等待链路恢复。The signaling processing network element instructs the first control plane network element to wait for link recovery.
  4. 如权利要求1-3任一所述的方法,其特征在于,A method according to any of claims 1-3, wherein
    所述第一控制面网元为控制面服务网关,所述第二用户面网元为基站, 所述第一用户面网元为用户面服务网关,所述信令处理网元为移动管理实体;或The first control plane network element is a control plane serving gateway, and the second user plane network element is a base station. The first user plane network element is a user plane serving gateway, and the signaling processing network element is a mobility management entity; or
    所述第一控制面网元为控制面分组数据网网关,所述第二用户面网元为用户面服务网关,所述第一用户面网元为用户面分组数据网网关,所述信令处理网元为移动管理实体;或The first control plane network element is a control plane packet data network gateway, the second user plane network element is a user plane serving gateway, and the first user plane network element is a user plane packet data network gateway, and the signaling Processing the network element as a mobility management entity; or
    所述第一控制面网元为控制面服务网关,所述第二用户面网元为用户面分组数据网网关,所述第一用户面网元为用户面服务网关,所述信令处理网元为移动管理实体。The first control plane network element is a control plane serving gateway, the second user plane network element is a user plane packet data network gateway, and the first user plane network element is a user plane serving gateway, and the signaling processing network The yuan is a mobile management entity.
  5. 一种故障处理方法,其特征在于,包括:A fault processing method, comprising:
    信令处理网元获得第一探测信息;所述第一探测信息用于指示第一控制面网元的状态;The signaling processing network element obtains the first detection information; the first detection information is used to indicate the state of the first control plane network element;
    所述信令处理网元接收第一用户面网元发送的第二探测信息;其中,所述第二探测信息用于指示所述第一用户面网元得到的第二用户面网元的状态;The signaling processing network element receives the second probe information sent by the first user plane network element, where the second probe information is used to indicate the state of the second user plane network element obtained by the first user plane network element ;
    所述信令处理网元根据所述第一探测信息和所述第二探测信息确定故障类型,所述故障类型包括仅所述第一控制面网元故障,或仅所述第二用户面网元故障,或所述第一控制面网元与所述第二用户面网元均故障;The signaling processing network element determines a fault type according to the first probe information and the second probe information, where the fault type includes only the first control plane network element fault, or only the second user plane network a failure of the element, or the first control plane network element and the second user plane network element are both faulty;
    若所述第一控制面网元与所述第二用户面网元均故障,且所述第一控制面网元管理所述第二用户面网元,则所述信令处理网元释放所述第一控制面网元和/或所述第二用户面网元关联的业务。If the first control plane network element and the second user plane network element both fail, and the first control plane network element manages the second user plane network element, the signaling processing network element is released. The service associated with the first control plane network element and/or the second user plane network element.
  6. 如权利要求5所述的方法,其特征在于,所述信令处理网元根据所述第一探测信息和所述第二探测信息确定故障类型,包括:The method according to claim 5, wherein the signaling processing network element determines the fault type according to the first probe information and the second probe information, including:
    若所述第一探测信息指示所述第一控制面网元故障,则所述信令处理网元确定所述第一控制面网元故障;或If the first probe information indicates that the first control plane network element is faulty, the signaling processing network element determines that the first control plane network element is faulty; or
    若所述第二探测信息指示所述第二用户面网元故障,则所述信令处理网元确定所述第二用户面网元故障。And if the second probe information indicates that the second user plane network element is faulty, the signaling processing network element determines that the second user plane network element is faulty.
  7. 如权利要求6所述的方法,其特征在于,在所述信令处理网元确定故 障类型之后,还包括:The method of claim 6 wherein said signaling processing network element is determined After the type of barrier, it also includes:
    若仅所述第一控制面网元故障,所述信令处理网元重新为所述第二用户面网元选择控制面网元,并向重新选择的控制面网元发送所述第二用户面网元的标识,使得重新选择的控制面网元管理所述第二用户面网元。If only the first control plane network element fails, the signaling processing network element reselects the control plane network element for the second user plane network element, and sends the second user to the reselected control plane network element. The identifier of the surface network element is such that the reselected control plane network element manages the second user plane network element.
  8. 如权利要求5-7任一所述的方法,其特征在于,信令处理网元获得第一探测信息,包括:The method according to any one of claims 5-7, wherein the signaling processing network element obtains the first detection information, including:
    所述信令处理网元接收第二控制面网元发送的所述第一探测信息,所述第一探测信息用于指示所述第二控制面网元得到的所述第一控制面网元的状态;或The signaling processing network element receives the first detection information that is sent by the second control plane network element, where the first detection information is used to indicate the first control plane network element obtained by the second control plane network element State; or
    所述信令处理网元对所述第一控制面网元进行探测,根据探测结果生成所述第一探测信息。The signaling processing network element detects the first control plane network element, and generates the first probe information according to the detection result.
  9. 如权利要求8所述的方法,其特征在于,The method of claim 8 wherein:
    所述信令处理网元为移动管理实体,所述第一控制面网元为控制面服务网关,所述第一用户面网元为基站,所述第二用户面网元为用户面服务网关;或The signaling processing network element is a mobility management entity, the first control plane network element is a control plane serving gateway, the first user plane network element is a base station, and the second user plane network element is a user plane serving gateway. ;or
    所述信令处理网元为移动管理实体,所述第二控制面网元为控制面服务网关,所述第一控制面网元为控制面分组数据网网关,所述第一用户面网元为用户面服务网关,所述第二用户面网元为用户面分组数据网网关。The signaling processing network element is a mobility management entity, the second control plane network element is a control plane serving gateway, and the first control plane network element is a control plane packet data network gateway, and the first user plane network element Serving the gateway for the user plane, the second user plane network element is a user plane packet data network gateway.
  10. 一种故障处理方法,其特征在于,包括:A fault processing method, comprising:
    软件定义网络SDN控制器对第一交换机进行探测,获得第一探测信息;The software-defined network SDN controller detects the first switch to obtain the first probe information;
    所述SDN控制器接收第二交换机发送的第二探测信息;所述第二探测信息用于指示所述第二交换机得到的所述第一交换机的状态;The SDN controller receives the second probe information sent by the second switch, where the second probe information is used to indicate the state of the first switch obtained by the second switch;
    所述SDN控制器根据所述第一探测信息和所述第二探测信息确定故障类型,所述故障类型包括所述第一交换机故障,或所述SDN控制器与所述第一交换机之间的链路故障。Determining, by the SDN controller, a fault type according to the first probe information and the second probe information, where the fault type includes the first switch fault, or between the SDN controller and the first switch The link is faulty.
  11. 一种故障处理方法,其特征在于,包括:A fault processing method, comprising:
    第一网元通过探测获知第二网元的状态为故障; The first network element learns that the state of the second network element is faulty by detecting;
    所述第一网元根据对所述第二网元的探测生成探测信息,并将所述探测信息发送给信令处理网元;所述探测信息中携带所述第二网元的标识,所述探测信息用于确定故障类型。The first network element generates the probe information according to the detection of the second network element, and sends the probe information to the signaling processing network element; the probe information carries the identifier of the second network element, where The probe information is used to determine the type of fault.
  12. 如权利要求11所述的方法,其特征在于,所述第一网元为控制面网元或用户面网元;所述第二网元为控制面网元或用户面网元。The method according to claim 11, wherein the first network element is a control plane network element or a user plane network element; and the second network element is a control plane network element or a user plane network element.
  13. 一种信令处理网元,其特征在于,包括:A signaling processing network element, comprising:
    接收单元,用于接收第一控制面网元发送的第一探测信息,及,接收第二用户面网元发送的第二探测信息;所述第一探测信息用于指示所述第一控制面网元得到的第一用户面网元的状态,所述第二探测信息用于指示所述第二用户面网元得到的所述第一用户面网元的状态;a receiving unit, configured to receive first probe information sent by the first control plane network element, and receive second probe information sent by the second user plane network element; the first probe information is used to indicate the first control plane a state of the first user plane network element obtained by the network element, where the second probe information is used to indicate a state of the first user plane network element obtained by the second user plane network element;
    处理单元,用于根据所述第一探测信息和所述第二探测信息确定故障类型,所述故障类型包括所述第一用户面网元故障,或所述第一控制面网元与所述第一用户面网元之间的链路故障。a processing unit, configured to determine a fault type according to the first probe information and the second probe information, where the fault type includes the first user plane network element fault, or the first control plane network element and the The link between the first user plane network elements is faulty.
  14. 如权利要求13所述的信令处理网元,其特征在于,所述处理单元用于:The signaling processing network element according to claim 13, wherein the processing unit is configured to:
    若所述第一探测信息和所述第二探测信息均指示所述第一用户面网元故障,则确定故障类型为所述第一用户面网元故障;或If the first probe information and the second probe information both indicate that the first user plane network element is faulty, determining that the fault type is the first user plane network element fault; or
    若所述第一探测信息指示所述第一用户面网元故障,而所述第二探测信息指示所述第一用户面网元正常,则确定故障类型为所述第一用户面网元正常,所述第一控制面网元与所述第一用户面网元之间的链路故障。If the first probe information indicates that the first user plane network element is faulty, and the second probe information indicates that the first user plane network element is normal, determining that the fault type is the first user plane network element is normal. The link between the first control plane network element and the first user plane network element is faulty.
  15. 如权利要求14所述的信令处理网元,其特征在于,所述信令处理网元还包括发送单元;所述处理单元还用于:The signaling processing network element according to claim 14, wherein the signaling processing network element further comprises a sending unit; the processing unit is further configured to:
    在确定故障类型为所述第一控制面网元与所述第一用户面网元之间的链路故障之后,重新为所述第一用户面网元选择控制面网元,并通过所述发送单元向重新选择的控制面网元发送所述第一用户面网元的标识,使得重新选择的控制面网元管理所述第一用户面网元;或After determining that the fault type is a link fault between the first control plane network element and the first user plane network element, reselecting the control plane network element for the first user plane network element, and by using the Sending, by the sending unit, the identifier of the first user plane network element to the reselected control plane network element, so that the reselected control plane network element manages the first user plane network element; or
    在确定故障类型为所述第一控制面网元与所述第一用户面网元之间的链 路故障之后,指示所述第一控制面网元等待链路恢复。Determining a fault type as a chain between the first control plane network element and the first user plane network element After the road failure, the first control plane network element is instructed to wait for link recovery.
  16. 如权利要求13-15任一所述的信令处理网元,其特征在于,A signaling processing network element according to any of claims 13-15, characterized in that
    所述第一控制面网元为控制面服务网关,所述第二用户面网元为基站,所述第一用户面网元为用户面服务网关,所述信令处理网元为移动管理实体;或The first control plane network element is a control plane serving gateway, the second user plane network element is a base station, the first user plane network element is a user plane serving gateway, and the signaling processing network element is a mobility management entity. ;or
    所述第一控制面网元为控制面分组数据网网关,所述第二用户面网元为用户面服务网关,所述第一用户面网元为用户面分组数据网网关,所述信令处理网元为移动管理实体;或The first control plane network element is a control plane packet data network gateway, the second user plane network element is a user plane serving gateway, and the first user plane network element is a user plane packet data network gateway, and the signaling Processing the network element as a mobility management entity; or
    所述第一控制面网元为控制面服务网关,所述第二用户面网元为用户面分组数据网网关,所述第一用户面网元为用户面服务网关,所述信令处理网元为移动管理实体。The first control plane network element is a control plane serving gateway, the second user plane network element is a user plane packet data network gateway, and the first user plane network element is a user plane serving gateway, and the signaling processing network The yuan is a mobile management entity.
  17. 一种信令处理网元,其特征在于,包括:A signaling processing network element, comprising:
    处理单元,用于获得第一探测信息;所述第一探测信息用于指示第一控制面网元的状态;a processing unit, configured to obtain first probe information, where the first probe information is used to indicate a state of the first control plane network element;
    接收单元,用于接收第一用户面网元发送的第二探测信息;其中,所述第二探测信息用于指示所述第一用户面网元得到的第二用户面网元的状态;a receiving unit, configured to receive second probe information that is sent by the first user plane network element, where the second probe information is used to indicate a state of the second user plane network element obtained by the first user plane network element;
    所述处理单元还用于:根据所述第一探测信息和所述第二探测信息确定故障类型,所述故障类型包括仅所述第一控制面网元故障,或仅所述第二用户面网元故障,或所述第一控制面网元与所述第二用户面网元均故障;且,若所述第一控制面网元与所述第二用户面网元均故障,且所述第一控制面网元管理所述第二用户面网元,则释放所述第一控制面网元和/或所述第二用户面网元关联的业务。The processing unit is further configured to: determine a fault type according to the first probe information and the second probe information, where the fault type includes only the first control plane network element fault, or only the second user plane The network element is faulty, or the first control plane network element and the second user plane network element are both faulty; and, if the first control plane network element and the second user plane network element are both faulty, and The first control plane network element manages the second user plane network element, and releases the service associated with the first control plane network element and/or the second user plane network element.
  18. 如权利要求17所述的信令处理网元,其特征在于,所述处理单元用于根据所述第一探测信息和所述第二探测信息确定故障类型,包括:The signaling processing network element according to claim 17, wherein the processing unit is configured to determine a fault type according to the first probe information and the second probe information, including:
    若所述第一探测信息指示所述第一控制面网元故障,则确定所述第一控制面网元故障;或Determining that the first control plane network element is faulty if the first probe information indicates that the first control plane network element is faulty; or
    若所述第二探测信息指示所述第二用户面网元故障,则确定所述第二用 户面网元故障。Determining the second use if the second probe information indicates that the second user plane network element is faulty The user network element is faulty.
  19. 如权利要求18所述的信令处理网元,其特征在于,所述信令处理网元还包括发送单元;所述处理单元还用于:The signaling processing network element according to claim 18, wherein the signaling processing network element further comprises a sending unit; the processing unit is further configured to:
    在确定故障类型之后,若仅所述第一控制面网元故障,重新为所述第二用户面网元选择控制面网元,并通过所述发送单元向重新选择的控制面网元发送所述第二用户面网元的标识,使得重新选择的控制面网元管理所述第二用户面网元。After determining the fault type, if only the first control plane network element fails, the control plane network element is newly selected for the second user plane network element, and is sent to the reselected control plane network element by using the sending unit. The identifier of the second user plane network element is configured, so that the reselected control plane network element manages the second user plane network element.
  20. 如权利要求17-19任一所述的信令处理网元,其特征在于,所述处理单元用于获得第一探测信息,包括:The signaling processing network element according to any one of claims 17 to 19, wherein the processing unit is configured to obtain the first detection information, including:
    获得所述接收单元所接收的第二控制面网元发送的所述第一探测信息,所述第一探测信息用于指示所述第二控制面网元得到的所述第一控制面网元的状态;或Obtaining the first detection information that is sent by the second control plane network element that is received by the receiving unit, where the first detection information is used to indicate the first control plane network element obtained by the second control plane network element State; or
    对所述第一控制面网元进行探测,根据探测结果生成所述第一探测信息。The first control plane network element is detected, and the first probe information is generated according to the detection result.
  21. 如权利要求20所述的信令处理网元,其特征在于,The signaling processing network element of claim 20, wherein
    所述信令处理网元为移动管理实体,所述第一控制面网元为控制面服务网关,所述第一用户面网元为基站,所述第二用户面网元为用户面服务网关;或The signaling processing network element is a mobility management entity, the first control plane network element is a control plane serving gateway, the first user plane network element is a base station, and the second user plane network element is a user plane serving gateway. ;or
    所述信令处理网元为移动管理实体,所述第二控制面网元为控制面服务网关,所述第一控制面网元为控制面分组数据网网关,所述第一用户面网元为用户面服务网关,所述第二用户面网元为用户面分组数据网网关。The signaling processing network element is a mobility management entity, the second control plane network element is a control plane serving gateway, and the first control plane network element is a control plane packet data network gateway, and the first user plane network element Serving the gateway for the user plane, the second user plane network element is a user plane packet data network gateway.
  22. 一种软件定义网络SDN控制器,其特征在于,包括:A software defined network SDN controller, comprising:
    处理单元,用于对第一交换机进行探测,获得第一探测信息;a processing unit, configured to detect the first switch, to obtain first detection information;
    接收单元,用于接收第二交换机发送的第二探测信息;所述第二探测信息用于指示所述第二交换机得到的所述第一交换机的状态;a receiving unit, configured to receive second detection information sent by the second switch, where the second detection information is used to indicate a status of the first switch obtained by the second switch;
    所述处理单元,还用于根据所述第一探测信息和所述第二探测信息确定故障类型,所述故障类型包括所述第一交换机故障,或所述SDN控制器与所述第一交换机之间的链路故障。 The processing unit is further configured to determine a fault type according to the first probe information and the second probe information, where the fault type includes the first switch fault, or the SDN controller and the first switch The link between the faults.
  23. 一种网元,其特征在于,包括:A network element, comprising:
    处理单元,用于通过探测获知第二网元的状态为故障,根据对所述第二网元的探测生成探测信息;a processing unit, configured to learn, by detecting, that the state of the second network element is a fault, and generate detection information according to the detection of the second network element;
    发送单元,用于将所述探测信息发送给信令处理网元;所述探测信息中携带所述第二网元的标识,所述探测信息用于确定故障类型。And a sending unit, configured to send the detection information to the signaling processing network element, where the detection information carries an identifier of the second network element, where the detection information is used to determine a fault type.
  24. 如权利要求23所述的网元,其特征在于,所述网元为控制面网元或用户面网元;所述第二网元为控制面网元或用户面网元。 The network element according to claim 23, wherein the network element is a control plane network element or a user plane network element; and the second network element is a control plane network element or a user plane network element.
PCT/CN2016/101284 2016-09-30 2016-09-30 Fault processing method and device WO2018058618A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2016/101284 WO2018058618A1 (en) 2016-09-30 2016-09-30 Fault processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2016/101284 WO2018058618A1 (en) 2016-09-30 2016-09-30 Fault processing method and device

Publications (1)

Publication Number Publication Date
WO2018058618A1 true WO2018058618A1 (en) 2018-04-05

Family

ID=61763552

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/101284 WO2018058618A1 (en) 2016-09-30 2016-09-30 Fault processing method and device

Country Status (1)

Country Link
WO (1) WO2018058618A1 (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020200136A1 (en) * 2019-03-29 2020-10-08 华为技术有限公司 Gateway selection system and method
CN112019378A (en) * 2020-08-04 2020-12-01 中国联合网络通信集团有限公司 Troubleshooting method and device
CN112867039A (en) * 2019-11-28 2021-05-28 大唐移动通信设备有限公司 User plane network element fault processing method and device
WO2021136047A1 (en) * 2019-12-31 2021-07-08 华为技术有限公司 Fault recovery method and apparatus for gateway
CN113242141A (en) * 2021-03-31 2021-08-10 联想(北京)有限公司 Fault detection method and device for user plane network element
WO2022012598A1 (en) * 2020-07-17 2022-01-20 华为技术有限公司 Communication method and apparatus
CN113992557A (en) * 2021-09-10 2022-01-28 新华三信息安全技术有限公司 Message processing method and device
CN115460635A (en) * 2021-06-08 2022-12-09 中国移动通信集团重庆有限公司 Fault detection method, device, equipment and computer storage medium
CN116367204A (en) * 2023-05-31 2023-06-30 阿里巴巴(中国)有限公司 User equipment service processing method, electronic equipment, storage medium and system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102137487A (en) * 2010-12-31 2011-07-27 华为技术有限公司 Method and equipment for selecting service gateway
CN102821415A (en) * 2012-08-21 2012-12-12 华为技术有限公司 Fault detecting and processing method and fault detecting and processing device
CN103139820A (en) * 2013-03-12 2013-06-05 华为技术有限公司 Link detection method and network elements
EP2615874A1 (en) * 2010-09-08 2013-07-17 Huawei Technologies Co., Ltd. Paging processing method and system, serving gateway

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2615874A1 (en) * 2010-09-08 2013-07-17 Huawei Technologies Co., Ltd. Paging processing method and system, serving gateway
CN102137487A (en) * 2010-12-31 2011-07-27 华为技术有限公司 Method and equipment for selecting service gateway
CN102821415A (en) * 2012-08-21 2012-12-12 华为技术有限公司 Fault detecting and processing method and fault detecting and processing device
CN103139820A (en) * 2013-03-12 2013-06-05 华为技术有限公司 Link detection method and network elements

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020200136A1 (en) * 2019-03-29 2020-10-08 华为技术有限公司 Gateway selection system and method
CN111757401A (en) * 2019-03-29 2020-10-09 华为技术有限公司 Gateway selection system and method
CN111757401B (en) * 2019-03-29 2021-12-14 华为技术有限公司 Gateway selection system and method
CN112867039A (en) * 2019-11-28 2021-05-28 大唐移动通信设备有限公司 User plane network element fault processing method and device
WO2021136047A1 (en) * 2019-12-31 2021-07-08 华为技术有限公司 Fault recovery method and apparatus for gateway
WO2022012598A1 (en) * 2020-07-17 2022-01-20 华为技术有限公司 Communication method and apparatus
CN112019378A (en) * 2020-08-04 2020-12-01 中国联合网络通信集团有限公司 Troubleshooting method and device
CN113242141A (en) * 2021-03-31 2021-08-10 联想(北京)有限公司 Fault detection method and device for user plane network element
CN113242141B (en) * 2021-03-31 2022-07-26 联想(北京)有限公司 Fault detection method and device for user plane network element
CN115460635A (en) * 2021-06-08 2022-12-09 中国移动通信集团重庆有限公司 Fault detection method, device, equipment and computer storage medium
CN113992557A (en) * 2021-09-10 2022-01-28 新华三信息安全技术有限公司 Message processing method and device
CN116367204A (en) * 2023-05-31 2023-06-30 阿里巴巴(中国)有限公司 User equipment service processing method, electronic equipment, storage medium and system
CN116367204B (en) * 2023-05-31 2023-09-12 阿里巴巴(中国)有限公司 User equipment service processing method, electronic equipment, storage medium and system

Similar Documents

Publication Publication Date Title
WO2018058618A1 (en) Fault processing method and device
US11943676B2 (en) Switching between network based and relay based operation for mission critical voice call
RU2769279C1 (en) Fault handling of the main cot group by the main node
JP6262783B2 (en) Mobile gateway in the pool for session resilience
EP2346215B1 (en) Equipment pool management method, node equipment and communication system
US10567216B2 (en) Fault detection method, gateway, user equipment, and communications system
CN112703773A (en) Systems, devices and methods for connection re-establishment via alternative routes due to radio link failure in integrated access and backhaul
JP2022502926A (en) UE migration method, equipment, system, and storage medium
WO2012000234A1 (en) Method, apparatus and system for fast switching between links
WO2014183715A1 (en) Gateway update information notification method, and controller
WO2019134649A1 (en) Implementation method and apparatus for control-plane resource migration, and network functional entity
AU2019272364A1 (en) Communication method and communications apparatus
WO2012071695A1 (en) Node fault processing method, system and related device
CN111757362A (en) Link state notification and link processing method and device
CN101902715A (en) Emergency service processing method, equipment and network system
WO2012062164A1 (en) Method, device and system for path switching
EP2928229B1 (en) Cross-device linear multiplex section overhead method, gateway and controller
CN107222883B (en) Wireless controller backup method, backup switching method, device and system
Ramanathan et al. Performance evaluation of two service recovery strategies in cloud-native radio access networks
WO2012109998A1 (en) Method and device for maintaining selected ip traffic offload connection during handover
EP3529954B1 (en) Method and apparatuses for attaching a radio base station to a core network node
WO2023274730A2 (en) Network resilience
WO2020200136A1 (en) Gateway selection system and method
CN116671149A (en) Communication control method
WO2018223649A1 (en) Method for re-establishing route in ultra-dense network (udn), and micro base station

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16917368

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16917368

Country of ref document: EP

Kind code of ref document: A1