WO2023155468A1 - 一种确定根因故障的方法及装置 - Google Patents

一种确定根因故障的方法及装置 Download PDF

Info

Publication number
WO2023155468A1
WO2023155468A1 PCT/CN2022/127162 CN2022127162W WO2023155468A1 WO 2023155468 A1 WO2023155468 A1 WO 2023155468A1 CN 2022127162 W CN2022127162 W CN 2022127162W WO 2023155468 A1 WO2023155468 A1 WO 2023155468A1
Authority
WO
WIPO (PCT)
Prior art keywords
objects
alarm information
network element
association relationship
root cause
Prior art date
Application number
PCT/CN2022/127162
Other languages
English (en)
French (fr)
Inventor
王姗姗
丁鼎
具睿
杨永军
王海明
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP22926772.9A priority Critical patent/EP4451729A1/en
Priority to KR1020247026778A priority patent/KR20240134185A/ko
Publication of WO2023155468A1 publication Critical patent/WO2023155468A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0631Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis
    • H04L41/065Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis involving logical or physical relationship, e.g. grouping and hierarchies
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0631Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis
    • H04L41/064Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis involving time analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/085Retrieval of network configuration; Tracking network configuration history
    • H04L41/0853Retrieval of network configuration; Tracking network configuration history by actively collecting configuration information or by backing up configuration information
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W24/00Supervisory, monitoring or testing arrangements
    • H04W24/04Arrangements for maintaining operational condition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W88/00Devices specially adapted for wireless communication networks, e.g. terminals, base stations or access point devices
    • H04W88/08Access point devices
    • H04W88/085Access point devices with remote components

Definitions

  • the present disclosure relates to the technical field of communications, and in particular to a method and device for determining a root cause failure.
  • a wireless communication network such as a mobile communication network
  • the services supported by the network are becoming more and more diverse, so more and more requirements need to be met.
  • the network needs to be able to support ultra-high speed, ultra-low latency, and/or very large connections.
  • This feature makes network planning, network configuration, and/or resource scheduling increasingly complex.
  • the present disclosure provides a method and device for determining root cause failures, so as to determine root cause failures of multiple alarm information, so that work orders can be dispatched only for root cause failures, reducing operation and maintenance costs.
  • a method for determining the root cause of a fault is provided.
  • the method is executed by a first network element, and the first network element may be a CU, DU, RU, or SMOF, etc., or may be a network element in the first network element.
  • a component processor, chip or others
  • the first network element can determine the root cause failure of the alarm information of multiple objects, and then only the alarm information corresponding to the root cause failure can be reported to the operator network management, and a work order is dispatched. For other alarms except the root cause failure Information no longer dispatches work orders, reducing operation and maintenance costs.
  • the determining the root cause failure of the alarm information of the N objects according to the object association relationship includes: determining the N The root cause failure of the alarm information of an object.
  • the determining the root cause failure of the alarm information of the N objects according to the object association relationship and the generation time of the alarm information of the N objects includes: determining X according to the object association relationship association relationship sets; for each association relationship set: according to the generation time of the alarm information of the objects included in the association relationship set, determine the L objects in the association relationship set, and the alarm information of the L objects The generation time difference is less than (or, less than or equal to) a threshold; determine the root cause failure of the alarm information of the L objects; wherein the root cause failure is the alarm information of at least one object in the L objects, and the X and L are all positive integers.
  • the association relationship set includes at least one object with an association relationship; then according to the generation time of different alarm information, in the association relationship set, eliminate Objects that meet the condition. Finally, for each association relationship set, the alarm information of one or more objects is determined as the root cause failure, which ensures the accuracy of the determined root cause failure.
  • the determining the root cause failure of the alarm information of the N objects according to the object association relationship and the generation time of the alarm information of the N objects includes: according to the object association relationship and the For the generation time of the alarm information of N objects, determine P association relationship sets; for each association relationship set: determine the root cause failure of the alarm information of Q objects included in the association relationship set; wherein, the root cause The fault is the alarm information of at least one object among the Q objects, the Q objects are related, and the generation time difference of the alarm information of the Q objects with the related relationship is less than a threshold, and the P and Q are both is a positive integer.
  • the determined association relationship set includes at least one object that has an association relationship, and the generation time of the alarm information of the objects included in the set satisfies a condition. For each association relationship set, at least one or more object alarm information is determined as the root cause failure, which ensures the accuracy of the determined root cause failure.
  • the N objects include N1 objects and N2 objects, the N1 and N2 are both positive integers, and the sum of the two is equal to N, and the determining the alarm information of the N objects includes Detecting the alarm information of the N1 objects; receiving the alarm information of the N2 objects from the second network element.
  • the first network element can detect the alarm information of N1 objects and receive the alarm information of N2 objects from the second network element. Determine the root faults of the alarm information of the N1 objects and the alarm information of the N2 objects. Work orders can be dispatched only for root cause failures, and no work orders are dispatched for other alarm information, reducing operation and maintenance costs.
  • the first network element is a centralized unit CU
  • the second network element is a distributed unit DU
  • the alarm information of the N1 objects includes the alarm information of the objects of the CU
  • the alarm information of the N2 objects includes at least one of the following: the alarm information of the object of the DU, the alarm information of the object of the wireless unit RU, the alarm information of the cloud resource object corresponding to the RU, or the alarm information of the object corresponding to the DU.
  • Alarm information of cloud resource objects includes at least one of the following: the alarm information of the object of the DU, the alarm information of the object of the wireless unit RU, the alarm information of the cloud resource object corresponding to the RU, or the alarm information of the object corresponding to the DU.
  • the first network element is a CU
  • the second network element is a DU.
  • the DU collects alarm information and reports the collected alarm information to the CU.
  • the alarm information collected by the DU includes alarm information of N2 objects.
  • the CU detects the alarm information of N1 objects; the CU can analyze the root causes of the alarm information of the N1 objects and the alarm information of the N2 objects, determine the root cause of the fault, and reduce operation and maintenance costs.
  • the first network element is a DU
  • the second network element is an RU
  • the alarm information of the N1 objects includes the alarm information of the objects of the DU
  • the alarm information of the N2 objects includes at least one of the following: alarm information of the object of the RU, or alarm information of the object of the cloud resource corresponding to the RU.
  • the first network element is a DU
  • the second network element is an RU
  • the RU collects alarm information of N2 objects and reports them to the DU.
  • the DU performs root cause analysis on the detected alarm information of N1 objects and the received alarm information of N2 objects to determine the root cause of the fault and reduce operation and maintenance costs.
  • the first network element is RU, DU, or CU
  • the alarm information of the N1 objects includes the alarm information of the objects of the first network element
  • the second network element is all The cloud resource corresponding to the first network element
  • the alarm information of the N2 objects includes the alarm information of the object of the cloud resource corresponding to the first network element.
  • the first network element is RU, DU or CU, and the first network element is RU as an example.
  • the cloud resource may report the alarm information to the RU.
  • the RU detects the alarm information of the N1 objects of the RU and the alarm information of the N2 objects corresponding to the RU reported by the cloud resource, it can perform root cause analysis to determine the root cause of the failure and reduce operation and maintenance costs.
  • the alarm information of each object is the first type of alarm information
  • the first type of alarm information includes at least one of the following: the identifier of the object, the second An identifier of a network element, or an identifier of a network element related to the second network element.
  • the alarm information generated for CU, DU, RU, and cloud resources is not reported to the corresponding management or network elements with corresponding relationship.
  • the alarm information generated or collected by the RU is reported to the DU that manages or has a corresponding relationship
  • the alarm information generated or collected by the DU is reported to or has a corresponding relationship with the CU.
  • the alarm information generated by the cloud resource Cloud can be reported to the network element corresponding to the alarm information.
  • the root cause analysis is performed by the CU, DU or RU to determine the root cause failure and reduce operation and maintenance costs.
  • the alarm information reported to the upper layer network element has been further improved compared with the alarm information, that is, the above-mentioned first type of alarm information, the alarm information can no longer include faults Cause and fault identification, etc. Instead, it includes an object identifier, an identifier of a corresponding network element, or an association relationship between different network elements.
  • the association relationship among different network elements can be easily obtained according to the association relationship among different network elements included in the alarm information, so as to improve the efficiency of root cause analysis.
  • the method further includes: sending first indication information to a third network element, where the first indication information is used to indicate a root cause failure of the alarm information of the N objects.
  • the third network element can be an SMOF, etc.
  • the first network element can be a CU, DU, or RU, etc.
  • CU, DU or RU can report the determined root cause failure to SMOF.
  • SMOF can only report the alarm information corresponding to the root cause of the fault to the operator's network management, reducing operation and maintenance costs.
  • the method further includes: acquiring the object association relationship, where the object association relationship is indicated by a configuration file or a configuration message from a third network element.
  • the third network element can be the SMOF, and the SMOF can configure the association relationship of objects to the CU, DU, or RU.
  • SMOF can use big data or artificial intelligence to update the relationship between objects periodically or based on trigger conditions, and can synchronize the updated object relationship to CU, DU or RU.
  • the object association relationship can be flexibly configured or updated, and the accuracy of determining the root cause of the fault can be improved.
  • the obtaining the object association relationship includes: receiving a first configuration file or a first configuration message from the third network element, where the first configuration file or the first configuration message is used to indicate the first object association relationship; receiving a second configuration file or a second configuration message from the third network element, where the second configuration file or second configuration message is used to indicate the second object association relationship; according to the first The object association relationship and the second object association relationship determine the object association relationship.
  • the N objects include N1 objects, N2 objects, and N3 objects, the values of N1, N2, and N3 are all positive integers, and the sum of the three is equal to N, and the Determining alarm information of N objects includes: receiving alarm information from the first network element, where the alarm information includes alarm information of the N1 objects; receiving alarm information from the second network element, the The alarm information includes the alarm information of the N2 objects; receiving the alarm information from the third network element, the alarm information includes the alarm information of the N3 objects.
  • the fourth network element respectively receives the alarm information from the first network element, the second network element and the third network element. Perform root cause analysis on the alarm information of the above three network elements to determine the root cause of the fault. In the future, only the root cause of the fault can be reported to the network management of the operator, and a work order will be dispatched, and no work order will be dispatched for other alarm information, reducing operation and maintenance costs.
  • the alarm information is the second type of alarm information
  • the second type of alarm information includes at least one of the following: an object identifier, an identifier of a corresponding network element, a fault identifier, or a fault cause.
  • a method for determining the root cause of the fault is provided, the method is executed by a second network element, and the second network element is a cloud resource Cloud, RU or DU, etc., or is a component in the second network element (processor, chip or others), including: sending alarm information of N2 objects to the first network element, the alarm information is the first type of alarm information; for the N2 objects, the first type of each object
  • the alarm information includes at least one of the following: an identifier of the object, an identifier of a second network element, or an identifier of a network element related to the second network element; wherein, N2 is a positive integer.
  • a method for determining a root cause failure is provided, the execution subject of the method is a third network element, the third network element is an SMOF, etc., or a component (processor, chip or other) in the third network element etc.), including: receiving first indication information, where the first indication information is used to indicate the root cause failure of the alarm information of N objects; the root cause failure is the alarm information of M objects in the N objects,
  • the N is a positive integer greater than or equal to 2
  • the M is a positive integer less than or equal to N.
  • the first network element can report the analyzed root cause failure indication information to the third network element, and the third network element can determine the root cause failure according to the indication information, and report the root cause failure corresponding to the operator network management Alarm information, reducing operation and maintenance costs.
  • the third network element may supplement and/or update the root cause fault analyzed by the first network element to improve the accuracy of determining the root cause fault.
  • the method further includes: sending a configuration file or a configuration message for indicating an object association relationship to the first network element.
  • the sending the configuration file or the configuration message used to indicate the object association relationship to the first network element includes: sending the first configuration file or the first configuration message to the first network element, the first network element A configuration file or a first configuration message is used to indicate the first object association relationship; a second configuration file or a second configuration message is sent to the first network element, and the second configuration file or the second configuration message is used to indicate the first object association relationship. Two-object relationship.
  • the N objects include N1 objects and N2 objects, both N1 and N2 are positive integers, and the sum of the two is equal to N;
  • the N1 objects include the centralized unit CU Objects
  • the N2 objects include at least one of the following: an object of a distributed unit DU, an object of a cloud resource corresponding to a DU, an object of a wireless unit RU, or an object of a cloud resource corresponding to an RU; or, the N1 objects include a DU
  • the N2 objects include at least one of the following: the object of the RU, or the object of the cloud resource corresponding to the RU; or, the N1 objects include the object of the CU, the object of the DU, or the object of the RU
  • the N2 objects include objects corresponding to cloud resources of the CU, objects corresponding to cloud resources of the DU, or objects corresponding to cloud resources of the RU.
  • the N objects include N1 objects, N2 objects, and N3 objects, and the N1, N2, and N3 are all positive integers, and the sum of the three is equal to N; the N1 objects CU objects are included in the N2 objects, DU objects are included in the N2 objects, and RU objects are included in the N3 objects.
  • the device may be the first network element, or a device configured in the first network element, or a device that can be used in conjunction with the first network element , the first network element may be a CU, DU, RU, or SMOF.
  • the device includes a one-to-one unit for performing the method/operation/step/action described in the first aspect, and the unit may be a hardware circuit, software, or a combination of hardware circuit and software.
  • the device may include a processing unit and a transceiver unit, and the processing unit and the transceiver unit may perform corresponding functions in any design example of the first aspect above, specifically:
  • a processing unit configured to determine alarm information of N objects, where N is an integer greater than or equal to 2.
  • the processing unit is further configured to determine the root cause failure of the alarm information of the N objects according to the object association relationship; wherein the root cause failure is the alarm information of M objects in the N objects, and the M is A positive integer less than or equal to N.
  • the transceiver unit is configured to receive corresponding information from other network elements.
  • the device includes a processor configured to implement the method described in the first aspect above.
  • the apparatus may also include memory for storing instructions and/or data.
  • the memory is coupled to a processor, and the processor executes the program instructions stored in the memory to implement the method described in the first aspect above.
  • the device includes:
  • a processor configured to determine the alarm information of N objects, where N is an integer greater than or equal to 2; determine the root cause failure of the alarm information of the N objects according to the object association relationship; wherein, the root cause failure is the alarm information of M objects among the N objects, and the M is a positive integer less than or equal to N.
  • the device may be the second network element, or a device configured in the second network element, or a device that can be used in conjunction with the second network element , the second network element may be a cloud resource Cloud, RU, or DU.
  • the device includes a one-to-one unit for performing the methods/operations/steps/actions described in the second aspect.
  • the unit may be a hardware circuit, or software, or a hardware circuit combined with software.
  • the device includes a transceiver unit.
  • a processing unit may also be included.
  • the communication unit and the transceiver unit can perform the corresponding functions in any design example of the second aspect above, specifically:
  • the communication unit is configured to send alarm information of N2 objects to the first network element, where the alarm information is the first type of alarm information; wherein, for the N2 objects, the first type of alarm information of each object includes the following At least one item: an identifier of the object, an identifier of a second network element, or an identifier of a network element associated with the second network element; wherein, N2 is a positive integer.
  • the processing unit is configured to determine alarm information of N2 objects.
  • the device includes a processor configured to implement the method described in the second aspect above.
  • the apparatus may also include memory for storing programs and/or instructions.
  • the memory is coupled to the processor, and when the processor executes the program instructions stored in the memory, the method of the second aspect above can be realized.
  • the device may also include a communication interface, which may be used for the device to communicate with other devices.
  • the communication interface may be a transceiver, circuit, bus, module, pin or other types of communication interface.
  • the unit includes:
  • a processor configured to determine alarm information of N objects
  • a communication interface configured to send alarm information of N2 objects to the first network element, where the alarm information is the first type of alarm information; for the N2 objects, the first type of alarm information of each object includes at least one of the following Item: the identifier of the object, the identifier of the second network element, or the identifier of a network element related to the second network element; wherein, the N2 is a positive integer.
  • the sixth aspect provides a device.
  • the device may be a third network element, or a device configured in the third network element, or a device that can be used in conjunction with the third network element .
  • the third network element may be an SMOF or the like.
  • the device includes a one-to-one unit for performing the methods/operations/steps/actions described in the third aspect above.
  • the unit may be a hardware circuit, or software, or a combination of hardware circuit and software.
  • the device includes a transceiver unit.
  • a processing unit may also be included.
  • the communication unit and the transceiver unit can perform the corresponding functions in any design example of the third aspect above, specifically:
  • a transceiver unit configured to receive first indication information, where the first indication information is used to indicate a root cause failure of alarm information of N objects; wherein, the root cause failure is an alarm of M objects among the N objects information, the N is a positive integer greater than or equal to 2, and the M is a positive integer less than or equal to N.
  • the processing unit is configured to process the first indication information.
  • the device includes a processor configured to implement the method described in the third aspect above.
  • the apparatus may also include memory for storing programs and/or instructions.
  • the memory is coupled to the processor, and when the processor executes the program instructions stored in the memory, the method in the third aspect above can be implemented.
  • the device may also include a communication interface, which may be used for the device to communicate with other devices.
  • the communication interface may be a transceiver, circuit, bus, module, pin or other types of communication interface.
  • the unit includes:
  • a communication interface configured to receive first indication information, where the first indication information is used to indicate a root cause failure of alarm information of N objects; wherein, the root cause failure is an alarm of M objects in the N objects information, the N is a positive integer greater than or equal to 2, and the M is a positive integer less than or equal to N.
  • a processor configured to process the first indication information.
  • a computer-readable storage medium including instructions, which, when run on a computer, cause the computer to execute the method of the first aspect, the second aspect or the third aspect.
  • a chip system includes a processor and may further include a memory for implementing the method of the first aspect, the second aspect, or the third aspect.
  • the system-on-a-chip may consist of chips, or may include chips and other discrete devices.
  • a computer program product including instructions, which, when run on a computer, cause the computer to execute the method of the first aspect, the second aspect or the third aspect.
  • a system in a tenth aspect, includes the device in the fourth aspect, the device in the fifth aspect, and the device in the sixth aspect.
  • FIG. 1 is a schematic diagram of a communication system provided by the present disclosure
  • FIG. 2 is an architecture diagram of the O-RAN provided by the present disclosure
  • Fig. 3 is the flowchart of determining the root cause failure provided by the present disclosure
  • FIG. 4 is a schematic diagram of reporting alarm information provided by the present disclosure.
  • 5 and 6 are schematic structural diagrams of the device provided by the present disclosure.
  • FIG. 1 is a schematic structural diagram of a communication system 1000 to which the present disclosure can be applied.
  • the communication system includes a radio access network 100 and a core network 200 , and optionally, the communication system 1000 may also include the Internet 300 .
  • the radio access network 100 may include at least one access network device (such as 110a and 110b in FIG. 1 ), and may also include at least one terminal device (such as 120a-120j in FIG. 1 ).
  • the terminal device is connected to the access network device in a wireless manner, and the access network device is connected to the core network in a wireless or wired manner.
  • the core network device and the access network device can be independent and different physical devices, or the functions of the core network device and the logical functions of the access network device can be integrated on the same physical device, or they can be integrated on one physical device Functions of some core network devices and functions of some access network devices.
  • Terminal devices and terminal devices, and access network devices and access network devices may be connected to each other in a wired or wireless manner.
  • FIG. 1 is only a schematic diagram.
  • the communication system may include other network devices, such as wireless relay devices and wireless backhaul devices, which are not shown in FIG. 1 .
  • the access network equipment can be a base station (base station), an evolved base station (evolved NodeB, eNodeB), a transmission reception point (transmission reception point, TRP), and a next-generation base station in the fifth generation (5th generation, 5G) mobile communication system (next generation NodeB, gNB), access network equipment in the open radio access network (open radio access network, O-RAN), next-generation base stations in the sixth generation (6th generation, 6G) mobile communication system, future mobile
  • DU distributed unit
  • CU control plane centralized unit control plane
  • CU-CP centralized unit control plane
  • CU user plane centralized unit user plane
  • the access network device may be a macro base station (such as 110a in Figure 1), a micro base station or an indoor station (such as 110b in Figure 1), or a relay node or a donor node.
  • the specific technology and specific device form adopted by the access network device are not limited in the present disclosure.
  • the device for implementing the function of the access network device may be the access network device; it may also be a device capable of supporting the access network device to realize the function, such as a chip system, a hardware circuit, a software module, or a hardware A circuit plus a software module, the device can be installed in the access network equipment or can be matched with the access network equipment for use.
  • the system-on-a-chip may be composed of chips, and may also include chips and other discrete devices.
  • the technical solutions provided by the present disclosure will be described below by taking the apparatus for realizing the functions of the access network equipment as the access network equipment and the access network equipment as the base station as an example.
  • the protocol layer structure may include a control plane protocol layer structure and a user plane protocol layer structure.
  • the control plane protocol layer structure may include a radio resource control (radio resource control, RRC) layer, a packet data convergence protocol (packet data convergence protocol, PDCP) layer, a radio link control (radio link control, RLC) layer, a media The access control (media access control, MAC) layer and the function of the protocol layer such as the physical layer.
  • the user plane protocol layer structure may include the functions of the PDCP layer, the RLC layer, the MAC layer, and the physical layer.
  • the PDCP layer may also include a service data adaptation protocol (service data adaptation protocol). protocol, SDAP) layer.
  • the protocol layer structure between the access network device and the terminal device may further include an artificial intelligence (AI) layer, which is used to transmit data related to the AI function.
  • AI artificial intelligence
  • Access devices may include CUs and DUs. Multiple DUs can be centrally controlled by one CU.
  • the interface between the CU and the DU may be referred to as an F1 interface.
  • the control plane (control panel, CP) interface may be F1-C
  • the user plane (user panel, UP) interface may be F1-U.
  • the present disclosure does not limit the specific names of the interfaces.
  • CU and DU can be divided according to the protocol layer of the wireless network: for example, the functions of the PDCP layer and above protocol layers are set in the CU, and the functions of the protocol layers below the PDCP layer (such as RLC layer and MAC layer, etc.) are set in the DU; another example, PDCP The functions of the protocol layer above the layer are set in the CU, and the functions of the PDCP layer and the protocol layer below are set in the DU, without restriction.
  • the CU or DU may be divided into functions having more protocol layers, and for example, the CU or DU may also be divided into part processing functions having protocol layers.
  • part of the functions of the RLC layer and the functions of the protocol layers above the RLC layer are set in the CU, and the rest of the functions of the RLC layer and the functions of the protocol layers below the RLC layer are set in the DU.
  • the functions of the CU or DU can also be divided according to the business type or other system requirements, for example, according to the delay, and the functions whose processing time needs to meet the delay requirement are set in the DU, which does not need to meet the delay
  • the required feature set is in the CU.
  • the CU may also have one or more functions of the core network.
  • the CU can be set on the network side to facilitate centralized management.
  • the wireless unit (radio unit, RU) of the DU is set remotely.
  • the RU may have a radio frequency function.
  • DUs and RUs can be divided in a physical layer (physical layer, PHY).
  • the DU can implement high-level functions in the PHY layer
  • the RU can implement low-level functions in the PHY layer.
  • the functions of the PHY layer may include at least one of the following: adding a cyclic redundancy check (cyclic redundancy check, CRC) code, channel coding, rate matching, scrambling, modulation, layer mapping, precoding, Resource mapping, physical antenna mapping, or radio frequency transmission functions.
  • CRC cyclic redundancy check
  • the functions of the PHY layer may include at least one of the following: CRC check, channel decoding, de-rate matching, descrambling, demodulation, de-layer mapping, channel detection, resource de-mapping, physical antenna de-mapping, or RF receiving function.
  • the high-level functions in the PHY layer may include a part of the functions of the PHY layer, for example, this part of the functions is closer to the MAC layer, and the lower-level functions in the PHY layer may include another part of the functions of the PHY layer, for example, this part of the functions is closer to the radio frequency function.
  • high-level functions in the PHY layer may include adding CRC codes, channel coding, rate matching, scrambling, modulation, and layer mapping
  • low-level functions in the PHY layer may include precoding, resource mapping, physical antenna mapping, and radio transmission functions
  • high-level functions in the PHY layer may include adding CRC codes, channel coding, rate matching, scrambling, modulation, layer mapping, and precoding
  • low-level functions in the PHY layer may include resource mapping, physical antenna mapping, and radio frequency send function.
  • the high-level functions in the PHY layer may include CRC check, channel decoding, de-rate matching, decoding, demodulation, and de-layer mapping
  • the low-level functions in the PHY layer may include channel detection, resource de-mapping, physical antenna de-mapping, and RF receiving functions
  • the high-level functions in the PHY layer may include CRC check, channel decoding, de-rate matching, decoding, demodulation, de-layer mapping, and channel detection
  • the low-level functions in the PHY layer may include resource de-mapping , physical antenna demapping, and RF receiving functions.
  • the function of the CU may be implemented by one entity, or may also be implemented by different entities.
  • the functions of the CU can be further divided, that is, the control plane and the user plane are separated and realized by different entities, namely, the control plane CU entity (ie, the CU-CP entity) and the user plane CU entity (ie, the CU-UP entity) .
  • the CU-CP entity and the CU-UP entity can be coupled with the DU to jointly complete the functions of the access network equipment.
  • any one of the foregoing DU, CU, CU-CP, CU-UP, and RU may be a software module, a hardware structure, or a software module+hardware structure, without limitation.
  • the existence forms of different entities may be different, which is not limited.
  • DU, CU, CU-CP, and CU-UP are software modules
  • RU is a hardware structure.
  • the access network device includes CU-CP, CU-UP, DU and RU.
  • the execution subject of the present disclosure includes DU, or includes DU and RU, or includes CU-CP, DU and RU, or includes CU-UP, DU and RU, without limitation.
  • the methods performed by each module are also within the protection scope of the present disclosure.
  • the terminal equipment may also be called a terminal, user equipment (user equipment, UE), mobile station, mobile terminal equipment, and the like.
  • Terminal devices can be widely used in communication in various scenarios, including but not limited to at least one of the following scenarios: device-to-device (device-to-device, D2D), vehicle-to-everything (V2X), machine-type communication ( machine-type communication (MTC), Internet of Things (IOT), virtual reality, augmented reality, industrial control, automatic driving, telemedicine, smart grid, smart furniture, smart office, smart wear, smart transportation, or intelligence city etc.
  • the terminal device can be a mobile phone, a tablet computer, a computer with wireless transceiver function, a wearable device, a vehicle, a drone, a helicopter, an airplane, a ship, a robot, a mechanical arm, or a smart home device, etc.
  • the present disclosure does not limit the specific technology and specific device form adopted by the terminal device.
  • the device for realizing the function of the terminal device may be a terminal device; it may also be a device capable of supporting the terminal device to realize the function, such as a chip system, a hardware circuit, a software module, or a hardware circuit plus a software module.
  • the device can be installed in the terminal equipment or can be matched with the terminal equipment for use.
  • the technical solution provided by the present disclosure will be described below by taking the terminal device as an example where the apparatus for realizing the functions of the terminal device is a terminal device, and the terminal device is a UE.
  • Base stations and terminal equipment can be fixed or mobile.
  • Base stations and/or terminal equipment can be deployed on land, including indoors or outdoors, handheld or vehicle-mounted; they can also be deployed on water; they can also be deployed on aircraft, balloons and artificial satellites in the air.
  • the present disclosure does not limit the application scenarios of the base station and the terminal equipment.
  • the base station and the terminal device can be deployed in the same scene or in different scenarios. For example, the base station and the terminal device are deployed on land at the same time; or, the base station is deployed on land and the terminal device is deployed on water.
  • the helicopter or drone 120i in FIG. Device 120i is a base station; however, for base station 110a, 120i is a terminal device, that is, communication between 110a and 120i is performed through a wireless air interface protocol. Communication between 110a and 120i may also be performed through an interface protocol between base stations. In this case, relative to 110a, 120i is also a base station. Therefore, both the base station and the terminal equipment can be collectively referred to as a communication device, 110a and 110b in FIG. 1 can be referred to as a communication device with a base station function, and 120a-120j in FIG. 1 can be referred to as a communication device with a terminal device function.
  • the traditional base station is split into cloud resources (Cloud), RU, DU, CU-CP, CU-UP, access network intelligent control ( Different components such as RAN intelligent controller (RIC), service management and orchestration framework (service management and orchestration framework, SMOF).
  • Cloud cloud resources
  • RU RU
  • DU DU
  • CU-CP CU-CP
  • CU-UP access network intelligent control
  • Different components such as RAN intelligent controller (RIC), service management and orchestration framework (service management and orchestration framework, SMOF).
  • RIC RAN intelligent controller
  • SMOF service management and orchestration framework
  • SMOF Provides a service-based framework for network operations and management of supporting networks.
  • the SMOF includes the operation, administration and maintenance (operation, administration and maintenance, OAM) of the cloud infrastructure (such as cloud resources) and the OAM of the base station.
  • OAM operation, administration and maintenance
  • RIC Refer to the software defined network (SDN), introduce intelligent scheduling, and realize the separation of policy and execution.
  • RIC is used to realize artificial intelligence.
  • RIC is divided into near real time (near real time, Near-RT) RIC and non-real time (none real time, None-RT) RIC.
  • the near-real-time RIC is an O-RAN near-real-time RAN intelligent controller, which realizes near-real-time control and optimization of O-RAN elements and resources through the fine-grained data collection and actions of the E2 interface.
  • the non-real-time RIC is the O-RAN non-real-time RAN intelligent controller, which realizes the logic function of non-real-time control and optimization of RAN elements and resources, including artificial intelligence (AI) or machine learning (machine learning) including model training and updating. learning, ML) workflow, and policy-based near real-time RIC application and function guidance.
  • AI artificial intelligence
  • machine learning machine learning
  • learning, ML machine learning
  • policy-based near real-time RIC application and function guidance is optionally, in the example in FIG. 2 , the non-real-time RIC is set in the SMOF.
  • Cloud Resource Cloud The O-RAN Alliance defines Cloud Resource Cloud as a cloud computing platform consisting of a collection of physical infrastructure nodes that meet O-RAN requirements and can host related O-RAN functions, supporting software components, and appropriate management and coordination functions.
  • RU, DU, and Cloud can be called Open RU (open RU, O-RU), Open DU (open DU, O-DU), Open Cloud (open Cloud, O-RU), respectively. Cloud) and so on.
  • each component in the base station reports an alarm message to the operator's network management when a fault is identified, for example, by software.
  • the operator's network management system may be a network management system (network management system, NMS).
  • NMS network management system
  • the operator's monitoring personnel judge and analyze the alarm information, and distribute tasks through work orders for faults that need to be dealt with. Because in the O-RAN architecture, there is an association between different components, a component failure may cause multiple components to report alarm information. If a work order is dispatched for each alarm message, the workload of monitoring and maintenance personnel will increase significantly, and the operation and maintenance cost will be high. Therefore, how to determine the root cause failure of multiple alarm messages is a problem worth studying.
  • the present disclosure provides a method for determining a root-cause failure, in which method: when determining a plurality of alarm information, the root-cause failure can be determined among the plurality of alarm information, and the root-cause failure is one of the plurality of alarm information One or more warning messages. Work orders are dispatched for alarm information of root cause failures, and no work orders are issued for alarm information of non-root cause failures, thereby reducing the workload of monitoring and maintenance personnel and reducing operation and maintenance costs.
  • the method of the present disclosure is applied in the base station of the O-RAN architecture as an example, which is not intended to limit the present disclosure.
  • the method disclosed in the present disclosure can also be applied to other equipment except the O-RAN base station to determine the root cause of failure, such as core network equipment or terminal equipment.
  • the present disclosure provides a process for determining a root cause failure method, at least including:
  • Step 301 the first network element determines alarm information of N objects, where N is an integer greater than or equal to 2.
  • Step 302 The first network element determines the root cause failure of the alarm information of the N objects according to the object association relationship.
  • the root cause failure is alarm information of M objects among the N objects, and M is a positive integer less than or equal to N.
  • the root cause failure of the alarm information of N-M objects is the alarm information of the M objects.
  • the value of N is 3, and the first network element determines alarm information of 3 objects.
  • the object association relationship it is determined that there is an association relationship among the three objects. For example, there is an association relationship between the network element CU of object 1, the network element DU of object 2, and the network element RU of object 3. Then it is determined that the alarm information of the object 3 of the RU is the root cause of the fault between the alarm information of the object 1 of the CU and the alarm information of the object 2 of the DU.
  • the CU and the DU generate alarm information due to a failure of the RU.
  • the root cause failure such as the alarm information of object 3 of the RU
  • the operator's network management issues a work order for corresponding maintenance.
  • the root cause failure such as the alarm information of object 2 of DU and the alarm information of object 1 of CU, etc.
  • the value of N is 5, and the first network element determines alarm information of 5 objects.
  • the first network element determines two association relationship sets according to the object association relationship.
  • one association relationship set includes 3 objects
  • the other association relationship set includes 2 objects.
  • the root cause of the failure is determined.
  • the association relationship set including 2 objects another root cause failure is determined.
  • the value of M is 2.
  • the object association relationship may be pre-configured or stipulated by a protocol.
  • the third network element may configure the object association relationship to the first network element through a configuration file or a configuration message.
  • the third network element may send a configuration file or a configuration message to the first network element, where the configuration file or configuration message is used to indicate an object association relationship.
  • the first network element may obtain the object association relationship through the configuration file or the configuration message.
  • the object association relationship configured by the third network element through the configuration file or configuration message is the association relationship among CU, DU and RU.
  • the third network element may configure object association relationships through multiple configuration files or multiple configuration messages. The first network element stitches together the multiple configured object association relationships to form a final object association relationship.
  • the third network element sends the first configuration file or the first configuration message to the first network element, where the first configuration file or the first configuration message is used to indicate the first object association relationship.
  • the third network element sends a second configuration file or a second configuration message to the first network element, where the second configuration file or the second configuration message is used to configure the second object association relationship.
  • the first network element determines the object association relationship according to the first object association relationship and the second object association relationship.
  • the first object association relationship is the association relationship between CU and DU
  • the second object association relationship is the object association relationship between DU and RU. Through the association relationship between the two, the association relationship among the CU, DU, and RU is finally determined.
  • the first network element may be a CU, DU, or RU
  • the third network element configured with an object association relationship for the first network element may be an SMOF.
  • an object may refer to a network element.
  • the first network element is an SMOF
  • the SMOF can receive alarm information from at least two of the following network elements: RU, DU, or CU.
  • SMOF receives alarm information from CU, DU and RU.
  • RU the number of network elements
  • DU the number of network elements
  • CU the number of network elements
  • DU the number of network elements
  • CU the number of the alarm information of RU
  • DU the alarm information of RU
  • SMOF can report RU alarm information to the operator's network management, and no longer reports CU and DU alarm information to the operator, reducing network operation and maintenance costs.
  • the SMOF receives the alarm information from the CU and the DU, and when the CU and the DU have an association relationship, it considers that the alarm information of the DU is the root cause failure of the alarm information of the CU.
  • the SMOF reports the alarm information of the DU to the network management of the operator, and no longer reports the alarm information of the CU, reducing network operation and maintenance costs.
  • SMOF can receive alarm information from DU and RU. When DU and RU have an association relationship, it considers the alarm information of RU to be the root cause of the alarm information of DU, reports the alarm information of RU to the operator, and no longer reports the alarm information of DU. Alarm information, reducing network operation and maintenance costs.
  • SMOF can receive alert information from CU and RU. Since there is an association relationship between CU and DU, and there is an association relationship between DU and RU, when there is an indirect relationship between CU and RU, it can be considered that the alarm information of RU is the root cause of the alarm information of CU, and it can be reported to the network management of the operator. RU alarm information. It should be noted that in some scenarios, a RU failure may not cause a corresponding DU failure, but may cause a corresponding CU failure. Therefore, in some scenarios, RU and CU may report alarm information at the same time, but DU does not report alarm information.
  • the relationship between at least two of RU, DU and CU may include the following meanings: one CU can centrally manage multiple DUs, and there is an association relationship between the CU and the multiple DUs it manages.
  • a DU can centrally manage multiple RUs, and there is an association relationship between the DU and the multiple RUs it manages.
  • the DU acts as a bridge in the middle, and there is also an association relationship between the CU and the RU. For example, if the target CU manages the target DU, and the target DU manages the target RU, then there is an association relationship between the target CU and the target RU.
  • the association relationship may refer to an association relationship of more than two terms.
  • the association relationship may refer to the association relationship among RUs, DUs, and CUs. Similar to the above, if the target CU manages the target DU, and the target DU manages the target RU, it is considered that there is an association among the target CU, the target DU, and the target RU. For example, if CU1 manages DU11 to DU13, and DU11 manages RU111 to RU113, it is considered that there is an association relationship among CU1, DU11 and RU11.
  • CU1, DU11, and RU111 report alarm information to SMOF respectively. Since there is a relationship between the RU, DU, and CU that reported the alarm information, it is considered that the alarm information of RU111 is DU11
  • the root cause of the alarm information of CU1 and CU1 is failure.
  • the alarm information reported by the RU may be an RU function abnormality alarm
  • the alarm information reported by the DU may be a DU cell unavailable alarm
  • the CU reported alarm information may be a CU cell unavailable alarm.
  • an object may refer to an object of a network element.
  • CU DU or RU
  • different objects can be divided according to different functions.
  • an object that manages the function of one cell may be called a DU object.
  • an object that manages the function of one cell is referred to as one CU object or the like.
  • Different objects have different identities.
  • the identifier may be assigned by the SMOF, or assigned by other network elements, or preset, or stipulated by a protocol, etc., without limitation.
  • the cloud resource Cloud includes a computing resource pool, a storage resource pool, and a network resource pool.
  • a computing resource pool includes multiple computing objects, and each computing object corresponds to a different identity.
  • a storage resource pool includes multiple storage objects, and different objects have different identifiers.
  • the network resource pool includes multiple network resource objects.
  • the specific functions of CU, DU, and RU can be implemented in the cloud resource Cloud.
  • the computing resource pool of the cloud resource Cloud includes 10 computing objects.
  • 5 calculation objects implement the calculation function of the RU, and these 5 calculation objects may be referred to as calculation objects of the RU.
  • the three calculation objects realize the calculation function of the DU, and the three calculation objects of the DU may be called the calculation objects of the DU.
  • the two calculation objects implement the calculation function of the RU, and these two calculation objects may be referred to as the calculation objects of the RU.
  • the calculation objects of the CU there is an association relationship among the calculation objects of the CU, the calculation objects of the DU, and the calculation objects of the RU, and the root cause of the fault can be determined according to the association relationship among the three.
  • the computing objects of RU, DU, and CU all report alarm information. Since there is an association relationship among RU, DU, and CU, it is determined that the alarm information of the computing object of the RU is the root cause of other alarm information. Subsequently, the SMOF only reports the alarm information of the computing object of the RU to the network management of the operator, thereby reducing operation and maintenance costs.
  • the object refers to the object corresponding to the network element as an example to continue the description.
  • the root cause failure of the alarm information of N objects can also be determined according to the generation time of the alarm information of N objects, that is: according to the object association relationship and the alarm information of N objects
  • the generation time of the information determines the root cause failure of the alarm information of N objects.
  • the alarm information of each object may carry a time stamp, and the generation time of the alarm information of each object may be determined according to the time stamp of the alarm information of each object.
  • X association relationship sets may be determined according to the object association relationship.
  • the association relationship set includes at least one object that has an association relationship.
  • the numbers of objects with associated relationships included in different association relationship sets among the X association relationship sets are the same or different, which is not limited.
  • one or more root cause failures may be determined for any one of the X association relationship sets.
  • a total of M root cause faults can be determined. The sum of the root cause failures determined by the X association relationship sets is less than or equal to the above M.
  • the object association relationship includes the association relationship between CU, DU and RU, and the association relationship between CU and cloud resource Cloud.
  • CU objects, DU objects, RU objects, and CU-corresponding cloud resource Cloud objects all report alarm information.
  • the above-mentioned multiple objects for reporting alarm information may be divided into two association relationship sets.
  • an association relationship set includes CU objects, DU objects, and RU objects, wherein the CU objects, DU objects, and RU objects have an association relationship.
  • Another association relationship set includes the objects of the CU and the objects of the cloud resource Cloud corresponding to the CU, wherein the objects of the CU and the objects of the cloud resource Cloud corresponding to the CU have an association relationship.
  • any relationship set i in the X relationship sets where i is a positive integer greater than or equal to 1 and less than or equal to X, perform the following operations: According to the alarm information of the objects included in the relationship set i generation time, determine the L objects in the association relationship set, the generation time difference of the alarm information of the L objects is less than (or, less than or equal to) the threshold; determine the root cause failure of the alarm information of the L objects; Wherein, the root cause failure is alarm information of at least one object among the L objects, and the L is a positive integer.
  • association relationship set i obtain the generation time of the alarm information of the objects included in the association relationship set; among the objects included in the association relationship set i, determine that the generation time of the alarm information satisfies the condition Objects, the objects that meet the conditions are the above L objects; among the L objects, determine the root cause of the failure.
  • the association relationship set includes CU objects, DU objects and RU objects.
  • the generation time difference between the alarm information of the CU object and the DU object is less than the threshold (satisfies the condition), and the generation time of the alarm information of the RU object is different from the generation time of the alarm information of the CU object, and/or the alarm of the DU object If the information generation time difference is greater than the threshold (the condition is not satisfied), the L objects determined above are the objects of the CU and the objects of the DU. It is determined that the alarm information of the object of the DU is the root cause failure of the alarm information of the object of the CU. In the present disclosure, with respect to a set of association relationships, one root-cause fault, or multiple root-cause faults may be determined, without limitation.
  • P association relationship sets can be determined according to the object association relationship and the generation time of the alarm information of the N objects; the different association relationship sets in the P association relationship sets include the same or different numbers
  • the object is not limited, and the P is an integer greater than or equal to 1.
  • at least one root cause failure may be determined. However, it is defined in the aforementioned step 302 that for the alarm information of N objects, a total of M root cause faults are determined. The sum of the root cause failures determined in the P association association sets is the above M.
  • the association relationship set is first determined according to the object association relationship; and then L objects whose alarm information generation time meets the conditions are determined in the association relationship set according to the generation time of different alarm information.
  • the generation time of the alarm information of different objects is considered, that is, in this design, the generation time of the alarm information corresponding to the objects included in the association relationship set is meet the conditions.
  • the CU object, DU object, RU object, and the cloud resource Cloud object corresponding to the CU all report alarm information. Since the object association relationship includes the association relationship of CU, DU and RU, CU and CU correspond to the association relationship of the cloud resource Cloud.
  • objects of CU, objects of DU, and objects of RU are classified into one set, and objects of CU and objects of cloud resource Cloud corresponding to the CU are classified into another set.
  • the DU object and the RU object respectively acquire the generation time of the alarm information corresponding to the CU object, the DU object and the RU object.
  • Determine whether the generation time of the alarm information of the three meets the condition for example, the difference between the generation time of the alarm information is less than the threshold
  • the condition for example, the difference between the generation time of the alarm information is less than the threshold
  • the object of the RU is removed from the original set, and the determined association
  • the relationship collection includes CU objects and DU objects.
  • the eliminated objects such as RU
  • the root cause failure corresponding to this set is the alarm information of the objects included in the set, for example, the alarm information of RU.
  • any one of the P association relationship sets determine the root cause failure of the alarm information of the Q objects included in the association relationship set, where Q is a positive integer; wherein, the root cause failure It is the alarm information of at least one object among the Q objects, the Q objects have an association relationship, and the generation time difference of the alarm information of the Q objects with the association relationship is less than (or, less than or equal to) a threshold.
  • one of the P association relationship sets includes CU objects, DU objects, and RU objects. Since there is an association among CU, DU, and RU, and CU manages DU, and DU manages RU, it can be considered that the alarm information corresponding to RU is the root cause of the alarm information of CU and DU.
  • an implementation of the above step 301 is as follows: the first network element detects the alarm information of N1 objects; the first network element receives the alarm information of N2 objects from the second network element, and the N1 and Both N2 are positive integers, and the sum of the two is equal to N.
  • the first network element is a CU, and the alarm information of the N1 objects detected by the first network element includes the alarm information of the N1 objects detected by the CU.
  • the second network element is a DU, and the alarm information of the N2 objects sent by the DU to the CU includes at least one of the following: alarm information of objects of the DU, alarm information of objects of the RU, and alarms of cloud resource Cloud objects corresponding to the DU Information, or alarm information of the cloud resource Cloud object corresponding to the RU.
  • the first network element is a DU, and the alarm information of the N1 objects detected by the first network element includes the alarm information of the N1 objects detected by the DU.
  • the second network element is an RU, and the alarm information of the N2 objects sent by the RU to the DU includes at least one of the following: alarm information corresponding to the RU, or alarm information of an object of the cloud resource Cloud corresponding to the RU.
  • the first network element is a CU, DU or RU, and the second network element is an object of the cloud resource Cloud corresponding to the first network element.
  • the first network element is a CU
  • the alarm information of N1 objects detected by the first network element includes the alarm information of N1 objects detected by the CU, and the cloud resource Cloud sends the N2 objects of the cloud resource Cloud corresponding to the CU to the CU warning information, etc.
  • the alarm information of N2 objects sent by the second network element to the first network element is called the first type of alarm information
  • the first type of alarm information includes at least one of the following: object An identifier, an identifier of the second network element, or an identifier of a network element associated with the second network element, and the like.
  • the DU can collect the alarm information of N2 objects, and the alarm information of the N2 objects includes at least one of the following: the alarm information of its own object detected by the DU, the alarm information of the RU object collected from the RU, and the RU collected from the RU The alarm information of the object corresponding to the cloud resource Cloud, or the alarm information of the object corresponding to the DU collected from the cloud resource Cloud, etc.
  • the DU reports the alarm information of the N2 objects to the CU associated with the DU.
  • the CU performs root cause analysis on the alarm information of N1 objects and the alarm information of N2 objects according to the object association relationship, and determines the root cause of the fault.
  • the alarm information of N2 objects reported by the DU is called the first type of alarm information.
  • the first type of alarm information of the DU object reported by the DU includes at least one of the following items: the identifier of the CU related to the DU, the identifier of the DU, or the identifier of the object.
  • the first type of alarm information of the DU object reported by the DU is: CU ID + DU ID + object ID.
  • the first type of alarm information of the RU object reported by the DU includes at least one of the following items: the identifier of the CU associated with the DU, the identifier of the DU, the identifier of the RU, or the identifier of the object.
  • the first type of alarm information of the RU object reported by the DU includes: CU ID + DU ID + RU ID + object ID. It can be understood that the alarm information of the RU object is reported by the RU to the DU. In this regard, the following two situations are discussed:
  • the RU is not aware of the CU, that is, the RU knows the corresponding DU but not the corresponding CU.
  • the alarm information reported by the RU to the DU may include: DU ID + RU ID + object ID.
  • the DU receives the alarm information reported by the RU, according to the corresponding relationship between the DU ID and the CU ID, the CU ID is added to the alarm information of the RU.
  • the first type of alarm information reported by the DU to the CU is: CU ID + DU ID + RU Identity + object identity.
  • the DU when the DU receives the alarm information of the RU, it does not process the alarm information, but directly reports the alarm information to the CU as the first type of alarm information, that is, the alarm information of the RU object reported by the DU to the CU is: DU ID +RU ID+Object ID.
  • the object association relationship stored in the CU is: CU identifier+DU identifier, DU identifier+RU identifier.
  • the association relationship between RU and DU is determined through DU identifier+RU identifier. Then, according to the CU identifier + the DU identifier, the association relationship between the DU and the CU is determined. Finally determine the association relationship between CU, DU and RU, that is, CU identifier + DU identifier + RU identifier.
  • the object association relationship stored in the CU is: CU ID + DU ID + RU ID, then the CU uses the CU ID + DU ID + RU ID as the keywords used for association, and performs an association process to directly determine the CU, DU, and RU relationship between.
  • the RU is perceivable to the CU, that is, the RU knows the corresponding DU and CU.
  • the alarm information reported by the RU to the DU may include: CU ID + DU ID + RU ID + object ID.
  • the DU receives the alarm information reported by the RU, it forwards the alarm information to the CU as the first type of alarm information.
  • the first network element may further include: the first network element sends first indication information to the third network element, and the first indication information Root cause faults for indicating the alarm information of the N objects.
  • the first network element is a CU, DU or RU.
  • the CU, DU or RU adopts the method in the process shown in Figure 3 above to determine the root cause failure, it can send indication information for indicating the root cause failure to a third network element, and the third network element can be SMOF etc.
  • an implementation of the above step 301 is as follows: the first network element receives the alarm information from the second network element, and the alarm information includes alarm information of N1 objects; receives the alarm information from the third network element alarm information, the alarm information includes alarm information of N2 objects; receives alarm information from the fourth network element, and the alarm information includes alarm information of N3 objects.
  • the N1, N2 and N3 are all positive integers, and the sum of the three is equal to N.
  • the alarm information reported by the second network element, the third network element and the fourth network element is called the second type of alarm information.
  • the second type of alarm information includes at least one of the following: an object identifier, an identifier of a corresponding network element, a fault identifier, or a fault reason.
  • the first network element may be an SMOF
  • the second network element may be a CU
  • the third network element may be a DU
  • the fourth network element may be a RU, and so on.
  • the cloud resource Cloud can use an existing interface to report alarm information to the CU, DU, or RU. Alternatively, add an interface between the cloud resource Cloud and the CU, DU, or RU for reporting alarm information. Alternatively, the cloud resource Cloud may use an existing interface to report part of the alarm information, and the remaining content of the alarm information may be reported through a newly added interface, without limitation.
  • the RU can report alarm information to the DU through an existing interface, such as an open (open) fronthaul (Fronthaul) interface. Alternatively, an interface can be added between the RU and the DU for reporting alarm information, etc.
  • the DU can report alarm information to the CU through an existing interface, such as the F1 interface, or an interface can be added between the DU and the CU for reporting alarm information.
  • an existing interface such as the F1 interface
  • an interface can be added between the DU and the CU for reporting alarm information.
  • part of the content can be reported through the new interface, and the remaining part of the content of the alarm information can be reported through the existing interface.
  • the specific process is described by taking the first network element as an SMOF, and the second network element, the third network element and the fourth network element as CU, DU and RU respectively as an example.
  • the warning information may be called the second type of warning information, and the second type of warning information includes at least one of the following: object identifier, CU identifier, fault identifier, or fault cause.
  • the CU reports the alarm information of N1 objects to the SMOF, and the alarm information of each of the N1 objects includes: object identifier+CU identifier+fault identifier.
  • the fault identification may implicitly indicate the cause of the fault.
  • the DU detects the failure of N2 objects of the DU, it can report the alarm information of the N2 objects to the SMOF.
  • the alarm information of each of the N2 objects includes: object ID + DU ID + fault ID,
  • the fault flag can implicitly indicate the cause of the fault.
  • the process for the RU to report N3 corresponding alarm information to the SMOF is similar to the process for the CU or DU to report the alarm information to the SMOF, and will not be repeated here.
  • the SMOF can determine the root cause of the alarm information reported by the CU, DU, and RU according to the object association relationship, such as the association relationship between the CU, DU, and RU.
  • one CU can centrally control multiple DUs, and the CU has an association relationship with the DUs it centrally controls, and there is no association relationship between the CU and other DUs except the DUs centrally controlled.
  • a DU can centrally control multiple RUs. There is an association relationship between the DU and the RUs it centrally controls, and there is no association relationship between the DU and other RUs except the RUs that are centrally controlled. For example, if CU1 centrally controls DU11 to DU13, and DU11 centrally controls RU111 to RU113, then there is an association relationship between CU1 and DU11 to DU13. DU11 is associated with RU111 to RU113.
  • CU1 reports alarm information of N1 objects to SMOF
  • DU11 reports alarm information of N2 objects to SOMF
  • RU113 reports alarm information of N3 objects to SMOF. Since there is an association relationship among CU1, DU11, and RU113, it can be determined that the alarm information of N3 objects of RU113 is the root cause of the alarm information of N2 objects of DU11 and N1 objects of CU1.
  • the CU, DU or RU reports alarm information to the SMOF and the SMOF determines the root cause of the fault as an example.
  • the cloud resource Cloud can also report alarm information to the SMOF, and the SMOF can determine the root cause of the alarm information of the CU, DU, RU, or cloud resource Cloud based on the association relationship.
  • CU1, DU11, RU113, and the cloud resource Cloud report alarm information to the SMOF respectively. It has been analyzed above that there is an association relationship among CU1, DU11, and RU113, and the alarm information reported by the design cloud resource Cloud is the alarm information of the object of RU113.
  • the SMOF may determine that the alarm information of the cloud resource Cloud is the root cause of the alarm information of the CU1, DU11, and RU113.
  • CU, DU, RU or cloud resources such as Cloud may use existing interfaces to report alarm information to the SMOF. For example, report alarm information to the SMOF through the O1 interface. Or, add an interface between CU, DU, RU, or cloud resource Cloud and SMOF, and report alarm information to SOMF through this newly added interface. Alternatively, part of the content of the alarm information is reported using an existing standard interface, and another part of content is reported using a new interface, etc., without limitation.
  • a process for determining the root cause of the failure is provided, at least including:
  • the cloud resource Cloud When the cloud resource Cloud detects the failure of the cloud resource Cloud object corresponding to the RU, it reports alarm information to the RU.
  • the alarm information is called the first type of alarm information, and the alarm information includes at least one of the following items: the object identifier of the cloud resource Cloud , cloud resource Cloud ID, or RU ID.
  • the alarm information may also include associated DU identifiers, CU identifiers, and the like.
  • the RU determines the alarm information of the cloud resource Cloud according to the association relationship between the RU and the cloud resource Cloud object corresponding to the RU, which is the root cause of the alarm information of the RU.
  • the RU reports the correlation analysis of the above root cause faults to the SMOF through the O1 interface.
  • the RU can send alarm information to the DU, and the alarm information includes: alarm information of the RU and alarm information of the cloud resource Cloud.
  • the alarm information reported by the RU may be the first type of alarm information, including: the object ID of the RU, and the ID of the DU corresponding to the RU.
  • the alarm information reported by the RU may further include: the CU identifier corresponding to the RU.
  • the DU determines the alarm information of the cloud resource Cloud according to the association relationship between the RU and the DU, and the association relationship between the RU and the fault object in the cloud resource Cloud, which is the root cause of the fault between the alarm information of the RU and the alarm information of the DU.
  • the DU reports the correlation analysis of the above root cause faults to the SMOF through the O1 interface.
  • the DU can send alarm information to the CU, and the alarm information includes: alarm information of the DU, alarm information of the RU, and alarm information of the cloud resource Cloud.
  • the alarm information of the DU includes at least one of the following items: the CU identifier associated with the DU, the DU identifier, and the object identifier of the DU.
  • the CU detects an internal fault, it will generate an alarm message.
  • the CU determines the root cause of the multiple alarm messages above based on the object association relationship.
  • the above multiple alarm information includes: alarm information detected by the CU, received alarm information of the DU, alarm information of the RU, and alarm information of the cloud resource Cloud.
  • the alarm information of the CU is caused by the alarm information of the DU
  • the alarm information of the DU is caused by the alarm information of the RU
  • the alarm information of the RU is caused by the alarm information of the cloud resource Cloud. caused by.
  • the CU finally determines that the alarm information of the cloud resource Cloud is the root cause of the alarm information of the CU, DU, and RU.
  • the CU reports the correlation analysis of the above root cause faults to the SMOF through the O1 interface.
  • SMOF when the SMOF receives the relevant analysis of the root cause failure reported by CU, DU, and RU, it can report the alarm information corresponding to the root cause failure to the operator's network management, and the operator's network management will send a work order.
  • SMOF can also determine the root cause failure by itself, as follows:
  • Cloud resources Cloud, RU, DU or CU, etc. can also report the alarm information detected respectively to the SMOF, the alarm information is called the second type of alarm information, and the second type of alarm information includes at least one of the following: object identification, The identifier, fault identifier, or fault cause of the corresponding network element, etc.
  • the alarm information reported by the cloud resource Cloud to the SMOF includes: an object identifier of the cloud resource Cloud, an identifier of the cloud resource Cloud, a fault identifier, and the like.
  • the alarm information reported by the RU to the SMOF includes: the object ID of the RU, the ID of the RU, and the fault ID.
  • the alarm information reported by the DU to the SMOF includes: the object ID of the DU, the ID of the DU, and the ID of the fault.
  • the alarm information reported by the CU to the SMOF includes: the CU's object ID, CU ID, and fault ID. Wherein, the fault identification may implicitly indicate the cause of the fault.
  • SMOF receives the alarm information reported by cloud resource Cloud, RU, DU, and CU, it can obtain the object identifier of cloud resource Cloud in the alarm information of cloud resource Cloud, the RU identifier in the alarm information of RU, and the The DU ID is obtained from the alarm information, and the CU ID is obtained from the CU alarm information.
  • the object association relationship determine whether there is an association relationship between the obtained cloud resource Cloud object identifier and the RU identifier, whether there is an association relationship between the obtained RU identifier and the DU identifier, and whether there is an association relationship between the obtained DU identifier and the CU identifier; if All three are related, and the alarm information of the cloud resource Cloud can be determined, which is the root cause of the alarm information of RU, DU, and CU.
  • the SMOF can report the alarm information of the cloud resource Cloud as the root cause of the fault to the operator's network management, and the operator's network management will issue a work order for the root cause of the fault.
  • the alarm information reported by the SMOF to the operator's network management may be the second type of alarm information reported by each network element.
  • the alarm information reported by the SMOF to the network management of the operator includes at least one of the following: the object identifier of the cloud resource Cloud, the identifier of the cloud resource Cloud, or the fault identifier, etc.
  • the fault flag can implicitly indicate the cause of the fault, etc.
  • SMOF can compare the root cause faults determined by CU, DU, and RU with the root cause faults determined by SMOF itself: determine the root cause faults reported by each network element, and the root cause faults determined by SMOF Whether the root cause fault is the same; if they are the same, report the same root cause fault to SOMF; if not, report the root cause fault reported by each network element and the root cause fault determined by SMOF to the operator network management.
  • the SMOF can report the root cause failure determined by the SMOF to the operator's network management in consideration of the high accuracy of its own judgment.
  • the SMOF can also report the root cause failure of each network element to the operator's network management without limitation.
  • SMOF can learn and update the object association relationship through big data or AI, so the accuracy of the root cause failure determined by SMOF is high.
  • the root cause of the fault may not be determined through CU, DU, or DU.
  • SMOF can be used for aggregation and unified processing to determine the root cause of the fault more accurately.
  • the SMOF can also send the updated object association relationship to CU, DU or RU synchronously.
  • RU, DU or CU, etc. can report their respective root cause failure analysis to the SMOF.
  • the SMOF can also determine the root cause of the failure according to the alarm information reported by the cloud resources Cloud, RU, DU, and CU.
  • the process of determining the root cause of failure by SMOF can be used as a supplement to RU, DU or CU to determine the root cause of failure, and improve the accuracy of determining the root cause of failure.
  • the first network element, the second network element and the third network element include hardware structures and/or software modules corresponding to each function.
  • the present disclosure can be implemented in the form of hardware or a combination of hardware and computer software. A certain function is executed by hardware, which is the way computer software drives the hardware, depending on the specific application scenarios and design constraints of the technical solution.
  • 5 and 6 are schematic structural diagrams of possible devices provided by the present disclosure. These communication devices can be used to implement the functions of the first network element, the second network element, or the third network element in the above method, and thus can also realize the beneficial effects of the above method.
  • a communication device 500 includes a processing unit 510 and a transceiver unit 520 .
  • the communication device can realize the functions of the first network element, the second network element or the third network element in the above method.
  • the processing unit 510 is configured to determine alarm information of N objects, where N is an integer greater than or equal to 2.
  • the processing unit 510 is further configured to determine the root cause failure of the alarm information of the N objects according to the object association relationship; wherein, the root cause failure is the alarm information of M objects in the N objects, and the M is a positive integer less than or equal to N.
  • the transceiver unit 520 is configured to receive corresponding information from other network elements.
  • the processing unit 510 is used to determine the alarm information of N2 objects; the transceiver unit 520 is used to send the alarm information of N2 objects to the first network element Information, the alarm information is called the first type of alarm information; wherein, for the N2 objects, the first type of alarm information of each object includes at least one of the following: the identifier of the object, the second network element An identifier, or an identifier of a network element related to the second network element; wherein, the N2 is a positive integer.
  • the transceiver unit 520 is used to receive the first indication information, and the first indication information is used to indicate the root cause of the alarm information of the N objects Fault; wherein, the root cause fault is the alarm information of M objects among the N objects, where N is a positive integer greater than or equal to 2, and M is a positive integer less than or equal to N.
  • the processing unit 510 is configured to process the root cause failures of the alarm information of the N objects.
  • the communication device 600 includes a processor 610 and an interface circuit 620 .
  • the processor 610 and the interface circuit 620 are coupled to each other.
  • the interface circuit 620 may be a transceiver, an input/output interface, or a pin.
  • the communication device 600 may further include a memory 630 for storing instructions executed by the processor 610 or storing input data required by the processor 610 to execute the instructions or storing data generated after the processor 610 executes the instructions.
  • the processor 610 is used to implement the functions of the processing unit 510
  • the interface circuit 620 is used to implement the functions of the transceiver unit 520 .
  • the model realizes the functions of CU, DU, RU or SMOF in the above method.
  • the module may be a chip in CU, DU, RU or SMOF, or other modules.
  • processor in the present disclosure may be a central processing unit (central processing unit, CPU), and may also be other general processors, digital signal processors (digital signal processor, DSP), application specific integrated circuits (application specific integrated circuit, ASIC), field programmable gate array (field programmable gate array, FPGA) or other programmable logic devices, transistor logic devices, hardware components or any combination thereof.
  • CPU central processing unit
  • DSP digital signal processor
  • ASIC application specific integrated circuit
  • FPGA field programmable gate array
  • a general-purpose processor can be a microprocessor, or any conventional processor.
  • the memory in the present disclosure can be random access memory, flash memory, read-only memory, programmable read-only memory, erasable programmable read-only memory, electrically erasable programmable read-only memory, register, hard disk, mobile hard disk, CD-ROM or any other form of storage media known in the art.
  • An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium.
  • a storage medium may also be an integral part of the processor.
  • the processor and storage medium can be located in the ASIC.
  • the ASIC can be located in the base station or the terminal.
  • the processor and the storage medium may also exist in the base station or the terminal as discrete components.
  • the methods in the present disclosure may be fully or partially implemented by software, hardware, firmware or any combination thereof.
  • software When implemented using software, it may be implemented in whole or in part in the form of a computer program product.
  • the computer program product comprises one or more computer programs or instructions.
  • the processes or functions described in the present disclosure are executed in whole or in part.
  • the computer may be a general computer, a special computer, a computer network, a network device, a user device, a core network device, an OAM or other programmable devices.
  • the computer program or instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer program or instructions may be downloaded from a website, computer, A server or data center transmits to another website site, computer, server or data center by wired or wireless means.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer, or a data storage device such as a server or a data center integrating one or more available media.
  • the available medium may be a magnetic medium, such as a floppy disk, a hard disk, or a magnetic tape; it may also be an optical medium, such as a digital video disk; or it may be a semiconductor medium, such as a solid state disk.
  • the computer readable storage medium may be a volatile or a nonvolatile storage medium, or may include both volatile and nonvolatile types of storage media.
  • “at least one” means one or more, and “plurality” means two or more.
  • “And/or” describes the association relationship of associated objects, indicating that there may be three types of relationships, for example, A and/or B, which can mean: A exists alone, A and B exist at the same time, and B exists alone, where A, B can be singular or plural.
  • the character “/” generally indicates that the contextual objects are an “or” relationship; in the formulas of the present disclosure, the character “/” indicates that the contextual objects are a “division” Relationship.
  • “Including at least one of A, B or C” may mean: including A; including B; including C; including A and B; including A and C; including B and C; including A, B, and C.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Mobile Radio Communication Systems (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

一种确定根因故障的方法及装置,包括:第一网元确定N个对象的告警信息,所述N为大于或等于2的整数;第一网元根据对象关联关系,确定所述N个对象的告警信息的根因故障;其中,所述根因故障是所述N个对象中M个对象的告警信息,所述M为小于或等于N的正整数。采用本公开的方法及装置,可确定多个告警信息的根因故障,仅针对根因故障派发工单,降低运营和维护成本。

Description

一种确定根因故障的方法及装置
本申请要求于2022年02月18日提交国家知识产权局、申请号为202210152355.7、申请名称为“一种确定根因故障的方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本公开涉及通信技术领域,尤其涉及一种确定根因故障的方法及装置。
背景技术
在无线通信网络中,例如在移动通信网络中,网络支持的业务越来越多样,因此需要满足的需求越来越多。例如,网络需要能够支持超高速率、超低时延、和/或超大连接。该特点使得网络规划、网络配置、和/或资源调度越来越复杂。这些新需求、新场景和新特性给网络规划、运维和高效运营带来了前所未有的挑战。如何提高网络的运维效率是一个值得研究的问题。
发明内容
本公开提供一种确定根因故障的方法及装置,以确定多个告警信息的根因故障,从而可以仅针对根因故障派发工单,降低运营和维护成本。
第一方面,提供一种确定根因故障的方法,该方法的执行主体为第一网元,所述第一网元可以为CU、DU、RU或SMOF等,或者可以为第一网元中的部件(处理器,芯片或其它等),或者可以为软件模块等,包括:确定N个对象的告警信息,所述N为大于或等于2的整数;根据对象关联关系,确定所述N个对象的告警信息的根因故障;其中,所述根因故障是所述N个对象中M个对象的告警信息,所述M为小于或等于N的正整数。
通过上述方法,第一网元可确定多个对象的告警信息的根因故障,后续可以仅将根因故障对应的告警信息上报到运营商网管,派发工单,对于除根因故障外的其它告警信息不再派发工单,降低运营和维护成本。
在一种设计中,所述根据对象关联关系,确定所述N个对象的告警信息的根因故障,包括:根据对象关联关系和所述N个对象的告警信息的生成时间,确定所述N个对象的告警信息的根因故障。
通过上述设计,在确定根因故障时,不但考虑对象间的关联关系,还考虑不同告警信息的生成时间。由于即使有些告警信息的对象间是存在关联关系的,但其对应告警信息的生成时间的差距较大,实质上这些告警信息间是不存在因果关系的。而采用上述方法,可降低上述可能,提高确定的根因故障的准确性。
在一种设计中,所述根据对象关联关系和所述N个对象的告警信息的生成时间,确定所述N个对象的告警信息的根因故障,包括:根据所述对象关联关系,确定X个关联关系集合;针对每个关联关系集合:根据所述关联关系集合中包括的对象的告警信息的生成时间,确定所述关联关系集合中的L个对象,所述L个对象的告警信息的生成时间差小于(或,小于等于)阈值;确定所述L个对象的告警信息的根因故障;其中,所述根因故障是所述L个对象中至少一个对象的告警信息,所述X和L均为正整数。
通过上述方法,先根据对象关联关系,确定关联关系集合,所述关联关系集合中包括到少一个存在关联关系的对象;再根据不同告警信息的生成时间,在关联关系集合中,剔 除生成时间不满足条件的对象。最后,针对每个关联关系集合,确定一个或多个对象的告警信息作为根因故障,保证了确定的根因故障的准确性。
在一种设计中,所述根据对象关联关系和所述N个对象的告警信息的生成时间,确定所述N个对象的告警信息的根因故障,包括:根据所述对象关联关系和所述N个对象的告警信息的生成时间,确定P个关联关系集合;针对每个关联关系集合:确定所述关联关系集合中包括的Q个对象的告警信息的根因故障;其中,所述根因故障是所述Q个对象中的至少一个对象的告警信息,所述Q个对象存在关联关系,且所述Q个存在关联关系的对象的告警信息的生成时间差小于阈值,所述P与Q均为正整数。
通过上述方法,在确定关联关系集合中,同时考虑了对象间的关联关系和告警信息的生成时间等因素。所述确定的关联关系集合中包括至少一个存在关联关系的对象,所述集合包括的对象的告警信息的生成时间满足条件。针对每个关联关系集合,确定至少一个或多个对象的告警信息作为根因故障,保证了确定的根因故障的准确性。
在一种设计中,所述N个对象中包括N1个对象和N2个对象,所述N1和N2均为正整数,且两者之和等于N,所述确定N个对象的告警信息,包括:检测到所述N1个对象的告警信息;接收来自所述第二网元的所述N2个对象的告警信息。
通过上述方法,第一网元可检测到N1个对象的告警信息,接收来自第二网元的N2个对象的告警信息。确定N1个对象的告警信息和N2个对象的告警信息的根因故障。可以仅针对根因故障派发工单,对于其它告警信息不再派发工单,降低运营和维护成本。
在一种设计中,所述第一网元为集中式单元CU,所述第二网元为分布式单元DU,所述N1个对象的告警信息中包括所述CU的对象的告警信息,所述N2个对象的告警信息中包括以下至少一项:所述DU的对象的告警信息、无线单元RU的对象的告警信息、所述RU对应的云资源对象的告警信息、或所述DU对应的云资源对象的告警信息。
通过上述方法,第一网元为CU,第二网元为DU,DU收集告警信息,且向CU上报收集的告警信息,DU收集的告警信息中包括N2个对象的告警信息。CU检测到N1个对象的告警信息;CU可对该N1个对象的告警信息和N2个对象的告警信息作根因分析,确定根因故障,降低运营和维护成本。
在一种设计中,所述第一网元为DU,所述第二网元为RU,所述N1个对象的告警信息中包括所述DU的对象的告警信息,所述N2个对象的告警信息中包括以下至少一项:所述RU的对象的告警信息、或所述RU对应的云资源的对象的告警信息。
通过上述方法,第一网元为DU,第二网元为RU,RU收集N2个对象的告警信息,且上报给DU。DU对检测到的N1个对象的告警信息和接收的N2个对象的告警信息作根因分析,确定根因故障,降低运营和维护成本。
在一种设计中,所述第一网元为RU,DU、或CU,所述N1个对象的告警信息中包括所述第一网元的对象的告警信息;所述第二网元为所述第一网元对应的云资源,所述N2个对象的告警信息中包括所述第一网元对应的云资源的对象的告警信息。
通过上述方法,第一网元为RU、DU或CU,以第一网元为RU为例。如果在云资源中,RU对应的对象产生告警信息时,则云资源可将该告警信息上报给RU。RU在检测到RU的N1个对象的告警信息,和云资源上报的RU对应的N2个对象的告警信息时,可作根因分析,确定根因故障,降低运营和维护成本。
在一种设计中,针对所述N2个对象,每个对象的告警信息为第一类告警信息,所述第一类告警信息中包括以下至少一项:所述对象的标识、所述第二网元的标识、或所述第二网元相关联网元的标识。
在目前的设计中,针对CU、DU、RU和云资源等产生的告警信息,并不上报给其对应的管理或存在对应关系的网元。而在该设计中,RU产生或收集的告警信息上报给管理或 存在对应关系的DU,DU产生或收集的告警信息上报给或存在对应关系的CU。云资源Cloud产生的告警信息,可以上报给该告警信息对应的网元。在本公开中,由CU、DU或RU等进行根因分析,确定根因故障,降低运营和维护成本。同时,对于被管理或下层存在对应关系的网元,上报给上层的网元的告警信息,相对告警信息中作了进一步改进,即上述第一类告警信息,该告警信息中可以不再包括故障原因和故障标识等。而是包括对象标识、对应网元的标识、或不同网元间的关联关系。对于进行根因分析的网元,可根据该告警信息中包括的不同网元间的关联关系,容易获取不同网元间的关联关系,提高根因分析的效率。
在一种设计中,还包括:向第三网元发送第一指示信息,所述第一指示信息用于指示所述N个对象的告警信息的根因故障。
通过上述方法,第三网元可以为SMOF等,第一网元可为CU、DU或RU等。CU、DU或RU可将确定的根因故障上报给SMOF。SMOF可向运营商网管仅上报该根因故障对应的告警信息,降低运营和维护成本。
在一种设计中,还包括:获取所述对象关联关系,所述对象关联关系是由来自第三网元的配置文件或配置消息指示的。
通过上述方法,第三网元可以为SMOF,SMOF可以向CU、DU或RU等配置对象的关联关系。可选的,SMOF可利用大数据或人工智能等方式,周期性或基于触发条件等,更新对象间的关联关系,且可将更新的对象关联关系同步给CU、DU或RU等。采用上述方法,可灵活配置或更新对象关联关系,提高确定根因故障的准确率。
在一种设计中,所述获取对象关联关系,包括:接收来自所述第三网元的第一配置文件或第一配置消息,所述第一配置文件或第一配置消息用于指示第一对象关联关系;接收来自所述第三网元的第二配置文件或第二配置消息,所述第二配置文件或第二配置消息用于指示所述第二对象关联关系;根据所述第一对象关联关系和所述第二对象关联关系,确定所述对象关联关系。
在一种设计中,所述N个对象中包括N1个对象、N2个对象和N3个对象,所述N1、N2和N3的取值均为正整数,且三者之和等于N,所述确定N个对象的告警信息,包括:接收来自所述第一网元的告警信息,所述告警信息中包括所述N1个对象的告警信息;接收来自所述第二网元的告警信息,所述告警信息中包括所述N2个对象的告警信息;接收来自所述第三网元的告警信息,所述告警信息中包括所述N3个对象的告警信息。
通过上述方法,第四网元分别接收来自第一网元、第二网元和第三网元的告警信息。对上述三个网元的告警信息作根因分析,确定根因故障。后续,可以仅将根因故障上报到运营商网管,派发工单,对于其它告警信息不作派发工单,降低运营和维护成本。
在一种设计中,所述告警信息为第二类告警信息,所述第二类告警信息中包括以下至少一项:对象标识、对应网元的标识、故障标识、或故障原因。
第二方面,提供一种确定根因故障的方法,该方法的执行主体为第二网元,该第二网元为云资源Cloud、RU或DU等,或者,为第二网元中的部件(处理器、芯片或其它等),包括:向第一网元发送N2个对象的告警信息,所述告警信息为第一类告警信息;针对所述N2个对象,每个对象的第一类告警信息中包括以下至少一项:所述对象的标识、第二网元的标识、或所述第二网元相关联网元的标识;其中,所述N2为正整数。
第三方面,提供一种确定根因故障的方法,该方法的执行主体为第三网元,该第三网元为SMOF等,或者为第三网元中的部件(处理器、芯片或其它等),包括:接收第一指示信息,所述第一指示信息用于指示N个对象的告警信息的根因故障;所述根因故障是所述N个对象中M个对象的告警信息,所述N为大于或等于2的正整数,所述M为小于或等于N的正整数。
通过上述设计,第一网元可向第三网元上报分析的根因故障的指示信息,第三网元可根据该指示信息,确定根因故障,向运营商网管上报该根因故障对应的告警信息,降低运营和维护成本。可选的,第三网元可对第一网元分析的根因故障作补充和/或更新等,提高确定根因故障的准确性。
在一种设计中,还包括:向第一网元发送用于指示对象关联关系的配置文件或配置消息。
在一种设计中,所述向第一网元发送用于指示对象关联关系的配置文件或配置消息,包括:向所述第一网元发送第一配置文件或第一配置消息,所述第一配置文件或第一配置消息用于指示第一对象关联关系;向所述第一网元发送第二配置文件或第二配置消息,所述第二配置文件或第二配置消息用于指示第二对象关联关系。
在一种设计中,所述N个对象中包括N1个对象和N2个对象,所述N1与N2均为正整数,且两者之和等于N;所述N1个对象包括集中式单元CU的对象,所述N2个对象包括以下至少一项:分布式单元DU的对象、DU对应云资源的对象、无线单元RU的对象、或RU对应云资源的对象;或者,所述N1个对象包括DU的对象,所述N2个对象包括以下至少一项:所述RU的对象、或所述RU对应云资源的对象;或者,所述N1个对象包括CU的对象、DU的对象、或RU的对象,所述N2个对象中包括所述CU对应云资源的对象、DU对应云资源的对象、或RU对应云资源的对象。
在一种设计中,所述N个对象中包括N1个对象、N2个对象和N3个对象,所述N1、N2与N3均为正整数,且三者之和等于N;所述N1个对象中包括CU的对象,所述N2个对象中包括DU的对象,所述N3个对象中包括RU的对象。
第四方面,提供一种装置,有益效果可参见第一方面的记载,该装置可以是第一网元,或者配置于第一网元中的装置,或者能够和第一网元匹配使用的装置,该第一网元可以是CU、DU、RU或SMOF等。
在一种设计中,该装置包括执行第一方面所描述的方法/操作/步骤/动作一一对应的单元,该单元可以是硬件电路、也可以是软件,也可以是硬件电路结合软件实现。示例性地,该装置可以包括处理单元和收发单元,且处理单元和收发单元可以执行上述第一方面任一种设计示例中的相应功能,具体的:
处理单元,用于确定N个对象的告警信息,所述N为大于或等于2的整数。处理单元,还用于根据对象关联关系,确定所述N个对象的告警信息的根因故障;其中,所述根因故障是所述N个对象中M个对象的告警信息,所述M为小于或等于N的正整数。收发单元,用于从其他网元接收相应的信息。
关于处理单元和收发单元的具体执行过程,可参见第一方面的记载,这里不再赘述。
在另一种设计中,该装置包括处理器,用于实现上述第一方面描述的方法。所述装置还可以包括存储器,用于存储指令和/或数据。所述存储器与处理器耦合,所述处理器执行存储器中存储的程序指令,可以实现上述第一方面描述的方法。例如,该装置包括:
存储器,用于存储程序指令;
处理器,用于确定N个对象的告警信息,所述N为大于或等于2的整数;根据对象关联关系,确定所述N个对象的告警信息的根因故障;其中,所述根因故障是所述N个对象中M个对象的告警信息,所述M为小于或等于N的正整数。
关于处理器的具体执行过程,可参见上述第一方面的记载,不再赘述。
第五方面,提供一种装置,有益效果可参见第二方面的记载,该装置可以是第二网元,或者配置于第二网元中的装置,或者能够和第二网元匹配使用的装置,该第二网元可以是云资源Cloud、RU或DU等。
在一种设计中,该装置包括执行第二方面中所描述的方法/操作/步骤/动作一一对应的 单元,该单元可以是硬件电路,也可以是软件,也可以是硬件电路结合软件实现。示例性地,该装置包括收发单元。可选的,还可以包括处理单元。该通信单元和收发单元可以执行上述第二方面任一种设计示例中的相应功能,具体的:
通信单元,用于向第一网元发送N2个对象的告警信息,所述告警信息为第一类告警信息;其中,针对所述N2个对象,每个对象的第一类告警信息中包括以下至少一项:所述对象的标识、第二网元的标识、或所述第二网元相关联的网元的标识;其中,所述N2为正整数。
可选的,处理单元,用于确定N2个对象的告警信息。
关于收发单元和处理单元的具体执行过程可参见第二方面的记载,这里不再赘述。
在另一种设计中,该装置包括处理器,用于实现上述第二方面描述的方法。所述装置还可以包括存储器,用于存储程序和/或指令。所述存储器与所述处理器耦合,所述处理器执行所述存储器中存储的程序指令时,可以实现上述第二方面的方法。所述装置还可以包括通信接口,所述通信接口可以用于该装置和其它设备进行通信。示例性地,该通信接口可以是收发器、电路、总线、模块、管脚或其它类型的通信接口。该装置包括:
存储器,用于存储程序指令;
处理器,用于确定N个对象的告警信息
通信接口,用于向第一网元发送N2个对象的告警信息,所述告警信息为第一类告警信息;针对所述N2个对象,每个对象的第一类告警信息中包括以下至少一项:所述对象的标识、第二网元的标识、或所述第二网元相关联网元的标识;其中,所述N2为正整数。
关于处理器和通信接口的具体执行过程可参见第二方面的记载,这里不再赘述。
第六方面,提供一种装置,有益效果可参见第三方面的记载,该装置可以是第三网元,或者配置于第三网元中的装置,或者能够和第三网元匹配使用的装置。该第三网元可以是SMOF等。
在一种设计中,该装置包括执行上述第三方面中所描述的方法/操作/步骤/动作一一对应的单元,该单元可以是硬件电路,也可以是软件,也可以是硬件电路结合软件实现。示例性地,该装置包括收发单元。可选的,还可以包括处理单元。该通信单元和收发单元可以执行上述第三方面任一种设计示例中的相应功能,具体的:
收发单元,用于接收第一指示信息,所述第一指示信息用于指示N个对象的告警信息的根因故障;其中,所述根因故障是所述N个对象中M个对象的告警信息,所述N为大于或等于2的正整数,所述M为小于或等于N的正整数。
可选的,处理单元,用于对第一指示信息进行处理。
关于收发单元和处理单元的具体执行过程可参见上述第三方面的记载,这里不再赘述。
在另一种设计中,该装置包括处理器,用于实现上述第三方面描述的方法。所述装置还可以包括存储器,用于存储程序和/或指令。所述存储器与所述处理器耦合,所述处理器执行所述存储器中存储的程序指令时,可以实现上述第三方面的方法。所述装置还可以包括通信接口,所述通信接口可以用于该装置和其它设备进行通信。示例性地,该通信接口可以是收发器、电路、总线、模块、管脚或其它类型的通信接口。该装置包括:
存储器,用于存储程序指令;
通信接口,用于接收第一指示信息,所述第一指示信息用于指示N个对象的告警信息的根因故障;其中,所述根因故障是所述N个对象中M个对象的告警信息,所述N为大于或等于2的正整数,所述M为小于或等于N的正整数。
处理器,用于对第一指示信息进行处理。
关于通信接口和处理器的具体执行过程可参见上述第三方面的记载,这里不再赘述。
第七方面,提供一种计算机可读存储介质,包括指令,当其在计算机上运行时,使得 计算机执行第一方面、第二方面或第三方面的方法。
第八方面,提供一种芯片系统,该芯片系统包括处理器,还可以包括存储器,用于实现第一方面、第二方面或第三方面的方法。该芯片系统可以由芯片构成,也可以包含芯片和其他分立器件。
第九方面,提供一种计算机程序产品,包括指令,当其在计算机上运行时,使得计算机执行第一方面、第二方面或第三方面的方法。
第十方面,提供一种系统,该系统中包括第四方面的装置、第五方面的装置和第六方面的装置。
附图说明
图1为本公开提供的通信系统的示意图;
图2为本公开提供的O-RAN的架构图;
图3为本公开提供的确定根因故障的流程图;
图4为本公开提供的上报告警信息的示意图;
图5和图6为本公开提供的装置的结构示意图。
具体实施方式
图1是本公开能够应用的通信系统1000的架构示意图。如图1所示,该通信系统包括无线接入网100和核心网200,可选的,通信系统1000还可以包括互联网300。其中,无线接入网100可以包括至少一个接入网设备(如图1中的110a和110b),还可以包括至少一个终端设备(如图1中的120a-120j)。终端设备通过无线的方式与接入网设备相连,接入网设备通过无线或有线方式与核心网连接。核心网设备与接入网设备可以是独立的不同的物理设备,或者可以将核心网设备的功能与接入网设备的逻辑功能集成在同一个物理设备上,或者可以是一个物理设备上集成了部分核心网设备的功能和部分的接入网设备的功能。终端设备和终端设备之间,以及接入网设备和接入网设备之间可以通过有线或无线的方式相互连接。图1只是示意图,该通信系统中可以包括其它网络设备,如还可以包括无线中继设备和无线回传设备等,在图1中未画出。
接入网设备可以是基站(base station)、演进型基站(evolved NodeB,eNodeB)、发送接收点(transmission reception point,TRP)、第五代(5th generation,5G)移动通信系统中的下一代基站(next generation NodeB,gNB)、开放无线接入网(open radio access network,O-RAN)中的接入网设备、第六代(6th generation,6G)移动通信系统中的下一代基站、未来移动通信系统中的基站或无线保真(wireless fidelity,WiFi)系统中的接入节点等;或者可以是完成基站部分功能的模块或单元,例如,可以是集中式单元(central unit,CU)、分布式单元(distributed unit,DU)、集中式单元控制面(CU control plane,CU-CP)模块、或集中式单元用户面(CU user plane,CU-UP)模块。接入网设备可以是宏基站(如图1中的110a),也可以是微基站或室内站(如图1中的110b),还可以是中继节点或施主节点等。本公开中对接入网设备所采用的具体技术和具体设备形态不做限定。
在本公开中,用于实现接入网设备的功能的装置可以是接入网设备;也可以是能够支持接入网设备实现该功能的装置,例如芯片系统、硬件电路、软件模块、或硬件电路加软件模块,该装置可以被安装在接入网设备中或可以与接入网设备匹配使用。在本公开中,芯片系统可以由芯片构成,也可以包括芯片和其他分立器件。为了便于描述,下文以用于实现接入网设备的功能的装置是接入网设备,接入网设备为基站为例,描述本公开提供的技术方案。
(1)协议层结构。
接入网设备和终端设备之间的通信遵循一定的协议层结构。该协议层结构可以包括控 制面协议层结构和用户面协议层结构。例如,控制面协议层结构可以包括无线资源控制(radio resource control,RRC)层、分组数据汇聚层协议(packet data convergence protocol,PDCP)层、无线链路控制(radio link control,RLC)层、媒体接入控制(media access control,MAC)层和物理层等协议层的功能。例如,用户面协议层结构可以包括PDCP层、RLC层、MAC层和物理层等协议层的功能,在一种可能的实现中,PDCP层之上还可以包括业务数据适配协议(service data adaptation protocol,SDAP)层。
可选的,接入网设备和终端设备之间的协议层结构还可以包括人工智能(artificial intelligence,AI)层,用于传输AI功能相关的数据。
(2)集中式单元(central unit,CU)和分布式单元(distributed unit,DU)。
接入设备可以包括CU和DU。多个DU可以由一个CU集中控制。作为示例,CU和DU之间的接口可以称为F1接口。其中,控制面(control panel,CP)接口可以为F1-C,用户面(user panel,UP)接口可以为F1-U。本公开不限制各接口的具体名称。CU和DU可以根据无线网络的协议层划分:比如,PDCP层及以上协议层的功能设置在CU,PDCP层以下协议层(例如RLC层和MAC层等)的功能设置在DU;又比如,PDCP层以上协议层的功能设置在CU,PDCP层及以下协议层的功能设置在DU,不予限制。
上述对CU和DU的处理功能按照协议层的划分仅仅是一种举例,也可以按照其他的方式进行划分。例如可以将CU或者DU划分为具有更多协议层的功能,又例如将CU或DU还可以划分为具有协议层的部分处理功能。在一种设计中,将RLC层的部分功能和RLC层以上的协议层的功能设置在CU,将RLC层的剩余功能和RLC层以下的协议层的功能设置在DU。在另一种设计中,还可以按照业务类型或者其他系统需求对CU或者DU的功能进行划分,例如按时延划分,将处理时间需要满足时延要求的功能设置在DU,不需要满足该时延要求的功能设置在CU。在另一种设计中,CU也可以具有核心网的一个或多个功能。示例性的,CU可以设置在网络侧方便集中管理。在另一种设计中,将DU的无线单元(radio unit,RU)拉远设置。可选的,RU可以具有射频功能。
可选的,DU和RU可以在物理层(physical layer,PHY)进行划分。例如,DU可以实现PHY层中的高层功能,RU可以实现PHY层中的低层功能。其中,用于发送时,PHY层的功能可以包括以下至少一项:添加循环冗余校验(cyclic redundancy check,CRC)码、信道编码、速率匹配、加扰、调制、层映射、预编码、资源映射、物理天线映射、或射频发送功能。用于接收时,PHY层的功能可以包括以下至少一项:CRC校验、信道解码、解速率匹配、解扰、解调、解层映射、信道检测、资源解映射、物理天线解映射、或射频接收功能。其中,PHY层中的高层功能可以包括PHY层的一部分功能,例如该部分功能更加靠近MAC层,PHY层中的低层功能可以包括PHY层的另一部分功能,例如该部分功能更加靠近射频功能。例如,PHY层中的高层功能可以包括添加CRC码、信道编码、速率匹配、加扰、调制、和层映射,PHY层中的低层功能可以包括预编码、资源映射、物理天线映射、和射频发送功能;或者,PHY层中的高层功能可以包括添加CRC码、信道编码、速率匹配、加扰、调制、层映射和预编码,PHY层中的低层功能可以包括资源映射、物理天线映射、和射频发送功能。例如,PHY层中的高层功能可以包括CRC校验、信道解码、解速率匹配、解码、解调、和解层映射,PHY层中的低层功能可以包括信道检测、资源解映射、物理天线解映射、和射频接收功能;或者,PHY层中的高层功能可以包括CRC校验、信道解码、解速率匹配、解码、解调、解层映射、和信道检测,PHY层中的低层功能可以包括资源解映射、物理天线解映射、和射频接收功能。
示例性的,CU的功能可以由一个实体来实现,或者也可以由不同的实体来实现。例如,可以对CU的功能进行进一步划分,即将控制面和用户面分离并通过不同实体来实现,分别为控制面CU实体(即CU-CP实体)和用户面CU实体(即CU-UP实体)。该CU-CP 实体和CU-UP实体可以与DU相耦合,共同完成接入网设备的功能。
可选的,上述DU、CU、CU-CP、CU-UP和RU中的任一个可以是软件模块、硬件结构、或者软件模块+硬件结构,不予限制。其中,不同实体的存在形式可以是不同的,不予限制。例如DU、CU、CU-CP、CU-UP是软件模块,RU是硬件结构。这些模块及其执行的方法也在本公开的保护范围内。
一种可能的实现中,接入网设备包括CU-CP、CU-UP、DU和RU。例如,本公开的执行主体包括DU,或者包括DU和RU,或者包括CU-CP、DU和RU,或者包括CU-UP、DU和RU,不予限制。各模块所执行的方法也在本公开的保护范围内。
终端设备也可以称为终端、用户设备(user equipment,UE)、移动台、移动终端设备等。终端设备可以广泛应用于各种场景中的通信,例如包括但不限于以下至少一个场景:设备到设备(device-to-device,D2D)、车物(vehicle to everything,V2X)、机器类通信(machine-type communication,MTC)、物联网(internet of things,IOT)、虚拟现实、增强现实、工业控制、自动驾驶、远程医疗、智能电网、智能家具、智能办公、智能穿戴、智能交通、或智慧城市等。终端设备可以是手机、平板电脑、带无线收发功能的电脑、可穿戴设备、车辆、无人机、直升机、飞机、轮船、机器人、机械臂、或智能家居设备等。本公开对终端设备所采用的具体技术和具体设备形态不做限定。
在本公开中,用于实现终端设备的功能的装置可以是终端设备;也可以是能够支持终端设备实现该功能的装置,例如芯片系统、硬件电路、软件模块、或硬件电路加软件模块,该装置可以被安装在终端设备中或可以与终端设备匹配使用。为了便于描述,下文以用于实现终端设备的功能的装置是终端设备,终端设备为UE为例,描述本公开提供的技术方案。
基站和终端设备可以是固定位置的,也可以是可移动的。基站和/或终端设备可以部署在陆地上,包括室内或室外、手持或车载;也可以部署在水面上;还可以部署在空中的飞机、气球和人造卫星上。本公开对基站和终端设备的应用场景不做限定。基站和终端设备可以部署在相同的场景或不同的场景,例如,基站和终端设备同时部署在陆地上;或者,基站部署在陆地上,终端设备部署在水面上等,不再一一举例。
基站和终端设备的角色可以是相对的,例如,图1中的直升机或无人机120i可以被配置成移动基站,对于那些通过120i接入到无线接入网100的终端设备120j来说,终端设备120i是基站;但对于基站110a来说,120i是终端设备,即110a与120i之间是通过无线空口协议进行通信的。110a与120i之间也可以是通过基站与基站之间的接口协议进行通信的,此时,相对于110a来说,120i也是基站。因此,基站和终端设备都可以统一称为通信装置,图1中的110a和110b可以称为具有基站功能的通信装置,图1中的120a-120j可以称为具有终端设备功能的通信装置。
在一种设计中,如图2所示,在O-RAN架构中,传统的基站被拆分为云资源(Cloud)、RU、DU、CU-CP、CU-UP、接入网智能控制(RAN intelligent controller,RIC)、服务管理和编排框架(service management and orchestration framework,SMOF)等不同组件。各组件的功能如下:
SMOF:为支持网络的网络运营和管理提供基于服务的框架。可选的,SMOF中包括云基础设施(例如云资源)的操作管理维护(operation,administration and maintenance,OAM)和基站的OAM。
RIC:可参照软件定义网络(software defined network,SDN),引入智能化的调度,实现策略与执行分离等。RIC用于实现人工智能。RIC分为近实时(near real time,Near-RT)RIC和非实时(none real time,None-RT)RIC。其中,近实时RIC为O-RAN近实时RAN智能控制器,通过E2接口的细粒度数据收集和动作,实现O-RAN元素和资源的近实时控 制和优化。非实时RIC为O-RAN非实时RAN智能控制器,实现RAN元素和资源的非实时控制和优化的逻辑功能,包括模型训练和更新在内的人工智能(artificial intelligence,AI)或机器学习(machine learning,ML)工作流,以及基于策略的近实时RIC应用和功能指导。可选的,在图2的示例中,非实时RIC设置于SMOF中。
云资源Cloud:O-RAN联盟将云资源Cloud定义为一个云计算平台,由满足O-RAN要求的物理基础设施节点集合组成,可托管相关的O-RAN功能、支持软件组件,以及适当的管理和协调功能。
关于RU、DU、CU-CP和CU-UP等,可参见前述图1中的说明,不再赘述。可选的,在O-RAN架构中,RU、DU、Cloud可分别称为开放RU(open RU,O-RU)、开放DU(open DU,O-DU)、开放Cloud(open Cloud,O-Cloud)等。
在一种设计中,基站内的各个组件,当识别到故障时,例如通过软件识别到故障时,向运营商网管上报告警信息。所述运营商网管可以为网络管理系统(network management system,NMS)。运营商的监控人员对告警信息进行判断分析,对需要处理的故障,通过工单进行任务的分发。由于在O-RAN架构中,不同组件之间是存在关联关系的,某一组件故障可能会引发多个组件均上报告警信息。如果针对每个告警信息均派发工单,监控与维护人员的工作量将大幅增加,运维成本较高。因此,如何确定多个告警信息的根因故障,是一个值得研究的问题。
本公开提供一种确定根因故障的方法,在该方法中:在确定多个告警信息时,可在多个告警信息中,确定根因故障,所述根因故障是多个告警信息中的一个或多个告警信息。针对根因故障的告警信息派发工单,对非根因故障的告警信息不再派发工单,从而减少监控与维护人员的工作量,降低运维成本。在后续描述中,是以在O-RAN架构的基站中,应用本公开的方法为例描述的,不作为对本公开的限制。例如,本公开的方法,还可以应用于除O-RAN基站外的其它设备中,确定根因故障,例如核心网设备或终端设备等。如图3所示,本公开提供一种确定根因故障方法的流程,至少包括:
步骤301:第一网元确定N个对象的告警信息,所述N为大于或等于2的整数。
步骤302:第一网元根据对象关联关系,确定所述N个对象的告警信息的根因故障。
其中,所述根因故障是所述N个对象中M个对象的告警信息,所述M为小于或等于N的正整数。或者描述为:N-M个对象的告警信息的根因故障是所述M个对象的告警信息。举例来说,N的取值为3,第一网元确定3个对象的告警信息。根据对象关联关系,确定3个对象间存在关联关系。例如,对象1的网元CU、对象2的网元DU、与对象3的网元RU间存在关联关系。则确定RU的对象3的告警信息,是CU的对象1的告警信息、与DU的对象2的告警信息的根因故障。也就是说,由于RU发生故障,导致CU和DU产生告警信息。在公开中,对于根因故障,例如RU的对象3的告警信息,上报到运营商网管,由运营商网管派发工单,进行相应的维护。对于除根因故障外的其它告警信息,例如DU的对象2的告警信息和CU的对象1的告警信息等,不再向运营商网管上报,也不再派发工单,减少网络运营和维护成本。或者,所述N的取值为5,第一网元确定5个对象的告警信息。第一网元根据对象关联关系,确定2个关联关系集合。其中,一个关联关系集合中包括3个对象,另一个关联关系集合中包括2个对象。根据该包括3个对象的关联关系集合,确定根因故障。根据该包括2个对象的关联关系集合,确定另一个根因故障。在该示例中,M的取值为2。
在本公开中,所述对象关联关系可以是预配置的,或者协议规定的。例如,第三网元可通过配置文件,或者配置消息等,向第一网元配置所述对象关联关系。例如,第三网元可向第一网元发送配置文件,或配置消息等,该配置文件或配置消息用于指示对象关联关系。第一网元可通过所述配置文件或配置消息,获取对象关联关系。例如,第三网元通过 配置文件或配置消息所配置的对象关联关系为CU、DU和RU间的关联关系。或者,第三网元可通过多个配置文件或多个配置消息,配置对象关联关系。第一网元将所配置的多个对象关联关系拼接在一起,组成最终的对象关联关系。例如,第三网元向第一网元发送第一配置文件或第一配置消息,所述第一配置文件或第一配置消息用于指示第一对象关联关系。第三网元向第一网元发送第二配置文件或第二配置消息,所述第二配置文件或第二配置消息用于配置第二对象关联关系。第一网元根据第一对象关联关系和第二对象关联关系,确定对象关联关系。例如,第一对象关联关系为CU和DU的关联关系,第二对象关联关系为DU和RU的对象关联关系。通过两者的关联关系,最终确定CU、DU和RU间的关联关系等。可选的,第一网元可以为CU、DU或RU等,为第一网元配置对象关联关系的第三网元可为SMOF。
在本公开的描述中,对象可指网元。例如,第一网元为SMOF,SMOF可接收来自以下至少两个网元的告警信息:RU、DU、或CU等。例如,SMOF接收来自CU、DU和RU的告警信息。当RU、DU和CU间存在关联关系时,认为RU的告警信息,是DU和CU的告警信息的根因故障。SMOF可向运营商网管上报RU的告警信息,不再向运营商上报CU和DU的告警信息,减少网络运营和维护成本。或者,SMOF接收来自CU和DU的告警信息,当CU和DU存在关联关系时,认为DU的告警信息,是CU的告警信息的根因故障。SMOF向运营商网管上报DU的告警信息,不再上报CU的告警信息,减少网络运营和维护成本。或者,SMOF可接收来自DU和RU的告警信息,当DU与RU存在关联关系时,认为RU的告警信息是DU的告警信息的根因,向运营商上报RU的告警信息,不再上报DU的告警信息,减少网络运营和维护成本。或者,SMOF可接收来自CU和RU的告警信息。由于CU与DU间存在关联关系,DU与RU间存在关联关系,当CU与RU间存在间接关联关系时,可认为RU的告警信息是CU的告警信息的根因故障,可向运营商网管上报RU的告警信息。应当指出,在有些场景下,RU故障可能不会引起对应的DU故障,但可能会引起对应的CU故障。因此,在某些场景下,可能会存在RU和CU同时上报告警信息,但DU不上报告警信息的场景。
在本公开中,关于RU、DU和CU中至少两项存在关联关系可包括以下含义:一个CU可以集中管理多个DU,CU与其管理的多个DU间存在关联关系。一个DU可集中管理多个RU,DU与其管理的多个RU间存在关联关系。中间由DU作为桥梁,CU与RU间也存在关联关系。比如,目标CU管理目标DU,目标DU管理目标RU,则目标CU与目标RU间存在关联关系。当然上述描述,是以RU、DU和CU间任两项间的关联关例为例描述的。在本公开中,所述关联关系可指大于两项的关联关系。例如,所述关联关系可指RU、DU和CU间的关联关系。与上述类似,目标CU管理目标DU,目标DU管理目标RU,则认为目标CU、目标DU、与目标RU间三者存在关联关系。例如,CU1管理DU11至DU13,DU11管理RU111至RU113,则认为CU1、DU11与RU11间存在关联关系。比如,在某个时刻或某一段时间内,CU1、DU11和RU111分别向SMOF上报告警信息,由于上报告警信息的RU、DU和CU间存在关联关系,则认为RU111的告警信息,是DU11和CU1的告警信息的根因故障。示例的,RU上报的告警信息可为RU功能异常告警,DU上报的告警信息可为DU小区不可用告警,CU上报的告警信息可为CU小区不可用告警。
或者,在本公开的另一种描述中,对象可指网元的对象。例如,在CU、DU或RU中,根据实现功能的不同,可划分不同的对象。例如,在DU中,可将管理一个小区的功能的对象称为一个DU对象。在CU中,将管理一个小区的功能的对象称为一个CU对象等。不同的对象有不同的标识。该标识可以为SMOF为其分配的,或者其它网元分配的,或者预设置,或者协议规定的等,不作限定。再例如,对于云资源Cloud内包括计算资源池、存储资源池和网络资源池。计算资源池中包括多个计算对象,每个计算对象对应不同的标 识。存储资源池包括多个存储对象,不同的对象也有不同的标识。网络资源池中包括多个网络资源对象。
在O-RAN架构中,CU、DU、和RU的具体功能可以在云资源Cloud中实现的。例如,云资源Cloud的计算资源池中包括10个计算对象。其中,5个计算对象实现RU的计算功能,该5个计算对象可称为RU的计算对象。3个计算对象实现DU的计算功能,该3个DU的计算对象可称为DU的计算对象。2个计算对象实现RU的计算功能,该2个计算对象可称为RU的计算对象。在本公开中,CU的计算对象、DU的计算对象和RU的计算对象间存在关联关系,可根据三者的关联关系,确定根因故障。例如,RU的计算对象、DU的计算对象和CU的计算对象,均上报告警信息。由于RU、DU和CU间存在关联关系,则确定RU的计算对象的告警信息,是其它告警信息的根因故障。后续,SMOF仅向运营商网管上报RU的计算对象的告警信息,从而减少运营和维护成本。应当指出,在本公开的描述中,RU的对象、DU的对象和CU的对象间存在关联关系,而对于不同RU对象间、不同DU对象间或不同CU对象间并不存在关联关系。在后续描述中,以对象指对应网元的对象为例,继续描述。
可选的,除根据对象关联关系外,还可根据N个对象的告警信息的生成时间,确定N个对象的告警信息的根因故障,也就是:可根据对象关联关系和N个对象的告警信息的生成时间,确定N个对象的告警信息的根因故障。其中,每个对象的告警信息中,可携带时间戳,根据每个对象告警信息的时间戳,可确定每个对象告警信息的生成时间。
在一种设计中,可根据对象关联关系,确定X个关联关系集合。所述关联关系集合包括存在关联关系的至少一个对象。该X个关联关系集合中不同关联关系集合中包括的存在关联关系的对象数目相同或不同,不予限制。可选的,针对X个关联关系集合中的任一个关联关系集合,可确定一个或多个根因故障。在前述步骤302中限定,在本公开中,总共可确定M个根因故障。所述X个关联关系集合所确定的根因故障的和,小于或等于上述M。
例如,所述对象关联关系包括CU、DU和RU的关联关系,CU与云资源Cloud的关联关系。CU的对象、DU的对象、RU的对象和CU对应云资源Cloud的对象均上报告警信息。则可根据上述对象关联关系,可将上述上报告警信息的多个对象,划分为两个关联关系集合。其中,一个关联关系集合中包括CU的对象、DU的对象和RU的对象,其中,CU的对象、DU的对象和RU的对象存在关联关系。另一个关联关系集合中包括CU的对象和CU对应云资源Cloud的对象,其中,CU的对象和CU对应云资源Cloud的对象存在关联关系。
针对X个关联关系集合中的任一个关联关系集合i,所述i为大于或等于1,小于或等于X的正整数,执行以下操作:根据所述关联关系集合i中包括的对象的告警信息的生成时间,确定所述关联关系集合中的L个对象,所述L个对象的告警信息的生成时间差小于(或,小于等于)阈值;确定所述L个对象的告警信息的根因故障;其中,所述根因故障是所述L个对象中至少一个对象的告警信息,所述L为正整数。或者描述为:针对任一个关联关系集合i,获取该关联关系集合中包括的对象的告警信息的生成时间;在所述关联关系集合i所包括的对象中,确定告警信息的生成时间满足条件的对象,所述满足条件的对象即为上述L个对象;在所述L个对象中,确定根因故障。沿用上述举例,所述关联关系集合中包括CU的对象、DU的对象和RU的对象。其中,CU的对象和DU的对象的告警信息的生成时间差小于阈值(满足条件),而RU的对象的告警信息的生成时间,与CU对象的告警信息的生成时间,和/或DU对象的告警信息的生成时间差,大于阈值(不满足条件),则上述确定的所述L个对象,即为所述CU的对象和DU的对象。确定DU的对象的告警信息是CU的对象的告警信息的根因故障。在本公开中,针对一个关联关系集合,可确定一个根因故障,或者多个根因故障等,不作限定。
在另一种设计中,可根据对象关联关系和所述N个对象的告警信息的生成时间,确定P个关联关系集合;所述P个关联关系集合中不同关联关系集合中包括相同或不同数目的对象,不予限制,所述P为大于或等于1的整数。可选的,针对所述P个关联关系集合的任一个关联关系集合,可确定至少一个根因故障。而在前述步骤302中限定,针对N个对象的告警信息,总共确定M个根因故障。所述P个关联关联集合中确定的根因故障的和,即为上述M。
与上述设计不同的是,在上述设计中,先根据对象关联关系,确定关联关系集合;再根据不同告警信息的生成时间,在关联关系集合中确定告警信息的生成时间满足条件的L个对象。在该设计中,在确定关联关系集合时,就考虑不同对象的告警信息的生成时间,也就是,在该设计中,所述关联关系集合中包括的对象,所对应告警信息的生成时间都是满足条件的。沿用上述举例,CU的对象、DU的对象、RU的对象和CU对应云资源Cloud的对象均上报告警信息。由于在所述对象关联关系中包括CU、DU和RU的关联关系,CU和CU对应云资源Cloud的关联关系。在本公开中,将CU的对象、DU的对象和RU的对象划为一个集合,将CU的对象和CU对应云资源Cloud的对象划为另一个集合。针对CU的对象、DU的对象和RU的对象构成的集合,分别获取CU的对象、DU的对象和RU的对象对应的告警信息的生成时间。判断三者的告警信息的生成时间是否满足条件(比如,告警信息生成时间的差值小于阈值),如果满足条件,则保持原集合;否则,在原集合中,剔除告警信息的生成时间不满足条件的对象,构成关联关系集合。比如,如果对于RU的告警信息的生成时间,与CU的告警信息的生成时间和/或DU的告警信息的生成时间的差值,大于阈值,则在原集合中剔除RU的对象,所确定的关联关系集合中包括CU的对象和DU的对象。可选的,对于关联关系集合中,被剔除的对象,例如RU,可以形成一个单独的集合,该集合对应的根因故障即为该集合中包括对象的告警信息,例如RU的告警信息。针对该告警信息单独上报到运营商网管。或者,描述为:将该被剔除对象的告警信息直接上报到运营商网管,由于运营商网管下发对应的工单。
针对所述P个关联关系集合中的任一个关联关系集合:确定所述关联关系集合中包括的Q个对象的告警信息的根因故障,所述Q为正整数;其中,所述根因故障是所述Q个对象中的至少一个对象的告警信息,所述Q个对象存在关联关系,且所述Q个存在关联关系的对象的告警信息的生成时间差小于(或,小于等于)阈值。
例如,所述P个关联关系集合中的一个关联关系集合中包括CU的对象、DU的对象和RU的对象。由于CU、DU和RU间存在关联关系,且CU管理DU,DU管理RU,则可认为RU对应的告警信息是CU和DU告警信息的根因故障。
在一种设计中,上述步骤301的一种实现方式为:第一网元检测N1个对象的告警信息;第一网元接收来自第二网元的N2个对象的告警信息,所述N1与N2均为正整数,且两者之和等于N。
例如,第一网元为CU,所述第一网元检测到的N1个对象的告警信息包括CU检测到的N1个对象的告警信息。第二网元为DU,所述DU向CU发送的N2个对象的告警信息中包括以下至少一项:DU的对象的告警信息、RU的对象的告警信息、DU对应的云资源Cloud对象的告警信息、或RU对应的云资源Cloud对象的告警信息。或者,第一网元为DU,所述第一网元检测到的N1个对象的告警信息包括DU检测到的N1个对象的告警信息。第二网元为RU,所述RU向DU发送的N2个对象的告警信息中包括以下至少一项:所述RU对应的告警信息、或RU对应的云资源Cloud的对象的告警信息。或者,第一网元为CU、DU或RU,第二网元为第一网元对应的云资源Cloud的对象等。例如,第一网元为CU,第一网元检测到的N1个对象的告警信息包括CU检测到的N1个对象的告警信息,云资源Cloud向CU发送CU对应的云资源Cloud的N2个对象的告警信息等。
可选的,在本公开中,第二网元向第一网元发送的N2个对象的告警信息,称为第一类告警信息,所述第一类告警信息中包括以下至少一项:对象标识、所述第二网元的标识、或所述第二网元相关联的网元的标识等。
例如,当CU的N1个对象出现故障时,通过软件可识别该N1个对象的故障,生成对应的告警信息,即N1个对象的告警信息。DU可收集N2个对象的告警信息,该N2个对象的告警信息中包括以下至少一项:DU检测到的自己对象的告警信息、从RU收集的RU的对象的告警信息、从RU收集的RU对应的云资源Cloud的对象的告警信息,或从云资源Cloud收集DU对应对象的告警信息等。DU将所述N2个对象的告警信息上报给与DU存在关联关系的CU。CU根据对象关联关系,对N1个对象的告警信息和N2个对象的告警信息,作根因分析,确定根因故障。在该示例中,对于DU上报的N2个对象的告警信息,称为第一类告警信息。对于DU上报的DU对象的第一类告警信息中包括以下至少一项:DU相关CU的标识、DU标识、或对象标识。例如,DU上报的DU对象的第一类告警信息为:CU标识+DU标识+对象标识。对于DU上报的RU对象的第一类告警信息包括以下至少一项:DU相关联的CU的标识、DU标识、RU标识、或对象标识。例如,DU上报的RU对象的第一类告警信息中包括:CU标识+DU标识+RU标识+对象标识。可以理解的是,对于RU对象的告警信息是RU上报给DU的。对此,分以下两种情况讨论:
第一种设计,RU对于CU不感知,即RU知道对应的DU,但不知道对应的CU。RU向DU上报的告警信息可包括:DU标识+RU标识+对象标识。DU在接收到RU上报的告警信息时,根据DU标识与CU标识的对应关系,在RU的告警信息中增加CU标识,DU向CU上报的第一类告警信息为:CU标识+DU标识+RU标识+对象标识。或者,DU在接收到RU的告警信息时,不对该告警信息作处理,直接将该告警信息作为第一类告警信息,上报给CU,即DU向CU上报的RU对象的告警信息为:DU标识+RU标识+对象标识。在一种设计中,CU中存储的对象关联关系为:CU标识+DU标识,DU标识+RU标识。在CU根据对象关联关系,确定根因故障时。CU根据CU标识+DU标识、DU标识+RU标识作为关联使用的关键字(KEY),做两次关联处理。比如,通过DU标识+RU标识,确定RU与DU的关联关系。再根据CU标识+DU标识,确定DU与CU的关联关系。最终确定CU、DU与RU间的关联关系,即CU标识+DU标识+RU标识。或者,CU中存储的对象关联关系为:CU标识+DU标识+RU标识,则CU根据CU标识+DU标识+RU标识作为关联使用的关键字,做一次关联处理,直接确定CU、DU和RU间的关联关系。
第二种设计,RU对于CU可感知,即RU知道对应的DU和CU。则RU向DU上报的告警信息中可包括:CU标识+DU标识+RU标识+对象标识。DU在接收到RU上报的告警信息时,将该告警信息作为第一类告警信息转发给CU。
可选的,在上述图3所示的流程中,当第一网元确定根因故障之后,还可以包括:第一网元向第三网元发送第一指示信息,所述第一指示信息用于指示所述N个对象的告警信息的根因故障。例如,第一网元为CU、DU或RU。当CU、DU或RU采用上述图3所示流程中的方法,确定根因故障后,可向第三网元发送用于指示所述根因故障的指示信息,所述第三网元可为SMOF等。
在另一种设计中,上述步骤301的一种实现方式为:第一网元接收来自第二网元的告警信息,所述告警信息中包括N1个对象的告警信息;接收来自第三网元的告警信息,所述告警信息中包括N2个对象的告警信息;接收来自第四网元的告警信息,所述告警信息中包括N3个对象的告警信息。所述N1、N2与N3均为正整数,且三者之和等于N。其中,第二网元、第三网元和第四网元上报的告警信息称为第二类告警信息。该第二类告警信息中包括以下至少一项:对象标识、对应网元的标识、故障标识、或故障原因。示例的,第一网元可以为SMOF,第二网元为CU,第三网元为DU,第四网元为RU等。
在本公开中,云资源Cloud可采用现有的接口,向CU、DU或RU等上报告警信息。或者,在云资源Cloud与CU、DU或RU间新增接口,用于上报告警信息。或者,云资源Cloud可采用现有的接口上报告警信息的一部分内容,告警信息的剩余内容采用新增的接口上报,不予限制。RU可通过现有的接口,向DU上报告警信息,例如开放(open)前传(Fronthaul)接口。或者,可在RU与DU间新增接口,用于上报告警信息等。DU可通过现有的接口,例如F1接口,向CU上报告警信息,或者,可在DU与CU间新增接口,用于上报告警信息等。对于RU与DU间的报警信息上报,或者DU与CU间的告警信息上报,可一部分内容通过新增接口上报,告警信息的剩余一部分内容通过现有的接口上报等。
例如,以第一网元为SMOF,第二网元、第三网元和第四网元分别为CU、DU和RU为例描述其具体的过程。当CU检测到CU的N1个对象故障时,可向SMOF上报该N1个对象的告警信息。该告警信息可称为第二类告警信息,该第二类告警信息中包括以下至少一项:对象标识、CU标识、故障标识、或故障原因等。例如,CU向SMOF上报N1个对象的告警信息,所述N1个对象中每个对象的告警信息中包括:对象标识+CU标识+故障标识。其中,故障标识可隐示指示故障原因。同理,当DU检测到DU的N2个对象故障时,可向SMOF上报该N2个对象的告警信息,该N2个对象中每个对象的告警信息中包括:对象标识+DU标识+故障标识,该故障标识可隐示指示故障原因。对于RU向SMOF上报N3个对应的告警信息的过程,与CU或DU向SMOF上报告警信息的过程相似,不再赘述。在本公开中,SMOF可根据对象关联关系,例如CU、DU和RU的关联关系,确定CU、DU和RU上报的告警信息的根因故障。在本公开中,一个CU可集中控制多个DU,CU与其集中控制的DU存在关联关系,CU与除集中控制的DU外的其它DU不存在关联关系。一个DU可集中控制多个RU,DU与其集中控制的RU间存在关联关系,DU与除集中控制的RU外的其它RU不存在关联关系。举例来说,CU1集中控制DU11至DU13,DU11集中控制RU111至RU113,则CU1与DU11至DU13存在关联关系。DU11与RU111至RU113存在关联关系。在本公开中,若CU1向SMOF上报N1个对象的告警信息,DU11向SOMF上报N2个对象的告警信息,RU113向SMOF上报N3个对象的告警信息。由于CU1、DU11和RU113间存在关联关系,可确定RU113的N3个对象的告警信息,是DU11的N2个对象和CU1的N1个对象的告警信息的根因故障。
应当理解,在上述描述中,以CU、DU或RU,向SMOF上报告警信息,SMOF确定根因故障为例描述的。除此之外,云资源Cloud还可以向SMOF上报告警信息,SMOF可根据关联关系,确定CU、DU、RU或云资源Cloud的告警信息的根因故障。沿用上述举例,CU1、DU11、RU113、和云资源Cloud,分别向SMOF上报告警信息。前述已分析,CU1、DU11和RU113间存在关联关系,设计云资源Cloud上报的告警信息是RU113的对象的告警信息。SMOF可确定云资源Cloud的告警信息是CU1、DU11和RU113的告警信息的根因故障。
在本公开中,CU、DU、RU或云资源Cloud等,可采用现有的接口,向SMOF上报告警信息。例如,通过O1接口向SMOF上报告警信息。或者,在CU、DU、RU或云资源Cloud与SMOF间新增接口,利用该新增接口向SOMF上报告警信息。或者,告警信息中的一部分内容利用现有的标准接口上报,另一部分内容利用新增接口上报等,不予限制。
示例的,如图4所示,以RU对应的云资源Cloud对象故障为例,提供一种确定根因故障的流程,至少包括:
云资源Cloud在检测到RU对应的云资源Cloud对象故障时,向RU上报告警信息,该告警信息称为第一类告警信息,该告警信息中包括以下至少一项:云资源Cloud的对象标识、云资源Cloud标识、或RU标识。可选的,该告警信息中还可以包括相关联的DU标识和CU标识等。RU检测到内部告警信息,RU根据RU与RU对应云资源Cloud对象的 关联关系,确定云资源Cloud的告警信息,是RU告警信息的根因故障。RU通过O1接口,将上述根因故障的相关性分析上报到SMOF。
RU可向DU发送告警信息,该告警信息中包括:RU的告警信息、和云资源Cloud的告警信息。关于云资源Cloud的告警信息的内容可参见前述。对于RU上报的告警信息,可为第一类告警信息,包括:RU的对象标识、和RU对应的DU标识。可选的,在RU可感知CU的情况下,RU上报的告警信息中还可以包括:RU对应的CU标识。DU根据RU与DU的关联关系,以及RU与云资源Cloud中故障对象的关联关系,确定云资源Cloud的告警信息,是RU的告警信息与DU的告警信息的根因故障。DU通过O1接口,将上述根因故障的相关性分析上报到SMOF。
DU可向CU发送告警信息,该告警信息中包括:DU的告警信息、RU的告警信息和云资源Cloud的告警信息。关于RU的告警信息与云资源Cloud的告警信息的内容可参见前述。DU的告警信息中包括以下至少一项:DU相关联的CU标识、DU标识、和DU的对象标识。CU在检测到内部故障,会产生告警信息。CU根据对象关联关系,确定上述多个告警信息的根因故障。上述多个告警信息中包括:CU检测到的告警信息,接收的DU的告警息、RU的告警信息,以及云资源Cloud的告警信息。通过对告警信息进行相关性分析可以得到,CU的告警信息是由于DU的告警信息所导致的,DU的告警信息是由于RU的告警信息所导致的,RU的告警信息由于云资源Cloud的告警信息所导致的。CU最后确定:云资源Cloud的告警信息,是CU、DU和RU的告警信息的根因故障。CU通过O1接口,将上述根因故障的相关性分析上报到SMOF。
示例的,SMOF在接收到CU、DU和RU等上报的根因故障的相关分析时,可将该根因故障对应的告警信息上报到运营商网管,由运营商网管派出工单。可选的,为了更加准确的确定根因故障,SMOF也可自己确定根因故障,如下:
云资源Cloud、RU、DU或CU等,还可以向SMOF上报各自检测到的告警信息,该告警信息称为第二类告警信息,该第二类告警信息中包括以下至少一项:对象标识、对应网元的标识、故障标识、或故障原因等。示例的,云资源Cloud向SMOF上报的告警信息中包括:云资源Cloud的对象标识、云资源Cloud的标识、和故障标识等。RU向SMOF上报的告警信息中包括:RU的对象标识、RU标识、和故障标识等。DU向SMOF上报的告警信息中包括:DU的对象标识、DU标识、和故障标识等。CU向SMOF上报的告警信息中包括:CU的对象标识、CU标识、和故障标识等。其中,故障标识可隐示指示故障原因。SMOF在接收到云资源Cloud、RU、DU和CU各自上报的告警信息时,可在云资源Cloud的告警信息中获取云资源Cloud的对象标识,在RU的告警信息中获取RU标识,在DU的告警信息中获取DU标识,在CU的告警信息中获取CU标识。根据对象关联关系,确定获取的云资源Cloud的对象标识与RU标识间是否存在关联关系,获取的RU标识与DU标识间是否存在关联关系,获取的DU标识与CU标识间是否存在关联关系;如果三者均存在关联关系,可确定云资源Cloud的告警信息,是RU、DU和CU的告警信息的根因故障。SMOF可将该云资源Cloud的告警信息作为根因故障,上报到运营商网管,由运营商网管针对该根因故障,派发工单。示例的,SMOF向运营商网管上报的告警信息,可为各个网元上报的第二类告警信息。例如,以根因故障为云资源Cloud的告警信息为例,则SMOF向运营商网管上报的告警信息中包括以下至少一项:云资源Cloud的对象标识、云资源Cloud的标识、或故障标识等,该故障标识可隐示指示故障原因等。
例如,在一种实现方式中,SMOF可将CU、DU和RU等确定的根因故障,和SMOF自己确定的根因故障,作比较:判断各个网元上报的根因故障,与SMOF确定的根因故障是否相同;如果相同,则向SOMF上报该相同的根因故障;如果不同,则将各个网元上报的根因故障和SMOF自己确定的根因故障,都上报到运营商网管。或者,SMOF基于自身 的判断的准确性高的考虑,可向运营商网管上报SMOF确定的根因故障。当然,如果各个网元确定的根因故障的准确性高,SMOF也可向运营商网管上报各个网元的根因故障,不予限制。
在一种设计中,SMOF可通过大数据或AI等方式,对对象关联关系进行学习和更新,因此对于SMOF确定的根因故障的准确性要高。对于有些告警信息之间的关联关系,可能通过CU、DU或DU等不能判断出根因故障,此时可通过SMOF做汇聚统一处理,较准确的确定根因故障。可选的,SMOF还可以将更新的对象关联关系,同步发送给CU、DU或RU等。
通过上述可看出,在本公开中,RU、DU或CU等,可将各自的根因故障分析上报到SMOF。可选的,SMOF也可根据云资源Cloud、RU、DU和CU等上报的告警信息,确定根因故障。SMOF确定根因故障的过程,可作为对RU、DU或CU确定根因故障的补充,提高确定根因故障的准确性。
为了实现上述方法中的功能,上述第一网元、第二网元和第三网元包括了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,给合本公开描述的各示例的单元及方法步骤,本公开能够以硬件或硬件和计算机软件相结合的形式来实现。某个功能究以硬件,是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用场景和设计约束条件。
图5和图6为本公开提供的可能的装置的结构示意图。这些通信装置可以用于实现上述方法中第一网元、第二网元或第三网元的功能,因此也能实现上述方法所具备的有益效果。
如图5所示,通信装置500包括处理单元510和收发单元520。通信装置能够实现上述方法中第一网元、第二网元或第三网元的功能。
例如,当通信装置500用于实现上述方法中第一网元的功能时:处理单元510,用于确定N个对象的告警信息,所述N为大于或等于2的整数。处理单元510,还用于根据对象关联关系,确定所述N个对象的告警信息的根因故障;其中,所述根因故障是所述N个对象中M个对象的告警信息,所述M为小于或等于N的正整数。收发单元520,用于从其它网元接收相应的信息。
例如,当通信装置500用于实现上述中第二网元的功能时:处理单元510,用于确定N2个对象的告警信息;收发单元520,用于向第一网元发送N2个对象的告警信息,所述告警信息称为第一类告警信息;其中,针对所述N2个对象,每个对象的第一类告警信息中包括以下至少一项:所述对象的标识、第二网元的标识、或所述第二网元相关联网元的标识;其中,所述N2为正整数。
例如,当通信装置500用于实现上述方法中第三网元的功能时:收发单元520,用于接收第一指示信息,所述第一指示信息用于指示N个对象的告警信息的根因故障;其中,所述根因故障是所述N个对象中M个对象的告警信息,所述N为大于或等于2的正整数,所述M为小于或等于N的正整数。处理单元510,用于对所述N个对象的告警信息的根因故障进行处理。
有关上述处理单元510和收发单元520更详细的描述可以参考上述方法中相关描述直接得到,这里不加赘述。
如图6所示,通信装置600包括处理器610和接口电路620。处理器610和接口电路620之间相互耦合。可以理解的是,接口电路620可以为收发器、输入输出接口、或管脚等。可选的,通信装置600还可以包括存储器630,用于存储处理器610执行的指令或存储处理器610运行指令所需要的输入数据或存储处理器610运行指令后产生的数据。
当通信装置600用于实现上述方法时,处理器610用于实现上述处理单元510的功能, 接口电路620用于实现上述收发单元520的功能。
当上述装置为应用于CU、DU、RU或SMOF中的模块时,该模型实现上述方法中CU、DU、RU或SMOF等的功能。该模块可以是CU、DU、RU或SMOF等中的芯片,也可以是其它模块等。
可以理解的是,本公开中的处理器可以是中央处理单元(central processing unit,CPU),还可以是其它通用处理器、数字信号处理器(digital signal processor,DSP)、专用集成电路(application specific integrated circuit,ASIC)、现场可编程门阵列(field programmable gate array,FPGA)或者其它可编程逻辑器件、晶体管逻辑器件,硬件部件或者其任意组合。通用处理器可以是微处理器,也可以是任何常规的处理器。
本公开中的存储器可以是随机存取存储器、闪存、只读存储器、可编程只读存储器、可擦除可编程只读存储器、电可擦除可编程只读存储器、寄存器、硬盘、移动硬盘、CD-ROM或者本领域熟知的任何其它形式的存储介质。
一种示例性的存储介质耦合至处理器,从而使处理器能够从该存储介质读取信息,且可向该存储介质写入信息。存储介质也可以是处理器的组成部分。处理器和存储介质可以位于ASIC中。另外,该ASIC可以位于基站或终端中。当然,处理器和存储介质也可以作为分立组件存在于基站或终端中。
本公开中的方法可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机程序或指令。在计算机上加载和执行所述计算机程序或指令时,全部或部分地执行本公开所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、网络设备、用户设备、核心网设备、OAM或者其它可编程装置。所述计算机程序或指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机程序或指令可以从一个网站站点、计算机、服务器或数据中心通过有线或无线方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是集成一个或多个可用介质的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,例如,软盘、硬盘、磁带;也可以是光介质,例如,数字视频光盘;还可以是半导体介质,例如,固态硬盘。该计算机可读存储介质可以是易失性或非易失性存储介质,或可包括易失性和非易失性两种类型的存储介质。
在本公开中,如果没有特殊说明以及逻辑冲突,不同的示例之间的术语和/或描述具有一致性、且可以相互引用,不同的示例中的技术特征根据其内在的逻辑关系可以组合形成新的示例。
本公开中,“至少一个”是指一个或者多个,“多个”是指两个或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B的情况,其中A,B可以是单数或者复数。在本公开的文字描述中,字符“/”,一般表示前后关联对象是一种“或”的关系;在本公开的公式中,字符“/”,表示前后关联对象是一种“相除”的关系。“包括A,B或C中的至少一个”可以表示:包括A;包括B;包括C;包括A和B;包括A和C;包括B和C;包括A、B和C。
可以理解的是,在本公开中涉及的各种数字编号仅为描述方便进行的区分,并不用来限制本公开的范围。上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定。

Claims (29)

  1. 一种确定根因故障的方法,其特征在于,包括:
    确定N个对象的告警信息,所述N为大于或等于2的整数;
    根据对象关联关系,确定所述N个对象的告警信息的根因故障;其中,所述根因故障是所述N个对象中M个对象的告警信息,所述M为小于N的正整数。
  2. 如权利要求1所述的方法,其特征在于,所述根据对象关联关系,确定所述N个对象的告警信息的根因故障,包括:
    根据对象关联关系和所述N个对象的告警信息的生成时间,确定所述N个对象的告警信息的根因故障。
  3. 如权利要求2所述的方法,其特征在于,所述根据对象关联关系和所述N个对象的告警信息的生成时间,确定所述N个对象的告警信息的根因故障,包括:
    根据所述对象关联关系,确定X个关联关系集合;
    针对每个关联关系集合:根据所述关联关系集合中包括的对象的告警信息的生成时间,确定所述关联关系集合中的L个对象,所述L个对象的告警信息的生成时间差小于阈值;确定所述L个对象的告警信息的根因故障;其中,所述根因故障是所述L个对象中至少一个对象的告警信息,所述X和L均为正整数。
  4. 如权利要求2所述的方法,其特征在于,所述根据对象关联关系和所述N个对象的告警信息的生成时间,确定所述N个对象的告警信息的根因故障,包括:
    根据所述对象关联关系和所述N个对象的告警信息的生成时间,确定P个关联关系集合;
    针对每个关联关系集合:确定所述关联关系集合中包括的Q个对象的告警信息的根因故障;其中,所述根因故障是所述Q个对象中的至少一个对象的告警信息,所述Q个对象存在关联关系,且所述Q个存在关联关系的对象的告警信息的生成时间差小于阈值,所述P与Q均为正整数。
  5. 如权利要求1至4中任一项所述的方法,其特征在于,所述N个对象中包括N1个对象和N2个对象,所述N1和N2均为正整数,且两者之和等于N,所述确定N个对象的告警信息,包括:
    检测到所述N1个对象的告警信息;
    接收来自所述第二网元的所述N2个对象的告警信息。
  6. 如权利要求5所述的方法,其特征在于,所述第一网元为集中式单元CU,所述第二网元为分布式单元DU,所述N1个对象的告警信息中包括所述CU的对象的告警信息,所述N2个对象的告警信息中包括以下至少一项:所述DU的对象的告警信息、无线单元RU的对象的告警信息、所述RU对应的云资源对象的告警信息、或所述DU对应的云资源对象的告警信息。
  7. 如权利要求5所述的方法,其特征在于,所述第一网元为DU,所述第二网元为RU,所述N1个对象的告警信息中包括所述DU的对象的告警信息,所述N2个对象的告警信息中包括以下至少一项:所述RU的对象的告警信息、或所述RU对应的云资源的对象的告警信息。
  8. 如权利要求5所述的方法,其特征在于,所述第一网元为RU、DU、或CU,所述N1个对象的告警信息中包括所述第一网元的对象的告警信息;所述第二网元为所述第一网元对应的云资源,所述N2个对象的告警信息中包括所述第一网元对应的云资源的对象的告警信息。
  9. 如权利要求5至8中任一项所述的方法,其特征在于,针对所述N2个对象,每个对 象的告警信息为第一类告警信息,所述第一类告警信息中包括以下至少一项:所述对象的标识、所述第二网元的标识、或所述第二网元相关联的网元的标识。
  10. 如权利要求1至9中任一项所述的方法,其特征在于,还包括:
    向第三网元发送第一指示信息,所述第一指示信息用于指示所述N个对象的告警信息的根因故障。
  11. 如权利要求1至10中任一项所述的方法,其特征在于,还包括:
    获取所述对象关联关系,所述对象关联关系是由来自第三网元的配置文件或配置消息指示的。
  12. 如权利要求11所述的方法,其特征在于,所述获取对象关联关系,包括:
    接收来自所述第三网元的第一配置文件或第一配置消息,所述第一配置文件或第一配置消息用于指示第一对象关联关系;
    接收来自所述第三网元的第二配置文件或第二配置消息,所述第二配置文件或第二配置消息用于指示所述第二对象关联关系;
    根据所述第一对象关联关系和所述第二对象关联关系,确定所述对象关联关系。
  13. 如权利要求1至4中任一项所述的方法,其特征在于,所述N个对象中包括N1个对象、N2个对象和N3个对象,所述N1、N2和N3的取值均为正整数,且三者之和等于N,所述确定N个对象的告警信息,包括:
    接收来自所述第一网元的告警信息,所述告警信息中包括所述N1个对象的告警信息;
    接收来自所述第二网元的告警信息,所述告警信息中包括所述N2个对象的告警信息;
    接收来自所述第三网元的告警信息,所述告警信息中包括所述N3个对象的告警信息。
  14. 如权利要求13所述的方法,其特征在于,所述告警信息为第二类告警信息,所述第二类告警信息中包括以下至少一项:对象标识、对应网元的标识、故障标识、或故障原因。
  15. 一种确定根因故障的方法,其特征在于,包括:
    向第一网元发送N2个对象的告警信息,所述告警信息为第一类告警信息;
    针对所述N2个对象,每个对象的第一类告警信息中包括以下至少一项:所述对象的标识、第二网元的标识、或所述第二网元相关联的网元的标识;其中,所述N2为正整数。
  16. 一种确定根因故障的方法,其特征在于,包括:
    接收第一指示信息,所述第一指示信息用于指示N个对象的告警信息的根因故障;
    所述根因故障是所述N个对象中M个对象的告警信息,所述N为大于或等于2的正整数,所述M为小于N的正整数。
  17. 如权利要求16所述的方法,其特征在于,还包括:
    向第一网元发送用于指示对象关联关系的配置文件或配置消息。
  18. 如权利要求17所述的方法,其特征在于,所述向第一网元发送用于指示对象关联关系的配置文件或配置消息,包括:
    向所述第一网元发送第一配置文件或第一配置消息,所述第一配置文件或第一配置消息用于指示第一对象关联关系;
    向所述第一网元发送第二配置文件或第二配置消息,所述第二配置文件或第二配置消息用于指示第二对象关联关系。
  19. 如权利要求16至18中任一项所述的方法,其特征在于,所述N个对象中包括N1个对象和N2个对象,所述N1与N2均为正整数,且两者之和等于N;
    所述N1个对象包括集中式单元CU的对象,所述N2个对象包括以下至少一项:分布式单元DU的对象、DU对应的云资源的对象、无线单元RU的对象、或RU对应的云资源的对象;或者,
    所述N1个对象包括DU的对象,所述N2个对象包括以下至少一项:所述RU的对象、或所述RU对应的云资源的对象;或者,
    所述N1个对象包括CU的对象、DU的对象、或RU的对象,所述N2个对象中包括所述CU对应的云资源的对象、DU对应的云资源的对象、或RU对应的云资源的对象。
  20. 如权利要求16至18中任一项所述的方法,其特征在于,所述N个对象中包括N1个对象、N2个对象和N3个对象,所述N1、N2与N3均为正整数,且三者之和等于N;
    所述N1个对象中包括CU的对象,所述N2个对象中包括DU的对象,所述N3个对象中包括RU的对象。
  21. 一种通信装置,其特征在于,包括用于实现权利要求1至14中任一项所述方法的单元。
  22. 一种通信装置,其特征在于,包括处理器和存储器,所述处理器和存储器耦合,所述处理器用于实现权利要求1至14中任一项所述的方法。
  23. 一种通信装置,其特征在于,包括用于实现权利要求15所述方法的单元。
  24. 一种通信装置,其特征在于,包括处理器和存储器,所述处理器和存储器耦合,所述处理器用于实现权利要求15所述的方法。
  25. 一种通信装置,其特征在于,包括用于实现权利要求16至20中任一项所述方法的单元。
  26. 一种通信装置,其特征在于,包括处理器和存储器,所述处理器和存储器耦合,所述处理器用于实现权利要求16至20中任一项所述的方法。
  27. 一种通信系统,其特征在于,包括权利要求21或22所述的通信装置,权利要求23或24所述的通信装置,和权利要求25或26所述的通信装置。
  28. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储有指令,当所述指令在计算机上运行时,使得计算机执行权利要求1至14中任一项所述的方法,或者权利要求15所述的方法,或者权利要求16至20中任一项所述的方法。
  29. 一种计算机程序产品,其特征在于,包括指令,当所述指令在计算机上运行时,使得计算机执行权利要求1至14中任一项所述的方法,或者权利要求15所述的方法,或者权利要求16至20中任一项所述的方法。
PCT/CN2022/127162 2022-02-18 2022-10-25 一种确定根因故障的方法及装置 WO2023155468A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP22926772.9A EP4451729A1 (en) 2022-02-18 2022-10-25 Methods for determining root cause fault, and apparatuses
KR1020247026778A KR20240134185A (ko) 2022-02-18 2022-10-25 근본 원인 오류 결정 방법 및 장치

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210152355.7 2022-02-18
CN202210152355.7A CN114520994A (zh) 2022-02-18 2022-02-18 一种确定根因故障的方法及装置

Publications (1)

Publication Number Publication Date
WO2023155468A1 true WO2023155468A1 (zh) 2023-08-24

Family

ID=81599412

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/127162 WO2023155468A1 (zh) 2022-02-18 2022-10-25 一种确定根因故障的方法及装置

Country Status (4)

Country Link
EP (1) EP4451729A1 (zh)
KR (1) KR20240134185A (zh)
CN (1) CN114520994A (zh)
WO (1) WO2023155468A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114520994A (zh) * 2022-02-18 2022-05-20 华为技术有限公司 一种确定根因故障的方法及装置
CN115243286B (zh) * 2022-06-20 2024-05-03 中国联合网络通信集团有限公司 一种数据处理方法、装置及存储介质
CN115988551B (zh) * 2022-12-19 2023-09-08 南京濠暻通讯科技有限公司 一种基于zynq的o-ran无线单元故障管理方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017186649A1 (en) * 2016-04-26 2017-11-02 Tactile Limited Repair diagnostic system and method
CN111897673A (zh) * 2020-07-31 2020-11-06 平安科技(深圳)有限公司 运维故障根因识别方法、装置、计算机设备和存储介质
CN113259168A (zh) * 2021-05-28 2021-08-13 新华三人工智能科技有限公司 一种故障根因分析方法及装置
CN114520994A (zh) * 2022-02-18 2022-05-20 华为技术有限公司 一种确定根因故障的方法及装置

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113347654B (zh) * 2020-03-03 2023-04-07 中国移动通信集团贵州有限公司 一种针对退服基站的故障类型确定方法和装置
CN113395108B (zh) * 2020-03-12 2022-12-27 华为技术有限公司 故障处理的方法、装置以及系统
CN113709777A (zh) * 2020-05-21 2021-11-26 华为技术有限公司 一种故障处理方法、装置及系统

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017186649A1 (en) * 2016-04-26 2017-11-02 Tactile Limited Repair diagnostic system and method
CN111897673A (zh) * 2020-07-31 2020-11-06 平安科技(深圳)有限公司 运维故障根因识别方法、装置、计算机设备和存储介质
CN113259168A (zh) * 2021-05-28 2021-08-13 新华三人工智能科技有限公司 一种故障根因分析方法及装置
CN114520994A (zh) * 2022-02-18 2022-05-20 华为技术有限公司 一种确定根因故障的方法及装置

Also Published As

Publication number Publication date
KR20240134185A (ko) 2024-09-06
CN114520994A (zh) 2022-05-20
EP4451729A1 (en) 2024-10-23

Similar Documents

Publication Publication Date Title
WO2023155468A1 (zh) 一种确定根因故障的方法及装置
US20190394826A1 (en) Method for processing rlc failure, network device and computer storage medium
US20220217046A1 (en) Providing information
WO2023036268A1 (zh) 一种通信方法及装置
US20230239175A1 (en) Method and System for Interaction Between 5G and Multiple TSC/TSN Domains
WO2023143267A1 (zh) 一种模型配置方法及装置
KR20210128459A (ko) V2x 통신들을 위한 애플리케이션 서버 및/또는 서비스들의 발견을 위한 방법들, 장치 및 컴퓨터 판독가능한 매체
JP6526835B2 (ja) シグナリングセット又はコールの分析及び分類
CN112166622A (zh) 通告用户设备(ue)的可扩展能力特性集
CN117597901A (zh) 用于时间敏感网络的域间配置的方法和装置
US20220006816A1 (en) Terminal management and control method, apparatus, and system
WO2023036280A1 (zh) 一种模型测试方法及装置
US20240275636A1 (en) Methods and Apparatus Supporting Dynamic Ethernet VLAN Configuration in a Fifth Generation System
US20240235891A1 (en) Methods and Apparatus Supporting Dynamic Ethernet VLAN Configuration in a Fifth Generation System
EP4250802A1 (en) Optimizing physical cell id assignment in a wireless communication network
EP4106273A1 (en) Apparatus, methods, and computer programs
WO2021196697A1 (zh) 一种容灾处理方法及装置
US20230403652A1 (en) Graph-based systems and methods for controlling power switching of components
WO2023066346A1 (zh) 一种通信方法及装置
US20240357380A1 (en) Managing decentralized autoencoder for detection or prediction of a minority class from an imbalanced dataset
US20240243796A1 (en) Methods and Apparatus for Controlling One or More Transmission Parameters Used by a Wireless Communication Network for a Population of Devices Comprising a Cyber-Physical System
WO2023240592A1 (en) Apparatus, methods, and computer programs
US20230403548A1 (en) Method and apparatus for terminal device behavior classification
US20220174515A1 (en) Split Architecture Radio Access Network Node Providing Low Level Indication of Status or Failure and Responsive Instructions
EP4396731A1 (en) Managing decentralized auotencoder for detection or prediction of a minority class from an imbalanced dataset

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22926772

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2022926772

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2022926772

Country of ref document: EP

Effective date: 20240715

ENP Entry into the national phase

Ref document number: 20247026778

Country of ref document: KR

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 11202405143T

Country of ref document: SG