WO2022143025A1 - 基于联邦学习的告警关联检测方法、系统、网络及介质 - Google Patents

基于联邦学习的告警关联检测方法、系统、网络及介质 Download PDF

Info

Publication number
WO2022143025A1
WO2022143025A1 PCT/CN2021/135855 CN2021135855W WO2022143025A1 WO 2022143025 A1 WO2022143025 A1 WO 2022143025A1 CN 2021135855 W CN2021135855 W CN 2021135855W WO 2022143025 A1 WO2022143025 A1 WO 2022143025A1
Authority
WO
WIPO (PCT)
Prior art keywords
node
alarm
data
optimal
mining
Prior art date
Application number
PCT/CN2021/135855
Other languages
English (en)
French (fr)
Inventor
肖雷
江其坤
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2022143025A1 publication Critical patent/WO2022143025A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0631Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0677Localisation of faults
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W24/00Supervisory, monitoring or testing arrangements
    • H04W24/04Arrangements for maintaining operational condition

Definitions

  • the embodiments of the present application relate to the field of communication technologies, and in particular, to a method, system, network and medium for alarm correlation detection based on federated learning.
  • the communication network can be divided into multiple layers according to the region, and can be divided into countries, centers, provinces, cities, counties, districts, etc.
  • the overall operation and maintenance can be attributed to the network management center at the provincial level, and the communication network at the prefecture-level, county-level, and district-level Managed by operation and maintenance personnel in their respective regions.
  • the physical equipment required to build a communication network can be divided into service domains such as wireless access equipment, bearer equipment, core network, and external power system according to the realization functions.
  • the wireless access equipment is mainly used to allow the user equipment to access the communication network
  • the bearer equipment mainly transmits the data of the user equipment to the communication equipment room in the corresponding geographical area
  • the core network mainly exchanges and exchanges the data in the communication equipment room in each geographical area.
  • the external power system mainly provides power supply and cooling for the equipment room or tower where the wireless access equipment is located.
  • each business domain is managed by its own network management operation and maintenance, which actually supports each other.
  • an alarm in one business domain is likely to trigger an alarm in other business domains.
  • Cross-domain alarm correlation has certain data missing and cannot be correlated.
  • One or more embodiments of this specification provide a federated learning-based alarm correlation detection method, system, and medium.
  • an alarm correlation detection method based on federated learning is provided, which is suitable for a scenario where multiple intelligent nodes forming a topology structure are arranged in a communication network, and the multiple intelligent nodes are distributed in each service domain, the The method includes: selecting an optimal node from the plurality of intelligent nodes; obtaining final alarm correlation relationship data after the optimal node completes alarm correlation mining in respective business domains based on federated learning; based on the final alarm correlation Relational data generates alarm correlation rules.
  • an alarm correlation detection system based on federated learning which includes arranging a plurality of intelligent nodes forming a topology structure in a communication network, the plurality of intelligent nodes are distributed in each service domain, and the system further includes : the node selection module is set to select the optimal node from the plurality of intelligent nodes; the mining module is set to obtain the final Alarm correlation data; a rule generation module configured to generate alarm correlation rules based on the final alarm correlation data.
  • a communication network comprising the federated learning-based alarm correlation detection system according to any one of claims 12 to 14.
  • an electronic device comprising: a processor; and
  • a memory arranged to store computer executable instructions which, when executed, cause the processor to perform the steps of the federated learning based alarm correlation detection method as described above.
  • a storage medium for computer-readable storage, wherein the storage medium stores one or more programs, and when the one or more programs can be executed by one or more processors, the above-mentioned implementation is achieved The steps of the alarm association detection method based on federated learning described in the article.
  • FIG. 1 is a schematic diagram of steps of an alarm association detection method based on federated learning provided by an embodiment of the present application.
  • FIG. 2 is a schematic diagram of steps of another method for detecting alarm association based on federated learning provided by an embodiment of the present application.
  • FIG. 3 is a schematic diagram of steps of another method for detecting alarm association based on federated learning provided by an embodiment of the present application.
  • FIG. 4 is a schematic diagram of steps of another method for detecting alarm association based on federated learning provided by an embodiment of the present application.
  • FIG. 5 is a schematic diagram of steps of another method for detecting alarm association based on federated learning provided by an embodiment of the present application.
  • FIG. 6 is a schematic diagram of steps of another method for detecting alarm association based on federated learning provided by an embodiment of the present application.
  • FIG. 7 is a schematic diagram of steps of another method for detecting alarm association based on federated learning provided by an embodiment of the present application.
  • FIG. 8 is a schematic diagram of an interface display of a knowledge graph displayed in another method for detecting alarm association based on federated learning provided by an embodiment of the present application.
  • FIG. 9 is a schematic diagram of steps of another method for detecting alarm association based on federated learning provided by an embodiment of the present application.
  • FIG. 10 is a schematic diagram of steps of another method for detecting alarm association based on federated learning provided by an embodiment of the present application.
  • FIG. 11 is a schematic diagram of steps of another alarm association detection method based on federated learning provided by an embodiment of the present application.
  • FIG. 12 is a schematic diagram of steps of another method for detecting alarm association based on federated learning provided by an embodiment of the present application.
  • FIG. 13 is a schematic diagram of steps of another method for detecting alarm association based on federated learning provided by an embodiment of the present application.
  • FIG. 14 is a schematic diagram of deployment of another method for alarm correlation detection based on federated learning provided by an embodiment of the present application applied to an intelligent node in a single service domain.
  • FIG. 15 is a schematic diagram of deployment of another method for alarm correlation detection based on federated learning provided by an embodiment of the present application applied to intelligent nodes in multiple service domains.
  • FIG. 16 is a schematic structural diagram of an alarm correlation detection system based on federated learning provided by an embodiment of the present application.
  • FIG. 17 is a schematic structural diagram of another alarm correlation detection system based on federated learning provided by an embodiment of the present application.
  • the method for alarm correlation detection based on federated learning is suitable for the scenario where multiple intelligent nodes forming a topology structure are arranged in a communication network, and the multiple intelligent nodes are distributed in each service domain. Intelligent nodes participate in the alarm correlation detection.
  • the alarm correlation detection method can realize cross-domain alarm correlation mining under the premise of ensuring data security by means of federated learning, quickly locate the alarm root cause, and improve the alarm detection efficiency.
  • the method for detecting alarm correlation based on federated learning provided in this specification and each step thereof will be described in detail below.
  • the intelligent node proposed in this application may be deployed on a processor device with data processing capability, such as a server, in the form of a service or micro-service, and the data processing capability of the intelligent node will be described in detail below.
  • FIG. 1 shows the current wireless communication network architecture.
  • the wireless communication network can be divided into business domains such as power network, wireless network, bearer network and core network.
  • Network devices in different business domains have corresponding network management systems.
  • Management, the network management system of each business domain can be divided into data domains such as configuration domain, performance domain, alarm domain, dynamic domain, and operation and maintenance domain for the management and monitoring of network devices. Therefore, when solving equipment failures, such as equipment operation and maintenance status and alarm generation in the wireless network, it may be caused by equipment failures in other domains that cause equipment failures in the associated wireless network. Therefore, it is necessary to comprehensively alarm multiple service domains.
  • Correlation data mining forms a fixed alarm correlation rule based on the obtained alarm correlation relationship involving each business domain, so that subsequent operation and maintenance personnel can quickly locate the root cause of the alarm according to the alarm correlation rule and solve the fault in time.
  • FIG. 2 a schematic diagram of steps of an alarm association detection method based on federated learning provided by an embodiment of the present application.
  • the federated learning-based alarm correlation detection method is suitable for a scenario where multiple intelligent nodes forming a topology structure are arranged in a communication network, and multiple intelligent nodes are distributed in each service domain.
  • the federated learning-based alarm association detection method provided by the embodiment of the present application includes:
  • Step 10 Select the optimal node from multiple intelligent nodes
  • the intelligent node here may be the original one that has been set in the communication network, or may be a newly set intelligent node for implementing the alarm correlation detection method provided by the embodiment of the present application.
  • These multiple intelligent nodes adopt topology
  • the structure is connected to form a large intelligent node group, so the intelligent nodes can also realize the data transmission between the intelligent nodes after the data processing is realized.
  • a large number of intelligent nodes are distributed in various service domains. The purpose of the existence of intelligent nodes is to use these intelligent nodes to realize alarm correlation detection across service domains.
  • the purpose of selecting the optimal node from multiple intelligent nodes is to use the selected optimal node for alarm correlation mining. Considering the time involved in the occurrence of alarms, each time an alarm correlation detection method is performed, it is necessary to correlate alarms according to the current alarm.
  • the topology structure where the involved intelligent nodes are located, the routing relationship between the intelligent nodes, and the running state of each intelligent node are selected from the multiple intelligent nodes again.
  • Step 20 Based on federated learning, after the optimal node completes alarm correlation mining in their respective business domains, final alarm correlation data is obtained;
  • federated learning can effectively solve the problem of data silos, allowing participants to jointly model without sharing data, which can technically break data silos and achieve collaboration.
  • the optimal node completes alarm correlation mining based on federated learning in their respective business domains, conducts comprehensive alarm correlation data mining for multiple business domains, and obtains the alarm correlation relationship involving each business domain. Correlation between equipment failures in other business domains.
  • Step 30 Generate an alarm association rule based on the final alarm association relationship data.
  • alarm correlation rules are generated, and it is convenient for operation and maintenance personnel to quickly locate the root cause of the alarm according to these alarm correlation rules and solve the equipment failure.
  • the federated learning-based alarm correlation detection method needs to be completed by using intelligent node clusters deployed in each business domain, and the optimal node based on federated learning is used to perform alarm correlation mining in the business domain where it is located to obtain the final result.
  • Alarm correlation data can solve alarm correlation mining across business domains, assist operation and maintenance personnel to dig out the alarm root cause of actual network operating equipment failure, and the formed alarm correlation rules can be transferred to other actual operating networks to quickly solve actual operation problems. Network equipment failure.
  • step 20 based on federated learning, after the optimal node completes alarm association mining in respective business domains, obtain Final alarm correlation data, including:
  • Step 200 scheduling and sorting the optimal nodes based on federated learning to form a scheduling sequence
  • the optimal nodes can be scheduled and sorted according to the longitudinal modeling method of federated learning to form a scheduling sequence, that is, scheduling and sorting need to consider the relationship between the various business domains involved in the alarm correlation and the interaction between each business domain when mining the alarm correlation respectively. supporting relationship between them.
  • the purpose of scheduling and sorting is to determine the data flow transmission direction of alarm correlation data between each optimal node, so that the optimal node can complete cross-domain alarm correlation mining.
  • Step 210 After the current node in the optimal nodes completes the alarm correlation mining according to the scheduling sequence, the obtained first alarm correlation data is sent to the next node in the optimal node, so that the next node in the optimal node can complete the alarm correlation excavate;
  • the scheduling sequence is formed.
  • the current node among the optimal nodes ranked ahead of the next node in the scheduling sequence after completing the alarm correlation mining, sends the obtained first alarm correlation data to the node ranked after the current node in the scheduling sequence.
  • the next node in the optimal node is used for the next node in the optimal node to complete alarm association mining.
  • the current node and the next node in the optimal node are the current node and the next node next to each other in the scheduling order.
  • the optimal node in the second position and the optimal node in the third position in the scheduling order The optimal node completes alarm correlation mining in sequence according to the scheduling order. It can be seen that the next node in the optimal node (from the second optimal node in the scheduling order to the last optimal node in the scheduling order) is the one that receives the current node (from the second optimal node in the scheduling order)
  • the alarm correlation mining is carried out in combination with the local alarm data.
  • the current node After the current node performs a complete alarm correlation mining for the data in its business domain, such as the device data of the current business, it filters out the alarm correlation data that does not involve cross-business domains, encrypts it uniformly, and sends it to the next slave node.
  • the direct sending from the current node to the next node is determined in consideration of the network topology where the intelligent node is located and the routing relationship between each intelligent node.
  • Step 220 After the last node in the scheduling sequence completes the alarm correlation mining, the final alarm correlation relationship data is obtained.
  • the final alarm association relationship data is obtained, and then an alarm association rule is generated based on the last alarm association data.
  • step 200 scheduling and sorting optimal nodes based on federated learning to form a scheduling sequence, which specifically includes:
  • Step 201 The optimal nodes are scheduled and ordered according to the association relationship of each business domain and the support relationship for data mining between each business domain based on the longitudinal modeling method of federated learning.
  • the optimal nodes and scheduling sequence selected each time when alarm correlation detection is performed are not the same.
  • the linear relationship between each business domain is mainly based on The support relationship between each business domain is determined, and the support relationship between each business domain mainly depends on the support relationship between the physical devices between each business domain.
  • the current base station B is deployed in the computer room A
  • the computer room A is the centralized point of the association relationship between the above physical devices
  • the power supply device C supplies power to the line control switch D
  • the line control switch D controls the power-on and power-off of the base station B.
  • the multiple intelligent nodes include a model design node, a master control node, and a slave node.
  • Step 10 Before selecting the optimal node from the multiple intelligent nodes, the alarm provided by the embodiment of the present application is Association detection methods, which also include:
  • Step 40 Determine one of the multiple intelligent nodes as the model design node based on the user's selection
  • the smart node which is the user's login device, can be used as the model design node.
  • the model design node as its name implies, coordinates the initiation and operation of the entire alarm correlation detection, and the current smart node group used for designing the current alarm correlation detection, that is, the selected optimal node is different.
  • the model design node is a node in the network topology, and is also a node in the current intelligent node group currently used for alarm correlation detection.
  • Step 41 Determine the alarm correlation model
  • the alarm mining model here is selected by the alarm correlation detection.
  • the alarm correlation model in a business domain may include a device model related to the management device of the business domain and a data model related to the data domain, which will be mentioned separately below. It should be noted that, for the specific alarm data mining in a business domain or a data domain, an existing mining algorithm and mining process may be used.
  • Step 42 Select another one of the multiple intelligent nodes as the master control node based on the alarm mining model and the routing relationship, and other intelligent nodes among the multiple intelligent nodes as the slave nodes;
  • the main control node based on the network topology and routing relationship formed by the optimal node.
  • the main control node here is different from the model design node.
  • the control node plays the role of intermediate coordination processing to realize the optimal node in the scheduling sequence to complete the alarm correlation detection and obtain the final alarm correlation relationship data, and the final alarm correlation relationship data generates alarm correlation rules on the main control node and stores the alarm correlation rules.
  • the master control node After the master control node obtains the alarm mining model, it selects other intelligent nodes as slave nodes according to the federated learning method. At this time, the node performance of the intelligent nodes is more considered, which will be described below. Then the master control node distributes the alarm mining start data such as the alarm mining model to each slave node, and plays the function of intermediate scheduling and coordination during alarm correlation detection.
  • a central intelligent node is selected as the main control node mainly according to the routing relationship of the intelligent nodes involved in the model design, which can reduce the network transmission delay caused by the scheduling of the main control node.
  • Other connected slave nodes mainly utilize the data computing capability of the slave node.
  • the master node can be scheduled according to the working status and idle state of each slave node. If a slave node stores alarm correlation data, the master node will preferentially use the slave node for alarm correlation mining in this business domain. If there is no alarm correlation data Then the sub-node is mainly responsible for the algorithm operation function of the main control node split.
  • the master control node finds the optimal node according to the network topology composed of the intelligent nodes involved in the alarm mining model, the routing relationship between the intelligent nodes, and the running status of each intelligent node. Then the main control node arranges the scheduling order of each optimal node according to the longitudinal modeling method of federated learning, completes the joint mining of multi-intelligent nodes, and finally returns the final alarm correlation data mined across domains to the main control node for storage. The node is then sent to the model design node and presented to the user through the model design node.
  • the data transmission between each intelligent node is mainly carried out by encryption.
  • the time period involved in the alarm correlation if the time period is too large or the alarm correlation data transmitted between the optimal nodes is too large due to other reasons, the time period can be processed into slices, and then distributed in slices.
  • the main purpose here is to perform fragmentation processing in terms of time to avoid the problem that the alarm mining data to be transmitted is too large due to the long time period selected by the alarm mining model.
  • the next intelligent node After the next intelligent node receives the alarm correlation data after the time slice, it performs error tolerance processing on the time distributed by the slice, to ensure that the correlation across the slice cannot be lost due to time slice.
  • any smart node may become a model design node, master node and slave node, so the functions that only nodes have include:
  • Model management mainly to obtain the device models related to the alarms managed and controlled by the current business domain where the smart node is located, and the device models of the business domains whose upper and lower support relationships are used.
  • the device models here are part of the alarm mining model; the overall alarm mining model
  • the model combines data according to the device model of the current business domain and presents it to the user; distributes the alarm mining startup data such as the alarm mining model to each associated slave node, and performs scheduling and sorting according to the federated learning algorithm to determine the data flow path of the alarm-related data; according to the alarm mining
  • the model obtains alarm association rules from the rule base and presents them to the user in the form of a knowledge graph; the specific data mining algorithm used in alarm association mining can be Pearson algorithm or frequency spanning tree algorithm to perform related linear analysis.
  • Rule management mainly responsible for manual design of alarm association rules, rule query, rule import, rule export, rule storage, etc.;
  • Intelligent data mining one is to determine the intermediate data that needs to be transmitted according to the attributes used in the design of the alarm mining model and the mining algorithm. It can be the first one that is included in the alarm mining start data after the alarm mining model is designed and sent to the optimal node. node, or each time an optimal node is reached, these attributes involved in the business domain where the optimal node is located can be obtained at any time.
  • the attribute is to determine that the physical devices of each service domain are within the same geographical area, and that the physical devices are associated and can have influence.
  • the attributes of the base station are mainly the physical location, GPS and other information of the computer room where it is located
  • the power equipment is mainly the physical location, GPS and other information of the computer room where it is located
  • the attributes of the core network are obtained through the network gateway of the transport layer of the core network.
  • the mining algorithm can be Pearson algorithm, frequency spanning tree algorithm FP Growth algorithm.
  • the other is to perform alarm data mining in its own business domain or data domain according to the alarm mining start data such as the alarm mining model distributed and received by the master control node as input, and mine related alarm correlations.
  • the stored alarm correlation rules diagnose the alarm root cause of the current alarm, and then select the repair plan from the repair strategy according to the diagnosis result. Repair, if it cannot be repaired, notify the operation and maintenance personnel to repair the environment.
  • step 10 select the optimal node from a plurality of intelligent nodes, which specifically includes:
  • Step 100 Select the optimal node from the slave nodes according to the performance of the slave nodes.
  • the optimal node is selected from the slave nodes according to the performance of the slave node, and the master node schedules the optimal nodes based on federated learning to form a scheduling order.
  • step 30 after generating an alarm association rule based on the final alarm association relationship data, the alarm association detection method provided by this embodiment of the present application further includes:
  • Step 50 the model design node obtains the alarm association rule
  • the model design node After the master control node generates the alarm association rules from the final alarm association relationship data, the model design node obtains the alarm association rules, so as to form the alarm association rules into a knowledge graph and display them to the user for confirmation and viewing.
  • Step 51 The model design node forms a knowledge graph based on the alarm association rules, and displays the knowledge graph.
  • the model design node displays the alarm association rules as a knowledge graph, as shown in Figure 8.
  • the communication network includes a network management system.
  • Step 51 the model design node forms a knowledge graph based on the alarm association rules, and after displaying the knowledge graph, the alarm association detection method provided by the embodiment of the present application, Also includes:
  • Step 60 the master control node receives the fault signal sent by the network management system
  • the alarm root cause of the current fault can be analyzed and diagnosed through the knowledge graph according to the alarm association rules that have been confirmed so far.
  • Step 61 The main control node performs fault diagnosis based on the alarm association rule, and determines the root cause of the alarm;
  • Step 62 The model design node displays the knowledge graph with the alarm root cause.
  • the knowledge graph can clearly identify the root cause of the alarm, and perform fault diagnosis and repair in a timely manner.
  • the alarm correlation detection method provided by the embodiment of the present application further includes:
  • Step 70 The model design node sends the alarm mining model to the master control node, so that the master control node selects the optimal node from the multiple intelligent nodes.
  • the alarm mining model may be that after the model design node is generated, the master control node selects the optimal node according to the alarm mining model, the routing relationship between each intelligent node, the performance and working status of the intelligent node.
  • step 210 before the current node in the optimal nodes in the scheduling order completes the alarm correlation mining, the alarm correlation detection method provided by the embodiment of the present application further includes:
  • Step 80 The master control node sends the alarm mining model, the data mining algorithm, and the alarm correlation data associated with the business domain where the first node of the optimal nodes is located to the first node of the optimal nodes.
  • the master control node After determining the scheduling order, the master control node sends the alarm mining start data to the first node in the optimal nodes.
  • the alarm mining model here involves all the optimal nodes in the network topology structure composed of the optimal nodes.
  • the optimal node can be used to perform alarm data mining in the smallest network unit in the network topology structure composed of the optimal node.
  • step 210 send the first alarm correlation data obtained after the current node in the optimal nodes in the scheduling order completes the alarm correlation mining to the next node in the optimal node, including:
  • Step 211 the current node in the optimal node sends the first alarm associated data to the master control node;
  • Whether the current node sends the first alarm-related data to the next node needs to be forwarded by the master control node is mainly determined by the network topology structure of the alarm mining model and the routing relationship between the intelligent nodes.
  • the transmission between the current node and the next node can be forwarded by the master node when the forwarding effect of the master node is good.
  • Step 212 The master control node sends the first alarm correlation data to the next node in the optimal node, wherein the first alarm correlation data includes the alarm mining model, the data mining algorithm, and the business domain where the next node in the optimal node is located.
  • the associated alarm correlation data includes the alarm mining model, the data mining algorithm, and the business domain where the next node in the optimal node is located.
  • the current node filters out the alarm mining data that does not involve cross-domain after mining the alarm data in the current business domain, and then filters the alarm mining data within the time period designed by the alarm mining model.
  • the alarm mining data is encrypted and sent to the next node.
  • the alarm mining model and data mining algorithm also need to be sent to the next node.
  • the data mining algorithm here is the algorithm used in the alarm data mining process, which is determined together with the design of the alarm mining model. Referring to FIG. 12 , in the alarm correlation detection method provided by the embodiments of the present application, step 210 : Complete alarm correlation mining according to the current node in the optimal node in the scheduling order, which specifically includes:
  • Step 215 Perform a linear correlation operation on the business domain where the next node in the optimal node is located based on the alarm mining model, the data mining algorithm, and the alarm correlation data associated with the business domain where the next node in the optimal node is located.
  • the business domain where the optimal node is located includes multiple sub-data domains.
  • Step 80 The master control node compares the alarm mining model, the data mining algorithm, and the first one of the optimal nodes with the optimal node. Before the alarm correlation data associated with the service domain where the node is located is sent to the first node in the optimal nodes, the alarm correlation detection method provided by the embodiment of the present application further includes:
  • Step 90 the main control node is arranged in each data domain, the optimal node is arranged in each of the multiple sub-data domains, and the model design node is arranged in the network management system of the business domain;
  • step 215 perform a linear correlation based on the alarm mining model, the data mining algorithm, and the alarm correlation data associated with the business domain where the next node in the optimal node is located in the business domain where the next node in the optimal node is located. operations, including:
  • Step 216 the next node in the optimal node performs alarm correlation mining in multiple sub-data domains based on the alarm mining model, the data mining algorithm, and the alarm correlation data associated with the business domain where the next node in the optimal node is located, respectively, Get the alarm data of the sub-data field.
  • each next node in the optimal node is set in multiple sub-data domains, and after performing alarm correlation mining, the obtained sub-data domain alarm data is sent to the main control node for storage by the main control node.
  • FIG. 14 shows an example of the alarm correlation detection method provided by the embodiment of the present application.
  • one node is deployed in a business domain as a model design node.
  • the master node and the slave nodes in the domain can be respectively set in the alarm domain and other data domains to perform alarm correlation mining.
  • This example is suitable for mining alarm correlation only for one business domain. It mainly mines the alarm correlation data in the current network management system, including the alarm, performance, dynamic and other data domains of the business domain in the current network management system, or even a certain data domain.
  • Alarm correlation mining is performed for sub-data domains in a data domain, and alarm correlation mining for each data domain or sub-data domain, because the data model of each data domain or sub-data domain is different, plus the network of the business domain.
  • the association relationship of the management device determines the aggregate association relationship between each data domain and each sub-data domain under the same data domain. Therefore, the alarm data mining between data domains and sub-data domains adopts the horizontal modeling method of federated learning, so that different data domains can independently conduct alarm data mining and mine alarm correlation data between data domains or between sub-data domains.
  • One master node can be deployed in a business domain, or it can be split according to the number and scale of network management devices in the business domain.
  • the splitting principle can first deploy one master node corresponding to one data domain, and then split the one data domain. It is divided into multiple sub-data domains, and then deployed in sub-nodes according to the data volume of the multiple sub-data domains.
  • the management devices in the alarm domain can vary with network elements and network element loads.
  • the number of cells in the alarm data field is comprehensively measured. When the number of cells managed by the network management equipment in the alarm domain exceeds 10,000 or millions, multiple sub-nodes are deployed for the alarm data domain, such as slave node 1 in the domain, slave node 2 in the domain, etc. .
  • the user After logging in to the current master node of the wireless domain, the user first obtains the models of the data domain supported by the current network management device, namely the alarm model and the performance model, that is, the physical resource model of the network management device.
  • the user designs the alarm mining model for the sub-data domain of each data domain, and then sends the mining start data such as the alarm mining model of the sub-data domain to the alarm sub-data domain for alarm data mining.
  • the final alarm correlation data obtained is mainly stored in the alarm data. in the master node of the domain.
  • the description is still in the alarm field, which is also applicable to other data fields.
  • the data is cut to form an alarm sub-data field, and the main control node of the alarm field distributes the alarm mining model to each sub-data field for alarm data mining.
  • slave node 1 performs alarm data mining of 1-1000 network elements.
  • the slave node 1 in this sub-data field is based on the basic alarm information such as the alarm ID, alarm reason code, time zone of the alarm, the minimum alarm correlation duration, and the topological correlation of the alarm according to the alarm correlation model.
  • These basic alarm information is designed by the alarm mining model.
  • the basic table-building data used at the time configure a linear data mining algorithm such as Pearson algorithm or FP algorithm and parameters related to the algorithm, perform alarm correlation mining, and encrypt the final alarm correlation data mined and return it to the main control node of the alarm domain.
  • the model design node set in the current network management system of the business domain is mainly for data design, storage and display of the alarm mining model, and distribution of the subdomain data model.
  • the intelligent nodes of the sub-data domain mainly realize the association relationship mining of its data domain.
  • FIG. 15 shows another example of the alarm correlation detection method provided by the embodiment of the present application.
  • network deployment includes smart nodes in the power domain, smart nodes in the bearer domain, and smart nodes in the core domain.
  • the wireless domain will also have other Manufacturer's wireless domain, etc. This example is mainly explained in the wireless domain.
  • Model design nodes are smart nodes for user logins.
  • the user logs in to the model design node, it obtains the device model and model data of the network management equipment in each service domain, and the IP address of the server running the network management software in each service domain. Select one with the most stable network connection relationship and the fastest link with the master nodes of other service domains as the wireless domain master node. The purpose is to alert the stability of associated data transmission and reduce the delay of network transmission during federated learning.
  • the power domain master control node completes the alarm association mining, the first alarm association data is sent to the bearer domain master control node.
  • the master node can also dynamically maintain the network status between each slave node in its own business domain, and the activity state of each slave node.
  • the above example mainly focuses on the scheduling method after the deployment of intelligent nodes, while this case mainly uses the alarm domain model as an example to illustrate the data transfer method of federated learning vertical mining and the processing flow of the optimal node scheduling order.
  • the user After logging in to the smart node in the wireless domain, the user will obtain the base station alarm data model in the service domain.
  • Design the alarm mining model use the alarm ID of the alarm model, the alarm reason code, the time zone of the alarm, the minimum alarm association duration, and the network topology involved in the alarm model as the basic data, and configure the linear data mining algorithm (Pearson algorithm). and FP algorithm) and algorithm-related parameters.
  • the alarm mining model and other mining startup data are encrypted and sent to the dynamic domain.
  • the dynamic domain divides the alarm into hours according to its time zone, and then uses the time period of 5-6 minutes as the minimum correlation time to perform alarm correlation mining.
  • the first alarm associated data associated with the wireless network will be screened out, along with the topology relationship between the dynamic domain and the wireless domain, the GPS of the dynamic domain, physical addresses and other device attribute information, and distributed to the wireless domain master control node.
  • the optimal node in the wireless domain performs secondary alarm correlation mining according to the correlation between the dynamic domain and the wireless domain involved in the first alarm correlation data. Alarm correlation data in the wireless domain.
  • the alarm correlation detection method based on federated learning is suitable for arranging multiple intelligent nodes forming a topology structure in a communication network, and multiple intelligent nodes are distributed in each service domain.
  • the optimal node needs to be selected from multiple intelligent nodes, so that the optimal node can participate in the alarm correlation detection.
  • Each optimal node completes alarm correlation mining in its own business domain based on federated learning, obtains the final alarm correlation data of this alarm correlation detection, and generates alarm correlation rules based on the final alarm correlation data.
  • the alarm correlation detection method based on federated learning provided by the embodiment of this application is used for correlation fault diagnosis caused by equipment failures between cross-service domains, and cross-domain alarm correlation mining can be realized under the premise of ensuring data security through federated learning. , quickly locate the root cause of the alarm, and improve the efficiency of alarm detection.
  • FIG. 17 it is a schematic structural diagram of an alarm correlation detection system 1 based on federated learning provided by an embodiment of the present application.
  • the federated learning-based alarm correlation detection system 1 includes a plurality of intelligent nodes 40 arranged in a communication network to form a topology structure, and the plurality of smart nodes 40 are distributed in each business domain.
  • the federated learning-based alarm correlation detection system 1 also further include:
  • the node selection module 10 is set to select the optimal node from a plurality of intelligent nodes
  • the intelligent node here may be the original one that has been set in the communication network, or may be a newly set intelligent node for implementing the alarm correlation detection method provided by the embodiment of the present application.
  • These multiple intelligent nodes adopt topology
  • the structure is connected to form a large intelligent node group, so the intelligent nodes can also realize the data transmission between the intelligent nodes after the data processing is realized.
  • a large number of intelligent nodes are distributed in various service domains. The purpose of the existence of intelligent nodes is to use these intelligent nodes to realize alarm correlation detection across service domains.
  • the purpose of selecting the optimal node from multiple intelligent nodes is to use the selected optimal node for alarm correlation mining. Considering the time involved in the occurrence of alarms, each time an alarm correlation detection method is performed, it is necessary to correlate alarms according to the current alarm.
  • the topology structure where the involved intelligent nodes are located, the routing relationship between the intelligent nodes, and the running state of each intelligent node are selected from the multiple intelligent nodes again.
  • the mining module 20 is configured to complete alarm correlation mining in respective business domains based on the optimal node of federated learning, and obtain final alarm correlation data;
  • federated learning can effectively solve the problem of data silos, allowing participants to jointly model without sharing data, which can technically break data silos and achieve collaboration.
  • the optimal node completes alarm correlation mining based on federated learning in their respective business domains, conducts comprehensive alarm correlation data mining for multiple business domains, and obtains the alarm correlation relationship involving each business domain. Correlation between equipment failures in other business domains.
  • the rule generation module 30 is configured to generate alarm association rules based on the final alarm association relationship data.
  • alarm correlation rules are generated, and it is convenient for operation and maintenance personnel to quickly locate the root cause of the alarm according to these alarm correlation rules and solve the equipment failure.
  • the federated learning-based alarm correlation detection method needs to be completed by using intelligent node clusters deployed in each business domain, and the optimal node based on federated learning is used to perform alarm correlation mining in the business domain where it is located to obtain the final result.
  • Alarm correlation data can solve alarm correlation mining across business domains, assist operation and maintenance personnel to dig out the alarm root cause of equipment failures in actual network operation, and the formed alarm correlation rules can be transferred to other actual operation networks to quickly solve actual operation problems. Network equipment failure.
  • the alarm correlation detection system provided by the embodiments of the present application, the mining module 20, is further configured to:
  • the optimal nodes are scheduled and sorted to form a scheduling sequence
  • the optimal nodes can be scheduled and sorted according to the longitudinal modeling method of federated learning to form a scheduling sequence, that is, scheduling and sorting need to consider the relationship between the various business domains involved in the alarm correlation and the interaction between each business domain when mining the alarm correlation respectively. supporting relationship between them.
  • the purpose of scheduling and sorting is to determine the data flow transmission direction of alarm correlation data between each optimal node, so that the optimal node can complete cross-domain alarm correlation mining.
  • the obtained first alarm correlation data is sent to the next node in the optimal node, so that the next node in the optimal node can complete the alarm correlation mining;
  • the scheduling sequence is formed.
  • the current node among the optimal nodes ranked ahead of the next node in the scheduling sequence after completing the alarm correlation mining, sends the obtained first alarm correlation data to the node ranked after the current node in the scheduling sequence.
  • the next node in the optimal node is used for the next node in the optimal node to complete alarm association mining.
  • the current node and the next node in the optimal node are the current node and the next node next to each other in the scheduling order.
  • the optimal node in the second position and the optimal node in the third position in the scheduling order The optimal node completes alarm correlation mining in sequence according to the scheduling order. It can be seen that the next node in the optimal node (from the second optimal node in the scheduling order to the last optimal node in the scheduling order) is the one that receives the current node (from the second optimal node in the scheduling order)
  • the alarm correlation mining is carried out in combination with the local alarm data.
  • the current node After the current node performs a complete alarm correlation mining for the data in its business domain, such as the device data of the current business, it filters out the alarm correlation data that does not involve cross-business domains, encrypts it uniformly, and sends it to the next slave node.
  • the direct sending from the current node to the next node is determined in consideration of the network topology where the intelligent node is located and the routing relationship between each intelligent node.
  • the final alarm correlation data is obtained.
  • the final alarm association relationship data is obtained, and then an alarm association rule is generated based on the last alarm association data.
  • the multiple intelligent nodes include a model design node, a master control node, and a slave node
  • the alarm correlation detection system provided by the embodiment of the present application further includes:
  • the node selection module 10 is further configured to determine one of the multiple intelligent nodes as the model design node based on the user's selection;
  • the smart node which is the user's login device, can be used as the model design node.
  • the model design node as its name implies, is to coordinate the initiation and operation of the entire alarm correlation detection.
  • the model design node is a node in the network topology, and is also a node in the current intelligent node group currently used for alarm correlation detection.
  • a model determination module 50 configured to determine an alert mining model
  • the alarm mining model is also different.
  • the alarm mining model here is selected by the alarm correlation detection.
  • the node selection module 10 is also configured to select another one of the multiple intelligent nodes as the master control node based on the alarm mining model and the routing relationship, and other intelligent nodes in the multiple intelligent nodes as the slave nodes;
  • the model design node and the alarm mining model are determined, another one of the multiple intelligent nodes is selected as the main control node based on the network topology and routing relationship formed by the optimal node.
  • the main control node here is different from the model design node.
  • the control node plays the role of intermediate coordination processing to realize the optimal node in the scheduling sequence to complete the alarm correlation detection and obtain the final alarm correlation relationship data, and the final alarm correlation relationship data generates alarm correlation rules on the main control node and stores the alarm correlation rules.
  • the master control node obtains the alarm mining model, it selects other intelligent nodes as slave nodes according to the federated learning method. At this time, the node performance of the intelligent nodes is more considered, which will be described below.
  • the master control node distributes the alarm mining start data such as the alarm mining model, the alarm mining algorithm, and the scheduling sequence to each slave node, and plays the function of intermediate scheduling and coordination during alarm correlation detection.
  • a central intelligent node is selected as the main control node mainly according to the routing relationship of the intelligent nodes involved in the model design, which can reduce the network transmission delay caused by the scheduling of the main control node.
  • Other connected slave nodes mainly utilize the data computing capability of the slave node.
  • the master node can be scheduled according to the working status and idle state of each slave node. If a slave node stores alarm correlation data, the master node will preferentially use the slave node for alarm correlation mining in this business domain. If there is no alarm correlation data Then the sub-node is mainly responsible for the algorithm operation function of the main control node split.
  • the master control node finds the optimal node according to the network topology composed of the intelligent nodes involved in the alarm mining model, the routing relationship between the intelligent nodes, and the running status of each intelligent node. Then the main control node arranges the scheduling order of each optimal node according to the longitudinal modeling method of federated learning, completes the joint mining of multi-intelligent nodes, and finally returns the final alarm correlation data mined across domains to the main control node for storage. The node is then sent to the model design node and presented to the user through the model design node.
  • the data transmission between each intelligent node is mainly carried out by encryption.
  • the time period involved in the alarm correlation if the time period is too large or the alarm correlation data transmitted between the optimal nodes is too large due to other reasons, the time period can be processed into slices, and then distributed in slices.
  • the main purpose here is to perform fragmentation processing in terms of time to avoid the problem that the alarm mining data to be transmitted is too large due to the long time period selected by the alarm mining model.
  • the next intelligent node After the next intelligent node receives the alarm correlation data after the time slice, it performs error tolerance processing on the time distributed by the slice, to ensure that the correlation across the slice cannot be lost due to time slice.
  • the node selection module 10 is set to:
  • the optimal node is selected from the slave nodes according to the performance of the slave nodes.
  • the optimal node is selected from the slave nodes according to the performance of the slave node, and the master control node schedules the optimal nodes based on federated learning to form a scheduling order.
  • the alarm correlation detection method based on federated learning is suitable for arranging multiple intelligent nodes forming a topology structure in a communication network, and multiple intelligent nodes are distributed in each service domain.
  • the optimal node needs to be selected from multiple intelligent nodes, so that the optimal node can participate in the alarm correlation detection.
  • Each optimal node completes alarm correlation mining in its own business domain based on federated learning, obtains the final alarm correlation data of this alarm correlation detection, and generates alarm correlation rules based on the final alarm correlation data.
  • the alarm correlation detection method based on federated learning provided by the embodiment of this application is used for correlation fault diagnosis caused by equipment failures between cross-service domains, and cross-domain alarm correlation mining can be realized under the premise of ensuring data security through federated learning. , quickly locate the root cause of the alarm, and improve the efficiency of alarm detection.
  • the communication network includes the federated learning based alarm correlation detection system as described above.
  • the alarm correlation detection system based on federated learning includes:
  • the node selection module 10 is set to select the optimal node from a plurality of intelligent nodes
  • the intelligent node here may be the original one that has been set in the communication network, or may be a newly set intelligent node for implementing the alarm correlation detection method provided by the embodiment of the present application.
  • These multiple intelligent nodes adopt topology
  • the structure is connected to form a large intelligent node group, so the intelligent nodes can also realize the data transmission between the intelligent nodes after the data processing is realized.
  • a large number of intelligent nodes are distributed in various service domains. The purpose of the existence of intelligent nodes is to use these intelligent nodes to realize alarm correlation detection across service domains.
  • the purpose of selecting the optimal node from multiple intelligent nodes is to use the selected optimal node for alarm correlation mining. Considering the time involved in the occurrence of alarms, each time an alarm correlation detection method is performed, it is necessary to correlate alarms according to the current alarm.
  • the topology structure where the involved intelligent nodes are located, the routing relationship between the intelligent nodes, and the running state of each intelligent node are selected from the multiple intelligent nodes again.
  • the mining module 20 is configured to complete alarm correlation mining in respective business domains based on the optimal node of federated learning, and obtain final alarm correlation data;
  • federated learning can effectively solve the problem of data silos, allowing participants to jointly model without sharing data, which can technically break data silos and achieve collaboration.
  • the optimal node completes alarm correlation mining based on federated learning in their respective business domains, conducts comprehensive alarm correlation data mining for multiple business domains, and obtains the alarm correlation relationship involving each business domain. Correlation between equipment failures in other business domains.
  • the rule generation module 30 is configured to generate alarm association rules based on the final alarm association relationship data.
  • alarm correlation rules are generated, and it is convenient for operation and maintenance personnel to quickly locate the root cause of the alarm according to these alarm correlation rules and solve the equipment failure.
  • the federated learning-based alarm correlation detection method needs to be completed by using intelligent node clusters deployed in each business domain, and the optimal node based on federated learning is used to perform alarm correlation mining in the business domain where it is located to obtain the final result.
  • Alarm correlation data can solve alarm correlation mining across business domains, assist operation and maintenance personnel to dig out the alarm root cause of actual network operating equipment failure, and the formed alarm correlation rules can be transferred to other actual operating networks to quickly solve actual operation problems. Network equipment failure.
  • the alarm correlation detection method based on federated learning is suitable for arranging multiple intelligent nodes forming a topology structure in a communication network, and multiple intelligent nodes are distributed in each service domain.
  • the optimal node needs to be selected from multiple intelligent nodes, so that the optimal node can participate in the alarm correlation detection.
  • Each optimal node completes alarm correlation mining in its own business domain based on federated learning, obtains the final alarm correlation data of this alarm correlation detection, and generates alarm correlation rules based on the final alarm correlation data.
  • the alarm correlation detection method based on federated learning provided by the embodiment of this application is used for correlation fault diagnosis caused by equipment failures between cross-service domains, and cross-domain alarm correlation mining can be realized under the premise of ensuring data security through federated learning. , quickly locate the root cause of the alarm, and improve the efficiency of alarm detection.
  • a storage medium provided by an embodiment of the present application is used for computer-readable storage, where the storage medium stores one or more programs, and the one or more programs can be executed by one or more processors to implement the following:
  • the steps of the federated learning-based alarm correlation detection method shown in FIGS. 1 to 7 and FIGS. 9 to 13 may specifically perform the following steps:
  • Step 10 Select the optimal node from multiple intelligent nodes
  • the intelligent node here may be the original one that has been set in the communication network, or may be a newly set intelligent node for implementing the alarm correlation detection method provided by the embodiment of the present application.
  • These multiple intelligent nodes adopt topology
  • the structure is connected to form a large intelligent node group, so the intelligent nodes can also realize the data transmission between the intelligent nodes after the data processing is realized.
  • a large number of intelligent nodes are distributed in various service domains. The purpose of the existence of intelligent nodes is to use these intelligent nodes to realize alarm correlation detection across service domains.
  • the purpose of selecting the optimal node from multiple intelligent nodes is to use the selected optimal node for alarm correlation mining. Considering the time involved in the occurrence of alarms, each time an alarm correlation detection method is performed, it is necessary to correlate alarms according to the current alarm.
  • the topology structure where the involved intelligent nodes are located, the routing relationship between the intelligent nodes, and the running state of each intelligent node are selected from the multiple intelligent nodes again.
  • Step 20 Based on federated learning, after the optimal node completes alarm correlation mining in their respective business domains, final alarm correlation data is obtained;
  • federated learning can effectively solve the problem of data silos, allowing participants to jointly model without sharing data, which can technically break data silos and achieve collaboration.
  • the optimal node completes alarm correlation mining based on federated learning in their respective business domains, conducts comprehensive alarm correlation data mining for multiple business domains, and obtains the alarm correlation relationship involving each business domain. Correlation between equipment failures in other business domains.
  • Step 30 Generate an alarm association rule based on the final alarm association relationship data.
  • alarm correlation rules are generated, and it is convenient for operation and maintenance personnel to quickly locate the root cause of the alarm according to these alarm correlation rules and solve the equipment failure.
  • the federated learning-based alarm correlation detection method needs to be completed by using intelligent node clusters deployed in each business domain, and the optimal node based on federated learning is used to perform alarm correlation mining in the business domain where it is located to obtain the final result.
  • Alarm correlation data can solve alarm correlation mining across business domains, assist operation and maintenance personnel to dig out the alarm root cause of equipment failures in actual network operation, and the formed alarm correlation rules can be transferred to other actual operation networks to quickly solve actual operation problems. Network equipment failure.
  • the alarm correlation detection method based on federated learning is suitable for arranging multiple intelligent nodes forming a topology structure in a communication network, and multiple intelligent nodes are distributed in each service domain.
  • the optimal node needs to be selected from multiple intelligent nodes, so that the optimal node can participate in the alarm correlation detection.
  • Each optimal node completes alarm correlation mining in its own business domain based on federated learning, obtains the final alarm correlation data of this alarm correlation detection, and generates alarm correlation rules based on the final alarm correlation data.
  • the alarm correlation detection method based on federated learning provided by the embodiment of this application is used for correlation fault diagnosis caused by equipment failures between cross-service domains, and cross-domain alarm correlation mining can be realized under the premise of ensuring data security through federated learning. , quickly locate the root cause of the alarm, and improve the efficiency of alarm detection.
  • the systems, devices, modules or units described in one or more of the above embodiments may be specifically implemented by computer chips or entities, or by products with certain functions.
  • One implementation device is a computer.
  • the computer can be, for example, a personal computer, a laptop computer, a cellular phone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or A combination of any of these devices.
  • Computer-readable storage media includes both persistent and non-permanent, removable and non-removable media, and storage of information can be implemented by any method or technology.
  • Information may be computer readable instructions, data structures, modules of programs, or other data.
  • Examples of computer storage media include, but are not limited to, phase-change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), Flash Memory or other memory technology, Compact Disc Read Only Memory (CD-ROM), Digital Versatile Disc (DVD) or other optical storage, Magnetic tape cassettes, magnetic tape magnetic disk storage or other magnetic storage devices or any other non-transmission medium that can be used to store information that can be accessed by a computing device.
  • computer-readable media does not include transitory computer-readable media, such as modulated data signals and carrier waves.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

一种基于联邦学习的告警关联检测方法、系统及介质。该基于联邦学习的告警关联检测方法,适用于在通信网络中布置形成拓扑结构的多个智能节点,所述多个智能节点分布在各个业务域中的场景,所述方法包括:从所述多个智能节点中选取最优节点(S10);基于联邦学习在所述最优节点在各自的业务域中完成告警关联挖掘后,得到最终告警关联关系数据(S20);基于所述最终告警关联关系数据生成告警关联规则(S30)。

Description

基于联邦学习的告警关联检测方法、系统、网络及介质
相关申请的交叉引用
本申请基于申请号为202011617624.X、申请日为2020年12月31日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。
技术领域
本申请实施例涉及通信技术领域,尤其涉及一种基于联邦学习的告警关联检测方法、系统、网络及介质。
背景技术
通信网络按照地域可以进行多层划分,可以划分国家、中心、省、市、县、区等,整体的运维可以归属于省一级的网管中心负责,地市县区级别的通信网络则由属于各自区域内的运维人员进行管理。目前搭建通信网络所需要的物理设备按照实现功能可以划分为无线接入设备、承载设备、核心网以及外部动力系统等业务域。无线接入设备主要用于让用户设备接入通信网络,承载设备则主要将用户设备的数据传输到对应地理区域的通信机房,核心网则主要将各个地理区域的通信机房中的数据进行交换和传输,而外部动力系统则主要给无线接入设备所处机房或者铁塔等提供供电和降温等。
存在的问题是,各个业务域均有各自的网管运维进行管理,实际之间是相互支持的,特别是一个业务域中的告警很可能会引发其他业务域上的告警。作为支撑的底层设备出现问题,会引发上层设备的故障。如基站所在地的供电电力系统的不稳定,会导致无线接入设备中的基站等发生间歇性告警,甚至是电力设备中的某一个变压和快关问题,会导致上层基站断链等。跨域的告警关联存在一定的数据缺失无法关联。造成该问题的主要原因是因为通信网络目前组建和实施方式导致,其他业务域一方面较难共享数据,另外一方面则会因为有不同的厂商设备组建而导致数据信息共享存在一定困难。此外,处于数据安全问题,各个业务域的运维数据需要一定层次的数据隔离,较难实现数据共享。因此,如何在保证数据安全的前提下实现跨域告警关联挖掘,快速定位告警根因,提高告警检测效率成为亟待解决的问题。
发明内容
本说明书一个或多个实施例提供一种基于联邦学习的告警关联检测方法、系统及介质。
第一方面,提供了一种基于联邦学习的告警关联检测方法,适用于在通信网络中布置形成拓扑结构的多个智能节点,所述多个智能节点分布在各个业务域中的场景,所述方法包括:从所述多个智能节点中选取最优节点;基于联邦学习在所述最优节点在各自的业务域中完成告警关联挖掘后,得到最终告警关联关系数据;基于所述最终告警关联关系数据生成告警关联规则。
第二方面,提出了一种基于联邦学习的告警关联检测系统,包括在通信网络中布置形成拓扑结构的多个智能节点,所述多个智能节点分布在各个业务域中,所述系统还包括:节点选取模块,被设置为从所述多个智能节点中选取最优节点;挖掘模块,被设置为基于联邦学习在所述最优节点在各自的业务域中完成告警关联挖掘后,得到最终告警关联关系数据;规则生成模块,被设置为基于所述最终告警关联关系数据生成告警关联规则。
第三方面,提出了一种通信网络,包括如权利要求12至14中任一项所述的基于联邦学习的告警关联检测系统。
第四方面,提出了一种电子设备,包括:处理器;以及
被安排成存储计算机可执行指令的存储器,所述可执行指令在被执行时使所述处理器执行如上文所述的基于联邦学习的告警关联检测方法的步骤。
第五方面,提出了一种存储介质,用于计算机可读存储,所述存储介质存储有一个或者多个程序,所述一个或者多个程序可被一个或者多个处理器执行时,实现如上文所述的基于联邦 学习的告警关联检测方法的步骤。
附图说明
为了更清楚地说明本说明书一个或多个实施例或已有技术中的技术方案,下面将对一个或多个实施例或已有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本说明书中记载的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1是本申请实施例提供一种基于联邦学习的告警关联检测方法的步骤示意图。
图2是本申请实施例提供另一种基于联邦学习的告警关联检测方法的步骤示意图。
图3是本申请实施例提供又一种基于联邦学习的告警关联检测方法的步骤示意图。
图4是本申请实施例提供又一种基于联邦学习的告警关联检测方法的步骤示意图。
图5是本申请实施例提供又一种基于联邦学习的告警关联检测方法的步骤示意图。
图6是本申请实施例提供又一种基于联邦学习的告警关联检测方法的步骤示意图。
图7是本申请实施例提供的又一种基于联邦学习的告警关联检测方法的步骤示意图。
图8是本申请实施例提供的又一种基于联邦学习的告警关联检测方法中展示的知识图谱的界面显示示意图。
图9是本申请实施例提供的又一种基于联邦学习的告警关联检测方法的步骤示意图。
图10是本申请实施例提供的又一种基于联邦学习的告警关联检测方法的步骤示意图。
图11是本申请实施例提供的又一种基于联邦学习的告警关联检测方法的步骤示意图。
图12是本申请实施例提供的又一种基于联邦学习的告警关联检测方法的步骤示意图。
图13是本申请实施例提供的又一种基于联邦学习的告警关联检测方法的步骤示意图。
图14是本申请实施例提供的又一种基于联邦学习的告警关联检测方法应用在单个业务域中智能节点的部署示意图。
图15是本申请实施例提供的又一种基于联邦学习的告警关联检测方法应用在多个业务域中智能节点的部署示意图。
图16是本申请实施例提供的一种基于联邦学习的告警关联检测系统的结构示意图。
图17是本申请实施例提供的另一种基于联邦学习的告警关联检测系统的结构示意图。
具体实施方式
为了使本技术领域的人员更好地理解本说明书中的技术方案,下面将结合本说明书一个或多个实施例中的附图,对本说明书一个或多个实施例中的技术方案进行清楚、完整地描述,显然,所描述的一个或多个实施例仅仅是本说明书一部分实施例,而不是全部的实施例。基于本说明书中的一个或多个实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本文件的保护范围。
本申请提供的一种基于联邦学习的告警关联检测方法适用于在通信网络中布置形成拓扑结构的多个智能节点,多个智能节点分布在各个业务域中的场景,在进行告警关联检测时需要智能节点参与进来完成告警关联检测。该告警关联检测方法通过联邦学习的方式可以保证数据安全的前提下实现跨域告警关联挖掘,快速定位告警根因,提高告警检测效率。下面将详细地描述本说明书提供的基于联邦学习的告警关联检测方法及其各个步骤。
需要说明的是,本申请提出的智能节点可以是以服务或者微服务的方式部署在具有数据处理能力的处理器设备比如服务器,下文将对智能节点的数据处理能力进行详细描述。
图1所示为目前实际使用的无线通信网络架构,可以将无线通讯网络划分为动力网、无线网、承载网和核心网等业务域,不同业务域的网络设备各有对应的网络管理系统所管理,每一个业务域的网络管理系统对于网络设备的管理和监控可以划分为配置域、性能域、告警域、动态域、运维域等数据域。因此在解决设备故障时,比如无线网中的设备运维状态以及告警产生,可能是因为其他几个域的设备故障引起连带的无线网的设备故障,因此需要对多个业务域进行综合的告警关联数据挖掘,基于得到的涉及各个业务域的告警关联关系形成一个固定的告警关联规则,方便后续运维人员可以根据该告警关联规则快速定位告警根因,及时解决故障。
实施例一
参照图2所示,为本申请实施例提供的一种基于联邦学习的告警关联检测方法的步骤示意图。可以理解的是,该基于联邦学习的告警关联检测方法,适用于在通信网络中布置形成拓扑结构的多个智能节点,多个智能节点分布在各个业务域中的场景。本申请实施例提供的基于联邦学习的告警关联检测方法,包括:
步骤10:从多个智能节点中选取最优节点;
需要说明的是这里的智能节点可以是原有的已经设置在通信网络中的,也可以是为了实施本申请实施例提供的告警关联检测方法而新设置的智能节点,这些多个智能节点采用拓扑结构连接,形成一个大的智能节点群,因此智能节点在实现数据处理后也可以实现智能节点之间的数据传输。智能节点的数量众多分布在各个业务域中,智能节点存在的目的是利用这些智能节点实现跨业务域的告警关联检测。
从多个智能节点中选取最优节点的目的是采用选出的最优节点进行告警关联挖掘,考虑到告警发生所涉及的时间问题,在每进行一次告警关联检测方法时均需要根据目前告警关联所涉及的智能节点所在的拓扑结构、各个智能节点之间的路由关系以及各个智能节点的运行状态重新从多个智能节点中选取最优节点。
步骤20:基于联邦学习在最优节点在各自的业务域中完成告警关联挖掘后,得到最终告警关联关系数据;
联邦学习作为分布式的机器学习范式,可以有效解决数据孤岛问题,让参与方在不共享数据的基础上联合建模,能从技术上打破数据孤岛,实现协作。
最优节点在各自的业务域中基于联邦学习完成告警关联挖掘,对多个业务域进行综合的告警关联数据挖掘,得到涉及各个业务域的告警关联关系,从而可以找到引起一个业务域设备故障与其它业务域的设备故障之间的关联关系。
步骤30:基于最终告警关联关系数据生成告警关联规则。
基于得到的最终告警关联关系数据生成告警关联规则,后续方便运维人员根据这些告警关联规则快速定位告警根因,解决设备故障。
本申请实施例提供的基于联邦学习的告警关联检测方法,需要利用部署在各个业务域中的智能节点集群来完成,在基于联邦学习最优节点在其所在的业务域进行告警关联挖掘,得到最终告警关联关系数据,可以解决跨业务域的告警关联关系挖掘,可以辅助运维人员挖掘出实际网络运行设备故障的告警根因,形成的告警关联规则可以迁移至其它实际运营网络,快速解决实际运营网络的设备故障。
参照图3所示,在一些实施例中,本申请实施例提供基于联邦学习的告警关联检测方法中,步骤20:基于联邦学习在最优节点在各自的业务域中完成告警关联挖掘后,得到最终告警关联关系数据,具体包括:
步骤200:基于联邦学习将最优节点进行调度排序,形成调度顺序;
可以根据联邦学习的纵向建模方式将最优节点进行调度排序,形成调度顺序,即调度排序需要考虑告警关联所涉及的各个业务域之间的关联关系以及各个业务域分别进行告警关联挖掘时相互之间的支撑关系。调度排序的目的是决定告警关联数据在各个最优节点之间的数据流传输方向,从而最优节点完成跨域的告警关联挖掘。
步骤210:按照调度顺序最优节点中的当前节点完成告警关联挖掘后将得到的第一告警关联数据发送至最优节点中的下一节点,以供最优节点中的下一节点完成告警关联挖掘;
前面已经提到形成调度顺序,在调度顺序中排在下一节点前面的最优节点中的当前节点在完成告警关联挖掘后,将得到的第一告警关联数据发送至调度顺序排在当前节点后面的最优节点中的下一节点,以供最优节点中的下一节点完成告警关联挖掘。
最优节点中的当前节点和下一节点是调度顺序中紧挨的当前节点和下一节点比如调度顺序中位于第二位的最优节点和位于第三位的最优节点,调度顺序中的最优节点按照调度顺序依次完成告警关联挖掘。可以看出,最优节点中的下一节点(从调度顺序中位于第二位的最优节点直至调度顺序中的最后一个最优节点)均是在收到当前节点(从调度顺序中位于第一位的最优节点直至调度顺序中的倒数第二位的最优节点)发过来的第一告警关联数据后,结合本地告警 数据进行告警关联挖掘。
当前节点对其所在业务域内的数据比如当前业务的设备数据进行一次完整的告警关联挖掘后,将不涉及跨业务域的告警关联数据进行筛除后统一加密发送至下一个从节点。这里直接由当前节点发送至下一节点是考虑到智能节点所在的网络拓扑结构和各个智能节点之间的路由关系决定。
步骤220:在调度顺序中的最后一个节点完成告警关联挖掘后得到最终告警关联关系数据。
在调度排序中的最后一个节点完成告警关联挖掘后得到最终告警关联关系数据,然后基于最后告警关联数据生成告警关联规则。
参照图4所示,在一些实施例中,本申请实施例提供的告警关联检测方法,步骤200:基于联邦学习将最优节点进行调度排序,形成调度顺序,具体包括:
步骤201:基于联邦学习的纵向建模方式按照各个业务域的关联关系和各个业务域之间进行数据挖掘的支撑关系对最优节点进行调度排序。
基于设备故障的不同以及运维人员的检测目标不同,每一次执行告警关联检测时所选取的最优节点和调度排序不尽相同,调度排序的实现可以基于联邦学习的纵向建模方式,结合故障设备或者检测目标所在业务域、与该业务域有关联关系的其它业务域,以及各个业务域之间记性数据挖掘时的支撑关系对最优节点进行调度排序。
在处理跨业务域问题时主要根据智能节点的网络拓扑结构,在各个支撑层做该业务域的告警数据挖掘,采用联邦学习的纵向建模方式,各个业务域之间的线性关联关系则主要根据各个业务域之间的支撑关系决定,而各个业务域之间的支撑关系主要取决于各个业务域之间的物理设备之间的支撑关系。如当前基站B在机房A中部署,机房A为上述物理设备之间关联关系的集中点,并且有电源设备C给线路控制开关D供电,线路控制开关D控制基站B的加电和断电。当针对基站B的告警关联数据进行挖掘时,则可以根据上层动力网的模型数据先分析动力域的相关数据进行设定时间段内的告警数据挖掘,得到第一告警关联数据,然后根据告警挖掘模型的网络拓扑结构,将第一告警关联数据传递给无线网络的智能节点,无线网的智能节点则依据动力网与无线网之间的拓扑关系再对基站的告警数据进行线性关联挖掘,从而将两个域的线性相关进行一个数据拼接,从而形成一个跨域的告警关联规则。具体的方案在实施示例中进行说明。
参照图5所示,在一些实施例中,多个智能节点包括模型设计节点、主控节点和从节点,步骤10:从多个智能节点中选取最优节点之前,本申请实施例提供的告警关联检测方法,还包括:
步骤40:基于用户的选择确定多个智能节点中的一个作为模型设计节点;
作为用户登录设备的智能节点可以作为模型设计节点,模型设计节点顾名思义是统筹整个告警关联检测的发起和运行,用于设计当前告警关联检测所使用的当前智能节点群即选取的最优节点不同。模型设计节点是网络拓扑结构中的一个节点,也是当前进行告警关联检测所使用的当前智能节点群中的一个节点。
步骤41:确定告警关联模型;
可以看出,用户登录的智能节点不同,当前告警关联检测所使用的当前智能节点不同即选取的最优节点不同,那么告警挖掘模型亦不相同,这里的告警挖掘模型是告警关联检测所选取的最优节点所形成的网络拓扑结构。告警关联模型在一个业务域中可以包括涉及该业务域的管理设备的设备模型和涉及数据域的数据模型,下面将会分别提到。需要说明的是,这里在一个业务域或者一个数据域中具体的告警数据挖掘可以采用目前已有的挖掘算法和挖掘流程。
步骤42:基于告警挖掘模型和路由关系选取多个智能节点中的另一个作为主控节点,以及多个智能节点中的其它智能节点作为从节点;
在确定模型设计节点和告警挖掘模型之后,基于最优节点所形成的网络拓扑结构和路由关系选取多个智能节点中的另一个作为主控节点,这里的主控节点区别于模型设计节点,主控节点起到中间协调处理实现调度顺序中的最优节点完成告警关联检测,得到最终告警关联关系数据,并且最终告警关联关系数据在主控节点生成告警关联规则,存储告警关联规则。
主控节点在获取到告警挖掘模型后,按照联邦学习的方式筛选其它智能节点作为从节点, 这时更多的是考量智能节点的节点性能,下文将有所描述。然后主控节点将告警挖掘模型等告警挖掘启动数据分发给各个从节点,在告警关联检测时起到中间调度协调的功能。
当用户在哪个智能节点上登陆并且设计告警挖掘模型时,则主要根据模型设计时所涉及的智能节点的路由关系选择一个中心智能节点为主控节点,可以减少主控节点调度时网络传输延迟引起的告警关联挖掘的效率问题。其他连接的从节点则主要利用该从节点的数据计算能力。主控节点可以根据各个从节点的工作状态以及空闲状态来调度,如果出现一个从节点存储有告警关联数据则主控节点优先使用该从节点做本业务域的告警关联挖掘,如没有告警关联数据则该分节点主要承担主控节点分拆的算法运算功能。
可以看出,主控节点根据告警挖掘模型所涉及的智能节点组成的网络拓扑结构,各个智能节点之间的路由关系、各个智能节点的运行状态找到最优节点。然后主控节点根据联邦学习的纵向建模方式将各个最优节点的调度顺序进行排列,完成多智能节点联合挖掘最后将跨域挖掘出的最终告警关联关系数据返回给主控节点存储,主控节点再发送至模型设计节点,通过模型设计节点呈现给用户。各个智能节点之间的数据传递主要采用加密方式进行传递。如果在调度顺序中的当前最优节点得出的第一告警关联数据中存在与下一个最优节点无关的数据时,则不发送给主控节点,只将与下一个最优节点相关的第一告警关联数据返回给主控节点,然后由主控节点返回给模型设计节点。
另外考虑到告警关联所涉及的时间段,如果时间段过大或者其它原因导致各个最优节点之间传输的告警关联数据过大,可以将该时间段进行切片处理,然后分片分发。这里主要是在时间上进行分片处理,避免告警挖掘模型所选取的时间段过长而导致需要传输的告警挖掘数据传输数据过大的问题。当下一个智能节点接收到该时间切片后的告警关联数据后对该分片分发的时间进行偏差容错处理,保证不能因为时间分片而导致跨分片的关联性丢失。
可以看出,任意一个智能节点都有可能成为模型设计节点、主控节点和从节点,因此只能节点具有的功能包括:
1.模型管理、规则管理、智能数据挖掘,以及后面会提到的故障诊断和故障修复。
模型管理,主要获取智能节点所在当前业务域所管控的告警相关的设备模型,以及其上下起到支撑关系的业务域的设备模型,这里的设备模型属于告警挖掘模型的一部分;将整体的告警挖掘模型根据当前业务域的设备模型进行数据组合呈现给用户;分发告警挖掘模型等告警挖掘启动数据到各个关联的从节点,根据联邦学习算法进行调度排序确定告警关联数据的数据流路径;根据告警挖掘模型从规则库中获取告警关联规则并且以知识图谱方式呈现给用户;进行告警关联挖掘时所采用的具体的数据挖掘算法可以是皮尔森算法或者频度生成树算法来进行相关的线性分析。
规则管理,主要负责告警关联规则的人工设计,规则查询,规则导入,规则导出,规则存储等;
智能数据挖掘,一个是根据告警挖掘模型设计所使用的属性、挖掘算法确定需要传递的中间数据,可以是告警挖掘模型设计好后包含在告警挖掘启动数据中后发送至最优节点的第一个节点,也可以是每到一个最优节点可以对该最优节点所在的业务域所涉及的这些属性随时获取。这里的属性是确定各个业务域的物理设备之间是同一个地理区域内的,并且各个物理设备之间是相关联并且可以产生影响的。比如基站的属性主要是所处的机房的物理位置、GPS等信息,动力设备主要是所处的机房的物理位置、GPS等信息,而核心网的属性是通过核心网的传输层的网络网关来体现。挖掘算法可以皮尔森算法,频度生成树算法FP Growth算法。另一个是根据主控节点分发收到的告警挖掘模型等告警挖掘启动数据作为输入在自己本业务域或者数据域中进行告警数据挖掘,挖掘相关的告警关联关系。
故障诊断、故障修复,根据当前通信网络环境上报的设定时间端内的告警信息,已存储的告警关联规则诊断出当前告警产生的告警根因,从而根据诊断结果从修复策略中选择修复方案进行修复,如不能修复则通知运维人员进行环境修复。
对应地,步骤10:从多个智能节点中选取最优节点,具体包括:
步骤100:根据从节点的性能在从节点中选取最优节点。
在确定哪个智能节点为模型设计节点和主控节点后,根据从节点的性能在从节点中选取最 优节点,主控节点基于联邦学习对最优节点进行调度排序,形成调度顺序。
参照图6所示,在一些实施例中,步骤30:基于最终告警关联关系数据生成告警关联规则之后,本申请实施例提供的告警关联检测方法,还包括:
步骤50:模型设计节点获取告警关联规则;
在主控节点将最终告警关联关系数据生成告警关联规则后,模型设计节点获取告警关联规则,以便将告警关联规则形成知识图谱,展示给用户以便用户确认和查看。
步骤51:模型设计节点基于告警关联规则形成知识图谱,并且展示知识图谱。
模型设计节点将告警关联规则形成知识图谱显示出来展示,参照图8所示。参照图7所示,在一些实施例中,通信网络包括网络管理系统,步骤51:模型设计节点基于告警关联规则形成知识图谱,并且展示知识图谱之后,本申请实施例提供的告警关联检测方法,还包括:
步骤60:主控节点收到网络管理系统发出的故障信号;
当现网通信网络环境中有故障上报时,可以根据目前已经确认的告警关联规则通过知识图谱来分析和诊断当前故障发生的告警根因。
步骤61:主控节点基于告警关联规则进行故障诊断,确定告警根因;
可以的话还可以为用户提供一个修复措施,如已经有修复方案,则可以智能化地进行故障修复。
步骤62:模型设计节点展示带有告警根因的知识图谱。
参照图8所示,知识图谱可以比较明了地明确告警根因,及时进行故障诊断和修复。
参照图9所示,在一些实施例中,在执行步骤42:基于告警挖掘模型和路由关系选取多个智能节点中的另一个作为主控节点的条件下,步骤200:基于联邦学习将最优节点进行调度排序,形成调度顺序之前,本申请实施例提供的告警关联检测方法,还包括:
步骤70:模型设计节点发送告警挖掘模型至主控节点,以供主控节点从多个智能节点中选取最优节点。
可以看出,告警挖掘模型可以是在模型设计节点生成后,由主控节点根据该告警挖掘模型、各个智能节点之间的路由关系、智能节点的性能和工作状态选取最优节点。
参照图10所示,在一些实施例中,步骤210:按照调度顺序最优节点中的当前节点完成告警关联挖掘之前,本申请实施例提供的告警关联检测方法,还包括:
步骤80:主控节点将告警挖掘模型、数据挖掘算法,以及与最优节点中的第一个节点所在业务域关联的告警关联数据发送至最优节点中的第一个节点。
主控节点在确定调度顺序后将告警挖掘启动数据发送至最优节点中的第一个节点,这里的告警挖掘模型涉及最优节点所组成的网络拓扑结构中的所有最优节点,某一个最优节点可以是在最优节点所组成的网络拓扑结构中的最小网络单元内进行告警数据挖掘。
参照图11所示,在一些实施例中,本申请实施例提供的告警关联检测方法,步骤210:按照调度顺序最优节点中的当前节点完成告警关联挖掘后将得到的第一告警关联数据发送至最优节点中的下一节点,具体包括:
步骤211:最优节点中的当前节点将第一告警关联数据发送至主控节点;
当前节点发送第一告警关联数据至下一节点是否需要主控节点转发主要由告警挖掘模型的网络拓扑结构和智能节点之间的路由关系决定。当前节点与下一节点之间的发送在主控节点转发效果很好的情况下可以由主控节点进行转发。
步骤212:主控节点发送第一告警关联数据至最优节点中的下一节点,其中第一告警关联数据包括告警挖掘模型、数据挖掘算法,以及与最优节点中的下一节点所在业务域关联的告警关联数据。
当前节点根据告警挖掘模型中所涉及的当前节点的告警挖掘模型算法,在当前业务域内进行告警数据挖掘后将不涉及跨域的告警挖掘数据筛除,然后将告警挖掘模型设计的时间段内的告警挖掘数据加密发送下一节点。当然还需要将告警挖掘模型、数据挖掘算法一并发送至下一节点。这里的数据挖掘算法是在告警数据挖掘过程中所使用的算法,这是在告警挖掘模型设计时一并确定的。参照图12所示,在一些实施例中,本申请实施例提供的告警关联检测方法,步骤210:按照调度顺序最优节点中的当前节点完成告警关联挖掘,具体包括:
步骤215:基于告警挖掘模型、数据挖掘算法,以及与最优节点中的下一节点所在业务域关联的告警关联数据在最优节点中的下一节点所在的业务域进行线性相关性运算。
在本业务域进行线性相关性处理,处理完成后将告警关联数据加密后发送至再下一节点,直至发送至主控节点。
参照图13所示,在一些实施例中,最优节点所在的业务域包括多个子数据域,步骤80:主控节点将告警挖掘模型、数据挖掘算法,以及与最优节点中的第一个节点所在业务域关联的告警关联数据发送至最优节点中的第一个节点之前,本申请实施例提供的告警关联检测方法,还包括:
步骤90:将主控节点设置在每一个数据域中,最优节点分别设置于多个子数据域的每一个中,以及将模型设计节点设置于业务域的网络管理系统中;
这里涉及到在一个业务域中智能节点的部署,考虑到一个业务域涉及多个数据域的情况,可以采用这种部署方式。
对应地,步骤215:基于告警挖掘模型、数据挖掘算法,以及与最优节点中的下一节点所在业务域关联的告警关联数据在最优节点中的下一节点所在的业务域进行线性相关性运算,具体包括:
步骤216:最优节点中的下一节点基于告警挖掘模型、数据挖掘算法,以及与最优节点中的下一节点所在业务域关联的告警关联数据分别在多个子数据域中进行告警关联挖掘,得到子数据域告警数据。
相应地,最优节点中的各个下一节点设置在多个子数据域中,在进行告警关联挖掘后将得到的子数据域告警数据发送至主控节点,供主控节点存储。
图14所示为本申请实施例提供的告警关联检测方法的一个示例。该示例是在一个业务域中部署一个作为模型设计节点,考虑到数据处理量,可以在告警域等数据域中分别设置主控节点以及域内的从节点进行告警关联挖掘。
该示例适用于只是针对一个业务域进行告警关联挖掘的情况,主要挖掘当前网络管理系统中的告警关联数据包括当前网络管理系统中涉及该业务域在告警、性能、动态等数据域,甚至是某一个数据域中的子数据域进行告警关联挖掘,而针对每一个数据域或者子数据域的报警关联挖掘,由于每一个数据域或者子数据域的数据模型不一样,加上该业务域的网络管理设备的关联关系决定各个数据域以及同一个数据域下面的各个子数据域之间的聚合关联关系。因此数据域和子数据域之间的告警数据挖掘采用联邦学习的横向建模方式,实现不同的数据域可以独立进行告警数据挖掘并挖掘出数据域之间或者子数据域之间的告警关联数据。
主控节点在一个业务域中可以部署一个,也可以根据业务域中网络管理设备的数量规模进行拆分,拆分原则可以首先对应一个数据域部署一个主控节点,然后将该一个数据域拆分为多个子数据域,然后再根据多个子数据域的数据量进行分节点部署。如图14所示,因该业务域的数据域分为告警域、性能域、动态域等,可以对应一个告警域部署一个主控节点,另外该告警域中管理设备随网元和网元负载的小区数进行综合度量,当该告警域的网络管理设备管理小区数过万或者百万等级别时,则针对该告警数据域部署多个分节点,如域内从节点1、域内从节点2等。
当用户登录当前的无线域的主控节点后,先获取当前网络管理设备支持的数据域的模型即告警模型、性能模型等即网络管理设备的物理资源模型。用户针对各个数据域的子数据域进行告警挖掘模型设计,然后将子数据域的告警挖掘模型等挖掘启动数据发送给告警子数据域进行告警数据挖掘,得到的最终告警关联数据主要存放在告警数据域的主控节点中。
这里还是以告警域进行描述,同样适用于其它数据域。根据网元和网元负载的小区数的规模进行数据切割形成告警子数据域,告警域主控节点将告警挖掘模型分发给各个子数据域做告警数据挖掘。如子数据域内从节点1进行1-1000网元的告警数据挖掘。该子数据域内从节点1根据告警关联模型的告警ID、告警原因码、告警发生的时间区域、告警最小关联时长以及其告警的拓扑关联关系等基础告警信息,这些基础告警信息是告警挖掘模型设计时所使用的基础建表数据,配置线性数据挖掘算法比如皮尔森算法或者FP算法及算法相关的参数,执行告警关联挖掘,将挖掘出来的最终告警关联数据加密返回给告警域主控节点。
设置在该业务域的当前网络管理系统中的模型设计节点主要是进行告警挖掘模型的数据设计和存储、展示以及子域数据模型的分发等。而子数据域的智能节点则主要实现其数据域的关联关系挖掘。
图15所示为本申请实施例提供的告警关联检测方法的另一个示例。网络部署除了无线域的智能节点部署外,还会有动力域的智能节点、承载域的智能节点、核心域的智能节点等,并且无线域除了当前设备厂商的网络管理设备外,还会有其他厂商的无线域等。本示例主要以无线域进行说明。模型设计节点为用户登录的智能节点。
当用户登录模型设计节点时,获取各个业务域的网络管理设备的设备模型、模型数据以及各个业务域的网络管理软件所运行的服务器的IP地址等信息,根据各个智能节点的路由关系在无线域中选择一个与其它各个业务域的主控节点网络连接关系最稳定且链路最快的作为无线域主控节点。目的是在联邦学习时,告警关联数据传递的稳定性并且降低网络传输的延迟。在动力域主控节点完成告警关联挖掘后将第一告警关联数据发送至承载域主控节点。另外主控节点还可以动态维护自个业务域中各个从节点之间的网络状况,以及各个从节点的活动状态。
上述示例主要关注的是智能节点部署之后的调度方式,而本案例则主要以告警域模型为案例说明联邦学习纵向挖掘的数据传递方式,以及最优节点的调度顺序的处理流程。
用户登录到无线域的智能节点后将获取该业务域下的基站告警数据模型。进行告警挖掘模型设计:以告警模型的告警ID、告警原因码、告警发生的时间区域、告警最小关联时长以及告警模型所涉及的网络拓扑结构等作为基础数据,配置线性数据挖掘算法(皮尔森算法和FP算法)及算法相关的参数。将该告警挖掘模型等挖掘启动数据加密后发送给动力域,动力域根据其时间区域将告警分割成以小时为单位,然后以5-6分钟的时间段为最小关联时长进行告警关联挖掘,挖掘完成后,将筛选出与无线网关联的第一告警关联数据,附带动力域与无线域的拓扑关系、动力域的GPS、物理地址等设备属性信息分发给无线域主控节点。无线域的最优节点根据第一告警关联数据中涉及的动力域和无线域的关联关系进行二次告警关联挖掘,采用的告警挖掘算法可以是已有的数据挖掘算法,从而得到涉及动力域与无线域的告警关联数据。
通过以上分析可以看出,本申请实施例提供的基于联邦学习的告警关联检测方法,适用于在通信网络中布置形成拓扑结构的多个智能节点,多个智能节点分布在各个业务域中,在进行告警关联检测时,需要从多个智能节点中选取最优节点,使得最优节点参与到告警关联检测中。各个最优节点基于联邦学习在各自的业务域中完成告警关联挖掘,得到此次告警关联检测的最终告警关联关系数据,并且基于最终告警关联关系数据生成告警关联规则。可以看出本申请实施例提供的基于联邦学习的告警关联检测方法用于跨业务域之间设备故障引起的关联故障诊断,通过联邦学习的方式可以保证数据安全的前提下实现跨域告警关联挖掘,快速定位告警根因,提高告警检测效率。
实施例二
参照图17所示,为本申请实施例提供的一种基于联邦学习的告警关联检测系统1的结构示意图。该基于联邦学习的告警关联检测系统1,包括在通信网络中布置形成拓扑结构的多个智能节点40,多个智能节点40分布在各个业务域中,该基于联邦学习的告警关联检测系统1还包括:
节点选取模块10,被设置为从多个智能节点中选取最优节点;
需要说明的是这里的智能节点可以是原有的已经设置在通信网络中的,也可以是为了实施本申请实施例提供的告警关联检测方法而新设置的智能节点,这些多个智能节点采用拓扑结构连接,形成一个大的智能节点群,因此智能节点在实现数据处理后也可以实现智能节点之间的数据传输。智能节点的数量众多分布在各个业务域中,智能节点存在的目的是利用这些智能节点实现跨业务域的告警关联检测。
从多个智能节点中选取最优节点的目的是采用选出的最优节点进行告警关联挖掘,考虑到告警发生所涉及的时间问题,在每进行一次告警关联检测方法时均需要根据目前告警关联所涉及的智能节点所在的拓扑结构、各个智能节点之间的路由关系以及各个智能节点的运行状态重新从多个智能节点中选取最优节点。
挖掘模块20,被设置为基于联邦学习最优节点在各自的业务域中完成告警关联挖掘,得到最终告警关联关系数据;
联邦学习作为分布式的机器学习范式,可以有效解决数据孤岛问题,让参与方在不共享数据的基础上联合建模,能从技术上打破数据孤岛,实现协作。
最优节点在各自的业务域中基于联邦学习完成告警关联挖掘,对多个业务域进行综合的告警关联数据挖掘,得到涉及各个业务域的告警关联关系,从而可以找到引起一个业务域设备故障与其它业务域的设备故障之间的关联关系。
规则生成模块30,被设置为基于最终告警关联关系数据生成告警关联规则。
基于得到的最终告警关联关系数据生成告警关联规则,后续方便运维人员根据这些告警关联规则快速定位告警根因,解决设备故障。
本申请实施例提供的基于联邦学习的告警关联检测方法,需要利用部署在各个业务域中的智能节点集群来完成,在基于联邦学习最优节点在其所在的业务域进行告警关联挖掘,得到最终告警关联关系数据,可以解决跨业务域的告警关联关系挖掘,可以辅助运维人员挖掘出实际网络运行设备故障的告警根因,形成的告警关联规则可以迁移至其它实际运营网络,快速解决实际运营网络的设备故障。
在一些实施例中,本申请实施例提供的告警关联检测系统,挖掘模块20,还被设置为:
基于联邦学习将最优节点进行调度排序,形成调度顺序;
可以根据联邦学习的纵向建模方式将最优节点进行调度排序,形成调度顺序,即调度排序需要考虑告警关联所涉及的各个业务域之间的关联关系以及各个业务域分别进行告警关联挖掘时相互之间的支撑关系。调度排序的目的是决定告警关联数据在各个最优节点之间的数据流传输方向,从而最优节点完成跨域的告警关联挖掘。
按照调度顺序最优节点中的当前节点完成告警关联挖掘后将得到的第一告警关联数据发送至最优节点中的下一节点,以供最优节点中的下一节点完成告警关联挖掘;
前面已经提到形成调度顺序,在调度顺序中排在下一节点前面的最优节点中的当前节点在完成告警关联挖掘后,将得到的第一告警关联数据发送至调度顺序排在当前节点后面的最优节点中的下一节点,以供最优节点中的下一节点完成告警关联挖掘。
最优节点中的当前节点和下一节点是调度顺序中紧挨的当前节点和下一节点比如调度顺序中位于第二位的最优节点和位于第三位的最优节点,调度顺序中的最优节点按照调度顺序依次完成告警关联挖掘。可以看出,最优节点中的下一节点(从调度顺序中位于第二位的最优节点直至调度顺序中的最后一个最优节点)均是在收到当前节点(从调度顺序中位于第一位的最优节点直至调度顺序中的倒数第二位的最优节点)发过来的第一告警关联数据后,结合本地告警数据进行告警关联挖掘。
当前节点对其所在业务域内的数据比如当前业务的设备数据进行一次完整的告警关联挖掘后,将不涉及跨业务域的告警关联数据进行筛除后统一加密发送至下一个从节点。这里直接由当前节点发送至下一节点是考虑到智能节点所在的网络拓扑结构和各个智能节点之间的路由关系决定。
在调度顺序中的最后一个节点完成告警关联挖掘后得到最终告警关联关系数据。
在调度排序中的最后一个节点完成告警关联挖掘后得到最终告警关联关系数据,然后基于最后告警关联数据生成告警关联规则。
参照图18所示,在一些实施例中,多个智能节点包括模型设计节点、主控节点和从节点,本申请实施例提供的告警关联检测系统还包括:
节点选取模块10,还被设置为基于用户的选择确定多个智能节点中的一个作为模型设计节点;
作为用户登录设备的智能节点可以作为模型设计节点,模型设计节点顾名思义是统筹整个告警关联检测的发起和运行,被设置为设计当前告警关联检测所使用的当前智能节点群即选取的最优节点不同。模型设计节点是网络拓扑结构中的一个节点,也是当前进行告警关联检测所使用的当前智能节点群中的一个节点。
模型确定模块50,被设置为确定告警挖掘模型;以及,
可以看出,用户登录的智能节点不同,当前告警关联检测所使用的当前智能节点不同即选取的最优节点不同,那么告警挖掘模型亦不相同,这里的告警挖掘模型是告警关联检测所选取 的最优节点所形成的网络拓扑结构。
节点选取模块10,还被设置为基于告警挖掘模型和路由关系选取多个智能节点中的另一个作为主控节点,以及多个智能节点中的其它智能节点作为从节点;
在确定模型设计节点和告警挖掘模型之后,基于最优节点所形成的网络拓扑结构和路由关系选取多个智能节点中的另一个作为主控节点,这里的主控节点区别于模型设计节点,主控节点起到中间协调处理实现调度顺序中的最优节点完成告警关联检测,得到最终告警关联关系数据,并且最终告警关联关系数据在主控节点生成告警关联规则,存储告警关联规则。主控节点在获取到告警挖掘模型后,按照联邦学习的方式筛选其它智能节点作为从节点,这时更多的是考量智能节点的节点性能,下文将有所描述。然后主控节点将告警挖掘模型、告警挖掘算法、调度顺序等告警挖掘启动数据分发给各个从节点,在告警关联检测时起到中间调度协调的功能。
当用户在哪个智能节点上登陆并且设计告警挖掘模型时,则主要根据模型设计时所涉及的智能节点的路由关系选择一个中心智能节点为主控节点,可以减少主控节点调度时网络传输延迟引起的告警关联挖掘的效率问题。其他连接的从节点则主要利用该从节点的数据计算能力。主控节点可以根据各个从节点的工作状态以及空闲状态来调度,如果出现一个从节点存储有告警关联数据则主控节点优先使用该从节点做本业务域的告警关联挖掘,如没有告警关联数据则该分节点主要承担主控节点分拆的算法运算功能。
可以看出,主控节点根据告警挖掘模型所涉及的智能节点组成的网络拓扑结构,各个智能节点之间的路由关系、各个智能节点的运行状态找到最优节点。然后主控节点根据联邦学习的纵向建模方式将各个最优节点的调度顺序进行排列,完成多智能节点联合挖掘最后将跨域挖掘出的最终告警关联关系数据返回给主控节点存储,主控节点再发送至模型设计节点,通过模型设计节点呈现给用户。各个智能节点之间的数据传递主要采用加密方式进行传递。如果在调度顺序中的当前最优节点得出的第一告警关联数据中存在与下一个最优节点无关的数据时,则不发送给主控节点,只将与下一个最优节点相关的第一告警关联数据返回给主控节点,然后由主控节点返回给模型设计节点。
另外考虑到告警关联所涉及的时间段,如果时间段过大或者其它原因导致各个最优节点之间传输的告警关联数据过大,可以将该时间段进行切片处理,然后分片分发。这里主要是在时间上进行分片处理,避免告警挖掘模型所选取的时间段过长而导致需要传输的告警挖掘数据传输数据过大的问题。当下一个智能节点接收到该时间切片后的告警关联数据后对该分片分发的时间进行偏差容错处理,保证不能因为时间分片而导致跨分片的关联性丢失。
对应地,节点选取模块10,被设置为:
根据从节点的性能在从节点中选取最优节点。
在确定哪个智能节点为模型设计节点和主控节点后,根据从节点的性能在从节点中选取最优节点,主控节点基于联邦学习对最优节点进行调度排序,形成调度顺序。
通过以上分析可以看出,本申请实施例提供的基于联邦学习的告警关联检测方法,适用于在通信网络中布置形成拓扑结构的多个智能节点,多个智能节点分布在各个业务域中,在进行告警关联检测时,需要从多个智能节点中选取最优节点,使得最优节点参与到告警关联检测中。各个最优节点基于联邦学习在各自的业务域中完成告警关联挖掘,得到此次告警关联检测的最终告警关联关系数据,并且基于最终告警关联关系数据生成告警关联规则。可以看出本申请实施例提供的基于联邦学习的告警关联检测方法用于跨业务域之间设备故障引起的关联故障诊断,通过联邦学习的方式可以保证数据安全的前提下实现跨域告警关联挖掘,快速定位告警根因,提高告警检测效率。
实施例三
为本申请实施例提供的一种通信网络的结构示意图。该通信网络包括如上文所述的基于联邦学习的告警关联检测系统。如图17所示,该基于联邦学习的告警关联检测系统包括:
节点选取模块10,被设置为从多个智能节点中选取最优节点;
需要说明的是这里的智能节点可以是原有的已经设置在通信网络中的,也可以是为了实施本申请实施例提供的告警关联检测方法而新设置的智能节点,这些多个智能节点采用拓扑结构连接,形成一个大的智能节点群,因此智能节点在实现数据处理后也可以实现智能节点之间的 数据传输。智能节点的数量众多分布在各个业务域中,智能节点存在的目的是利用这些智能节点实现跨业务域的告警关联检测。
从多个智能节点中选取最优节点的目的是采用选出的最优节点进行告警关联挖掘,考虑到告警发生所涉及的时间问题,在每进行一次告警关联检测方法时均需要根据目前告警关联所涉及的智能节点所在的拓扑结构、各个智能节点之间的路由关系以及各个智能节点的运行状态重新从多个智能节点中选取最优节点。
挖掘模块20,被设置为基于联邦学习最优节点在各自的业务域中完成告警关联挖掘,得到最终告警关联关系数据;
联邦学习作为分布式的机器学习范式,可以有效解决数据孤岛问题,让参与方在不共享数据的基础上联合建模,能从技术上打破数据孤岛,实现协作。
最优节点在各自的业务域中基于联邦学习完成告警关联挖掘,对多个业务域进行综合的告警关联数据挖掘,得到涉及各个业务域的告警关联关系,从而可以找到引起一个业务域设备故障与其它业务域的设备故障之间的关联关系。
规则生成模块30,被设置为基于最终告警关联关系数据生成告警关联规则。
基于得到的最终告警关联关系数据生成告警关联规则,后续方便运维人员根据这些告警关联规则快速定位告警根因,解决设备故障。
本申请实施例提供的基于联邦学习的告警关联检测方法,需要利用部署在各个业务域中的智能节点集群来完成,在基于联邦学习最优节点在其所在的业务域进行告警关联挖掘,得到最终告警关联关系数据,可以解决跨业务域的告警关联关系挖掘,可以辅助运维人员挖掘出实际网络运行设备故障的告警根因,形成的告警关联规则可以迁移至其它实际运营网络,快速解决实际运营网络的设备故障。
通过以上分析可以看出,本申请实施例提供的基于联邦学习的告警关联检测方法,适用于在通信网络中布置形成拓扑结构的多个智能节点,多个智能节点分布在各个业务域中,在进行告警关联检测时,需要从多个智能节点中选取最优节点,使得最优节点参与到告警关联检测中。各个最优节点基于联邦学习在各自的业务域中完成告警关联挖掘,得到此次告警关联检测的最终告警关联关系数据,并且基于最终告警关联关系数据生成告警关联规则。可以看出本申请实施例提供的基于联邦学习的告警关联检测方法用于跨业务域之间设备故障引起的关联故障诊断,通过联邦学习的方式可以保证数据安全的前提下实现跨域告警关联挖掘,快速定位告警根因,提高告警检测效率。
实施例四
本申请实施例提供的一种存储介质,用于计算机可读存储,所述存储介质存储有一个或者多个程序,所述一个或者多个程序可被一个或者多个处理器执行,以实现如图1至图7,以及图9至图13所示的基于联邦学习的告警关联检测方法的步骤,具体可以执行以下步骤:
步骤10:从多个智能节点中选取最优节点;
需要说明的是这里的智能节点可以是原有的已经设置在通信网络中的,也可以是为了实施本申请实施例提供的告警关联检测方法而新设置的智能节点,这些多个智能节点采用拓扑结构连接,形成一个大的智能节点群,因此智能节点在实现数据处理后也可以实现智能节点之间的数据传输。智能节点的数量众多分布在各个业务域中,智能节点存在的目的是利用这些智能节点实现跨业务域的告警关联检测。
从多个智能节点中选取最优节点的目的是采用选出的最优节点进行告警关联挖掘,考虑到告警发生所涉及的时间问题,在每进行一次告警关联检测方法时均需要根据目前告警关联所涉及的智能节点所在的拓扑结构、各个智能节点之间的路由关系以及各个智能节点的运行状态重新从多个智能节点中选取最优节点。
步骤20:基于联邦学习在最优节点在各自的业务域中完成告警关联挖掘后,得到最终告警关联关系数据;
联邦学习作为分布式的机器学习范式,可以有效解决数据孤岛问题,让参与方在不共享数据的基础上联合建模,能从技术上打破数据孤岛,实现协作。
最优节点在各自的业务域中基于联邦学习完成告警关联挖掘,对多个业务域进行综合的告 警关联数据挖掘,得到涉及各个业务域的告警关联关系,从而可以找到引起一个业务域设备故障与其它业务域的设备故障之间的关联关系。
步骤30:基于最终告警关联关系数据生成告警关联规则。
基于得到的最终告警关联关系数据生成告警关联规则,后续方便运维人员根据这些告警关联规则快速定位告警根因,解决设备故障。
本申请实施例提供的基于联邦学习的告警关联检测方法,需要利用部署在各个业务域中的智能节点集群来完成,在基于联邦学习最优节点在其所在的业务域进行告警关联挖掘,得到最终告警关联关系数据,可以解决跨业务域的告警关联关系挖掘,可以辅助运维人员挖掘出实际网络运行设备故障的告警根因,形成的告警关联规则可以迁移至其它实际运营网络,快速解决实际运营网络的设备故障。
通过以上分析可以看出,本申请实施例提供的基于联邦学习的告警关联检测方法,适用于在通信网络中布置形成拓扑结构的多个智能节点,多个智能节点分布在各个业务域中,在进行告警关联检测时,需要从多个智能节点中选取最优节点,使得最优节点参与到告警关联检测中。各个最优节点基于联邦学习在各自的业务域中完成告警关联挖掘,得到此次告警关联检测的最终告警关联关系数据,并且基于最终告警关联关系数据生成告警关联规则。可以看出本申请实施例提供的基于联邦学习的告警关联检测方法用于跨业务域之间设备故障引起的关联故障诊断,通过联邦学习的方式可以保证数据安全的前提下实现跨域告警关联挖掘,快速定位告警根因,提高告警检测效率。
总之,以上所述仅为本说明书的若干实施例而已,并非用于限定本说明书的保护范围。凡在本说明书的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本说明书的保护范围之内。
上述一个或多个实施例阐明的系统、装置、模块或单元,具体可以由计算机芯片或实体实现,或者由具有某种功能的产品来实现。一种实现设备为计算机。具体的,计算机例如可以为个人计算机、膝上型计算机、蜂窝电话、相机电话、智能电话、个人数字助理、媒体播放器、导航设备、电子邮件设备、游戏控制台、平板计算机、可穿戴设备或者这些设备中的任何设备的组合。
计算机可读存储介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。按照本文中的界定,计算机可读介质不包括暂存电脑可读媒体(transitory media),如调制的数据信号和载波。
还需要说明的是,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、商品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、商品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、商品或者设备中还存在另外的相同要素。
本说明书中的各个实施例均采用递进的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于系统实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。
上述对本说明书特定实施例进行了描述。其它实施例在所附权利要求书的范围内。在一些情况下,在权利要求书中记载的动作或步骤可以按照不同于实施例中的顺序来执行并且仍然可以实现期望的结果。另外,在附图中描绘的过程不一定要求示出的特定顺序或者连续顺序才能实现期望的结果。在某些实施方式中,多任务处理和并行处理也是可以的或者可能是有利的。

Claims (17)

  1. 一种基于联邦学习的告警关联检测方法,适用于在通信网络中布置形成拓扑结构的多个智能节点,所述多个智能节点分布在各个业务域中的场景,所述方法包括:
    从所述多个智能节点中选取最优节点;
    基于联邦学习在所述最优节点在各自的业务域中完成告警关联挖掘后,得到最终告警关联关系数据;
    基于所述最终告警关联关系数据生成告警关联规则。
  2. 如权利要求1所述的告警关联检测方法,基于联邦学习在所述最优节点在各自的业务域中完成告警关联挖掘后,得到最终告警关联关系数据,包括:
    基于所述联邦学习将所述最优节点进行调度排序,形成调度顺序;
    按照所述调度顺序所述最优节点中的当前节点完成告警关联挖掘后将得到的第一告警关联数据发送至所述最优节点中的下一节点,以供所述最优节点中的下一节点完成告警关联挖掘;
    在所述调度顺序中的最后一个节点完成告警关联挖掘后得到所述最终告警关联关系数据。
  3. 如权利要求2所述的告警关联检测方法,基于所述联邦学习将所述最优节点进行调度排序,形成调度顺序,包括:
    基于所述联邦学习的纵向建模方式按照所述各个业务域的关联关系和所述各个业务域之间进行数据挖掘的支撑关系对所述最优节点进行调度排序。
  4. 如权利要求1至3中任一项所述的告警关联检测方法,所述多个智能节点包括模型设计节点、主控节点和从节点,从所述多个智能节点中选取最优节点之前,所述方法还包括:
    基于用户的选择确定所述多个智能节点中的一个作为模型设计节点;
    确定告警挖掘模型;
    基于所述告警挖掘模型和路由关系选取所述多个智能节点中的另一个作为所述主控节点,以及所述多个智能节点中的其它智能节点作为从节点;
    对应地,从所述多个智能节点中选取最优节点,包括:
    根据所述从节点的性能在所述从节点中选取所述最优节点。
  5. 如权利要求4所述的告警关联检测方法,基于所述最终告警关联关系数据生成告警关联规则之后,所述方法还包括:
    所述模型设计节点获取所述告警关联规则;
    所述模型设计节点基于所述告警关联规则形成知识图谱,并且展示所述知识图谱。
  6. 如权利要求5所述的告警关联检测方法,所述通信网络包括网络管理系统,所述模型设计节点基于所述告警关联规则形成知识图谱,并且展示所述知识图谱之后,所述方法还包括:
    所述主控节点收到所述网络管理系统发出的故障信号;
    所述主控节点基于所述告警关联规则进行故障诊断,确定告警根因;
    所述模型设计节点展示带有所述告警根因的所述知识图谱。
  7. 如权利要求4所述的告警关联检测方法,基于所述告警挖掘模型和路由关系选取所述多个智能节点中的另一个作为所述主控节点的条件下,基于所述联邦学习将所述最优节点进行调度排序,形成调度顺序之前,所述方法还包括:
    所述模型设计节点发送所述告警挖掘模型至所述主控节点,以供所述主控节点从所述多个智能节点中选取最优节点。
  8. 如权利要求7所述的告警关联检测方法,按照所述调度顺序所述最优节点中的当前节点完成告警关联挖掘之前,所述方法还包括:
    所述主控节点将所述告警挖掘模型、数据挖掘算法,以及与所述最优节点中的第一个节点所在业务域关联的告警关联数据发送至所述最优节点中的第一个节点。
  9. 如权利要求8所述的告警关联检测方法,按照所述调度顺序所述最优节点中的当前节点完成告警关联挖掘后将得到的第一告警关联数据发送至所述最优节点中的下一节点,包括:
    所述最优节点中的当前节点将所述第一告警关联数据发送至所述主控节点;
    所述主控节点发送所述第一告警关联数据至所述最优节点中的下一节点,其中所述第一告 警关联数据包括所述告警挖掘模型、所述数据挖掘算法,以及与所述最优节点中的下一节点所在业务域关联的告警关联数据。
  10. 如权利要求8所述的告警关联检测方法,按照所述调度顺序所述最优节点中的当前节点完成告警关联挖掘,包括:
    基于所述告警挖掘模型、所述数据挖掘算法,以及与所述最优节点中的下一节点所在业务域关联的告警关联数据在所述最优节点中的下一节点所在的业务域进行线性相关性运算。
  11. 如权利要求10所述的告警关联检测方法,所述最优节点所在的业务域包括多个子数据域,所述主控节点将所述告警挖掘模型、数据挖掘算法,以及与所述最优节点中的第一个节点所在业务域关联的告警关联数据发送至所述最优节点中的第一个节点之前,所述方法还包括:
    将所述主控节点设置在每一个数据域中,所述最优节点分别设置于所述多个子数据域的每一个中,以及将所述模型设计节点设置于所述业务域的网络管理系统中;
    对应地,基于所述告警挖掘模型、所述数据挖掘算法,以及与所述最优节点中的下一节点所在业务域关联的告警关联数据在所述最优节点中的下一节点所在的业务域进行线性相关性运算,包括:
    所述最优节点基于所述告警挖掘模型、所述数据挖掘算法,以及与所述最优节点中的下一节点所在业务域关联的告警关联数据分别在所述多个子数据域中进行告警关联挖掘,得到子数据域告警数据。
  12. 一种基于联邦学习的告警关联检测系统,包括在通信网络中布置形成拓扑结构的多个智能节点,所述多个智能节点分布在各个业务域中,所述系统还包括:
    节点选取模块,被设置为从所述多个智能节点中选取最优节点;
    挖掘模块,被设置为基于联邦学习在所述最优节点在各自的业务域中完成告警关联挖掘后,得到最终告警关联关系数据;
    规则生成模块,被设置为基于所述最终告警关联关系数据生成告警关联规则。
  13. 如权利要求12所述的告警关联检测系统,所述挖掘模块还被设置为:
    基于所述联邦学习将所述最优节点进行调度排序,形成调度顺序;
    按照所述调度顺序所述最优节点中的当前节点完成告警关联挖掘后将得到的第一告警关联数据发送至所述最优节点中的下一节点,以供所述最优节点中的下一节点完成告警关联挖掘;
    在所述调度顺序中的最后一个节点完成告警关联挖掘后得到所述最终告警关联关系数据。
  14. 如权利要求12或13所述的告警关联检测系统,所述多个智能节点包括模型设计节点、主控节点和从节点,所述系统还包括:
    所述节点选取模块,还被设置为基于用户的选择确定所述多个智能节点中的一个作为模型设计节点;
    模型确定模块,被设置为确定告警挖掘模型;以及,
    所述节点选取模块,还被设置为基于所述告警挖掘模型和路由关系选取所述多个智能节点中的另一个作为所述主控节点,以及所述多个智能节点中的其它智能节点作为从节点;
    对应地,所述节点选取模块,被设置为:
    根据所述从节点的性能在所述从节点中选取所述最优节点。
  15. 一种通信网络,包括如权利要求12至14中任一项所述的基于联邦学习的告警关联检测系统。
  16. 一种电子设备,包括:
    处理器;以及
    被安排成存储计算机可执行指令的存储器,所述可执行指令在被执行时使所述处理器执行如权利要求1至11中任一项所述的基于联邦学习的告警关联检测方法的步骤。
  17. 一种存储介质,用于计算机可读存储,所述存储介质存储有一个或者多个程序,所述一个或者多个程序可被一个或者多个处理器执行时,实现如权利要求1至11中任一项所述的基于联邦学习的告警关联检测方法的步骤。
PCT/CN2021/135855 2020-12-31 2021-12-06 基于联邦学习的告警关联检测方法、系统、网络及介质 WO2022143025A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011617624.XA CN113259148B (zh) 2020-12-31 2020-12-31 基于联邦学习的告警关联检测方法、系统及介质
CN202011617624.X 2020-12-31

Publications (1)

Publication Number Publication Date
WO2022143025A1 true WO2022143025A1 (zh) 2022-07-07

Family

ID=77181379

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/135855 WO2022143025A1 (zh) 2020-12-31 2021-12-06 基于联邦学习的告警关联检测方法、系统、网络及介质

Country Status (2)

Country Link
CN (1) CN113259148B (zh)
WO (1) WO2022143025A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116308721A (zh) * 2023-05-11 2023-06-23 菏泽市市场监管监测中心 一种信息监督管理方法、装置、电子设备及存储介质
CN117118849A (zh) * 2023-09-29 2023-11-24 江苏首捷智能设备有限公司 一种物联网网关系统及实现方法

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113807697B (zh) * 2021-09-17 2023-10-31 中国联合网络通信集团有限公司 基于告警关联的派单方法及装置
CN116866740A (zh) * 2022-03-23 2023-10-10 中兴通讯股份有限公司 基于纵向联邦学习的otn数字孪生网络生成方法及系统

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180107695A1 (en) * 2016-10-19 2018-04-19 Futurewei Technologies, Inc. Distributed fp-growth with node table for large-scale association rule mining
CN109167695A (zh) * 2018-10-26 2019-01-08 深圳前海微众银行股份有限公司 基于联邦学习的联盟网络构建方法、设备及可读存储介质
CN111666987A (zh) * 2020-05-22 2020-09-15 中国电子科技集团公司电子科学研究院 基于联邦学习的跨域数据安全互联方法及系统
CN111737749A (zh) * 2020-06-28 2020-10-02 南方电网科学研究院有限责任公司 基于联邦学习的计量装置告警预测方法及设备

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7840515B2 (en) * 2007-02-16 2010-11-23 Panasonic Corporation System architecture and process for automating intelligent surveillance center operations
US20090226162A1 (en) * 2008-03-07 2009-09-10 Jami Cheng Auto-prioritizing service impacted optical fibers in massive collapsed rings network outages
CN104376365B (zh) * 2014-11-28 2018-01-09 国家电网公司 一种基于关联规则挖掘的信息系统运行规则库的构造方法
CN111537945B (zh) * 2020-06-28 2021-05-11 南方电网科学研究院有限责任公司 基于联邦学习的智能电表故障诊断方法及设备

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180107695A1 (en) * 2016-10-19 2018-04-19 Futurewei Technologies, Inc. Distributed fp-growth with node table for large-scale association rule mining
CN109167695A (zh) * 2018-10-26 2019-01-08 深圳前海微众银行股份有限公司 基于联邦学习的联盟网络构建方法、设备及可读存储介质
CN111666987A (zh) * 2020-05-22 2020-09-15 中国电子科技集团公司电子科学研究院 基于联邦学习的跨域数据安全互联方法及系统
CN111737749A (zh) * 2020-06-28 2020-10-02 南方电网科学研究院有限责任公司 基于联邦学习的计量装置告警预测方法及设备

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116308721A (zh) * 2023-05-11 2023-06-23 菏泽市市场监管监测中心 一种信息监督管理方法、装置、电子设备及存储介质
CN116308721B (zh) * 2023-05-11 2023-10-20 菏泽市市场监管监测中心 一种信息监督管理方法、装置、电子设备及存储介质
CN117118849A (zh) * 2023-09-29 2023-11-24 江苏首捷智能设备有限公司 一种物联网网关系统及实现方法
CN117118849B (zh) * 2023-09-29 2024-02-20 江苏首捷智能设备有限公司 一种物联网网关系统及实现方法

Also Published As

Publication number Publication date
CN113259148B (zh) 2022-05-13
CN113259148A (zh) 2021-08-13

Similar Documents

Publication Publication Date Title
WO2022143025A1 (zh) 基于联邦学习的告警关联检测方法、系统、网络及介质
CN104184209B (zh) 一种用户侧分布式能源发电管理系统
Kumar et al. Maximizing the lifetime of a barrier of wireless sensors
US10411949B2 (en) Method and system for virtual network mapping protection and computer storage medium
Ramesh et al. The smart network management automation algorithm for administration of reliable 5G communication networks
Faragardi et al. An efficient placement of sinks and SDN controller nodes for optimizing the design cost of industrial IoT systems
US20120254391A1 (en) Inter-cluster communications technique for event and health status communications
Liu et al. Adaptive service discovery on service-oriented and spontaneous sensor systems.
CN105049253A (zh) 一种获取移动网络故障定位和故障预警的方法
CN114301828A (zh) 一种跨子网交互方法、装置、电子设备和存储介质
Murturi et al. A decentralized approach for resource discovery using metadata replication in edge networks
CN116485136A (zh) 基于云边协同的锂电池产线数据平台建设方法及系统
Alim et al. Structural vulnerability assessment of community-based routing in opportunistic networks
CN105025071A (zh) 一种油气管道云处理系统及实现方法
CN104038420A (zh) 一种路由计算方法和设备
CN114257438B (zh) 基于蜜罐的电力监控系统管理方法、装置和计算机设备
CN102868594B (zh) 一种消息处理方法和装置
CN116431324A (zh) 一种基于Kafka高并发数据采集与分发的边缘系统
CN113824801B (zh) 一种智能融合终端统一接入管理组件系统
CN115695202A (zh) 一种网络探测方法、装置、设备及可读存储介质
CN102740390B (zh) 一种m2m系统及其通信方法、m2m平台和终端
CN111078302B (zh) 一种配网监控平台系统的自动化部署方法及终端
Park et al. A Study on Big Data Collecting and Utilizing Smart Factory Based Grid Networking Big Data Using Apache Kafka
Patel et al. A comprehensive analysis of computing paradigms leading to fog computing: simulation tools, applications, and use cases
Di Nitto et al. Reconfiguration primitives for self-adapting overlays in distributed publish-subscribe systems

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21913766

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 13.11.23)