WO2021259273A1 - 树模型构建方法、装置、设备和存储介质 - Google Patents

树模型构建方法、装置、设备和存储介质 Download PDF

Info

Publication number
WO2021259273A1
WO2021259273A1 PCT/CN2021/101572 CN2021101572W WO2021259273A1 WO 2021259273 A1 WO2021259273 A1 WO 2021259273A1 CN 2021101572 W CN2021101572 W CN 2021101572W WO 2021259273 A1 WO2021259273 A1 WO 2021259273A1
Authority
WO
WIPO (PCT)
Prior art keywords
alarm
data
model
alarm data
relationship
Prior art date
Application number
PCT/CN2021/101572
Other languages
English (en)
French (fr)
Inventor
姜磊
申山宏
周波
杜家强
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Priority to JP2022580024A priority Critical patent/JP2023532013A/ja
Priority to EP21830013.5A priority patent/EP4170975A4/en
Publication of WO2021259273A1 publication Critical patent/WO2021259273A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0631Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0631Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis
    • H04L41/0636Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis based on a decision tree analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/10Pre-processing; Data cleansing
    • G06F18/15Statistical pre-processing, e.g. techniques for normalisation or restoring missing data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0631Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis
    • H04L41/065Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis involving logical or physical relationship, e.g. grouping and hierarchies
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/142Network analysis or design using statistical or mathematical methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/12Discovery or management of network topologies

Definitions

  • This application relates to the field of data analysis and processing, and in particular to a method, device, device, and storage medium for building a tree model.
  • the current solution is to express the alarm correlation in the iterative and updated knowledge graph.
  • the common way to update the knowledge graph is comprehensive update or incremental update, but the comprehensive update method consumes more resources, and the incremental update method requires semi-automatic and semi-manual intervention, and the workload is relatively large.
  • the embodiments of the present application propose a method, device, device, and storage medium for building a tree model.
  • the embodiment of the present application provides a tree model construction method, which includes the following steps: preprocessing the alarm data to obtain a vectorized alarm association relationship set; and compare each alarm association relationship in the vectorized alarm association relationship set Perform normalization processing on the first feature of the vectorized alarm; determine the label of each alarm data in the set according to the vectorized alarm association relationship set, the normalized first feature, and the classification model; wherein, the classification model is obtained through training; Build a tree model based on the labels of each alarm data in the collection.
  • the embodiment of the present application also proposes a tree model construction device, which includes: a processing module configured to preprocess the alarm data to obtain a vectorized alarm correlation set; the processing module is also configured to perform vectorized alarms The first feature of each alarm association relationship in the association relationship set is normalized; the determining module is set to determine each alarm in the set according to the vectorized alarm association relationship set, the normalized first feature and the classification model The label of the data; among them, the classification model is obtained through training; the building module is set to build a tree model according to the labels of each alarm data in the set.
  • the embodiment of the present application also proposes a device, which includes a memory, a processor, a program stored in the memory and running on the processor, and a data bus used to realize the connection and communication between the processor and the memory, When the program is executed by the processor, the steps of the foregoing method are realized.
  • This application provides a readable and writable storage medium for computer storage.
  • the storage medium stores one or more programs, and the one or more programs can be executed by one or more processors to implement the steps of the foregoing method.
  • Fig. 1 is a flowchart of a method for constructing a tree model provided by an embodiment of the present application.
  • Figure 2 is a schematic diagram of one-hot encoding provided by an embodiment of the present application.
  • Fig. 3 is a schematic diagram of a storage method of an alarm association relationship after one-hot encoding provided by an embodiment of the present application.
  • Fig. 4 is a flowchart of a method for determining each alarm data label provided by an embodiment of the present application.
  • Fig. 5 is a schematic diagram of a subtree model provided by an embodiment of the present application.
  • Fig. 6 is a schematic diagram of a subtree model provided by an embodiment of the present application.
  • Fig. 7 is a schematic diagram of a subtree model provided by an embodiment of the present application.
  • Fig. 8 is a schematic diagram of a tree model provided by an embodiment of the present application.
  • Fig. 9 is a schematic structural diagram of a tree model construction device provided by an embodiment of the present application.
  • FIG. 10 is a schematic diagram of the structure of a device provided by an embodiment of the present application.
  • words such as “optionally” or “exemplarily” are used to represent examples, illustrations, or illustrations. Any embodiment or design solution described as “optional” or “exemplary” in the embodiments of the present application should not be construed as being more preferable or advantageous than other embodiments or design solutions. To be precise, the use of words such as “optionally” or “exemplarily” is intended to present related concepts in a specific manner.
  • LTE Long Term Evolution
  • LTE cell out of service LTE Cell out of service
  • the same fault cause causes multiple alarm correlations generated by the same network element, or the same fault cause causes multiple alarm correlations generated by multiple network elements connected in the same discipline, or the same fault cause causes the correlation Multiple alarm associations for multiple disciplines. For example, if a certain network element is down, the "destination signaling point unreachable" alarm from other network elements to this network element can be associated with the same source.
  • Parent-child relationship indicates the primary and secondary relationship of the alarm association relationship in the tree model.
  • the primary alarm can be regarded as the parent alarm, and the child alarm remains unchanged, that is, the primary-secondary relationship is transformed into the parent-child relationship in the tree model.
  • the source alarm can be used as a parent alarm, and other alarms can be used as sub-alarms, so that the same-origin relationship can be transformed into a parent-child relationship in the tree model.
  • the embodiment of the present application provides a tree model construction method, as shown in FIG. 1, the method includes the following steps:
  • the alarm data can be preprocessed in the way of data cleaning.
  • the embodiment of the present application provides an implementation manner for preprocessing the alarm data, which may be to mine the alarm data to obtain a candidate set of alarm association relationships, so as to implement the process of cleaning the alarm data.
  • the alarm data in the candidate set of alarm association relationships obtained by mining can also be one-hot encoded, and then the one-hot encoded alarm data is used as input to train the alarm vector neural network to Get the vectorized alarm association relationship set.
  • the Apriori algorithm, FP Growth algorithm, FPMAX algorithm, etc. can be used for alarm mining in the above-mentioned preprocessing of the alarm data, and the alarm vector neural network can use network models such as Skip-gram.
  • the first feature in this step may include at least one of alarm topology relationship, alarm business relationship, alarm level, alarm professional relationship, alarm cosine similarity, alarm support, and alarm confidence.
  • the alarm topology relationship and the alarm business relationship of each alarm association can be determined according to the actual network structure. At least one of the relationship, the alarm level, and the alarm professional relationship is normalized. For example, in the actual network structure, suppose that the alarm A and the alarm B occur in the indoor baseband processing unit (Building Base Band Unit, BBU) and the base station respectively, and the characteristic of the topology relationship is taken as an example for normalization processing. Assuming that A is the parent alarm of B and B is not the parent alarm of A, then the dimension that A is the parent alarm of B can be marked as 0, and the dimension that B is not the parent alarm of A can be marked as 1.
  • BBU Building Base Band Unit
  • the alarm professional relationship, the alarm business relationship, and the alarm level can also be normalized according to the actual network structure, where the normalized value of the alarm level can be the actual alarm level value.
  • the alarm support, alarm certainty, and alarm cosine similarity of each alarm correlation can be determined according to the normalized threshold. At least one of them is normalized. For example, taking alarm cosine similarity as an example, suppose the cosine similarity of alarm A and alarm B is 0.9, and the normalized threshold of cosine similarity is set to 0.7, that is, the cosine similarity of alarm A and alarm B is greater than the corresponding cosine similarity Normalized threshold, the dimension can be set to 1, on the contrary, if the cosine similarity of alarm A and alarm B is less than the corresponding normalized threshold of cosine similarity, the dimension can be set to 0.
  • the cosine similarity between the alarm data A and the alarm data B can be obtained through the alarm vector neural network in step S101, and the alarm support and alarm confidence can be obtained in the alarm data mining process.
  • alarm support degree may respectively correspond to corresponding normalized thresholds.
  • this feature can be normalized according to its corresponding normalized threshold.
  • the normalized thresholds corresponding to different first features may be the same, which is not limited in the embodiment of the present application.
  • the first feature includes at least one of alarm topology relationship, alarm business relationship, alarm level, alarm professional relationship, and at least one of alarm support, alarm confidence, and alarm cosine similarity .
  • the corresponding first feature can be normalized according to the above two methods.
  • S103 Determine the label of each alarm data in the set according to the vectorized alarm association relationship set, the normalized first feature and the classification model.
  • the above classification model may be obtained through machine learning training. For example, training and optimization are performed through training data and test data. Among them, the training data and the test data can be selected according to the continuous alarm time period. Of course, when training the classification model, you can perform frequent set mining on the training data and test data through a mining algorithm in advance to narrow the data range.
  • the learning effect of the classification model can be evaluated. For example, after a certain network model is evaluated based on the test data, the classification accuracy of the model is less than 80%, the network model can be optimized and re-learned by adjusting the parameters.
  • the classification model After training the classification model based on the training data and the test data, the classification model can be combined with the vectorized alarm correlation set obtained through step S101 and step S102 and the normalized first feature to perform classification and learning, and determine the vector The label of each alarm data in the alarm association relationship set.
  • the label of the alarm data may indicate that the alarm data is a parent node or a child node.
  • classification model can be a common machine learning classification model such as logistic regression or SVM.
  • a tree model can be constructed according to the label of the alarm data.
  • a subtree model in this step, can be constructed according to the labels of the alarm data, and the constructed subtree model can be integrated into the original tree model to Build a new tree model. In this way, not only can the association relationship between alarms be found conveniently and quickly based on the tree model, but also the continuous growth of the tree model can be dynamically realized.
  • the embodiment of the application provides a method for constructing a tree model, which is specifically to preprocess the alarm data to obtain a vectorized alarm association relationship set; normalize the first feature of each alarm association relationship in the vectorized alarm association relationship set Processing: Determine the label of each alarm data in the set according to the vectorized alarm association relationship set, the first characteristic after normalization and the classification model obtained by training; build a tree model according to the label of each alarm data in the set.
  • the primary and secondary associations between the alarms can be expressed through the tree model, and the alarm data can be learned and analyzed based on the training classification model, and the corresponding tree model can be dynamically generated, thereby realizing the continuous growth of the tree.
  • the alarm data when the alarm data is one-hot encoding, only the corresponding alarm dimension may be set to 1, and the remaining positions may be set to 1. Is 0. That is, when one-hot encoding is used in all the n types of possible associated alarms that are mined, only one alarm itself is set to 1, and the remaining associated alarms are set to 0.
  • data when data is saved, for example, the one-hot-encoded alarm association relationship is saved in the database, and the design can be designed according to the dimension of the position where the alarm is 1.
  • the implementation of determining the label of each alarm data in the vectorized alarm association relationship set in step S103 may include, but is not limited to, the following steps:
  • S401 Determine the probability and cosine similarity corresponding to each alarm association relationship according to the vectorized alarm association relationship set, the normalized first feature and the classification model.
  • the probability in this step can be used to indicate the probability that any alarm data is the parent node of another alarm data in the corresponding alarm association relationship, and the cosine similarity can be used to indicate the corresponding alarm association relationship, any alarm data and another The size of the correlation between one alarm data.
  • vectorized training is carried out through the alarm vector neural network, the cosine similarity between the pair of alarm data in the vectorized alarm association relationship set is calculated, and the sigmoid function in the logistic regression classification model is used to calculate any alarm data as another alarm data The probability of the parent node.
  • the normalized similarity threshold is 0.5
  • the calculated cosine similarity between the alarm data A and the alarm data B is 0.9, which is greater than the normalized similarity threshold
  • the alarm data A and the alarm data are determined There is a strong correlation between B.
  • the probability that the alarm data A is the parent node of the alarm data B is 0.83, which is greater than the preset value of 0.45
  • the label of the alarm data A is marked as the parent node
  • the label of the alarm data B is marked as the child node.
  • the normalized threshold of similarity is 0.5
  • the cosine similarity between alarm data A and alarm data B calculated based on the classification model is 0.35, which is less than the normalized threshold of similarity, then the alarm data A and the alarm are determined There is no strong correlation between data B. Therefore, both the labels of the alarm data A and the alarm data B can be marked as child nodes.
  • building a tree model according to the tags of each alarm data in the set in the above step S104 may include the following different situations.
  • a subtree model is constructed according to the tags of each alarm data. For example, suppose that in alarm data A and alarm data B, alarm data A is the parent node, alarm data B is the child node, in alarm data B and alarm data C, alarm data B is the parent node, and alarm data C is the child node , That is, alarm data A is the parent node of alarm data B, and alarm data B is the parent node of alarm data C. Then, the subtree model shown in FIG. 5 can be constructed based on the labels of alarm data A, B, and C.
  • the tags of each alarm data are updated according to the probability corresponding to the alarm association relationship of each alarm data, and the subtree model is constructed according to the updated tags of each alarm data. For example, suppose that in alarm data A and alarm data B, alarm data A is the parent node, alarm data B is the child node, in alarm data B and alarm data C, alarm data B is the parent node, and alarm data C is the child node , In the alarm data A and the alarm data C, the alarm data C is the parent node, and the alarm data A is the child node.
  • each alarm data is updated with the probability corresponding to the alarm association relationship to which the alarm data belongs.
  • the probability that alarm data A is the parent node of alarm data B is 53%
  • the probability that alarm data B is the parent node of alarm data C is 85%
  • the alarm In the association relationships A and C, the probability that the alarm data C is the parent node of the alarm data A is 75%, and then the alarm association relationship with the probability close to 50% can be selected, that is, the alarm association relationships A and B
  • the alarm data B is the alarm data
  • the probability of the parent node of A is updated to 47%, then in the alarm data A and alarm data B, the label of the alarm data A can be updated as a child node, and the label of the alarm data B can be updated as the parent node, that is, each alarm data is updated
  • the latter label is
  • the above examples all have a strong correlation between the alarm data. If there is no strong correlation between the alarm data, that is, the cosine similarity is less than the normalized similarity threshold, then the labels of each alarm data can be updated as follows . For example, suppose that in alarm correlations A and C, alarm data A and alarm data C do not have a strong correlation, alarm correlations A and B, and alarm correlations C and B are all strongly correlated, and alarm data A is an alarm The probability of the parent node of data B is 53%, and the probability of alarm data B being the parent node of alarm data C is 85%. According to the above method, the labels of alarm data A and C that do not have strong correlation can be marked as Child node.
  • the alarm data A and the alarm data B the alarm data A is the parent node, and the alarm data B is the child node.
  • the alarm data B is the parent node
  • the alarm data C is the child node. Because the label of alarm data A is updated to a child node, and the probability that it is the parent node of alarm data B is 53%, then the probability of alarm data B being the parent node of alarm data A is 47%, and the probability of alarm data B is 47%.
  • the label is updated to the parent node, that is, the alarm data B is used as the parent node, and the alarm data A and C are used as the child nodes. In this way, the subtree model shown in FIG. 7 can be constructed.
  • the tree model can be constructed according to the constructed subtree model, that is, the growth of the tree model can be realized.
  • the following different situations may exist in the process of building a tree model.
  • the parent node of the subtree model exists in the original tree model
  • the node with the same parent node of the original tree model and the subtree model can be used as the target node, and the subtree model can be attached to the target node of the original tree model.
  • the subtree model cannot be attached to the original tree model.
  • the corresponding alarm association relationship of each alarm data in the subtree model can be corresponded Probability, update the label of each alarm data in the subtree model. For example, it is possible to update according to the method of updating each alarm data label listed above, and reconstruct the subtree model according to the updated label of each alarm data, until the parent node of the reconstructed subtree model exists in the original tree In the model.
  • the node in the original tree model that is the same as the parent node of the reconstructed subtree model can be used as the target node, and the reconstructed subtree model can be linked to the target node of the original tree model to achieve the growth of the tree model.
  • FIG. 9 is a tree model construction device provided by an embodiment of the application. As shown in FIG. 9, the device includes: a processing module 901, a determination module 902, and a construction module 903;
  • the processing module is set to preprocess the alarm data to obtain the vectorized alarm correlation set
  • the processing module is also set to perform normalization processing on the first feature of each alarm association relationship in the vectorized alarm association relationship set;
  • the determining module is set to determine the label of each alarm data in the set according to the vectorized alarm association relationship set, the normalized first feature and the classification model;
  • the classification model is obtained through training
  • the first feature may include at least one of the following: alarm topology relationship, alarm business relationship, alarm level, alarm professional relationship, alarm cosine similarity, alarm support, and alarm confidence.
  • the building module is set to build a tree model based on the tags of each alarm data in the set.
  • the above-mentioned processing module is configured to mine the alarm data to obtain a candidate set of alarm association relationships; perform one-hot encoding on the alarm data in the candidate set of alarm association relationships; according to the one-hot encoding
  • the vector alarm neural network training is performed on the alarm data, and the vectorized alarm correlation set is obtained.
  • the normalization process of the above-mentioned processing module may be based on the actual network structure when the first feature includes at least one of the alarm topology relationship, the alarm business relationship, the alarm level, and the alarm professional relationship. Perform normalization processing on at least one of the alarm topology relationship, alarm business relationship, alarm level, and alarm professional relationship of each alarm correlation;
  • the alarm support, alarm certainty, and alarm cosine of each alarm association relationship are determined according to the normalized threshold. At least one of the similarities is normalized;
  • alarm support, alarm confidence, and alarm cosine similarity each correspond to a corresponding normalized threshold.
  • the determining module may be configured to determine the probability and cosine similarity corresponding to each alarm association relationship according to the vectorized alarm association relationship set, the normalized first feature, and the classification model;
  • the probability is used to indicate the probability that any alarm data is the parent node of another alarm data in the corresponding alarm association relationship
  • the cosine similarity is used to indicate the corresponding alarm association relationship, any alarm data and another alarm data
  • the determination module can be set to mark the label of any alarm data in the corresponding alarm association as The parent node marks the label of another alarm data as a child node; in the case that the cosine similarity is less than the normalized similarity threshold, the determination module can be set to associate any alarm data in the corresponding alarm association with another The labels of the alarm data are all marked as child nodes.
  • the above-mentioned building module may be configured to construct a subtree model based on the tags of each alarm data when there is no conflict between the tags of each alarm data;
  • the labels of each alarm data conflict, the labels of each alarm data are updated according to the probability corresponding to the alarm association relationship of each alarm data, and the subtree model is constructed according to the updated labels of each alarm data;
  • the implementation of the above-mentioned building tree model may include, but is not limited to, the following:
  • the same node in the original tree model as the parent node of the subtree model is used as the target node, and the subtree model is attached to the target node of the original tree model;
  • the node in the original tree model that is the same as the parent node of the reconstructed subtree model is taken as the target node, and the reconstructed subtree model is attached to the target node of the original tree model.
  • the tree model construction device provided in this embodiment is used to implement the tree model construction method of the embodiment shown in FIG.
  • FIG. 10 is a schematic structural diagram of a device provided by an embodiment. As shown in FIG. 10, the device includes a processor 1001 and a memory 1002; the number of processors 1001 in the device may be one or more. In FIG. 10, one The processor 1001 is taken as an example; the processor 1001 and the memory 1002 in the device may be connected through a bus or other methods. In FIG. 10, the connection through a bus is taken as an example.
  • the memory 1002 can be used to store software programs, computer-executable programs, and modules, such as the program instructions/modules corresponding to the tree model construction method in the embodiment of FIG. 1 of the present application (for example, the tree model construction device
  • the processor 1001 implements the above tree model construction method by running software programs, instructions, and modules stored in the memory 1002.
  • the memory 1002 may mainly include a program storage area and a data storage area.
  • the program storage area may store an operating system and an application program required by at least one function; the data storage area may store data created according to the use of a set-top box.
  • the memory 1002 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, or other non-volatile solid-state storage devices.
  • the embodiments of the present application also provide a readable and writable storage medium for computer storage.
  • the storage medium stores one or more programs, and the one or more programs can be executed by one or more processors to execute a Tree model construction method, the method includes:
  • the classification model is obtained through training
  • the embodiments of the present application provide a tree model construction method, device, equipment, and storage medium, wherein the method includes preprocessing alarm data to obtain a vectorized alarm association relationship set; and associate each alarm in the vectorized alarm association relationship set
  • the first feature of the relationship is normalized; the label of each alarm data in the set is determined according to the set of vectorized alarm association relations, the first feature after normalization and the classification model obtained by training; according to each alarm data in the set
  • the tags build a tree model. In this way, the primary and secondary association relationship between the alarms can be expressed through the tree model, and the alarm data can be learned and analyzed based on the classification model obtained by training, and the corresponding tree model can be dynamically generated, thereby realizing the continuous growth of the tree.
  • the division between functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, a physical component may have multiple functions, or a function or step may consist of several physical components.
  • the components are executed cooperatively.
  • Certain physical components or all physical components can be implemented as software executed by a processor, such as a central processing unit, a digital signal processor, or a microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit .
  • a processor such as a central processing unit, a digital signal processor, or a microprocessor
  • Such software may be distributed on a computer-readable medium, and the computer-readable medium may include a computer storage medium (or non-transitory medium) and a communication medium (or transitory medium).
  • Computer storage medium includes volatile and non-volatile data implemented in any method or technology for storing information (such as computer-readable instructions, data structures, program modules, or other data).
  • Information such as computer-readable instructions, data structures, program modules, or other data.
  • Computer storage media include but are not limited to RAM, ROM, EEPROM, flash memory or other memory technologies, CD-ROM, digital versatile disk (DVD) or other optical disk storage, magnetic cassettes, magnetic tapes, magnetic disk storage or other magnetic storage devices, or Any other medium used to store desired information and that can be accessed by a computer.
  • a communication medium usually contains computer-readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transmission mechanism, and may include any information delivery medium. .

Abstract

一种树模型构建方法、装置、设备和存储介质。该方法包括:对告警数据进行预处理,得到向量化告警关联关系集合(S101);对向量化告警关联关系集合中各告警关联关系的第一特征进行归一化处理(S102);根据向量化告警关联关系集合、归一化处理后的第一特征以及训练得到的分类模型,确定集合中各告警数据的标签(S103);根据集合中各告警数据的标签构建树模型(S104)。

Description

树模型构建方法、装置、设备和存储介质
相关申请的交叉引用
本申请基于申请号为202010592638.4、申请日为2020年06月24日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。
技术领域
本申请涉及数据分析处理领域,尤其涉及一种树模型构建方法、装置、设备和存储介质。
背景技术
在通信领域,通信设备经常会产生大量告警,其中有些告警存在关联关系,这些关联关系可以通过拓扑链路、业务关系、专业关系、告警级别等确定。在例如第五代移动通信技术(5th-Generation,5G)等新网络场景下,其拓扑关系和业务关系也发生了变化,传统的根据拓扑链路辅助确定告警根因关系的方式,其有效性会大幅下降。
目前的解决方式是以知识图谱迭代、更新的方式表达告警关联关系。常见的知识图谱更新的方式为全面更新或增量更新,但全面更新的方式对资源消耗较大,增量更新的方式要求半自动化半人工干预,工作量较大。
发明内容
本申请实施例提出一种树模型构建方法、装置、设备和存储介质。
有鉴于此,本申请实施例提供了一种树模型构建方法,该方法包括以下步骤:对告警数据进行预处理,得到向量化告警关联关系集合;对向量化告警关联关系集合中各告警关联关系的第一特征进行归一化处理;根据向量化告警关联关系集合、归一化处理后的第一特征以及分类模型,确定集合中各告警数据的标签;其中,分类模型是通过训练得到的;根据集合中各告警数据的标签构建树模型。
本申请实施例还提出了一种树模型构建装置,该装置包括:处理模块,被设置成对告警数据进行预处理,得到向量化告警关联关系集合;处理模块,还 被设置成对向量化告警关联关系集合中各告警关联关系的第一特征进行归一化处理;确定模块,被设置成根据向量化告警关联关系集合、归一化处理后的第一特征以及分类模型,确定集合中各告警数据的标签;其中,分类模型是通过训练得到的;构建模块,被设置成根据集合中各告警数据的标签构建树模型。
本申请实施例还提出了一种设备,该设备包括:存储器、处理器,存储在存储器上并可在处理器上运行的程序以及用于实现处理器和存储器之间的连接通信的数据总线,当程序被处理器执行时,实现前述方法的步骤。
本申请提供了一种可读写存储介质,用于计算机存储,存储介质存储有一个或者多个程序,一个或者多个程序可被一个或者多个处理器执行,以实现前述方法的步骤。
附图说明
图1是本申请实施例提供的一种树模型构建方法的流程图。
图2是本申请实施例提供的独热编码示意图。
图3是本申请实施例提供的独热编码后的告警关联关系的存储方式示意图。
图4是本申请实施例提供的一种确定各告警数据标签的方法的流程图。
图5是本申请实施例提供的一种子树模型示意图。
图6是本申请实施例提供的一种子树模型示意图。
图7是本申请实施例提供的一种子树模型示意图。
图8是本申请实施例提供的一种树模型示意图。
图9是本申请实施例提供的一种树模型构建装置的结构示意图。
图10是本申请实施例提供的一种设备结构示意图。
具体实施方式
为使本申请的目的、技术方案和优点更加清楚明白,下文中将结合附图对本申请的实施例进行详细说明。需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互任意组合。
另外,在本申请实施例中,“可选地”或者“示例性地”等词用于表示作例子、例证或说明。本申请实施例中被描述为“可选地”或者“示例性地”的任何实施例或设计方案不应被解释为比其它实施例或设计方案更优选或更具优势。确切而言,使用“可选地”或者“示例性地”等词旨在以具体方式呈现 相关概念。
为了便于理解本申请实施例提供的方案,示例地给出了部分与本申请相关概念的说明以供参考。如下所示:
主次关系:在同一拓扑关系下,光口接收链路故障导致远端射频模块(Remote Radio Unit,RRU)链路断,在同一拓扑关系和业务关系下,RRU链路断导致长期演进技术(Long Term Evolution,LTE)小区退出服务(简称:LTE小区退服),假设在10分钟内发生LTE小区退服和RRU链路断告警,则可以将LTE小区退服作为主告警,将RRU链路断作为子告警,两个告警之间的关联关系为主次关系。
同源关系:同一个故障原因导致同一网元产生的多条告警关联,或者,同一个故障原因导致同一专业内相连的多个网元产生的多条告警关联,或者,同一个故障原因导致涉及多个专业的多条告警关联。例如,某一个网元宕机,则其他各网元到该网元的“目的信令点不可达”告警可进行同源关联。
父子关系:表示树模型中告警关联关系的主次关系。在主次关系中,可以将主告警作为父告警,子告警不变,即将主次关系转化为树模型中的父子关系。在同源关系中,可以将源告警作为父告警,其他告警作为子告警,这样可以将同源关系转化为树模型中的父子关系。
在上述概念的基础上,本申请实施例提供了一种树模型构建方法,如图1所示,该方法包括以下步骤:
S101、对告警数据进行预处理,得到向量化告警关联关系集合。
在无线网络中存在基于各种原因产生的大量告警,由于个别告警可能不会对其他告警的关联关系有参考作用,因此,可以以数据清洗的方式对告警数据进行预处理。
在一些实例中,本申请实施例提供一种对告警数据进行预处理的实现方式可以为,对告警数据进行挖掘,得到告警关联关系备选集合,从而实现对告警数据的清洗过程。在本申请实施例中,还可以对挖掘得到的告警关联关系备选集合中的告警数据进行独热编码,进而,将独热编码后的告警数据作为输入,对告警向量神经网络进行训练,以获取到向量化告警关联关系集合。
在一些实例中,上述对告警数据的预处理过程中,可以采用Apriori算法、FP Growth算法、FPMAX算法等进行告警挖掘,告警向量神经网络可以采用Skip-gram等网络模型。
S102、对向量化告警关联关系集合中各告警关联关系的第一特征进行归 一化处理。
在一些实例中,本步骤中的第一特征可以包括告警拓扑关系、告警业务关系、告警级别、告警专业关系、告警余弦相似度、告警支持度、告警确信度中的至少一项。
在一些实例中,在第一特征包括告警拓扑关系、告警业务关系、告警级别、告警专业关系中的至少之一的情况下,可以根据实际网络结构对各告警关联关系的告警拓扑关系、告警业务关系、告警级别、告警专业关系中的至少之一进行归一化处理。例如,在实际网络结构中,假设告警A和告警B分别发生在室内基带处理单元(Building Base Band Unit,BBU)和基站,以拓扑关系这一特征为例,进行归一化处理。设A是B的父告警,B不是A的父告警,则A是B的父告警这一维度可以为标记为0,B不是A的父告警这一维度可以标记为1。
类似地,告警专业关系、告警业务关系、告警级别也可以根据实际的网络结构进行归一化处理,其中,告警级别的归一化值可以为实际的告警级别值。
在第一特征包括告警支持度、告警确信度、告警余弦相似度中至少之一的情况下,可以根据归一化阈值对各告警关联关系的告警支持度、告警确信度、告警余弦相似度中的至少之一进行归一化处理。例如,以告警余弦相似度为例,假设告警A与告警B的余弦相似度为0.9,设余弦相似度归一化阈值为0.7,即告警A与告警B的余弦相似度大于对应的余弦相似度归一化阈值,则可以将该维度设置为1,相反,若告警A与告警B的余弦相似度小于对应的余弦相似度归一化阈值,则可以将该维度设置为0。
其中,告警数据A与告警数据B的余弦相似度可以通过步骤S101中的告警向量神经网络得到,告警支持度、告警确信度可以在告警数据挖掘过程中获得。
需要说明的是,上述告警支持度、告警确信度、告警余弦相似度可以分别对应相应的归一化阈值。例如,以告警确信度为例,可以按照其对应的归一化阈值对这一特征进行归一化处理。当然,不同第一特征对应的归一化阈值可以相同,本申请实施例对此不作限定。
可以理解的是,在第一特征包括告警拓扑关系、告警业务关系、告警级别、告警专业关系中的至少之一,以及告警支持度、告警确信度、告警余弦相似度中至少之一的情况下,可以按照上述两种方式分别对对应的第一特征进行归一化处理。
S103、根据向量化告警关联关系集合、归一化处理后的第一特征以及分类模型,确定集合中各告警数据的标签。
其中,上述分类模型可以是通过机器学习训练得到的。例如,通过训练数据和测试数据进行训练、优化。其中,训练数据和测试数据可以根据连续发生的告警时间段进行选取。当然,在进行分类模型训练时,可以事先通过挖掘算法对训练数据和测试数据进行频繁集挖掘,以缩小数据范围。
在一些实例中,在基于训练数据和测试数据训练分类模型的过程中,可以对分类模型的学习效果进行评估,例如,在某次基于测试数据评估网络模型后,该模型的分类准确率不到80%,则可以通过调整参数的方式对网络模型进行优化、重新学习。
在一些实例中,调整参数的方式可以有以下两种情况:一、重新设计训练数据,比如训练数据过少,容易过拟合;二、重新定义正则化,比如训练数据训练结果较优,而测试数据测试结果的准确率不高。
根据训练数据和测试数据训练得到分类模型后,可以基于该分类模型,结合通过步骤S101和步骤S102得到的向量化告警关联关系集合以及归一化处理后的第一特征进行分类、学习,确定向量化告警关联关系集合中各告警数据的标签。
其中,告警数据的标签可以表示该告警数据为父节点或者子节点。
注意,上述分类模型,可以采用逻辑回归或者SVM等常见机器学习的分类模型。
S104、根据集合中各告警数据的标签构建树模型。
通过上述步骤获取向量化告警关联关系集合中各告警数据的标签,即确定告警数据为父节点或者子节点后,可以根据该告警数据的标签构建树模型。
在一些实例中,在已有知识图谱(下文称为原树模型)的情况下,本步骤中根据告警数据的标签可以构建子树模型,并将构建的子树模型融入原树模型中,以构建新的树模型。这样不仅可以基于树模型方便、快速地查找各告警之间的关联关系,并且可以动态实现树模型的不断生长。
本申请实施例提供了一种树模型构建方法,具体为对告警数据进行预处理,得到向量化告警关联关系集合;对向量化告警关联关系集合中各告警关联关系的第一特征进行归一化处理;根据向量化告警关联关系集合、归一化处理后的第一特征以及训练得到的分类模型,确定集合中各告警数据的标签;根据集合中各告警数据的标签构建树模型。这样可以通过树模型表达各告警之间 的主次关联关系,并且基于训练得到的分类模型对告警数据进行学习、分析,动态生成相应的树模型,从而实现树的不断生长。
如图2所示,在一种实施例中,上述步骤S101中对告警数据进行预处理的过程中,对告警数据进行独热编码时,可以仅将对应的告警维度设为1,其余位置设为0。即在挖掘出的所有可能关联的n个类型告警中采用独热编码时,只将某一告警自身设置为1,其余关联告警均设置为0。相应地,在进行数据保存时,例如,将独热编码后的告警关联关系保存至数据库中,则可以按照告警为1的位置这一维度进行设计。如图3所示,图2中LTE小区退服为1的位置为6,则LTE小区退服对应的独热编码为6,图2中RRU链路断链为1的位置为3,则RRU链路断链对应的独热编码为3,光口接收链路故障为1的位置为2,则光口接收链路故障对应的独热编码为2。
如图4所示,在一种实施例中,上述步骤S103中确定向量化告警关联关系集合中各告警数据的标签的实现方式可以包括但不限于以下步骤:
S401、根据向量化告警关联关系集合、归一化处理后的第一特征以及分类模型,确定各告警关联关系对应的概率和余弦相似度。
本步骤中的概率可以用于表示对应告警关联关系中,任一告警数据为另一告警数据的父节点的概率,余弦相似度可以用于表示对应的告警关联关系中,任一告警数据与另一告警数据之间的关联性大小。例如,通过告警向量神经网络进行向量化训练,计算向量化告警关联关系集合中两两告警数据之间的余弦相似度,通过逻辑回归分类模型中的Sigmoid函数计算任一告警数据为另一告警数据的父节点的概率。
S402、在余弦相似度大于等于相似度归一化阈值,且对应的告警关联关系的概率大于预设值的情况下,将对应的告警关联关系中任一告警数据的标签标记为父节点,将另一告警数据的标签标记为子节点。
在一些实例中,假设相似度归一化阈值为0.5,计算得到的告警数据A与告警数据B之间的余弦相似度为0.9,大于相似度归一化阈值,则确定告警数据A与告警数据B之间存在强关联性。对应地,若告警数据A为告警数据B的父节点的概率为0.83,大于预设值0.45,则将告警数据A的标签标记为父节点,将告警数据B的标签标记为子节点。
S403、在余弦相似度小于相似度归一化阈值的情况下,将对应的告警关联关系中任一告警数据与另一告警数据的标签均标记为子节点。
同样地,假设相似度归一化阈值为0.5,基于分类模型计算得到的告警数 据A与告警数据B之间的余弦相似度为0.35,小于相似度归一化阈值,则确定告警数据A与告警数据B之间不存在强关联性。因此,可以将告警数据A与告警数据B的标签均标记为子节点。
可以理解的是,在树模型的最顶端仅存在一个父节点,由于告警数据A与告警数据B之间不存在强关联性,那么这两个数据也不可能同时为树模型最顶端的父节点,即使告警数据A与告警数据B可能分别为其他节点的父节点,那么告警数据A与告警数据B也必然是其他节点的子节点,因此,可以将不存在强关联性的告警数据A与告警数据B的标签均标记为子节点。
在一种实施例中,上述步骤S104中根据集合中各告警数据的标签构建树模型可以包括以下不同的情形。
在各告警数据的标签不存在冲突的情况下,根据各告警数据的标签构建子树模型。例如,假设在告警数据A和告警数据B中,告警数据A为父节点,告警数据B为子节点,在告警数据B和告警数据C中,告警数据B为父节点,告警数据C为子节点,即告警数据A为告警数据B的父节点,告警数据B为告警数据C的父节点,那么可以基于告警数据A、B、C的标签构建如图5所示的子树模型。
在各告警数据的标签存在冲突的情况下,根据各告警数据所属告警关联关系对应的概率,对各告警数据的标签进行更新,根据更新后的各告警数据的标签构建子树模型。例如,假设在告警数据A和告警数据B中,告警数据A为父节点,告警数据B为子节点,在告警数据B和告警数据C中,告警数据B为父节点,告警数据C为子节点,在告警数据A和告警数据C中,告警数据C为父节点,告警数据A为子节点,即告警数据A、B、C之间的标签产生冲突,不能构成子树模型,那么可以基于各告警数据所属告警关联关系对应的概率对各告警数据的标签进行更新。设告警关联关系A和B中,告警数据A是告警数据B的父节点的概率为53%,告警关联关系B和C中,告警数据B是告警数据C的父节点的概率为85%,告警关联关系A和C中,告警数据C是告警数据A的父节点的概率为75%,则可以选取概率接近50%的告警关联关系,即告警关联关系A和B,将告警数据B为告警数据A的父节点的概率更新为47%,那么在告警数据A和告警数据B中,可以将告警数据A的标签更新为子节点,将告警数据B的标签更新为父节点,即各告警数据更新后的标签为,告警数据B为C的父节点,告警数据C为告警数据A的父节点,这样基于更新后的各告警数据的标签可以构建如图6所示的子树模型。
当然,上述示例均为各告警数据之间存在强关联性,若告警数据之间不存在强关联性,即余弦相似度小于相似度归一化阈值,那么可以按照如下方式更新各告警数据的标签。例如,假设告警关联关系A和C中,告警数据A和告警数据C不存在强关联性,告警关联关系A和B,以及告警关联关系C和B均为强关联性,并且告警数据A是告警数据B的父节点的概率为53%,告警数据B是告警数据C的父节点的概率为85%,那么按照上述方式,可以将不存在强关联性的告警数据A和C的标签均标记为子节点。在告警数据A和告警数据B中,告警数据A为父节点,告警数据B为子节点,在告警数据B和告警数据C中,告警数据B为父节点,告警数据C为子节点。因为,告警数据A的标签更新为子节点,并且其为告警数据B的父节点的概率为53%,那么告警数据B为告警数据A的父节点的概率为47%,可以将告警数据B的标签更新为父节点,即告警数据B作为父节点,告警数据A和C作为子节点,这样可以构建如图7所示的子树模型。
针对上述几种不同的情形构建子树模型后,可以根据构建的子树模型构建树模型,即实现树模型的生长。
同样地,构建树模型的过程中可能存在以下不同的情形。例如,在子树模型的父节点存在于原树模型的情况下,可以将该原树模型与子树模型的父节点相同的节点作为目标节点,将子树模型挂接在原树模型的目标节点上。如图8所示,假设构建的子树模型如虚线框部分所示,其父节点为LTE小区退服,而在原树模型中存在与“LTE小区退服”这一父节点相同的节点,则将原树模型中的“LTE小区退服”作为目标节点,将虚线框部分的子树模型挂接在原树模型的“LTE小区退服”这一节点上,从而实现树模型的生长。
另一种情形,在子树模型的父节点不存在于原树模型的情况下,导致子树模型无法挂接在原树模型上,那么可以根据子树模型中各告警数据所属告警关联关系对应的概率,更新子树模型中各告警数据的标签。例如,可以根据上述所列出的更新各告警数据标签的方式进行更新,并根据更新后的各告警数据的标签重构子树模型,直至重构后的子树模型的父节点存在于原树模型中。这样,可以将原树模型中与重构后的子树模型的父节点相同的节点作为目标节点,将重构后的子树模型挂接在原树模型的目标节点上,以实现树模型的增长。
图9为本申请实施例提供的一种树模型构建装置,如图9所示,该装置包括:处理模块901、确定模块902、构建模块903;
其中,处理模块,被设置成对告警数据进行预处理,得到向量化告警关联关系集合;
处理模块,还被设置成对向量化告警关联关系集合中各告警关联关系的第一特征进行归一化处理;
确定模块,被设置成根据向量化告警关联关系集合、归一化处理后的第一特征以及分类模型,确定集合中各告警数据的标签;
其中,分类模型是通过训练得到的;
第一特征可以包括以下至少一项:告警拓扑关系、告警业务关系、告警级别、告警专业关系、告警余弦相似度、告警支持度、告警确信度。
构建模块,被设置成根据集合中各告警数据的标签构建树模型。
在一种实施例中,上述处理模块,被设置成对告警数据进行挖掘,得到告警关联关系备选集合;对告警关联关系备选集合中的告警数据进行独热编码;根据独热编码后的告警数据进行向量告警神经网络训练,得到向量化告警关联关系集合。
在一种实施例中,上述处理模块的归一化处理过程可以为在第一特征包括告警拓扑关系、告警业务关系、告警级别、告警专业关系中的至少之一的情况下,根据实际网络结构对各告警关联关系的告警拓扑关系、告警业务关系、告警级别、告警专业关系中的至少之一进行归一化处理;
和/或,在第一特征包括告警支持度、告警确信度、告警余弦相似度中至少之一的情况下,根据归一化阈值对各告警关联关系的告警支持度、告警确信度、告警余弦相似度中的至少之一进行归一化处理;
其中,告警支持度、告警确信度、告警余弦相似度各自对应相应的归一化阈值。
在一种实施例中,确定模块,可以被设置成根据向量化告警关联关系集合、归一化处理后的第一特征以及分类模型,确定各告警关联关系对应的概率和余弦相似度;
其中,概率用于表示对应的告警关联关系中,任一告警数据为另一告警数据的父节点的概率,余弦相似度用于表示对应的告警关联关系中,任一告警数据与另一告警数据之间的关联性大小;
在余弦相似度大于等于相似度归一化阈值,且对应的告警关联关系的概率大于预设值的情况下,确定模块可以被设置成将对应的告警关联关系中任一告警数据的标签标记为父节点,将另一告警数据的标签标记为子节点;在余 弦相似度小于相似度归一化阈值的情况下,确定模块可以被设置成将对应的告警关联关系中任一告警数据与另一告警数据的标签均标记为子节点。
在一种实施例中,上述构建模块,可以被设置成在各告警数据的标签不存在冲突的情况下,根据各告警数据的标签构建子树模型;
或者,在各告警数据的标签存在冲突的情况下,根据各告警数据所属告警关联关系对应的概率,对各告警数据的标签进行更新,根据更新后的各告警数据的标签构建子树模型;
根据子树模型构建树模型。
在一些实例中,上述构建树模型的实现方式可以包括但不限于以下方式:
在子树模型的父节点存在于原树模型的情况下,将原树模型中与子树模型的父节点相同的节点作为目标节点,将子树模型挂接在原树模型的目标节点上;
或者,在子树模型的父节点不存在于原树模型的情况下,根据子树模型中各告警数据所属告警关联关系对应的概率,更新子树模型中各告警数据的标签;
根据更新后的各告警数据的标签重构子树模型,直至重构后的子树模型的父节点存在于原树模型中;
将原树模型中与重构后的子树模型的父节点相同的节点作为目标节点,将重构后的子树模型挂接在原树模型的目标节点上。
本实施例提供的树模型构建装置用于实现图1所示实施例的树模型构建方法,其实现原理和技术效果类似,此处不再赘述。
图10为一实施例提供的一种设备的结构示意图,如图10所示,该设备包括处理器1001和存储器1002;设备中处理器1001的数量可以是一个或多个,图10中以一个处理器1001为例;设备中的处理器1001和存储器1002可以通过总线或其他方式连接,图10中以通过总线连接为例。
存储器1002作为一种计算机可读存储介质,可用于存储软件程序、计算机可执行程序以及模块,如本申请图1实施例中的树模型构建方法对应的程序指令/模块(例如,树模型构建装置中的处理模块901、确定模块902、构建模块903)。处理器1001通过运行存储在存储器1002中的软件程序、指令以及模块实现上述的树模型构建方法。
存储器1002可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序;存储数据区可存储根据机顶盒 的使用所创建的数据等。此外,存储器1002可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他非易失性固态存储器件。
本申请实施例还提供了一种可读写存储介质,用于计算机存储,该存储介质存储有一个或者多个程序,一个或者多个程序可被一个或者多个处理器执行,以执行一种树模型构建方法,该方法包括:
对告警数据进行预处理,得到向量化告警关联关系集合;
对向量化告警关联关系集合中各告警关联关系的第一特征进行归一化处理;
根据向量化告警关联关系集合、归一化处理后的第一特征以及分类模型,确定集合中各告警数据的标签;
其中,分类模型是通过训练得到的;
根据集合中各告警数据的标签构建树模型。
本申请实施例提供了一种树模型构建方法、装置、设备和存储介质,其中该方法包括对告警数据进行预处理,得到向量化告警关联关系集合;对向量化告警关联关系集合中各告警关联关系的第一特征进行归一化处理;根据向量化告警关联关系集合、归一化处理后的第一特征以及训练得到的分类模型,确定集合中各告警数据的标签;根据集合中各告警数据的标签构建树模型。这样可以通过树模型表达各告警之间的主次关联关系,并且基于训练得到的分类模型对告警数据进行学习、分析,动态生成相应的树模型,从而实现树的不断生长。
本领域普通技术人员可以理解,上文中所公开方法中的全部或某些步骤、设备中的功能模块/单元可以被实施为软件、固件、硬件及其适当的组合。
在硬件实施方式中,在以上描述中提及的功能模块/单元之间的划分不一定对应于物理组件的划分;例如,一个物理组件可以具有多个功能,或者一个功能或步骤可以由若干物理组件合作执行。某些物理组件或所有物理组件可以被实施为由处理器,如中央处理器、数字信号处理器或微处理器执行的软件,或者被实施为硬件,或者被实施为集成电路,如专用集成电路。这样的软件可以分布在计算机可读介质上,计算机可读介质可以包括计算机存储介质(或非暂时性介质)和通信介质(或暂时性介质)。如本领域普通技术人员公知的,术语计算机存储介质包括在用于存储信息(诸如计算机可读指令、数据结构、程序模块或其他数据)的任何方法或技术中实施的易失性和非易失性、可移除 和不可移除介质。计算机存储介质包括但不限于RAM、ROM、EEPROM、闪存或其他存储器技术、CD-ROM、数字多功能盘(DVD)或其他光盘存储、磁盒、磁带、磁盘存储或其他磁存储装置、或者可以用于存储期望的信息并且可以被计算机访问的任何其他的介质。此外,本领域普通技术人员公知的是,通信介质通常包含计算机可读指令、数据结构、程序模块或者诸如载波或其他传输机制之类的调制数据信号中的其他数据,并且可包括任何信息递送介质。
以上参照附图仅说明了本申请的一些实施例而已,并非因此局限本申请的权利范围。本领域技术人员不脱离本申请的范围和实质内所作的任何修改、等同替换和改进,均应在本申请的权利范围之内。

Claims (10)

  1. 一种树模型构建方法,包括以下步骤:
    对告警数据进行预处理,得到向量化告警关联关系集合;
    对所述向量化告警关联关系集合中各告警关联关系的第一特征进行归一化处理;
    根据所述向量化告警关联关系集合、归一化处理后的第一特征以及分类模型,确定所述集合中各告警数据的标签;
    其中,所述分类模型是通过训练得到的;
    根据所述集合中各告警数据的标签构建树模型。
  2. 根据权利要求1所述的方法,其中,所述对告警数据进行预处理,得到向量化告警关联关系集合,包括:
    对所述告警数据进行挖掘,得到告警关联关系备选集合;
    对所述告警关联关系备选集合中的告警数据进行独热编码;
    根据独热编码后的告警数据进行告警向量神经网络训练,得到向量化告警关联关系集合。
  3. 根据权利要求1所述的方法,其中,所述第一特征包括以下至少一项:
    告警拓扑关系、告警业务关系、告警级别、告警专业关系、告警余弦相似度、告警支持度、告警确信度。
  4. 根据权利要求3所述的方法,其中,对所述向量化告警关联关系集合中各告警关联关系的第一特征进行归一化处理,包括:
    在所述第一特征包括告警拓扑关系、告警业务关系、告警级别、告警专业关系中的至少之一的情况下,根据实际网络结构对所述各告警关联关系的告警拓扑关系、告警业务关系、告警级别、告警专业关系中的至少之一进行归一化处理;
    和/或,在所述第一特征包括告警支持度、告警确信度、告警余弦相似度中至少之一的情况下,根据归一化阈值对所述各告警关联关系的告警支持度、告警确信度、告警余弦相似度中的至少之一进行归一化处理;
    其中,告警支持度、告警确信度、告警余弦相似度各自对应相应的归一化阈值。
  5. 根据权利要求1-4任一项所述的方法,其中,所述根据所述向量化告警关联关系集合、归一化处理后的第一特征以及分类模型,确定所述集合中各告警数据的标签,包括:
    根据所述向量化告警关联关系集合、归一化处理后的第一特征以及分类模型,确定各告警关联关系对应的概率和余弦相似度;
    其中,所述概率用于表示对应的告警关联关系中,任一告警数据为另一告警数据的父节点的概率,所述余弦相似度用于表示对应的告警关联关系中,任一告警数据与另一告警数据之间的关联性大小;
    在所述余弦相似度大于等于相似度归一化阈值,且对应的告警关联关系的概率大于预设值的情况下,将所述对应的告警关联关系中任一告警数据的标签标记为父节点,将另一告警数据的标签标记为子节点;
    在所述余弦相似度小于相似度归一化阈值的情况下,将对应的告警关联关系中任一告警数据与另一告警数据的标签均标记为子节点。
  6. 根据权利要求1所述的方法,其中,所述根据所述集合中各告警数据的标签构建树模型,包括:
    在所述各告警数据的标签不存在冲突的情况下,根据所述各告警数据的标签构建子树模型;
    或者,在所述各告警数据的标签存在冲突的情况下,根据所述各告警数据所属告警关联关系对应的概率,对所述各告警数据的标签进行更新,根据更新后的所述各告警数据的标签构建子树模型;
    根据所述子树模型构建所述树模型。
  7. 根据权利要求6所述的方法,其中,所述根据所述子树模型构建所述树模型,包括:
    在所述子树模型的父节点存在于原树模型的情况下,将所述原树模型中与所述子树模型的父节点相同的节点作为目标节点,将所述子树模型挂接在所述原树模型的目标节点上;
    或者,在所述子树模型的父节点不存在于所述原树模型的情况下,根据所述子树模型中各告警数据所属告警关联关系对应的概率,更新所述子树模型中各告警数据的标签;
    根据更新后的各告警数据的标签重构子树模型,直至重构后的子树模型的父节点存在于所述原树模型中;
    将所述原树模型中与重构后的子树模型的父节点相同的节点作为目标节点,将所述重构后的子树模型挂接在所述原树模型的目标节点上。
  8. 一种树模型构建装置,包括:
    处理模块,被设置成对告警数据进行预处理,得到向量化告警关联关系集 合;
    所述处理模块,还被设置成对所述向量化告警关联关系集合中各告警关联关系的第一特征进行归一化处理;
    确定模块,被设置成根据所述向量化告警关联关系集合、归一化处理后的第一特征以及分类模型,确定所述集合中各告警数据的标签;
    其中,所述分类模型是通过训练得到的;
    构建模块,被设置成根据所述集合中各告警数据的标签构建树模型。
  9. 一种设备,包括:存储器、处理器,存储在所述存储器上并可在所述处理器上运行的程序以及用于实现所述处理器和所述存储器之间的连接通信的数据总线,其中,所述程序被所述处理器执行时实现如权利要求1-7任一项所述的树模型构建方法。
  10. 一种可读写存储介质,用于计算机存储,其中,所述存储介质存储有一个或者多个程序,所述一个或者多个程序可被一个或者多个处理器执行,以实现如权利要求1-7任一项所述的树模型构建方法。
PCT/CN2021/101572 2020-06-24 2021-06-22 树模型构建方法、装置、设备和存储介质 WO2021259273A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP2022580024A JP2023532013A (ja) 2020-06-24 2021-06-22 ツリーモデル構築方法、装置、機器および記憶媒体
EP21830013.5A EP4170975A4 (en) 2020-06-24 2021-06-22 TREE MODEL CONSTRUCTION METHOD, APPARATUS AND DEVICE AND STORAGE MEDIUM

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010592638.4 2020-06-24
CN202010592638.4A CN113839802A (zh) 2020-06-24 2020-06-24 树模型构建方法、装置、设备和存储介质

Publications (1)

Publication Number Publication Date
WO2021259273A1 true WO2021259273A1 (zh) 2021-12-30

Family

ID=78965019

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/101572 WO2021259273A1 (zh) 2020-06-24 2021-06-22 树模型构建方法、装置、设备和存储介质

Country Status (4)

Country Link
EP (1) EP4170975A4 (zh)
JP (1) JP2023532013A (zh)
CN (1) CN113839802A (zh)
WO (1) WO2021259273A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115243286A (zh) * 2022-06-20 2022-10-25 中国联合网络通信集团有限公司 一种数据处理方法、装置及存储介质
CN115391151A (zh) * 2022-10-26 2022-11-25 腾云悦智科技(长沙)有限责任公司 一种基于对象关系进行智能发现告警标签的方法

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114760186A (zh) * 2022-03-23 2022-07-15 深信服科技股份有限公司 告警分析方法、装置、电子设备及存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6253339B1 (en) * 1998-10-28 2001-06-26 Telefonaktiebolaget Lm Ericsson (Publ) Alarm correlation in a large communications network
CN102938708A (zh) * 2012-11-05 2013-02-20 国网电力科学研究院 基于告警传播模式的告警相关性分析系统及其分析方法
US20180046934A1 (en) * 2016-08-09 2018-02-15 International Business Machines Corporation Warning filter based on machine learning
CN109951306A (zh) * 2017-12-20 2019-06-28 中国移动通信集团湖北有限公司 告警的处理方法、装置、设备及介质
CN110929951A (zh) * 2019-12-02 2020-03-27 电子科技大学 一种用于电网告警信号的关联分析和预测方法
CN111209131A (zh) * 2019-12-30 2020-05-29 航天信息股份有限公司广州航天软件分公司 一种基于机器学习确定异构系统的故障的方法和系统

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111193627B (zh) * 2019-12-31 2022-08-12 中国移动通信集团江苏有限公司 信息处理方法、装置、设备及存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6253339B1 (en) * 1998-10-28 2001-06-26 Telefonaktiebolaget Lm Ericsson (Publ) Alarm correlation in a large communications network
CN102938708A (zh) * 2012-11-05 2013-02-20 国网电力科学研究院 基于告警传播模式的告警相关性分析系统及其分析方法
US20180046934A1 (en) * 2016-08-09 2018-02-15 International Business Machines Corporation Warning filter based on machine learning
CN109951306A (zh) * 2017-12-20 2019-06-28 中国移动通信集团湖北有限公司 告警的处理方法、装置、设备及介质
CN110929951A (zh) * 2019-12-02 2020-03-27 电子科技大学 一种用于电网告警信号的关联分析和预测方法
CN111209131A (zh) * 2019-12-30 2020-05-29 航天信息股份有限公司广州航天软件分公司 一种基于机器学习确定异构系统的故障的方法和系统

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4170975A4 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115243286A (zh) * 2022-06-20 2022-10-25 中国联合网络通信集团有限公司 一种数据处理方法、装置及存储介质
CN115243286B (zh) * 2022-06-20 2024-05-03 中国联合网络通信集团有限公司 一种数据处理方法、装置及存储介质
CN115391151A (zh) * 2022-10-26 2022-11-25 腾云悦智科技(长沙)有限责任公司 一种基于对象关系进行智能发现告警标签的方法

Also Published As

Publication number Publication date
JP2023532013A (ja) 2023-07-26
EP4170975A4 (en) 2023-11-29
EP4170975A1 (en) 2023-04-26
CN113839802A (zh) 2021-12-24

Similar Documents

Publication Publication Date Title
WO2021259273A1 (zh) 树模型构建方法、装置、设备和存储介质
WO2021057576A1 (zh) 一种构造云化网络告警根因关系树模型方法、装置和存储介质
CN110609759B (zh) 一种故障根因分析的方法及装置
US20160105308A1 (en) System, apparatus and methods for adaptive data transport and optimization of application execution
WO2023045417A1 (zh) 一种故障知识图谱构建方法及装置
CN113632074A (zh) 用于数据库迁移的机器学习预测
EP4307634A1 (en) Feature engineering programming method and apparatus
CN109218080A (zh) 一种自动绘制网络拓扑架构的方法、监控系统及终端设备
US11140167B1 (en) System, method, and computer program for automatically classifying user accounts in a computer network using keys from an identity management system
CN112232524B (zh) 多标签信息的识别方法、装置、电子设备和可读存储介质
US11411835B2 (en) Cognitive model determining alerts generated in a system
WO2022111284A1 (zh) 一种数据标注处理方法、装置、存储介质及电子装置
CN108664607A (zh) 一种基于迁移学习的电力通信网数据质量提升方法
WO2023143570A1 (zh) 一种连接关系预测方法及相关设备
US20230401829A1 (en) Training machine learning models based on unlabeled data
US20230132213A1 (en) Managing bias in federated learning
US20230209367A1 (en) Telecommunications network predictions based on machine learning using aggregated network key performance indicators
CN116094907A (zh) 投诉信息的处理方法、装置及存储介质
US20220222486A1 (en) Data Source Evaluation Platform for Improved Generation of Supervised Learning Models
US20220222568A1 (en) System and Method for Ascertaining Data Labeling Accuracy in Supervised Learning Systems
BR102022001453A2 (pt) Método e sistema para fornecer uma abordagem generalizada para mapeamento de cultura em regiões com várias características
WO2022052199A1 (zh) 数据标注方法、网络设备、终端、系统及存储介质
US20210329436A1 (en) Bluetooth device networking system and method based on ble
CN113609317A (zh) 一种图像库构建方法、装置及电子设备
CN112866120A (zh) 基于分类搜索的sdn流表无环一致性更新方法和系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21830013

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2022580024

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2021830013

Country of ref document: EP

Effective date: 20230123