CN114765575B

CN114765575B - Network fault cause prediction method and device and electronic equipment

Info

Publication number: CN114765575B
Application number: CN202110001432.4A
Authority: CN
Inventors: 周永庆; 花小磊; 朱琳
Original assignee: China Mobile Communications Group Co Ltd; Research Institute of China Mobile Communication Co Ltd
Current assignee: China Mobile Communications Group Co Ltd; Research Institute of China Mobile Communication Co Ltd
Priority date: 2021-01-04
Filing date: 2021-01-04
Publication date: 2024-06-11
Anticipated expiration: 2041-01-04
Also published as: CN114765575A

Abstract

The present invention provides a method, device and electronic device for predicting the cause of a network fault, which solves the problem of low accuracy in the existing prediction of the cause of a network fault. The method of the present invention comprises: obtaining a classification feature vector in a fault work order, the classification feature vector comprising a first-class feature vector and a second-class feature vector; obtaining a target fault cause category to which the fault work order belongs according to the first-class feature vector and a first classification prediction model; obtaining a target fault cause subcategory of the fault work order in the target fault cause category according to the second-class feature vector and a second classification prediction model corresponding to the target fault cause category. The present invention adopts a two-step prediction method, i.e., first predicting a major category of fault causes, and then predicting a subdivided category of fault causes in the major category of fault causes, which can effectively reduce the number of categories predicted in each step and improve the accuracy of the prediction results.

Description

A network failure cause prediction method, device and electronic equipment

技术领域Technical Field

本发明涉及人工智能技术领域，尤其是涉及一种网络故障原因预测方法、装置及电子设备。The present invention relates to the field of artificial intelligence technology, and in particular to a method, device and electronic equipment for predicting the cause of a network failure.

背景技术Background technique

在网络系统中，网元种类繁多，网络结构复杂，在网络运行的过程中，不可避免地会发生各种故障。在故障发生后，网络运维人员需要对故障进行排查，找出导致故障发生的原因，进而采取相应的处理措施帮助现网恢复运行。In the network system, there are many types of network elements and the network structure is complex. Various faults will inevitably occur during the operation of the network. After the fault occurs, the network operation and maintenance personnel need to troubleshoot the fault, find out the cause of the fault, and then take corresponding treatment measures to help the existing network resume operation.

具体来说，在现网运行的过程中，故障发生后，网络设备会产生告警，汇报到网管系统中。网管系统基于收到的告警和一定的派单规则派单给运维人员，运维人员结合告警等多方面的信息对故障原因进行排查，再依据故障原因采取相应的处理措施，在解决了故障之后，将故障原因和处理措施回单对应到相应的工单。Specifically, during the operation of the existing network, after a fault occurs, the network equipment will generate an alarm and report it to the network management system. The network management system dispatches an order to the operation and maintenance personnel based on the received alarm and certain dispatching rules. The operation and maintenance personnel will investigate the cause of the fault based on the alarm and other information, and then take corresponding treatment measures based on the cause of the fault. After the fault is resolved, the fault cause and treatment measures will be returned to the corresponding work order.

现有的故障原因预测技术方案中，故障原因的类别较多，其中某些故障原因较为相似，直接进行预测时准确率较低。In the existing fault cause prediction technology solutions, there are many categories of fault causes, some of which are relatively similar, and the accuracy is low when directly predicted.

发明内容Summary of the invention

本发明的目的在于提供一种网络故障原因预测方法、装置及电子设备，用于解决现有网络故障原因预测准确率低的问题。The purpose of the present invention is to provide a method, device and electronic device for predicting the cause of a network failure, so as to solve the problem of low accuracy in predicting the cause of a network failure.

为了达到上述目的，本发明提供一种网络故障原因预测方法，包括：In order to achieve the above object, the present invention provides a method for predicting the cause of a network failure, comprising:

获取故障工单中的分类特征向量，所述分类特征向量包括第一类特征向量和第二类特征向量；Obtaining a classification feature vector in the fault work order, wherein the classification feature vector includes a first-category feature vector and a second-category feature vector;

根据所述第一类特征向量和第一分类预测模型，得到所述故障工单所属的目标故障原因类别；Obtaining a target fault cause category to which the fault work order belongs according to the first type of feature vector and the first classification prediction model;

根据所述第二类特征向量以及与所述目标故障类别相对应的第二分类预测模型，得到所述故障工单在所述目标故障原因类别中的目标故障原因子类别。A target fault cause subcategory of the fault work order in the target fault cause category is obtained according to the second type feature vector and a second classification prediction model corresponding to the target fault category.

其中，所述获取故障工单中的分类特征向量，包括：The step of obtaining the classification feature vector in the fault work order includes:

获取待处理的故障工单，所述故障工单的字段包括告警标题、网元名称、网元类型和故障发生时间；Obtain a pending fault work order, wherein the fields of the fault work order include an alarm title, a network element name, a network element type, and a fault occurrence time;

基于所述故障工单的字段与特征向量的对应关系和/或特征提取模型，提取所述故障工单中的分类特征向量。Based on the correspondence between the fields of the fault work order and the feature vector and/or the feature extraction model, the classification feature vector in the fault work order is extracted.

其中，所述分类特征向量包括：Wherein, the classification feature vector includes:

用于表征所述告警标题的第一特征向量；A first feature vector for characterizing the alarm title;

用于表征所述告警标题对应的故障原因类别的第二特征向量；A second feature vector for characterizing the fault cause category corresponding to the alarm title;

用于表征所述网元类型的第三特征向量；A third characteristic vector for characterizing the network element type;

用于表征所述网元类型对应的故障原因类别的第四特征向量；A fourth eigenvector for characterizing a fault cause category corresponding to the network element type;

用于表征所述故障工单关联到的告警信息的第五特征向量；A fifth eigenvector for characterizing the alarm information associated with the fault work order;

用于表征所述告警标题对应的故障原因子类别的第六特征向量；以及用于表征网元类型对应的故障原因子类别的第七特征向量；a sixth feature vector for characterizing a subcategory of a fault cause corresponding to the alarm title; and a seventh feature vector for characterizing a subcategory of a fault cause corresponding to a network element type;

其中，所述第一类特征向量包括：所述第一特征向量、所述第二特征向量、所述第三特征向量、所述第四特征向量和所述第五特征向量；The first type of feature vectors includes: the first feature vector, the second feature vector, the third feature vector, the fourth feature vector and the fifth feature vector;

所述第二类特征向量包括：所述第一特征向量、所述第二特征向量、所述第三特征向量、所述第四特征向量、所述第五特征向量、所述第六特征向量和所述第七特征向量。The second type of feature vectors includes: the first feature vector, the second feature vector, the third feature vector, the fourth feature vector, the fifth feature vector, the sixth feature vector and the seventh feature vector.

其中，所述根据所述第一类特征向量和第一分类预测模型，得到所述故障工单所属的目标故障原因类别，包括：Wherein, obtaining the target fault cause category to which the fault work order belongs according to the first type of feature vector and the first classification prediction model includes:

通过所述第一分类预测模型对所述第一类特征向量进行分类，得到各个故障原因类别的概率值；Classifying the first type of feature vectors by using the first classification prediction model to obtain probability values of each fault cause category;

将各个故障原因类别的概率值中最大概率值对应的故障原因类别，确定为所述故障工单所属的目标故障原因类别。The fault cause category corresponding to the maximum probability value among the probability values of each fault cause category is determined as the target fault cause category to which the fault work order belongs.

其中，所述根据所述第二类特征向量以及与所述目标故障原因类别相对应的第二分类预测模型，得到所述故障工单在所述目标故障原因类别中的目标故障原因子类别，包括：Wherein, obtaining the target fault cause subcategory of the fault work order in the target fault cause category according to the second type feature vector and the second classification prediction model corresponding to the target fault cause category includes:

通过所述第二分类预测模型对所述第二类特征向量进行分类，得到所述故障工单在所述目标故障原因类别中各个故障原因子类别的概率值；Classifying the second type of feature vectors by using the second classification prediction model to obtain probability values of each fault cause subcategory of the fault work order in the target fault cause category;

将各个故障原因子类别的概率值中最大概率值对应的故障原因子类别，确定为目标故障原因子类别。The fault cause subcategory corresponding to the maximum probability value among the probability values of each fault cause subcategory is determined as the target fault cause subcategory.

其中，所述方法还包括：Wherein, the method further comprises:

获取多条历史故障工单及多条历史告警信息，每条所述历史故障工单的字段包括告警标题、网元名称、网元类型、故障发生时间、故障原因类别以及对应故障原因类别的故障原因子类别，每条所述历史告警信息的字段包括告警标题、网元名称和告警开始时间；Acquire multiple historical fault work orders and multiple historical alarm information, where the fields of each of the historical fault work orders include an alarm title, a network element name, a network element type, a fault occurrence time, a fault cause category, and a fault cause subcategory corresponding to the fault cause category, and the fields of each of the historical alarm information include an alarm title, a network element name, and an alarm start time;

根据所述历史故障工单的字段和所述历史告警信息的字段，得到分类特征向量，所述分类特征向量包括：用于表征所述告警标题的第一特征向量，用于表征所述告警标题对应的故障原因类别的第二特征向量，用于表征所述网元类型的第三特征向量，用于表征所述网元类型对应的故障原因类别的第四特征向量，用于表征所述故障工单关联到的告警信息的第五特征向量，用于表征所述告警标题对应的故障原因子类别的第六特征向量以及用于表征网元类型对应的故障原因子类别的第七特征向量；According to the fields of the historical fault work order and the fields of the historical alarm information, a classification feature vector is obtained, the classification feature vector comprising: a first feature vector for characterizing the alarm title, a second feature vector for characterizing the fault cause category corresponding to the alarm title, a third feature vector for characterizing the network element type, a fourth feature vector for characterizing the fault cause category corresponding to the network element type, a fifth feature vector for characterizing the alarm information associated with the fault work order, a sixth feature vector for characterizing the fault cause subcategory corresponding to the alarm title, and a seventh feature vector for characterizing the fault cause subcategory corresponding to the network element type;

根据所述第一特征向量、所述第二特征向量、所述第三特征向量、所述第四特征向量、所述第五特征向量以及故障原因类别的类别标签，进行模型训练，得到第一分类预测模型。Model training is performed based on the first feature vector, the second feature vector, the third feature vector, the fourth feature vector, the fifth feature vector, and the category label of the fault cause category to obtain a first classification prediction model.

其中，根据所述历史故障工单的字段和所述历史告警信息的字段，得到分类特征向量之后，所述方法还包括：Wherein, after obtaining the classification feature vector according to the fields of the historical fault work order and the fields of the historical alarm information, the method further includes:

根据所述故障原因类别，对多条所述历史故障工单进行分组，得到多组历史故障工单数据；According to the fault cause category, the plurality of historical fault work orders are grouped to obtain a plurality of groups of historical fault work order data;

根据每组历史故障工单数据对应的故障原因子类别的类别标签以及分类特征向量，对各组历史故障工单数据分别进行模型训练，得到多个第二分类预测模型。According to the category labels and classification feature vectors of the fault cause subcategories corresponding to each group of historical fault work order data, model training is performed on each group of historical fault work order data to obtain multiple second classification prediction models.

本发明还提供一种网络故障原因预测装置，包括：The present invention also provides a network failure cause prediction device, comprising:

第一获取模块，用于获取故障工单中的分类特征向量，所述分类特征向量包括第一类特征向量和第二类特征向量；A first acquisition module is used to acquire a classification feature vector in the fault work order, wherein the classification feature vector includes a first-category feature vector and a second-category feature vector;

第一故障原因预测模块，用于根据所述第一类特征向量和第一分类预测模型，得到所述故障工单所属的目标故障原因类别；A first fault cause prediction module, used for obtaining a target fault cause category to which the fault work order belongs according to the first type of feature vector and a first classification prediction model;

第二故障原因预测模块，用于根据所述第二类特征向量以及与所述目标故障原因类别相对应的第二分类预测模型，得到所述故障工单在所述目标故障原因类别中的目标故障原因子类别。The second fault cause prediction module is used to obtain a target fault cause subcategory of the fault work order in the target fault cause category according to the second type feature vector and a second classification prediction model corresponding to the target fault cause category.

本发明还提供一种电子设备，包括处理器和收发器，所述收发器在处理器的控制下接收和发送数据，所述处理器用于执行以下操作：The present invention also provides an electronic device, comprising a processor and a transceiver, wherein the transceiver receives and sends data under the control of the processor, and the processor is used to perform the following operations:

根据所述第二类特征向量和第二分类预测模型，得到所述故障工单在所述目标故障原因类别中的目标故障原因子类别。According to the second type of feature vector and the second classification prediction model, a target fault cause subcategory of the fault work order in the target fault cause category is obtained.

其中，所述处理器还用于：Wherein, the processor is further used for:

用于表征所述告警标题对应的故障原因子类别的第六特征向量；以及A sixth feature vector for characterizing a subcategory of a fault cause corresponding to the alarm title; and

用于表征网元类型对应的故障原因子类别的第七特征向量；A seventh eigenvector for characterizing a subcategory of a fault cause corresponding to a network element type;

其中，所述处理器还用于：Wherein, the processor is further used for:

本发明还提供一种电子设备，包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的程序；所述处理器执行所述程序时实现如上述所述的网络故障原因预测方法。The present invention also provides an electronic device, comprising a memory, a processor, and a program stored in the memory and executable on the processor; when the processor executes the program, the network failure cause prediction method as described above is implemented.

本发明还提供一种计算机可读存储介质，其上存储有计算机程序，该程序被处理器执行时实现如上述所述的网络故障原因预测方法中的步骤。The present invention also provides a computer-readable storage medium on which a computer program is stored. When the program is executed by a processor, the steps in the network fault cause prediction method as described above are implemented.

本发明的上述技术方案至少具有如下有益效果：The above technical solution of the present invention has at least the following beneficial effects:

本发明实施例中，通过获取故障工单中的分类特征向量，所述分类特征向量包括第一类特征向量和第二类特征向量；根据所述第一类特征向量和第一分类预测模型，得到所述故障工单所属的目标故障原因类别；根据所述第二类特征向量以及与所述目标故障原因类别相对应的第二分类预测模型，得到所述故障工单在所述目标故障原因类别中的目标故障原因子类别，如此，通过两步预测方法，即先对故障原因大类进行预测，再对该故障原因大类中的故障原因细分的类别进行预测，能够有效地减少每一步预测的类别数，提升预测结果的准确率。In an embodiment of the present invention, a classification feature vector in a fault work order is obtained, and the classification feature vector includes a first-category feature vector and a second-category feature vector; based on the first-category feature vector and a first classification prediction model, a target fault cause category to which the fault work order belongs is obtained; based on the second-category feature vector and a second classification prediction model corresponding to the target fault cause category, a target fault cause subcategory of the fault work order in the target fault cause category is obtained. In this way, through a two-step prediction method, that is, first predicting a major category of fault causes, and then predicting a subdivided category of fault causes in the major category of fault causes, the number of categories predicted in each step can be effectively reduced, thereby improving the accuracy of the prediction results.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1表示本发明实施例的网络故障原因预测方法的流程示意图之一；FIG1 is a schematic diagram showing a flow chart of a method for predicting network failure causes according to an embodiment of the present invention;

图2表示本发明实施例的第一分类预测模型和第二分类预测模型的模型训练流程示意图；FIG2 is a schematic diagram showing the model training process of the first classification prediction model and the second classification prediction model according to an embodiment of the present invention;

图3表示本发明实施例的网络故障原因预测方法的流程示意图之二；FIG3 is a second flow chart of a method for predicting network failure causes according to an embodiment of the present invention;

图4表示本发明实施例的网络故障原因预测装置的模块示意图；FIG4 is a schematic diagram showing a module of a device for predicting network failure causes according to an embodiment of the present invention;

图5表示本发明实施例的电子设备的结构示意图。FIG5 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

具体实施方式Detailed ways

为使本发明要解决的技术问题、技术方案和优点更加清楚，下面将结合附图及具体实施例进行详细描述。In order to make the technical problems, technical solutions and advantages to be solved by the present invention more clear, a detailed description will be given below with reference to the accompanying drawings and specific embodiments.

本发明针对现有网络故障原因预测准确率低的问题，提供一种网络故障原因预测方法、装置及电子设备。Aiming at the problem of low accuracy in current prediction of network fault causes, the present invention provides a network fault cause prediction method, device and electronic equipment.

如图1所示，为本发明实施例提供的网络故障原因预测方法的流程示意图。该方法具体包括：As shown in Figure 1, it is a schematic diagram of the flow of a method for predicting network failure causes provided by an embodiment of the present invention. The method specifically includes:

步骤101，获取故障工单中的分类特征向量，所述分类特征向量包括第一类特征向量和第二类特征向量；Step 101, obtaining a classification feature vector in a fault work order, wherein the classification feature vector includes a first-category feature vector and a second-category feature vector;

本步骤中，故障工单是基于收到的告警信息，按照预设的派单规则生成的，是一条新的工单，即故障原因暂未记录在该故障工单上。In this step, the fault work order is generated based on the received alarm information and in accordance with the preset dispatching rules. It is a new work order, that is, the cause of the fault has not yet been recorded on the fault work order.

步骤102，根据所述第一类特征向量和第一分类预测模型，得到所述故障工单所属的目标故障原因类别；Step 102, obtaining a target fault cause category to which the fault work order belongs according to the first type feature vector and the first classification prediction model;

本步骤中，第一分类预测模型为预先训练好的模型，将该故障工单的第一类特征向量作为输入，输入该第一分类预测模型，输出该故障工单所属的目标故障原因类别，即该故障工单所属的故障原因大类。In this step, the first classification prediction model is a pre-trained model. The first type of feature vector of the fault work order is taken as input and input into the first classification prediction model to output the target fault cause category to which the fault work order belongs, that is, the major category of fault cause to which the fault work order belongs.

该第一分类预测模型是基于历史故障工单和告警数据训练得到的。The first classification prediction model is trained based on historical fault work orders and alarm data.

步骤103，根据所述第二类特征向量以及与所述目标故障原因类别相对应的第二分类预测模型，得到所述故障工单在所述目标故障原因类别中的目标故障原因子类别。Step 103: Obtain a target fault cause subcategory of the fault work order in the target fault cause category according to the second type feature vector and a second classification prediction model corresponding to the target fault cause category.

本步骤中，第二分类预测模型为预先训练好的模型。In this step, the second classification prediction model is a pre-trained model.

这里，将该故障工单的第二类特征向量作为输入，输入该第二分类预测模型，输出该故障工单在目标故障原因类别中的目标故障原因子类别，即故障原因大类中的故障原因细分的类别。Here, the second type of feature vector of the fault work order is used as input to the second classification prediction model, and the target fault cause subcategory of the fault work order in the target fault cause category is output, that is, the category of the fault cause subdivision in the fault cause category.

需要说明的是，由于历史故障工单中包括运维人员在故障解决后记录的故障原因和处理措施，包含了很多的运维经验，所以依据历史故障工单所包含的信息进行挖掘，训练机器学习模型，得到第一分类预测模型和第二分类预测模型，之后在新的派单中的工单信息与告警信息，自动化地对故障原因进行预测，从而节省运维人员排查故障原因的时间。It should be noted that since historical fault work orders include the fault causes and handling measures recorded by the operation and maintenance personnel after the fault is resolved, and contain a lot of operation and maintenance experience, we mine the information contained in the historical fault work orders and train the machine learning model to obtain the first classification prediction model and the second classification prediction model. Then, the fault cause is automatically predicted based on the work order information and alarm information in the new dispatch, thereby saving the operation and maintenance personnel the time to troubleshoot the cause of the fault.

本发明实施例的网络故障原因预测方法，通过获取故障工单中的分类特征向量，所述分类特征向量包括第一类特征向量和第二类特征向量；根据所述第一类特征向量和第一分类预测模型，得到所述故障工单所属的目标故障原因类别；根据所述第二类特征向量以及与所述目标故障原因类别相对应的第二分类预测模型，得到所述故障工单在所述目标故障原因类别中的目标故障原因子类别，如此，通过两步预测方法，即先对故障原因大类进行预测，再对该故障原因大类中的故障原因细分的类别进行预测，能够有效地减少每一步预测的类别数，提升预测结果的准确率。The network fault cause prediction method of the embodiment of the present invention obtains the classification feature vector in the fault work order, and the classification feature vector includes a first type of feature vector and a second type of feature vector; according to the first type of feature vector and the first classification prediction model, the target fault cause category to which the fault work order belongs is obtained; according to the second type of feature vector and the second classification prediction model corresponding to the target fault cause category, the target fault cause subcategory of the fault work order in the target fault cause category is obtained. In this way, through a two-step prediction method, that is, first predicting the main category of fault causes, and then predicting the subdivided categories of fault causes in the main category of fault causes, the number of categories predicted in each step can be effectively reduced, and the accuracy of the prediction results can be improved.

作为一可选地实现方式，本发明实施例的步骤101可具体包括：As an optional implementation, step 101 of the embodiment of the present invention may specifically include:

本步骤中，网络故障发生后，会产生告警信息。电子设备基于收到的告警信息，按照预设的派单规则生成待处理的故障工单。In this step, after a network failure occurs, an alarm message is generated. Based on the received alarm message, the electronic device generates a pending fault work order according to a preset dispatching rule.

一般地，待处理的故障工单包括但不限于告警标题、网元名称、网元类型和故障发生时间等字段。需要说明的是，此时待处理的故障工单中未记录造成本次网络故障的故障原因以及对应所采取的处理措施。Generally, the pending fault worksheet includes but is not limited to fields such as alarm title, network element name, network element type, and fault occurrence time. It should be noted that the pending fault worksheet does not record the cause of the network fault and the corresponding treatment measures.

可选的，所述分类特征向量包括：Optionally, the classification feature vector includes:

其中，第一特征向量、第二特征向量和第六特征向量均是根据故障工单的告警标题与特征向量的对应关系得到的。The first feature vector, the second feature vector and the sixth feature vector are all obtained according to the corresponding relationship between the alarm title of the fault work order and the feature vector.

也就是说，告警标题与表征其自身的特征向量具有第一对应关系，通过该第一对应关系，得到该故障工单的表征其告警标题的第一特征向量；告警标题与表征该告警标题对应的故障原因类别的特征向量具有第二对应关系，通过该第二对应关系，得到该故障工单的表征其告警标题对应的故障原因类别的第二特征向量；告警标题与表征该告警标题对应的故障原因子类别的特征向量具有第三对应关系，通过该第三对应关系，得到该故障工单的表征其告警标题对应的故障原因子类别的第六特征向量。That is to say, the alarm title has a first corresponding relationship with the feature vector representing itself, and through the first corresponding relationship, the first feature vector representing the alarm title of the fault work order is obtained; the alarm title has a second corresponding relationship with the feature vector representing the fault cause category corresponding to the alarm title, and through the second corresponding relationship, the second feature vector representing the fault cause category corresponding to the alarm title of the fault work order is obtained; the alarm title has a third corresponding relationship with the feature vector representing the fault cause subcategory corresponding to the alarm title, and through the third corresponding relationship, the sixth feature vector representing the fault cause subcategory corresponding to the alarm title of the fault work order is obtained.

其中，第三特征向量、第四特征向量和第七特征向量均是根据故障工单的网元类型与特征向量的对应关系得到的。The third eigenvector, the fourth eigenvector and the seventh eigenvector are all obtained according to the corresponding relationship between the network element type of the fault work order and the eigenvector.

也就是说，网元类型与表征其自身的特征向量具有第四对应关系，通过该第四对应关系，得到该故障工单的表征其网元类型的第三特征向量；网元类型与表征该网元类型对应的故障原因类别的特征向量具有第五对应关系，通过该第五对应关系，得到该故障工单的表征其网元类型对应的故障原因类别的第四特征向量；网元类型与表征该网元类型对应的故障原因子类别的特征向量具有第六对应关系，通过该第六对应关系，得到该故障工单的表征其网元类型对应的故障原因子类别的第七特征向量。That is to say, the network element type has a fourth corresponding relationship with the feature vector representing itself, and through the fourth corresponding relationship, the third feature vector representing the network element type of the fault work order is obtained; the network element type has a fifth corresponding relationship with the feature vector representing the fault cause category corresponding to the network element type, and through the fifth corresponding relationship, the fourth feature vector representing the fault cause category corresponding to the network element type of the fault work order is obtained; the network element type has a sixth corresponding relationship with the feature vector representing the fault cause subcategory corresponding to the network element type, and through the sixth corresponding relationship, the seventh feature vector representing the fault cause subcategory corresponding to the network element type of the fault work order is obtained.

需要特别指出的是，第五特征向量是基于特征提取模型得到的。具体的，首先，提取该故障工单关联到的告警信息，得到m条告警，将每条告警信息通过第一特征提取模型(比如word2vec的CBOW模型)得到对应的向量；之后，求得m个向量的第一均值vec_cbow；然后，将每条告警信息通过第二特征提取模型(比如word2vec的Skip-grams模型)得到对应的向量；之后，求得m个向量的第二均值vec_sg；最后，将第一均值vec_cbow和第二均值vec_sg拼接得到第五特征向量。It should be pointed out that the fifth feature vector is obtained based on the feature extraction model. Specifically, first, extract the alarm information associated with the fault work order to obtain m alarms, and pass each alarm information through the first feature extraction model (such as the CBOW model of word2vec) to obtain the corresponding vector; then, obtain the first mean vec_cbow of the m vectors; then, pass each alarm information through the second feature extraction model (such as the Skip-grams model of word2vec) to obtain the corresponding vector; then, obtain the second mean vec_sg of the m vectors; finally, concatenate the first mean vec_cbow and the second mean vec_sg to obtain the fifth feature vector.

需要说明的是，当告警信息满足第一条件和第二条件的情况下，确定该告警信息为该故障工单关联到的告警信息。It should be noted that, when the alarm information meets the first condition and the second condition, it is determined that the alarm information is the alarm information associated with the fault work order.

这里，第一条件为告警信息的告警开始时间t₂处于(t₁-t_A,t₁+t_B)之间，其中，t₁表示该故障工单的故障发生时间，t_A和t_B为预设的时间值。Here, the first condition is that the alarm start time _t2 of the alarm information is between ( _t1 - _tA , _t1 + _tB ), where _t1 represents the fault occurrence time of the fault work order, and _tA and _tB are preset time values.

也就是说，告警信息的告警开始时间在该故障工单的故障发生时间的前一段时间和后一段时间之间。That is to say, the alarm start time of the alarm information is between a period of time before and a period of time after the fault occurrence time of the fault work order.

第二条件为告警信息中的网元名称与该故障工单中的网元名称相同。The second condition is that the network element name in the alarm information is the same as the network element name in the fault work order.

这里，word2vec可以根据给定的语料库，通过优化后的训练模型快速有效地将一个词语表达成向量形式，为自然语言处理领域的应用研究提供了新的工具。word2vec依赖跳过某些符号Skip-grams模型或连续词袋CBOW模型来建立神经词嵌入。Here, word2vec can quickly and effectively express a word into a vector form based on a given corpus through an optimized training model, providing a new tool for applied research in the field of natural language processing. Word2vec relies on the Skip-grams model or the Continuous Bag of Words (CBOW) model to build neural word embeddings.

作为一可选的实现方式，本发明实施例的方法步骤102，根据所述第一类特征向量和第一分类预测模型，得到所述故障工单所属的目标故障原因类别，可包括：As an optional implementation, step 102 of the method of the embodiment of the present invention, obtaining the target fault cause category to which the fault work order belongs according to the first type of feature vector and the first classification prediction model, may include:

这里，各个故障原因类别即该故障工单所属的各个故障原因类别，即该故障工单所属的各个故障原因大类。Here, each fault cause category is each fault cause category to which the fault work order belongs, that is, each major fault cause category to which the fault work order belongs.

需要说明的是，基于第一类特征向量，通过第一分类预测模型的分类，得到该故障工单所属的故障原因大类可能对应很多种，最有可能所属的故障原因大类可通过概率值比较衡量确定。It should be noted that based on the first type of feature vector, through the classification of the first classification prediction model, the major category of fault cause to which the fault work order belongs may correspond to many types, and the most likely major category of fault cause can be determined by comparing and measuring the probability values.

通过该实现方式，能够预测出该故障工单所属的目标故障原因类别。Through this implementation, the target fault cause category to which the fault work order belongs can be predicted.

作为一可选的实现方式，本发明实施例的方法步骤103，根据所述第二类特征向量以及与所述目标故障原因类别相对应的第二分类预测模型，得到所述故障工单在所述目标故障原因类别中的目标故障原因子类别，包括：As an optional implementation, in step 103 of the method of the embodiment of the present invention, obtaining the target fault cause subcategory of the fault work order in the target fault cause category according to the second type of feature vector and the second classification prediction model corresponding to the target fault cause category includes:

本步骤中，需要说明的是，第二分类预测模型是根据该故障工单所属的目标故障原因类别确定的。也就是说，不同的故障原因类别对应不同的第二分类预测模型。In this step, it should be noted that the second classification prediction model is determined according to the target fault cause category to which the fault work order belongs. In other words, different fault cause categories correspond to different second classification prediction models.

该实现方式与上述确定故障工单的目标故障原因类别相似，最有可能所属的故障原因子类别可通过概率值比较衡量确定。This implementation is similar to the above-mentioned determination of the target fault cause category of the fault work order, and the most likely fault cause subcategory can be determined by comparing and measuring the probability values.

通过上述描述可知，对于预测故障工单的故障原因的准确率关键在于分类预测模型，如何训练出好的分类预测模型，作为一可选的实现方式，本发明实施例的方法还可包括：From the above description, it can be seen that the key to the accuracy of predicting the cause of the fault work order lies in the classification prediction model. How to train a good classification prediction model? As an optional implementation method, the method of the embodiment of the present invention may also include:

这里，多条历史故障工单及多条历史告警信息中的“多条”在这里可以理解为大量。需要说明的是，这些历史故障工单及历史告警信息为有效的数据，也就是说，上述历史故障工单以及历史告警信息中没有空字段。Here, the "multiple" in the multiple historical fault work orders and the multiple historical alarm information can be understood as a large number. It should be noted that these historical fault work orders and historical alarm information are valid data, that is, there are no empty fields in the above historical fault work orders and historical alarm information.

在大量的历史故障工单和大量的历史告警信息中，首先，去掉含有空字段的数据；之后，对筛选出的历史故障工单和俩是告警信息中的告警标题进行正则化匹配，提取告警内容，如对于“xxx”发生“yyy告警”，提取出“yyy告警”部分；最后，将故障发生时间和告警开始时间进行格式化，转换为统一的时间格式，比如datetime64格式。Among a large number of historical fault tickets and a large amount of historical alarm information, first, remove the data containing empty fields; then, perform regularization matching on the alarm titles in the filtered historical fault tickets and the alarm information, and extract the alarm content. For example, for "xxx" with "yyy alarm", extract the "yyy alarm" part; finally, format the fault occurrence time and alarm start time and convert them into a unified time format, such as datetime64 format.

本步骤中，根据历史故障工单的告警标题字段，得到第一特征向量、第二特征向量和第六特征向量。In this step, the first feature vector, the second feature vector and the sixth feature vector are obtained according to the alarm title field of the historical fault work order.

具体的，对历史故障工单做如下处理：Specifically, historical fault work orders are processed as follows:

1)对历史故障工单的告警标题进行one-hot编码，得到one-hot向量；将该告警标题与相应的one-hot向量作为字典的key和value存储起来，字典记为dict_1。1) Perform one-hot encoding on the alarm title of the historical fault work order to obtain a one-hot vector; store the alarm title and the corresponding one-hot vector as the key and value of the dictionary, which is recorded as dict_1.

这里的one-hot向量即为用于表征告警标题的第一特征向量。The one-hot vector here is the first feature vector used to characterize the alarm title.

2)求得历史故障工单中的告警标题上对应出现的每种故障原因大类的次数，并进行归一化处理，得到告警标题对应的向量；将告警标题和对应的向量作为字典的key和value存储起来，字典记为dict_2。2) Obtain the number of times each major category of fault cause appears in the alarm title in the historical fault work order, and perform normalization to obtain the vector corresponding to the alarm title; store the alarm title and the corresponding vector as the key and value of the dictionary, and the dictionary is recorded as dict_2.

这里，该向量即为用于表征告警标题对应的故障原因类别的第二特征向量。Here, the vector is the second feature vector used to characterize the fault cause category corresponding to the alarm title.

例如，在所有的工单中，一共有4条工单出现了告警标题A，这四条工单中的告警标题和故障原因大类如下：For example, among all the work orders, there are 4 work orders with alarm title A. The alarm titles and fault causes of these 4 work orders are as follows:

告警标题A故障原因大类1Alarm title A Fault cause category 1

告警标题A故障原因大类3Alarm title A Fault cause category 3

告警标题A故障原因大类4Alarm title A Fault cause category 4

则告警标题A对应的向量为[2/4,0,1/4,1/4,0,···]，该向量的维度为故障原因类别的类别数。Then the vector corresponding to the alarm title A is [2/4, 0, 1/4, 1/4, 0, ···], and the dimension of the vector is the number of fault cause categories.

3)对历史故障工单的网元类型进行one-hot编码，得到one-hot向量；将该网元类型与相应的one-hot向量作为字典的key和value存储起来，字典记为dict_3。3) One-hot encode the network element type of the historical fault work order to obtain a one-hot vector; store the network element type and the corresponding one-hot vector as the key and value of the dictionary, and the dictionary is recorded as dict_3.

这里的one-hot向量即为用于表征告警标题的第三特征向量。The one-hot vector here is the third eigenvector used to characterize the alarm title.

4)求得历史故障工单中的网元类型上对应出现的每种故障原因大类的次数，并进行归一化处理，得到网元类型对应的向量；将网元类型和其对应的向量作为字典的key和value存储起来，字典记为dict_4。4) Obtain the number of times each major type of fault cause appears on the network element type in the historical fault work order, and perform normalization processing to obtain the vector corresponding to the network element type; store the network element type and its corresponding vector as the key and value of the dictionary, and the dictionary is recorded as dict_4.

这里，该向量即为用于表征网元类型对应的故障原因类别的第四特征向量。Here, the vector is the fourth eigenvector used to characterize the fault cause category corresponding to the network element type.

例如，在所有的工单中，一共有4条工单出现了网元类型A，这四条工单中的网元类型和故障原因大类如下：For example, among all the work orders, there are 4 work orders with network element type A. The network element types and fault causes in these 4 work orders are as follows:

网元类型A故障原因大类1Network element type A fault cause category 1

网元类型A故障原因大类3Network element type A fault cause category 3

网元类型A故障原因大类4Network element type A fault cause category 4

则网元类型A对应的向量为[2/4,0,1/4,1/4,0,···]，该向量的维度为故障原因类别的类别数。Then the vector corresponding to the network element type A is [2/4, 0, 1/4, 1/4, 0, ···], and the dimension of the vector is the number of fault cause categories.

5)对于用于表征故障工单关联到的告警信息的第五特征向量5) For the fifth eigenvector used to characterize the alarm information associated with the fault work order

这里的故障工单指的是历史故障工单。The fault tickets here refer to historical fault tickets.

首先，对于每条历史故障工单，提取其关联到的告警信息。First, for each historical fault work order, extract the associated alarm information.

这里，当告警信息满足第一条件和第二条件的情况下，确定该告警信息为该历史故障工单关联到的告警信息。Here, when the alarm information satisfies the first condition and the second condition, it is determined that the alarm information is the alarm information associated with the historical fault work order.

这里，第一条件为告警信息的告警开始时间t₂处于(t₁-t_A,t₁+t_B)之间，其中，t₁表示该历史故障工单的故障发生时间，t_A和t_B为预设的时间值。Here, the first condition is that the alarm start time _t2 of the alarm information is between ( _t1 - _tA , _t1 + _tB ), where _t1 represents the fault occurrence time of the historical fault work order, and _tA and _tB are preset time values.

也就是说，告警信息的告警开始时间在该历史故障工单的故障发生时间的前一段时间和后一段时间之间。That is to say, the alarm start time of the alarm information is between a period of time before and a period of time after the fault occurrence time of the historical fault work order.

第二条件为告警信息中的网元名称与该历史故障工单中的网元名称相同。The second condition is that the network element name in the alarm information is the same as the network element name in the historical fault work order.

之后，将每条历史故障工单关联到的告警信息，按照告警开始时间的先后进行排序，提取每条告警信息的告警标题，将其作为一个词，组成一句告警语句。Afterwards, the alarm information associated with each historical fault work order is sorted in the order of the alarm start time, and the alarm title of each alarm information is extracted and used as a word to form an alarm sentence.

这里，该告警语句用来描述该历史故障工单故障发生的一段时间内网元上产生的一系列有序的告警信息。Here, the alarm statement is used to describe a series of ordered alarm information generated on the network element within a period of time when the historical fault work order fault occurs.

然后，将每条历史故障工单对应的告警语句，组成一篇文档，作为语料，用该语料分别训练CBOW模型和Skip-gram模型两种wrod2vec模型，得到每条告警信息的两种向量表征；保存训练好的CBOW模型model_cbow和Skip-gram模型model_sg；Then, the alarm statements corresponding to each historical fault work order are combined into a document as corpus, and the corpus is used to train two wrod2vec models, the CBOW model and the Skip-gram model, respectively, to obtain two vector representations of each alarm information; the trained CBOW model model_cbow and Skip-gram model model_sg are saved;

最后，对每条历史故障工单对应的告警语句中的告警标题，查询其告警向量，再将告警向量进行平均。CBOW模型和Skip-gram模型分别对告警标题进行向量化和平均。将两种模型得到的两条向量拼接起来，作为历史故障工单匹配到告警信息的特征向量，即第五特征向量。Finally, for each historical fault work order, the alarm title in the alarm statement is queried for its alarm vector, and then the alarm vectors are averaged. The CBOW model and the Skip-gram model vectorize and average the alarm title respectively. The two vectors obtained by the two models are concatenated as the feature vector that matches the historical fault work order to the alarm information, that is, the fifth feature vector.

6)求得历史故障工单中的告警标题上对应出现的每种故障原因子类别(即故障原因细分的类别)的次数，并进行归一化处理，得到告警标题对应的向量；将告警标题和对应的向量作为的key和value存储起来，字典记为dict_5。6) Obtain the number of times each fault cause subcategory (i.e., the category of fault cause subdivision) appears in the alarm title in the historical fault work order, and perform normalization processing to obtain the vector corresponding to the alarm title; store the alarm title and the corresponding vector as the key and value, and the dictionary is recorded as dict_5.

这里，该向量即为用于表征告警标题对应的故障原因子类别的第六特征向量。Here, the vector is the sixth feature vector used to characterize the fault cause subcategory corresponding to the alarm title.

7)求得历史故障工单中的网元类型上对应出现的每种故障原因子类别(即故障原因细分的类别)的次数，并进行归一化处理，得到网元类型对应的向量；将网元类型和对应的向量作为的key和value存储起来，字典记为dict_6。7) Obtain the number of times each fault cause subcategory (i.e., the category of fault cause subdivision) appears on the network element type in the historical fault work order, and perform normalization processing to obtain the vector corresponding to the network element type; store the network element type and the corresponding vector as the key and value, and the dictionary is recorded as dict_6.

这里，该向量即为用于表征网元类型对应的故障原因子类别的第七特征向量。Here, the vector is the seventh characteristic vector used to characterize the fault cause subcategory corresponding to the network element type.

需要说明的是，可选地，故障原因类别的类别标签通过对历史故障工单的故障原因类别进行数字化编码得到。即用数字1到N标识N类故障原因类别，作为类别标签。It should be noted that, optionally, the category label of the fault cause category is obtained by digitally encoding the fault cause category of the historical fault work order, that is, using numbers 1 to N to identify N types of fault cause categories as category labels.

这里，将第一特征向量、第二特征向量、第三特征向量、第四特征向量和第五特征向量作为输入，输入预设分类模型中，输出分类结果，即输出类别标签，将分类结果中的类别标签与对应的实际的类别标签进行比较，不断的调整预设分类模型中的参数，缩小分类结果中的类别标签与对应的实际的类别标签的差异，使差异缩小至预设范围，或者使预设分类模型达到最小收敛位置。Here, the first eigenvector, the second eigenvector, the third eigenvector, the fourth eigenvector and the fifth eigenvector are taken as inputs and input into a preset classification model, and the classification results, i.e., the category labels, are output, and the category labels in the classification results are compared with the corresponding actual category labels. The parameters in the preset classification model are continuously adjusted to narrow the difference between the category labels in the classification results and the corresponding actual category labels, so that the difference is narrowed to a preset range, or the preset classification model reaches the minimum convergence position.

这里，模型训练时所使用的预设分类模型为GBDT模型或者XGBoost模型。Here, the preset classification model used in model training is the GBDT model or the XGBoost model.

这里，GBDT(Gradient Boosting Decision Tree，梯度提升决策树)模型是一个加法模型，它串行地训练一组CART(Classification and Regression Trees，分类与回归树)，最终对所有回归树的预测结果加和，由此得到一个强学习器，每一颗新树都拟合当前损失函数的负梯度方向。Here, the GBDT (Gradient Boosting Decision Tree) model is an additive model that trains a set of CART (Classification and Regression Trees) in series, and finally sums up the prediction results of all regression trees to obtain a strong learner. Each new tree fits the negative gradient direction of the current loss function.

XGBoost(Extreme Gradient Boosting，梯度提升树)模型，同样是串行地生成模型，取所有模型的和为输出。The XGBoost (Extreme Gradient Boosting) model also generates models serially and takes the sum of all models as output.

这里，将训练好的第一分类预测模型model_1进行保存。Here, the trained first classification prediction model model_1 is saved.

进一步地，在根据所述历史故障工单的字段和所述历史告警信息的字段，得到分类特征向量之后，所述方法还包括：Furthermore, after obtaining the classification feature vector according to the fields of the historical fault work order and the fields of the historical alarm information, the method further includes:

这里，将所有的数据，即所有的历史故障工单按照故障原因类别进行分组，得到多组历史故障工单数据。Here, all data, that is, all historical fault work orders are grouped according to the fault cause categories to obtain multiple groups of historical fault work order data.

这里，不同组的历史故障工单数据对应不同的故障原因类别。Here, different groups of historical fault work order data correspond to different fault cause categories.

需要说明的是，假设故障原因类别有N种，A¹，···，A^N，那么第一组数据中，故障原因类别，即故障原因大类全部为A¹，第一组数据的标签为其故障原因子类别，假设故障原因大类A¹对应的故障原因子类别，即故障原因细分的类别共有n_1种，那么第一组数据的标签有到/>共n_1种标签。It should be noted that, assuming that there are N types of fault cause categories, A ¹ , ···, ^AN , then in the first set of data, the fault cause categories, that is, the major categories of fault causes are all A ¹ , and the labels of the first set of data are its fault cause subcategories. Assuming that the fault cause subcategories corresponding to the major category of fault cause A ¹ , that is, the categories of fault cause subdivision, have a total of n_1 types, then the labels of the first set of data are To/> There are n_1 labels in total.

这里，分类特征向量具体指的是第一特征向量、第二特征向量、第三特征向量、第四特征向量、第五特征向量、第六特征向量和第七特征向量。Here, the classification feature vector specifically refers to a first feature vector, a second feature vector, a third feature vector, a fourth feature vector, a fifth feature vector, a sixth feature vector, and a seventh feature vector.

通过每组历史故障工单数据，即每组历史故障工单中的分类特征向量分别训练一个模型，模型训练时所使用的预设分类模型为GBDT模型或者XGBoost模型。A model is trained using each set of historical fault work order data, that is, the classification feature vectors in each set of historical fault work orders. The preset classification model used during model training is the GBDT model or the XGBoost model.

这里，将训练好的N个第二分类预测模型model_2_1到model_2_N进行保存。Here, the trained N second classification prediction models model_2_1 to model_2_N are saved.

这里，第一分类预测模型和第二分类预测模型的具体训练过程可参考图2。Here, the specific training process of the first classification prediction model and the second classification prediction model can refer to Figure 2.

下面就一示例，如图3所示，具体说明本发明实施例的方法的实施过程。The following is an example, as shown in FIG3 , to specifically illustrate the implementation process of the method of the embodiment of the present invention.

S1：接收到待预测的工单。S1: Receive the work order to be predicted.

这里，该待预测工单为一条新的工单，该待预测工单包括告警标题、网元名称、网元类型和故障发生时间四个字段。Here, the work order to be predicted is a new work order, and the work order to be predicted includes four fields: alarm title, network element name, network element type, and fault occurrence time.

需要说明的是，基于上述四个字段以及工单关联到的告警信息预测该待预测工单的故障原因子类别(即故障原因细分类别)这一字段。It should be noted that the field of the fault cause subcategory (ie, fault cause subdivision category) of the work order to be predicted is predicted based on the above four fields and the alarm information associated with the work order.

S2：提取该工单的一级分类的特征向量。S2: Extract the feature vector of the first-level classification of the work order.

具体的，1)告警标题的one-hot特征：将该工单中的告警标题当做key，在字典dict_1中查询告警标题对应的特征向量vec_1，即第一特征向量。Specifically, 1) one-hot feature of the alarm title: take the alarm title in the work order as the key, and query the feature vector vec_1 corresponding to the alarm title in the dictionary dict_1, that is, the first feature vector.

2)告警标题的故障原因大类分布特征：将该工单中的告警标题当做key，在字典dict_2中查询告警标题对应的特征向量vec_2，即第二特征向量。2) Fault cause distribution characteristics of alarm titles: The alarm title in the work order is used as the key, and the feature vector vec_2 corresponding to the alarm title, that is, the second feature vector, is searched in the dictionary dict_2.

3)网元类型的one-hot特征：将该条工单的网元类型当做key，在字典dict_3中查询网元类型对应的特征向量vec_3，即第三特征向量。3) One-hot feature of network element type: Use the network element type of the work order as the key and query the feature vector vec_3 corresponding to the network element type in the dictionary dict_3, that is, the third feature vector.

4)网元类型的故障原因大类分布特征：将该条工单的网元类型当做key，在字典dict_4中查询网元类型对应的特征向量vec_4，即第四特征向量。4) Distribution characteristics of major fault causes by network element type: The network element type of the work order is used as the key, and the feature vector vec_4 corresponding to the network element type, i.e., the fourth feature vector, is searched in the dictionary dict_4.

5)工单关联到告警的word2vec特征：5) Word2vec features of work orders associated with alarms:

首先，提取该条工单关联到的告警信息，提取的方法和训练阶段中的提取方法相同。假设关联到h条告警，对于每一条告警，使用wrod2vec模型CBOW模型得到其向量，然后求出h个向量的均值vec_cbow；然后，对于每一条告警，使用Skip-grams模型得到其向量，求出h条向量的均值vec_sg；最后，将均值vec_cbow和均值vec_sg进行拼接得到特征向量vec_5，即第五特征向量。First, extract the alarm information associated with the work order. The extraction method is the same as that in the training phase. Assuming that there are h alarms associated, for each alarm, use the wrod2vec model CBOW model to obtain its vector, and then find the mean vec_cbow of the h vectors; then, for each alarm, use the Skip-grams model to obtain its vector, and find the mean vec_sg of the h vectors; finally, concatenate the mean vec_cbow and the mean vec_sg to obtain the feature vector vec_5, which is the fifth feature vector.

6)告警标题上故障原因细分的分布特征：将该工单中的告警标题当做key，在字典dict_5中查询告警标题对应的特征向量vec_6。6) Distribution characteristics of fault cause segmentation on alarm title: Take the alarm title in the work order as the key and query the feature vector vec_6 corresponding to the alarm title in the dictionary dict_5.

7)网元类型上故障原因细分的分布特征：将该工单中的网元类型当做key，在字典dict_6中查询网元类型对应的特征向量vec_7。7) Distribution characteristics of fault cause segmentation by network element type: The network element type in the work order is used as the key, and the feature vector vec_7 corresponding to the network element type is searched in the dictionary dict_6.

这里，上述vec_1、vec_2、vec_3、vec_4和vec_5属于一级分类的特征向量，即上述实施例中的第一类特征向量。Here, the above vec_1, vec_2, vec_3, vec_4 and vec_5 belong to the feature vectors of the first level classification, that is, the first type of feature vectors in the above embodiment.

S3：将一级分类的特征向量输入到model_1进行分类。S3: Input the feature vector of the first-level classification into model_1 for classification.

这里，将提取到的特征向量vec_1、vec_2、vec_3、vec_4、vec_5进行拼接，然后输入到model_1进行分类，得到属于各个故障原因大类的概率p_1，···，p_N。Here, the extracted feature vectors vec_1, vec_2, vec_3, vec_4, and vec_5 are concatenated and then input into model_1 for classification to obtain the probabilities p_1, ···, p_N belonging to each major category of fault causes.

S4：选取预测结果中概率最高的故障原因大类i。S4: Select the fault cause category i with the highest probability in the prediction results.

其中，预测结果即为属于各个故障原因大类的概率p_1，···，p_N，概率值最大的类别i(1≤i≤N)即为该工单在第一步分类时所属的类别。The prediction result is the probability p_1,...,p_N of belonging to each major category of fault causes. The category i (1≤i≤N) with the largest probability value is the category to which the work order belongs in the first classification step.

S5：提取该工单的二级分类的特征向量。S5: Extract the feature vector of the secondary classification of the work order.

上述vec_1、vec_2、vec_3、vec_4、vec_5、vec_6和vec_7属于二级分类的特征向量，即上述实施例中的第二类特征向量。具体的提取过程详见S2部分的阐述，这里不再赘述。The above vec_1, vec_2, vec_3, vec_4, vec_5, vec_6 and vec_7 belong to the feature vectors of the secondary classification, that is, the second type of feature vectors in the above embodiment. The specific extraction process is detailed in the description of part S2, which will not be repeated here.

S6：将二级分类的特征向量输入到model_2_i进行分类。S6: Input the feature vector of the secondary classification into model_2_i for classification.

这里，将提取到的特征向量vec_1、vec_2、vec_3、vec_4、vec_5、vec_6和vec_7进行拼接，然后输入到model_2_i进行分类，得到属于故障原因大类i中各个故障原因细分的概率 Here, the extracted feature vectors vec_1, vec_2, vec_3, vec_4, vec_5, vec_6 and vec_7 are concatenated and then input into model_2_i for classification to obtain the probability of each fault cause subdivision in the fault cause category i.

S7：选取预测结果中概率最高的故障原因细分j。S7: Select the fault cause segment j with the highest probability in the prediction results.

其中，预测结果即为属于故障原因大类i中各个故障原因细分的概率其中概率值最大的类别j(1≤j≤n_i)即为该工单在第二步分类时所属的类别，即最终的故障原因细分类别。The prediction result is the probability of each fault cause subdivided into the fault cause category i. The category j (1≤j≤n_i) with the largest probability value is the category to which the work order belongs in the second step classification, that is, the final fault cause subdivision category.

如图4所示，本发明实施例还提供一种网络故障原因预测装置，该装置包括：As shown in FIG4 , an embodiment of the present invention further provides a network fault cause prediction device, the device comprising:

第一获取模块401，用于获取故障工单中的分类特征向量，所述分类特征向量包括第一类特征向量和第二类特征向量；A first acquisition module 401 is used to acquire a classification feature vector in a fault work order, wherein the classification feature vector includes a first-category feature vector and a second-category feature vector;

第一故障原因预测模块402，用于根据所述第一类特征向量和第一分类预测模型，得到所述故障工单所属的目标故障原因类别；A first fault cause prediction module 402, configured to obtain a target fault cause category to which the fault work order belongs according to the first type of feature vector and a first classification prediction model;

第二故障原因预测模块403，用于根据所述第二类特征向量以及与所述目标故障原因类别相对应的第二分类预测模型，得到所述故障工单在所述目标故障原因类别中的目标故障原因子类别。The second fault cause prediction module 403 is used to obtain a target fault cause subcategory of the fault work order in the target fault cause category according to the second type feature vector and a second classification prediction model corresponding to the target fault cause category.

可选地，所述第一获取模块401包括：Optionally, the first acquisition module 401 includes:

第一获取单元，用于获取待处理的故障工单，所述故障工单的字段包括告警标题、网元名称、网元类型和故障发生时间；A first acquisition unit is used to acquire a fault work order to be processed, wherein the fields of the fault work order include an alarm title, a network element name, a network element type, and a fault occurrence time;

特征提取单元，用于基于所述故障工单的字段与特征向量的对应关系和/或特征提取模型，提取所述故障工单中的分类特征向量。The feature extraction unit is used to extract the classification feature vector in the fault work order based on the correspondence between the fields of the fault work order and the feature vector and/or the feature extraction model.

可选地，所述分类特征向量包括：Optionally, the classification feature vector includes:

可选地，所述第一故障原因预测模块402包括：Optionally, the first fault cause prediction module 402 includes:

第一处理单元，用于通过所述第一分类预测模型对所述第一类特征向量进行分类，得到各个故障原因类别的概率值；A first processing unit, configured to classify the first type of feature vectors by using the first classification prediction model to obtain probability values of each fault cause category;

第二处理单元，用于将各个故障原因类别的概率值中最大概率值对应的故障原因类别，确定为所述故障工单所属的目标故障原因类别。The second processing unit is used to determine the fault cause category corresponding to the maximum probability value among the probability values of each fault cause category as the target fault cause category to which the fault work order belongs.

可选地，所述第二故障原因预测模块403包括：Optionally, the second fault cause prediction module 403 includes:

第三处理单元，用于通过所述第二分类预测模型对所述第二类特征向量进行分类，得到所述故障工单在所述目标故障原因类别中各个故障原因子类别的概率值；A third processing unit is used to classify the second type of feature vector by using the second classification prediction model to obtain the probability value of each fault cause subcategory of the fault work order in the target fault cause category;

第四处理单元，用于将各个故障原因子类别的概率值中最大概率值对应的故障原因子类别，确定为目标故障原因子类别。The fourth processing unit is configured to determine the fault cause subcategory corresponding to the maximum probability value among the probability values of the fault cause subcategories as the target fault cause subcategory.

可选地，所述装置还包括：Optionally, the device further comprises:

第二获取模块，用于获取多条历史故障工单及多条历史告警信息，每条所述历史故障工单的字段包括告警标题、网元名称、网元类型、故障发生时间、故障原因类别以及对应故障原因类别的故障原因子类别，每条所述历史告警信息的字段包括告警标题、网元名称和告警开始时间；A second acquisition module is used to acquire multiple historical fault work orders and multiple historical alarm information, wherein the fields of each of the historical fault work orders include an alarm title, a network element name, a network element type, a fault occurrence time, a fault cause category, and a fault cause subcategory corresponding to the fault cause category, and the fields of each of the historical alarm information include an alarm title, a network element name, and an alarm start time;

第一处理模块，用于根据所述历史故障工单的字段和所述历史告警信息的字段，得到分类特征向量，所述分类特征向量包括：用于表征所述告警标题的第一特征向量，用于表征所述告警标题对应的故障原因类别的第二特征向量，用于表征所述网元类型的第三特征向量，用于表征所述网元类型对应的故障原因类别的第四特征向量，用于表征所述故障工单关联到的告警信息的第五特征向量，用于表征所述告警标题对应的故障原因子类别的第六特征向量以及用于表征网元类型对应的故障原因子类别的第七特征向量；A first processing module, configured to obtain a classification feature vector according to a field of the historical fault work order and a field of the historical alarm information, wherein the classification feature vector includes: a first feature vector for characterizing the alarm title, a second feature vector for characterizing a fault cause category corresponding to the alarm title, a third feature vector for characterizing the network element type, a fourth feature vector for characterizing a fault cause category corresponding to the network element type, a fifth feature vector for characterizing the alarm information associated with the fault work order, a sixth feature vector for characterizing a fault cause subcategory corresponding to the alarm title, and a seventh feature vector for characterizing a fault cause subcategory corresponding to the network element type;

第一模型训练模块，用于根据所述第一特征向量、所述第二特征向量、所述第三特征向量、所述第四特征向量、所述第五特征向量以及故障原因类别的类别标签，进行模型训练，得到第一分类预测模型。The first model training module is used to perform model training according to the first feature vector, the second feature vector, the third feature vector, the fourth feature vector, the fifth feature vector and the category label of the fault cause category to obtain a first classification prediction model.

可选地，所述装置还包括：Optionally, the device further comprises:

第二处理模块，用于根据所述故障原因类别，对多条所述历史故障工单进行分组，得到多组历史故障工单数据；A second processing module is used to group the multiple historical fault work orders according to the fault cause category to obtain multiple groups of historical fault work order data;

第二模型训练模块，用于根据每组历史故障工单数据对应的故障原因子类别的类别标签以及分类特征向量，对各组历史故障工单数据分别进行模型训练，得到多个第二分类预测模型。The second model training module is used to perform model training on each group of historical fault work order data according to the category label and classification feature vector of the fault cause subcategory corresponding to each group of historical fault work order data, so as to obtain multiple second classification prediction models.

本发明实施例的网络故障原因预测装置，通过获取故障工单中的分类特征向量，所述分类特征向量包括第一类特征向量和第二类特征向量；根据所述第一类特征向量和第一分类预测模型，得到所述故障工单所属的目标故障原因类别；根据所述第二类特征向量以及与所述目标故障原因类别相对应的第二分类预测模型，得到所述故障工单在所述目标故障原因类别中的目标故障原因子类别，如此，通过两步预测方法，即先对故障原因大类进行预测，再对该故障原因大类中的故障原因细分的类别进行预测，能够有效地减少每一步预测的类别数，提升预测结果的准确率。The network fault cause prediction device of the embodiment of the present invention obtains the classification feature vector in the fault work order, and the classification feature vector includes a first type of feature vector and a second type of feature vector; according to the first type of feature vector and the first classification prediction model, the target fault cause category to which the fault work order belongs is obtained; according to the second type of feature vector and the second classification prediction model corresponding to the target fault cause category, the target fault cause subcategory of the fault work order in the target fault cause category is obtained. In this way, through a two-step prediction method, that is, first predicting the main category of fault causes, and then predicting the subdivided category of fault causes in the main category of fault causes, the number of categories predicted in each step can be effectively reduced, and the accuracy of the prediction result can be improved.

在此需要说明的是，本发明实施例提供的上述装置，能够实现上述方法实施例所实现的所有方法步骤，且能够达到相同的技术效果，在此不再对本实施例中与方法实施例相同的部分及有益效果进行具体赘述。It should be noted here that the above-mentioned device provided in the embodiment of the present invention can implement all the method steps implemented in the above-mentioned method embodiment, and can achieve the same technical effect. The parts and beneficial effects that are the same as the method embodiment in this embodiment will not be described in detail here.

为了更好的实现上述目的，如图5所示，本发明实施例还提供一种电子设备，包括处理器500和收发器510，所述收发器510在处理器的控制下接收和发送数据，所述处理器500用于执行如下过程：In order to better achieve the above purpose, as shown in FIG5 , an embodiment of the present invention further provides an electronic device, including a processor 500 and a transceiver 510, wherein the transceiver 510 receives and sends data under the control of the processor, and the processor 500 is used to perform the following process:

根据所述第二类特征向量以及与所述目标故障原因类别相对应的第二分类预测模型，得到所述故障工单在所述目标故障原因类别中的目标故障原因子类别。A target fault cause subcategory of the fault work order in the target fault cause category is obtained according to the second type of feature vector and a second classification prediction model corresponding to the target fault cause category.

可选地，所述处理器500还用于：Optionally, the processor 500 is further configured to:

本发明实施例的电子设备，通过获取故障工单中的分类特征向量，所述分类特征向量包括第一类特征向量和第二类特征向量；根据所述第一类特征向量和第一分类预测模型，得到所述故障工单所属的目标故障原因类别；根据所述第二类特征向量和第二分类预测模型，得到所述故障工单在所述目标故障原因类别中的目标故障原因子类别，如此，通过两步预测方法，即先对故障原因大类进行预测，再对该故障原因大类中的故障原因细分的类别进行预测，能够有效地减少每一步预测的类别数，提升预测结果的准确率。The electronic device of an embodiment of the present invention obtains a classification feature vector in a fault work order, wherein the classification feature vector includes a first-category feature vector and a second-category feature vector; based on the first-category feature vector and a first classification prediction model, a target fault cause category to which the fault work order belongs is obtained; based on the second-category feature vector and a second classification prediction model, a target fault cause subcategory of the fault work order in the target fault cause category is obtained. In this way, through a two-step prediction method, i.e., first predicting a major category of fault causes, and then predicting a subdivided category of fault causes in the major category of fault causes, the number of categories predicted in each step can be effectively reduced, thereby improving the accuracy of the prediction results.

本发明实施例还提供一种电子设备，包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序，所述处理器执行所述程序时实现如上所述的网络故障原因预测方法实施例中的各个过程，且能达到相同的技术效果，为避免重复，这里不再赘述。An embodiment of the present invention also provides an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the program, each process in the network fault cause prediction method embodiment as described above is implemented, and the same technical effect can be achieved. To avoid repetition, it will not be described here.

本发明实施例还提供一种计算机可读存储介质，其上存储有计算机程序，该程序被处理器执行时实现如上所述的网络故障原因预测方法实施例中的各个过程，且能达到相同的技术效果，为避免重复，这里不再赘述。其中，所述的计算机可读存储介质，如只读存储器(Read-Only Memory，简称ROM)、随机存取存储器(Random Access Memory，简称RAM)、磁碟或者光盘等。The embodiment of the present invention also provides a computer-readable storage medium, on which a computer program is stored. When the program is executed by a processor, each process in the network fault cause prediction method embodiment described above is implemented, and the same technical effect can be achieved. To avoid repetition, it is not repeated here. The computer-readable storage medium is, for example, a read-only memory (ROM), a random access memory (RAM), a magnetic disk or an optical disk.

本领域内的技术人员应明白，本申请的实施例可提供为方法、系统或计算机程序产品。因此，本申请可采用完全硬件实施例、完全软件实施例或结合软件和硬件方面的实施例的形式。而且，本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可读存储介质(包括但不限于磁盘存储器和光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art will appreciate that the embodiments of the present application may be provided as methods, systems or computer program products. Therefore, the present application may adopt the form of a complete hardware embodiment, a complete software embodiment or an embodiment combining software and hardware. Moreover, the present application may adopt the form of a computer program product implemented on one or more computer-readable storage media (including but not limited to disk storage and optical storage, etc.) containing computer-usable program code.

本申请是参照根据本申请实施例的方法、设备(系统)和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其它可编程数据处理设备的处理器以产生一个机器，使得通过计算机或其它可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或一个方框或多个方框中指定的功能的装置。The present application is described with reference to the flowchart and/or block diagram of the method, device (system) and computer program product according to the embodiment of the present application. It should be understood that each process and/or box in the flowchart and/or block diagram, and the combination of the process and/or box in the flowchart and/or block diagram can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, a special-purpose computer, an embedded processor or other programmable data processing device to produce a machine, so that the instructions executed by the processor of the computer or other programmable data processing device produce a device for implementing the function specified in one process or multiple processes and/or one box or multiple boxes in the flowchart.

这些计算机程序指令也可存储在能引导计算机或其它可编程数据处理设备以特定方式工作的计算机可读存储介质中，使得存储在该计算机可读存储介质中的指令产生包括指令装置的纸制品，该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing device to operate in a specific manner, so that the instructions stored in the computer-readable storage medium produce a paper product including an instruction device that implements the functions specified in one or more processes in the flowchart and/or one or more boxes in the block diagram.

这些计算机程序指令也可装载到计算机或其它可编程数据处理设备上，使得计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理，从而在计算机或其他科编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions may also be loaded onto a computer or other programmable data processing device, causing the computer or other programmable device to execute a series of operating steps to produce computer-implemented processing, whereby the instructions executed on the computer or other programmable device provide steps for implementing the functions specified in one or more processes in the flowchart and/or one or more boxes in the block diagram.

以上所述是本发明的优选实施方式，应当指出，对于本技术领域的普通技术人员来说，在不脱离本发明所述原理的前提下，还可以做出若干改进和润饰，这些改进和润饰也应视为本发明的保护范围。The above is a preferred embodiment of the present invention. It should be pointed out that for ordinary technicians in this technical field, several improvements and modifications can be made without departing from the principles of the present invention. These improvements and modifications should also be regarded as the scope of protection of the present invention.

Claims

1. A method for predicting the cause of a network failure, comprising:

Obtaining a classification feature vector in the fault work order, wherein the classification feature vector includes a first-category feature vector and a second-category feature vector;

Obtaining a target fault cause category to which the fault work order belongs according to the first type of feature vector and the first classification prediction model;

Obtaining a target fault cause subcategory of the fault work order in the target fault cause category according to the second type of feature vector and a second classification prediction model corresponding to the target fault cause category;

The method further comprises:

Acquire multiple historical fault work orders and multiple historical alarm information, where the fields of each of the historical fault work orders include an alarm title, a network element name, a network element type, a fault occurrence time, a fault cause category, and a fault cause subcategory corresponding to the fault cause category, and the fields of each of the historical alarm information include an alarm title, a network element name, and an alarm start time;

According to the fields of the historical fault work order and the fields of the historical alarm information, a classification feature vector is obtained, the classification feature vector comprising: a first feature vector for characterizing the alarm title, a second feature vector for characterizing the fault cause category corresponding to the alarm title, a third feature vector for characterizing the network element type, a fourth feature vector for characterizing the fault cause category corresponding to the network element type, a fifth feature vector for characterizing the alarm information associated with the fault work order, a sixth feature vector for characterizing the fault cause subcategory corresponding to the alarm title, and a seventh feature vector for characterizing the fault cause subcategory corresponding to the network element type;

Model training is performed based on the first feature vector, the second feature vector, the third feature vector, the fourth feature vector, the fifth feature vector, and the category label of the fault cause category to obtain a first classification prediction model.

2. The method according to claim 1, characterized in that the step of obtaining the classification feature vector in the fault work order comprises:

Obtain a pending fault work order, wherein the fields of the fault work order include an alarm title, a network element name, a network element type, and a fault occurrence time;

Based on the correspondence between the fields of the fault work order and the feature vector and/or the feature extraction model, the classification feature vector in the fault work order is extracted.

3. The method according to claim 2, characterized in that the classification feature vector comprises:

A first feature vector for characterizing the alarm title;

A second feature vector for characterizing the fault cause category corresponding to the alarm title;

A third characteristic vector for characterizing the network element type;

A fourth eigenvector for characterizing a fault cause category corresponding to the network element type;

A fifth eigenvector for characterizing the alarm information associated with the fault work order;

A sixth feature vector for characterizing a subcategory of a fault cause corresponding to the alarm title; and

A seventh eigenvector for characterizing a subcategory of a fault cause corresponding to a network element type;

The first type of feature vectors includes: the first feature vector, the second feature vector, the third feature vector, the fourth feature vector and the fifth feature vector;

The second type of feature vectors includes: the first feature vector, the second feature vector, the third feature vector, the fourth feature vector, the fifth feature vector, the sixth feature vector and the seventh feature vector.

4. The method according to claim 1, characterized in that the step of obtaining the target fault cause category to which the fault work order belongs based on the first type of feature vector and the first classification prediction model comprises:

Classifying the first type of feature vectors by using the first classification prediction model to obtain probability values of each fault cause category;

The fault cause category corresponding to the maximum probability value among the probability values of each fault cause category is determined as the target fault cause category to which the fault work order belongs.

5. The method according to claim 1, characterized in that the step of obtaining the target fault cause subcategory of the fault work order in the target fault cause category according to the second type feature vector and the second classification prediction model corresponding to the target fault cause category comprises:

Classifying the second type of feature vectors by using the second classification prediction model to obtain probability values of each fault cause subcategory of the fault work order in the target fault cause category;

The fault cause subcategory corresponding to the maximum probability value among the probability values of each fault cause subcategory is determined as the target fault cause subcategory.

6. The method according to claim 1, characterized in that after obtaining the classification feature vector according to the fields of the historical fault work order and the fields of the historical alarm information, the method further comprises:

According to the fault cause category, the plurality of historical fault work orders are grouped to obtain a plurality of groups of historical fault work order data;

According to the category labels and classification feature vectors of the fault cause subcategories corresponding to each group of historical fault work order data, model training is performed on each group of historical fault work order data to obtain multiple second classification prediction models.

7. A network failure cause prediction device, characterized by comprising:

A first acquisition module is used to acquire a classification feature vector in the fault work order, wherein the classification feature vector includes a first-category feature vector and a second-category feature vector;

A first fault cause prediction module, used for obtaining a target fault cause category to which the fault work order belongs according to the first type of feature vector and a first classification prediction model;

A second fault cause prediction module, configured to obtain a target fault cause subcategory of the fault work order in the target fault cause category according to the second type of feature vector and a second classification prediction model corresponding to the target fault cause category;

A second acquisition module is used to acquire multiple historical fault work orders and multiple historical alarm information, wherein the fields of each of the historical fault work orders include an alarm title, a network element name, a network element type, a fault occurrence time, a fault cause category, and a fault cause subcategory corresponding to the fault cause category, and the fields of each of the historical alarm information include an alarm title, a network element name, and an alarm start time;

A first processing module, configured to obtain a classification feature vector according to a field of the historical fault work order and a field of the historical alarm information, wherein the classification feature vector includes: a first feature vector for characterizing the alarm title, a second feature vector for characterizing a fault cause category corresponding to the alarm title, a third feature vector for characterizing the network element type, a fourth feature vector for characterizing a fault cause category corresponding to the network element type, a fifth feature vector for characterizing the alarm information associated with the fault work order, a sixth feature vector for characterizing a fault cause subcategory corresponding to the alarm title, and a seventh feature vector for characterizing a fault cause subcategory corresponding to the network element type;

The first model training module is used to perform model training according to the first feature vector, the second feature vector, the third feature vector, the fourth feature vector, the fifth feature vector and the category label of the fault cause category to obtain a first classification prediction model.

8. An electronic device, comprising a processor and a transceiver, wherein the transceiver receives and sends data under the control of the processor, wherein the processor is configured to perform the following operations:

According to the second type of feature vector and the second classification prediction model, a target fault cause subcategory of the fault work order in the target fault cause category is obtained;

The processor is further configured to:

9. The electronic device according to claim 8, wherein the processor is further configured to:

10. The electronic device according to claim 9, wherein the classification feature vector comprises:

A first feature vector for characterizing the alarm title;

A third characteristic vector for characterizing the network element type;

11. The electronic device according to claim 8, wherein the processor is further configured to:

12. The electronic device according to claim 8, wherein the processor is further configured to:

13. The electronic device according to claim 8, wherein the processor is further configured to:

14. An electronic device comprising a memory, a processor, and a program stored in the memory and executable on the processor; wherein the processor implements the network fault cause prediction method according to any one of claims 1 to 6 when executing the program.

15. A computer-readable storage medium having a computer program stored thereon, wherein when the program is executed by a processor, the program implements the steps in the network fault cause prediction method according to any one of claims 1 to 6.