WO2021185079A1 - 相似故障推荐方法及相关设备 - Google Patents

相似故障推荐方法及相关设备 Download PDF

Info

Publication number
WO2021185079A1
WO2021185079A1 PCT/CN2021/078895 CN2021078895W WO2021185079A1 WO 2021185079 A1 WO2021185079 A1 WO 2021185079A1 CN 2021078895 W CN2021078895 W CN 2021078895W WO 2021185079 A1 WO2021185079 A1 WO 2021185079A1
Authority
WO
WIPO (PCT)
Prior art keywords
fault
faults
dimensions
information
historical
Prior art date
Application number
PCT/CN2021/078895
Other languages
English (en)
French (fr)
Inventor
谢青
李立锟
田智勇
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP21771510.1A priority Critical patent/EP4102775A4/en
Publication of WO2021185079A1 publication Critical patent/WO2021185079A1/zh
Priority to US17/946,788 priority patent/US11757701B2/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0631Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0604Management of faults, events, alarms or notifications using filtering, e.g. reduction of information by using priority, element types, position or time
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/069Management of faults, events, alarms or notifications using logs of notifications; Post-processing of notifications

Definitions

  • the invention relates to the technical field of operation and maintenance, in particular to a similar fault recommendation method and related equipment.
  • the current normal way of network operation and maintenance is to collect the alarms reported by the network through the monitoring network and display them on an alarm list page.
  • the monitoring personnel or semi-automatic tools identify the alarms that may be caused by the fault and dispatch the order, and the actual maintenance personnel go to the site to check the real fault. And fix it.
  • Common network faults are hardware faults, such as equipment power failure or fiber interruption, or hardware damage (such as single board failure, optical module failure), or long-term use of some devices or harsh environments that cause equipment aging and frequent alarms. .
  • a small number of faults are logic or software faults, such as too many services on the ring that exceed the upper limit of bandwidth, misconfiguration of service parameters, unreasonable division of network resources, and local congestion.
  • the intelligent operation and maintenance (artificial intelligence for IT operations, AIOps) management software provided by Moogsoft will match the similarity according to the proportion of the number of alarms, and recommend matching higher historical faults according to the proportion, and then you can refer to The solution to the historical fault solves the existing fault.
  • AIOps artificial intelligence for IT operations
  • the embodiment of the application discloses a similar fault recommendation method and related equipment.
  • the application implements the recommendation of similar faults through multi-dimensional comparison, which can improve the accuracy of the recommendation, and provide an effective reference for solving the fault to be handled, thereby improving Improve the efficiency of network maintenance.
  • this application discloses a similar fault recommendation method, which includes:
  • this solution implements the recommendation of similar faults through multi-dimensional similarity comparison, which can improve the accuracy of recommendation and provide an effective solution for troubleshooting pending faults. Reference to improve the efficiency of network maintenance.
  • the method further includes:
  • This application sets multiple dimensions for similar calculations based on the type of the network where the fault is located, with strong pertinence and high flexibility, which can further help improve the accuracy of the recommendation.
  • the processing according to the fault diagnosis information of the M dimensions to obtain the characteristics of the fault to be processed includes:
  • the feature value of each dimension is combined to obtain the feature of the fault to be processed, and the feature of the fault to be processed is a feature vector.
  • This application combines features of multiple dimensions into a feature vector to calculate similarity, which is convenient for calculation and improves calculation efficiency.
  • the method before the acquiring fault diagnosis information of M dimensions according to the alarm information, the method further includes: adjusting one or more of the preset dimensions to obtain the M dimensions .
  • the similarity calculated by the combination of different dimensions is different.
  • multiple dimensions used for similar calculation can be flexibly adjusted, which helps to improve the accuracy of the final recommendation.
  • the filtering out similar faults from the multiple historical faults according to the multiple similarities and recommending the similar faults includes:
  • N historical faults are selected from the multiple historical faults, and the N historical faults are recommended, where the N historical faults are the top N among the multiple similarities
  • the multiple similarities are arranged in descending order, and the N is a positive integer.
  • the recommendation of historical faults with a high degree of similarity is selected, so as to provide an effective reference for the faults to be handled.
  • the method further includes:
  • the comparison information between the similar fault and the fault to be processed is recommended, where the comparison information includes a comparison of the similar fault with the fault to be processed in each of the M dimensions.
  • This application is helpful to provide further reference information for operation and maintenance personnel by comparing the details of the recommended faults.
  • the multiple historical faults are faults obtained after screening from a database.
  • the method further includes:
  • the evaluation information of the recommended historical faults in this application can be used as a reference for subsequent historical fault recommendations, which can further improve the efficiency of the recommendation and make the recommended historical faults more in line with the needs of users.
  • this application discloses a device, which includes:
  • the obtaining unit is used to obtain the alarm information of the fault to be processed
  • the acquiring unit is further configured to acquire fault diagnosis information of M dimensions according to the alarm information, the M dimensions include M different angles of fault diagnosis, and the M is an integer greater than 1;
  • a processing unit configured to process the fault diagnosis information of the M dimensions to obtain the characteristics of the fault to be processed
  • the calculation unit is configured to calculate multiple similarities based on the characteristics of the fault to be processed and the characteristics of multiple historical faults, and the multiple similarities represent the degree of similarity between the fault to be processed and the multiple historical faults, respectively ;
  • a screening unit configured to screen out similar faults from the multiple historical faults according to the multiple similarities
  • the recommendation unit is used to recommend the similar faults.
  • the obtaining unit is further configured to obtain the fault diagnosis information of M dimensions according to the alarm information after obtaining the alarm information of the fault to be processed, and obtain the information according to the alarm information.
  • the obtaining unit is further configured to obtain the fault diagnosis information of M dimensions according to the alarm information after obtaining the alarm information of the fault to be processed, and obtain the information according to the alarm information.
  • the device further includes a determining unit, configured to determine the M dimensions according to the network type.
  • the processing unit is specifically configured to:
  • the feature value of each dimension is combined to obtain the feature of the fault to be processed, and the feature of the fault to be processed is a feature vector.
  • the device further includes an adjustment unit configured to adjust one or more of the preset dimensions before the obtaining unit obtains fault diagnosis information of M dimensions according to the alarm information Dimensions to obtain the M dimensions.
  • the screening unit is specifically configured to: filter out N historical faults from the multiple historical faults according to the multiple similarities;
  • the recommendation unit is specifically configured to: recommend the N historical faults, where the N historical faults are historical faults associated with the top N similarities among the multiple similarities, and the multiple similarities Arranged in descending order, the N is a positive integer.
  • the recommendation unit is further configured to filter similar faults from the multiple historical faults according to the multiple similarities by the screening unit
  • the comparison information between the similar fault and the fault to be processed is recommended, where the comparison information includes a comparison of the similar fault with the fault to be processed in each of the M dimensions.
  • the multiple historical faults are faults obtained after screening from a database.
  • the device further includes a receiving unit configured to receive evaluation information of the recommended historical failure after the recommendation unit recommends the similar failure, wherein the evaluation information It is used as the recommended reference information for subsequent historical faults.
  • the present application discloses a device that includes a processor, a communication interface, and a memory, where the memory is used to store computer programs and/or data, and the processor is used to execute the storage in the memory
  • a computer program that causes the device to perform the following operations:
  • the M dimensions include M different angles of fault diagnosis, and the M is an integer greater than 1;
  • the device after the obtaining the alarm information of the fault to be processed and before obtaining the fault diagnosis information of M dimensions according to the alarm information, the device further performs the following operations:
  • the M dimensions are determined according to the network type.
  • the processing according to the fault diagnosis information of the M dimensions to obtain the characteristics of the fault to be processed includes:
  • the feature value of each dimension is combined to obtain the feature of the fault to be processed; the feature of the fault to be processed is a feature vector.
  • the device before acquiring fault diagnosis information of M dimensions according to the alarm information, the device further performs the following operation: adjusting one or more of the preset dimensions to obtain the M Dimensions.
  • the filtering out similar faults from the multiple historical faults according to the multiple similarities and recommending the similar faults includes:
  • N historical faults are selected from the multiple historical faults, and the N historical faults are recommended; wherein the N historical faults are the top N among the multiple similarities
  • the device further performs the following operations:
  • the comparison information between the similar fault and the fault to be processed is recommended, where the comparison information includes a comparison of the similar fault with the fault to be processed in each of the M dimensions.
  • the multiple historical faults are faults obtained after screening from a database.
  • the device further performs the following operations:
  • an embodiment of the present application discloses a computer-readable storage medium, the computer-readable storage medium stores a computer program, and the computer program is executed by a processor to implement any one of the foregoing aspects of the first aspect. method.
  • the embodiments of the present application disclose a computer program product.
  • the computer program product is read and executed by a computer, the method described in any one of the above-mentioned first aspects will be executed.
  • the embodiments of the present application disclose a computer program, when the computer program is executed on a computer, it will enable the computer to implement the method described in any one of the above-mentioned first aspects.
  • this solution implements the recommendation of similar faults through multi-dimensional similarity comparisons, which can improve the accuracy of recommendation, and to solve the problems to be solved.
  • the processed faults provide an effective reference, thereby improving the efficiency of network maintenance.
  • Figure 1 is a schematic diagram of the system architecture of a similar fault recommendation method provided by an embodiment of the solution
  • Figure 2 is a schematic flowchart of a similar fault recommendation method provided by an embodiment of the solution
  • Fig. 3 is a schematic diagram of a cosine included angle provided by an embodiment of the solution.
  • Figure 4 is a schematic diagram of a historical fault recommendation page provided by an embodiment of the solution.
  • Figure 5 is a schematic diagram of another historical fault recommendation page provided by an embodiment of the solution.
  • Figure 6 is a schematic diagram of another historical fault recommendation page provided by an embodiment of the solution.
  • FIG. 7 is a schematic diagram of another flow chart of a similar fault recommendation method provided by an embodiment of the solution.
  • Figure 8 is a schematic diagram of the logical structure of a device provided by an embodiment of the solution.
  • Figure 9 is a schematic diagram of the hardware structure of a device provided by an embodiment of the solution.
  • FIG. 1 is a schematic diagram of a system architecture to which the similar fault recommendation method provided in an embodiment of the present application is applicable.
  • the system architecture can include one or more fault monitoring devices 100 (multiple devices can form a cluster) and one or more network nodes (or devices) 110.
  • the fault monitoring device 100 can be connected to the network node 110. Realize mutual communication through the network. in:
  • the fault monitoring device 100 is mainly used to monitor whether a network node 100 and a line in a certain network are faulty.
  • the fault monitoring device 100 may be a device used by an operator to monitor whether a fault occurs in the network, or may also be a device used in an enterprise to monitor whether the enterprise's own network fails, etc., which is the specific fault monitoring device 100
  • the equipment in the network is determined according to the actual situation, and this solution does not impose restrictions on this.
  • the fault monitoring device 100 needs to run a corresponding application program to provide corresponding similar fault recommendation methods, such as database service, data calculation, decision execution and so on.
  • the fault monitoring device 100 may monitor the occurrence of a fault through a certain application program, and then recommend a similar historical fault for the newly-occurring fault through a certain application program, so as to learn from the similar historical faults. Solutions quickly resolve emerging faults.
  • the network node 110 may be a device on the network side or the user side in the network where the fault monitoring device 100 is located.
  • the network node 110 can install and run related applications (APP).
  • APP related applications
  • the network node 110 may send fault information or alarm information to the fault monitoring device 100 through a corresponding application.
  • the network node 110 may include, but is not limited to, any electronic product based on a smart operating system, which can interact with users through input devices such as keyboards, virtual keyboards, touch panels, touch screens, and voice control devices, such as smart phones and tablets. Computers, desktop computers, handheld computers and wearable electronic devices, etc. Among them, the intelligent operating system includes, but is not limited to, any operating system that enriches device functions by providing various applications to the device, such as Android, IOS, Windows, and MAC systems.
  • the network node 110 may also include, but is not limited to, any network device in the network, such as servers, hubs, switches, bridges, routers, gateways, network interface cards, wireless access points, printers and modems, fiber optic transceivers and other networks. equipment.
  • system architecture to which the similar fault recommendation method provided in the embodiment of the present application is applicable is not limited to the system architecture shown in FIG. 1.
  • the following provides a similar fault recommendation method, which can be applied to the system architecture shown in FIG. 1.
  • the method includes but is not limited to the following steps:
  • Step 201 Obtain the alarm information of the fault to be processed.
  • the network node when a fault occurs in the network (the fault can be called a pending fault), the network node will generate an alarm, and then the network node will list the alarms, the reasons for the possible alarms, and the possible alarms according to the alarm conditions.
  • the location such as which device or port, and fault repair suggestions, are sent to the fault monitoring device.
  • the information obtained by the fault monitoring device may be referred to as alarm information of the fault to be processed.
  • the alarm information is mainly the description information of the fault to be processed, and the information may also include other information such as possible fault types, etc. This solution does not impose restrictions on this.
  • the fault monitoring device can obtain the alarm information of the fault to be processed through the Kafka system.
  • Kafka can receive alarm information about pending faults sent by multiple network nodes, and store the information in order, and the fault monitoring device may subscribe to this information in a subscription manner, thereby obtaining the alarm information about pending faults. .
  • Step 202 Obtain fault diagnosis information of M dimensions according to the above-mentioned alarm information.
  • the fault can be diagnosed from multiple angles, so as to quickly solve the fault. These fault diagnosis angles can be called dimensions. Then, the aforementioned M dimensions may include M different angles for fault diagnosis; the M is an integer greater than 1. Some commonly used dimensions of fault diagnosis are listed below, see Table 1, but they are not limited to the dimensions listed in Table 1.
  • Table 1 exemplarily lists multiple dimensions of fault diagnosis. These dimensions may include dimensions such as network location, network link information, protection type, business information, device logs, performance index data, diagnosis command response, alarm type, etc. . Table 1 also gives some descriptions of each dimension and the method of comparing similar faults from each dimension. The following is a detailed description:
  • the network location can include the network side or the user side.
  • the network location of the fault can be used to determine whether the two faults are likely to be similar. If the locations of the two faults are both on the network side or both on the user side, the two faults may be similar faults.
  • the network ring chain may include an access ring, an access chain, an aggregation ring, a core ring, and so on. You can judge whether the two faults are likely to be similar through the network link where the fault is located. If the network ring chains where two faults occur are located in the same ring chain, for example, both are located in the same access ring, the two faults may be similar faults. Or, if the two faulty network rings are located in the same type of ring chain, for example, they are both located in the packet transport network (PTN) 950 access ring, but not in the same access ring , Then the two faults may also be similar faults.
  • PDN packet transport network
  • Protection types can include unprotected, APS protection, Tunnel protection, and so on. These protections can be used to reduce or even avoid losses to the network when the network fails. It can be judged whether the two faults are likely to be similar by judging whether there are corresponding types of protective measures after the fault occurs. If two faults have the same type of protection measures, for example, both have APS protection or neither have protection, then the two faults may be similar faults.
  • the business information includes two aspects.
  • the first aspect includes the specific business involved and the business damage situation
  • the second aspect includes the specific business type involved and the business damage distribution corresponding to different business types.
  • the two faults can be judged whether the two faults are likely to be similar by calculating the proportion of the same number of services in the services involved in the two faults.
  • a percentage threshold may be preset, and when the ratio of the number of the same services to the total amount of involved services is greater than the threshold, the two faults may be similar faults.
  • the threshold value is 0.6
  • it can also be judged based on the situation of specific business damage, such as business degradation or business interruption. If the damage of the same service is also the same, the two faults may be similar faults.
  • the two failures are likely to be similar by comparing the types of services involved in the two failures. If two faults involve the same or similar types of services, then the two faults may be similar faults.
  • the damage distribution of service types can also be integrated, for example, the service damage rates corresponding to the same or similar service types. If the service damage rates are similar, then the two faults may be similar faults.
  • the device log refers to the log of the device involved in the failure, and it can be judged whether the two failures are likely to be similar by comparing the logs of the device involved in the two failures. If the logs of the devices involved in two faults are similar, then the two faults may be similar faults.
  • the performance index data may include various parameter indexes for evaluating the performance of the equipment or the network, such as the packet loss rate and so on. It can be judged whether the two faults are likely to be similar by comparing the changing trends or dynamics of the performance indicators of the devices or networks involved in the two faults.
  • the change trend or change dynamics of the performance index may include, for example, a gentle rise/fall, a rapid rise/fall, a cliff-like fall, up and down oscillation, and a horizontal straightness/fluctuation, and so on. Exemplarily, if the packet loss rates of the devices involved in two failures both increase rapidly, then the two failures may be similar failures.
  • the diagnosis command response refers to the response of the equipment involved in the fault to the diagnosis command.
  • Diagnosis commands include commands to query the overall operating status of the device, etc. It can be judged whether the two faults are likely to be similar by comparing the response information of the devices involved in the two devices to the same diagnostic command. If the response information of the two devices is similar or the same, then the two faults may be similar.
  • alarms There are generally a large number of alarms, and operation and maintenance personnel are accustomed to classifying alarms.
  • Common alarm types include line alarms, equipment decommissioning alarms, single board alarms, service alarms, protection degradation alarms, interruption alarms, fan alarms, temperature alarms, and traffic Over-limit alarm, optical function abnormal alarm, error code alarm, switching type alarm and return type alarm, etc. It can be judged whether the two faults are likely to be similar by comparing the alarm types of the alarm information corresponding to the two faults. If the alarm types corresponding to the two faults are similar or the same, then the two faults may be similar.
  • some of the above-mentioned information in multiple dimensions may be obtained from network nodes, such as alarm types, etc., and some may be obtained by the fault monitoring device based on the above-mentioned alarm information by query, such as business information, Equipment logs and so on. Others can be determined based on existing information, such as network location, network links, and so on. The specific acquisition of this information is determined according to the actual situation, and this plan does not impose restrictions on this.
  • the failure similarity comparison method described in Table 1 above is the failure similarity comparison achieved from a single dimension.
  • the similar failures obtained from the failure similarity comparison achieved from a single dimension may have large deviations and have little reference significance. Therefore, this solution provides The similarity comparison of faults is realized from multiple dimensions, the deviation between the similar faults obtained from the comparison and the faults to be handled is reduced, and the accuracy of the recommendation of similar faults is improved, thereby providing effective reference information for operation and maintenance personnel.
  • the fault monitoring device after the fault monitoring device obtains the above-mentioned alarm information, it can obtain the fault diagnosis information in various dimensions of the fault to be processed according to the alarm information. For example, it can obtain the fault diagnosis of multiple dimensions in Table 1 above. information.
  • the fault diagnosis information includes information that actually describes the fault to be processed in each dimension. For example, it may be the information in the dimension description in Table 1 above, which can be used to diagnose the fault type of the fault to be processed and the specific location where the fault occurs. For example, taking the dimension of network location as an example, if the fault to be processed occurs on the network side, the network side is the fault diagnosis information of the dimension of the fault to be processed in the network location.
  • the fault monitoring equipment can obtain the faults of the pending faults in various dimensions based on this information. Diagnostic information.
  • the specific alarm type can be determined according to the alarm list.
  • the specific network location can be determined according to the location where the alarm may occur, for example, it can be determined whether it is on the network side or on the user side. It is also possible to obtain the fault diagnosis information of the corresponding dimension by querying the database according to the alarm information, for example, obtaining specific business information and specific equipment logs related to the fault to be handled through the query. This is just an example. The acquisition of specific fault diagnosis information in these dimensions is determined according to the actual situation, and this solution does not impose restrictions on this.
  • the fault monitoring device may further determine M dimensions for similar calculations.
  • the above-mentioned fault monitoring device may determine the network type of the network where the fault to be processed is located based on the received alarm information, and then may determine the M dimensions according to the network type of the network where the fault to be processed is located.
  • different network types can correspond to different M dimensions.
  • different network types can also correspond to the same M dimensions, which is specifically determined according to actual conditions.
  • the M dimensions corresponding to different network types may be pre-configured.
  • packet transport network PTN and optical transport network optical transport network
  • Dimensions, M dimensions corresponding to the OTN network can be pre-configured, including network location, network links, business information, performance index data, and alarm types.
  • the specific network classification is not limited. It is determined according to the actual situation. Then, for the classified network types, the M dimensions corresponding to each type of network can be pre-configured. .
  • the fault monitoring device can adaptively select M dimensions from a preset list of dimensions, such as Table 1, as the comparison of similar faults according to the characteristics of the network type.
  • a preset list of dimensions such as Table 1
  • Dimension Which dimensions to choose can be determined according to the characteristics of the network type, and this solution does not impose restrictions on this.
  • the above-mentioned M dimensions may not be determined according to the network type of the network where the fault is located, but the preset M dimensions are uniformly used as dimensions for comparison of similar faults. For example, regardless of the network type, the dimensions of network location, network link, protection type, business information, device logs, performance index data, diagnostic command response, and alarm type are used as dimensions for comparing similar faults.
  • the M dimensions may also be determined based on the above-obtained alarm information.
  • the information includes an alarm list and the location where the alarm may occur, such as which device or port.
  • the alarm type can be determined according to the alarm list, and the type of alarm may be determined according to the alarm list.
  • the location can determine the network location, for example, it can be determined whether it is on the network side or the user side, and the fault monitoring device can use the two dimensions of the network location and the alarm type as the dimensions for comparing similar faults.
  • this is only an exemplary introduction.
  • the information specifically included in the alarm information and the dimension determined according to the information are determined according to actual conditions, and this solution does not impose restrictions on this.
  • the number of dimensions can be adjusted according to actual needs. Adjustments, such as adding, deleting, or modifying these preset dimensions. The specific added, reduced or modified dimensions are determined according to actual conditions, and this solution does not impose restrictions on this.
  • the fault monitoring device can obtain the fault diagnosis information of each dimension of the fault to be processed from the above-mentioned alarm information, and select the fault diagnosis information of the M dimensions.
  • the fault monitoring device first obtains the fault diagnosis information of the fault to be processed in each dimension according to the alarm information, and then determines the M dimensions, and then selects the M dimensions of the fault from the fault diagnosis information of each dimension Diagnostic information.
  • the fault monitoring device may first determine the M dimensions according to the alarm information, and then obtain the fault diagnosis information of the M dimensions according to the alarm information for the M dimensions. It should be noted that the specific implementation method is determined according to the actual situation, and this solution does not impose restrictions on this.
  • Step 203 Process the fault diagnosis information of the above M dimensions to obtain the characteristics of the fault to be processed.
  • the fault monitoring device can process the fault diagnosis information of the M dimensions to obtain the characteristics of the fault to be processed.
  • the specific processing process may include feature extraction of each dimension of the M dimensions, and The features extracted from each dimension are combined into the features of the fault to be processed.
  • the following exemplarily introduces the process of obtaining the characteristics of the above-mentioned fault to be processed according to the above-mentioned M dimensions of fault diagnosis information processing.
  • the digitization of information in simple terms, can use numbers to represent specific textual information.
  • the network location dimension the network location includes the user side and the network side. After digitizing the information of the network location dimension, a certain number such as 0 can be used to represent the user side, and another number such as 1 can be used to represent the network. side.
  • the network ring chain dimension the network ring chain includes access ring, access chain, convergence ring, and core ring. After digitizing the information of the network ring chain dimension, 0, 1, 2 and 3 can be used. Denote the access ring, access chain, convergence ring, and core ring respectively.
  • the digitization of other dimensions please refer to the digitization examples of the two dimensions, which will not be repeated here. It should be noted that the specific numbers used to quantify the dimensional information are determined according to the actual situation, and this solution does not impose restrictions on this.
  • the user side is associated with the number 0
  • the network side is associated with the number 1, and so on.
  • the three dimensions of network location, network link, and alarm type are determined as the comparison dimensions of similar faults, and the fault diagnosis information obtained in the network location, network link and alarm type dimensions of the pending fault is the network side, Core ring and line alarm.
  • the digitized number on the network side is 1, the digitized number of the core ring is 3, and the digitized number of line alarms is 0. From this, it can be obtained that the characteristic values of the dimensions of the network location, network link, and alarm type of the fault to be handled are 1, 3, and 0, respectively.
  • these feature values can be combined into a feature vector, and this feature vector can be called the feature of the fault to be processed.
  • the characteristic values of the network location, network link, and alarm type dimensions of the fault to be processed are 1, 3, and 0, respectively.
  • the vector A can also be expressed as [3,1,0] and so on. That is, the order of the feature values corresponding to the M dimensions can be determined according to the actual situation, which is not limited in this solution.
  • Step 204 Calculate multiple similarities based on the characteristics of the fault to be processed and the characteristics of multiple historical faults, and the multiple similarities represent the similarities between the fault to be processed and the multiple historical faults, respectively.
  • the database of the above-mentioned fault monitoring device stores historical fault information.
  • the information of these historical faults may include information of multiple dimensions corresponding to each historical fault, detailed information of solutions corresponding to each historical fault, and so on.
  • the fault monitoring device may obtain information about each historical fault from the database, and extract the characteristics of each historical fault according to the determined M dimensions.
  • the method of extracting the characteristics of the historical fault is the same as the method of extracting the characteristics of the fault to be processed. You can refer to the corresponding description above, which will not be repeated here. It should be noted that when the feature values of the respective dimensions corresponding to the extracted historical faults are combined into a feature vector, the combination sequence of the feature values is the same as the combination sequence of the feature values corresponding to the faults to be processed. For ease of understanding, give examples:
  • the characteristics extracted from the three dimensions of the network location, the network ring chain, and the alarm type of a certain historical fault are the network side, the core ring, and the equipment de-service alarm respectively.
  • the digitized numbers of the network side and core ring correlation maps are still 1 and 3, respectively, and the digitized number of the device decommissioning alarm correlation maps is 1.
  • B [1,3,1]
  • the fault monitoring device After extracting the feature vector of each historical fault in the database, the fault monitoring device performs similar calculations on the extracted feature vector of the fault to be processed with the feature vector of each historical fault to obtain multiple similarities.
  • the similarity calculation between the fault to be processed and the historical fault may be calculated by using algorithms such as cosine similarity, term frequency-inverse document frequency (TF-IDF), etc.
  • TF-IDF term frequency-inverse document frequency
  • the principle of the cosine similarity algorithm is to treat two feature vectors as two line segments in space.
  • the two line segments start from the origin and point to different directions, that is, an angle is formed between the two line segments.
  • Figure 3 shows a schematic diagram of two eigenvectors A and B.
  • the two vectors start from the same point and point in different directions, and the angle between them is ⁇ .
  • the following is an example based on Figure 3 and the aforementioned examples.
  • the cosine value is closer to 1, it means that the angle of the included angle is closer to 0 degrees, which means that the above two vectors are more similar. Then the cosine value of the angle between the two vectors can be used as the similarity of the two vectors. It can be said that the similarity of the vector A and the vector B is 0.95 or 95%. Since vector A represents a fault to be processed and vector B represents a certain historical fault, the similarity between the fault to be processed and the historical fault is 95%. Based on this algorithm, the similarity between the fault to be processed and each historical fault in the database can be calculated to obtain multiple similarities.
  • the database of the above-mentioned fault monitoring device stores historical fault information, and this information also includes the user’s evaluation of the solution to the corresponding historical fault and/or includes the user’s evaluation of the fault itself, Then the fault monitoring equipment can filter out historical faults that are used for similar calculations to the faults to be processed based on the evaluation information.
  • the historical fault corresponding to the evaluation can be selected as a historical fault that is calculated similarly to the fault to be processed. If the evaluation indicates that the fault does not affect the processing of any service, for example, some unused ports or equipment on the user side have a fault, then the fault corresponding to the evaluation may not be used for similar calculations to the fault to be handled.
  • Step 205 Screen similar faults from the multiple historical faults according to the multiple similarities, and recommend the similar faults.
  • the fault monitoring device can sort these similarities in descending order, and then filter the top N similarities
  • the historical faults associated with the degree are recommended to the user, and the N historical faults can be referred to as similar faults of the pending fault.
  • Manner 1 The top N similarities in the ranking are all greater than a preset threshold. For example, suppose the preset threshold is 50%, and the calculated similarities are 95%, 80%, 65%, 57%, and 20%. After comparing with the threshold, the similarity can be recommended to the user as 95%. %, 80%, 65%, and 57% related historical faults. For example, see Figure 4.
  • Figure 4 exemplarily shows a schematic diagram of the page recommending the first N historical faults.
  • the recommendation page may include information such as similarity, fault level, fault number, fault type, maintenance suggestion, fault status, and corresponding creation time.
  • this page is not limited to this information, but can also include other information such as the number of alarms, the number of affected services, and the network location.
  • the specific information included is determined according to the specific situation.
  • the maintenance suggestions given for each recommended fault in Figure 4 may be suggestions based on actual solutions to the corresponding historical faults.
  • the specific maintenance suggestion can be displayed by responding to the click operation on the viewing control 401.
  • the details of the maintenance suggestion can also be directly displayed in FIG. 4 without viewing the control 401.
  • the actual solutions corresponding to each historical fault may also be displayed in FIG. 4 for the convenience of users' reference.
  • the fault monitoring equipment can also provide users with a query entry for the actual solutions of the recommended historical faults. The user can query the actual solutions of these historical faults based on the recommended historical faults for reference, thereby greatly improving the number of faults to be handled. Processing efficiency.
  • the recommended reasons for each historical fault can also be seen on the recommended page.
  • the recommended reasons can be displayed on the page through the control 402.
  • the recommended reasons can include similarity analysis and processing experience similarity analysis. content.
  • Manner 2 The value of N is preset.
  • the value of N can be preset to 10, 15, or 20. That is, the multiple similarities obtained above are sorted in descending order, and the top N ranked historical faults, such as the top 10, the top 15 or the top 20 similarities, are selected and recommended to the user.
  • the specific value of N is not limited in this solution.
  • the fault monitoring device may also receive user evaluation information of the recommended historical faults, and these evaluation information may be used as reference information for subsequent historical fault recommendations.
  • the user's evaluation of the recommended historical faults can be implemented through actions such as “like” or “dislike”. “Like” means that the user is satisfied with the recommended historical fault and gives a good comment. And “disable” means that the user is not satisfied with the recommended historical fault and gives a bad review.
  • the user's evaluation of the recommended historical faults can also be achieved by scoring, giving "stars", and so on.
  • the full score can be 10 points or 100 points. The higher the score, the higher the satisfaction.
  • the way to give "stars” can be five stars. The more stars you give, the higher your satisfaction.
  • the user's evaluation method for the recommended historical faults may also be other methods, which can be determined according to actual conditions.
  • the fault monitoring equipment After the fault monitoring equipment receives the user's evaluation information, it associates the evaluation information with the corresponding historical faults and counts them, that is, records the number of positive and negative reviews corresponding to the historical faults. Then, in the subsequent recommendation of historical faults, historical faults with better evaluations, such as historical faults with more favorable reviews, can be displayed in front of the recommendation page, and historical faults with poor evaluations, such as history with no favorable reviews or more negative reviews. The fault can be displayed at the back of the recommended page, or even not recommended. Exemplarily, the following takes the evaluation method of "Like" as an example for introduction, which can be seen in FIG. 5.
  • the historical fault recommendation page shown in FIG. 5 also includes the number of likes 501 for the historical faults and the operation instructions 502 for the likes. It can be seen that the historical faults with more likes are displayed in the front of the recommended page. That is, the fault monitoring device arranges and displays the historical faults on the recommended page from top to bottom according to the order of the number of likes of the recommended historical faults.
  • the operation signal 502 for likes in FIG. 5 is used for the user to like, and the fault monitoring device can receive the user's like information from the operation signal 502 for likes.
  • the embodiment of the present application uses the user's evaluation to recommend historical faults, and can combine the user's feedback to make more accurate recommendations for the user, thereby further improving the processing efficiency of the faults to be handled.
  • the fault monitoring device may also compare the pending faults and historical faults from a single dimension based on the above M dimensions. Then, when the user needs to understand the comparison details of a single dimension, the fault monitoring device can also display the details of the comparison result of the single dimension on the interface of the display screen for the user to read and refer to. In order to facilitate understanding, the following examples illustrate.
  • the fault to be processed is the fault numbered 150672
  • the fault similar to the fault to be processed 150672 is the historical fault 984.
  • FIG. 6 exemplarily shows related information about the fault 150672 to be handled, such as the fault type, the number of alarms, the type of services affected, the type of protection, the number of services affected, the time when the fault was created, and so on. Then, in the similar faults, the fault number, the fault diagnosis dimension, the similarity comparison details, the severity level of the historical faults, and the creation time of the similar historical faults are given.
  • the information in the fault diagnosis dimension indicates that the pending fault 150672 and the historical fault 984 are compared mainly from the three dimensions of network location, network link, and alarm type.
  • the similar comparison details give the single-dimensional comparison details involving these three dimensions, for example, "both involve the network side and the core ring, the fault 150672 is related to the line alarm, and the fault 984 is related to the equipment de-service alarm".
  • the embodiment of the present application provides single-dimensional comparison details, which helps users to quickly find solutions to the faults to be processed based on the details, and further improves the resolution efficiency.
  • FIG. 7 is a schematic diagram of the software architecture used to provide similar fault recommendations in the fault monitoring equipment.
  • the fault monitoring equipment includes a similar fault recommendation service front-end user interface (UI).
  • the recommendation service may include a recommendation component and a database, and the recommendation component may include a feature processing unit, a calculation unit, a sorting unit, and a filter. Unit etc.
  • the database can be used to store a list of historical faults, a list of detailed information of historical faults, and a list of recommended historical faults.
  • the process of implementing similar fault recommendation based on this software architecture includes but is not limited to the following steps:
  • the feature processing unit obtains alarm information from the kafka component.
  • the feature processing unit queries the historical fault list and the detailed information list of the historical fault in the database to find the historical fault and its detailed information for similar comparison.
  • the feature processing unit obtains the features of pending faults and historical faults based on the information processing obtained in steps 1 and 2.
  • the feature processing unit sends the features of the pending faults and historical faults obtained by the processing to the computing unit.
  • the calculation unit calculates the similarity between the fault to be processed and the historical fault according to the processed features.
  • the calculation unit sends the calculated similarity to the sorting unit.
  • the sorting unit sorts the calculated similarity in descending order.
  • the sorting unit sends the sorted similarity to the screening unit.
  • the screening unit selects the top N similarities from the sorted similarities.
  • the screening unit stores the N historical faults associated with similarity in the database, and obtains a recommended historical fault list.
  • the front-end user interface UI queries the recommended historical fault list in the database, and displays the information of the list on the user interface.
  • this solution implements the recommendation of similar faults through multi-dimensional similarity comparisons, which can improve the accuracy of recommendation, and to solve the problems to be solved.
  • the processed faults provide an effective reference, thereby improving the efficiency of network maintenance.
  • each device includes a hardware structure and/or software module corresponding to each function.
  • the present application can be implemented in the form of hardware or a combination of hardware and computer software. Whether a certain function is executed by hardware or computer software-driven hardware depends on the specific application and design constraint conditions of the technical solution. Professionals and technicians can use different methods for each specific application to implement the described functions, but such implementation should not be considered beyond the scope of this application.
  • the embodiment of the present application may divide the fault monitoring device into functional modules according to the foregoing method examples.
  • each functional module may be divided corresponding to each function, or two or more functions may be integrated into one module.
  • the above-mentioned integrated modules can be implemented in the form of hardware or software functional modules. It should be noted that the division of modules in the embodiments of the present application is illustrative, and is only a logical function division, and there may be other division methods in actual implementation.
  • FIG. 8 shows a schematic diagram of a possible logical structure of the device, which may be the above-mentioned fault monitoring device.
  • the device 800 includes an acquisition unit 801, a processing unit 802, a calculation unit 803, a screening unit 804, and a recommendation unit 805.
  • the obtaining unit 801 is configured to perform the steps of obtaining information in the foregoing method embodiment, for example, to perform the steps of obtaining alarm information and fault diagnosis information.
  • the processing unit 802 is configured to execute the step of processing the fault diagnosis information of M dimensions to obtain the characteristics of the fault to be processed in the foregoing method embodiment.
  • the calculation unit 803 is configured to perform similar calculation steps in the aforementioned method embodiment.
  • the screening unit 804 is used to perform the screening steps in the aforementioned method embodiment, for example, to perform screening similar historical faults.
  • the recommending unit 805 is used to perform the recommended steps in the foregoing method embodiment, for example, to perform the steps of recommending similar historical faults.
  • the device 800 further includes a determining unit configured to perform the steps of determining M dimensions in the foregoing method embodiment.
  • the device 800 further includes an adjustment unit for performing the steps of adjusting the dimensions in the foregoing method embodiment.
  • the device 800 further includes a receiving unit, configured to perform the step of receiving evaluation information in the foregoing method embodiment.
  • FIG. 9 is a schematic diagram of a possible hardware structure of the device provided by this application.
  • the device may be the fault monitoring device described in the foregoing method embodiment.
  • the device 900 includes a processor 901, a memory 902, and a communication interface 903.
  • the processor 901, the communication interface 903, and the memory 902 may be connected to each other or connected to each other through a bus 904.
  • the memory 902 is used to store computer programs and data of the device 900.
  • the memory 902 may include, but is not limited to, random access memory (RAM), read-only memory (ROM), and Erasable programmable read-only memory (erasable programmable read-only memory, EPROM) or portable read-only memory (compact disc read-only memory, CD-ROM), etc.
  • the communication interface 903 is used to support the device 900 to communicate, for example, to receive or send data.
  • the processor 901 may be a central processing unit, a general-purpose processor, a digital signal processor, an application specific integrated circuit, a field programmable gate array, or other programmable logic devices, transistor logic devices, hardware components, or any combination thereof.
  • the processor may also be a combination of computing functions, for example, a combination of one or more microprocessors, a combination of a digital signal processor and a microprocessor, and so on.
  • the processor 901 may be used to read the program stored in the memory 902, and execute the operation performed by the fault monitoring device in the method described in FIG. 2 and possible implementation manners.
  • the embodiment of the present application also discloses a computer-readable storage medium, the computer-readable storage medium stores a computer program, and the computer program is executed by a processor to implement the method described in FIG. 2 and possible implementation manners.
  • the embodiment of the present application also discloses a computer program product.
  • the computer program product When the computer program product is read and executed by a computer, the method described in FIG. 2 and possible implementations will be executed.
  • the embodiment of the present application also discloses a computer program.
  • the computer program When the computer program is executed on a computer, the computer will enable the computer to implement the method described in FIG. 2 and possible implementations.
  • this solution implements the recommendation of similar faults through multi-dimensional similarity comparisons, which can improve the accuracy of recommendation, and to solve the problems to be solved.
  • the processed faults provide an effective reference, thereby improving the efficiency of network maintenance.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请实施例提供了一种相似故障推荐方法及相关设备,该方法包括:获取待处理故障的告警信息;根据该告警信息获取M个维度的故障诊断信息;该M个维度包括故障诊断的M个不同的角度;该M为大于1的整数;根据该M个维度的故障诊断信息处理得到该待处理故障的特征;根据该待处理故障的特征和多个历史故障的特征计算得到多个相似度,该多个相似度表征该待处理故障分别和该多个历史故障的相似程度;根据该多个相似度从该多个历史故障中筛选出相似故障,并推荐该相似故障。采用本申请实施例,能够提高推荐相似故障的准确率。

Description

相似故障推荐方法及相关设备
本申请要求于2020年03月18日提交中国国家知识产权局、申请号为202010192545.2、申请名称为“相似故障推荐方法及相关设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及运维技术领域,尤其涉及一种相似故障推荐方法及相关设备。
背景技术
当前网络运维的常规方式是通过监控网络采集网络上报的告警展现在一个告警列表页面上,由监控人员或半自动工具甄别可能是故障导致的告警并派单,由实际维护人员到现场检查真实故障并修复。一般网络经常存在的故障都是硬件类故障,如设备停电或者光纤中断,或者硬件损坏(如单板故障,光模块故障),或者一些器件长时间使用或恶劣环境导致设备老化从而经常告警的故障。少量故障是逻辑或者软件故障,如环上业务跑得太多超过了带宽上限,业务参数配置错误,网络资源划分不合理导致局部容易拥塞等。
当网络非常庞大时,网络的不同位置上可能会重复地发生类似的故障,如不同的小区位置,在不同的时间点都发生线路挖断或者供电不好导致设备掉电退服等故障。故障监控(或告警监控)时,监控人员对每个实际的故障的孤立监控处理,故障之间存在的相似性经验没有得到网络管理系统的有效积累和总结形成系统知识时,会导致监控人员持续做重复劳动。
现有的技术方案中,Moogsoft提供的智能化运维(artificial intelligence for IT operations,AIOps)管理软件会按照告警数量的占比来匹配相似度,按照比例推荐匹配较高的历史故障,然后可以参考该历史故障的解决方法解决现有故障。但是,仅通过比较两个故障相同告警数量的占比来判断其相似程度,使得推荐的历史故障准确度不高,从而导致推荐的历史故障的解决方案参考意义不大的问题。
综上所述,如何解决推荐的历史故障准确度不高,故障解决方案参考意义不大的问题是本领域人员急需解决的问题。
发明内容
本申请实施例公开了一种相似故障推荐方法及相关设备,本申请通过多维度的比较来实现相似故障的推荐,能够提高推荐的准确率,为解决待处理的故障提供有效的参考,从而提高了网络维护的效率。
第一方面,本申请公开了一种相似故障推荐方法,该方法包括:
获取待处理故障的告警信息;
根据所述告警信息获取M个维度的故障诊断信息,所述M个维度包括故障诊断的M个不同的角度,所述M为大于1的整数;
根据所述M个维度的故障诊断信息处理得到所述待处理故障的特征;
根据所述待处理故障的特征和多个历史故障的特征计算得到多个相似度,所述多个相似度表征所述待处理故障分别和所述多个历史故障的相似程度;
根据所述多个相似度从所述多个历史故障中筛选出相似故障,并推荐所述相似故障。
相比于现有的技术方案中从单维度比较以实现相似故障的推荐,本方案通过多维度的相似比较来实现相似故障的推荐,能够提高推荐的准确率,为解决待处理的故障提供有效的参考,从而提高了网络维护的效率。
在其中一种可能的实施方式中,所述获取待处理故障的告警信息之后,所述根据所述告警信息获取M个维度的故障诊断信息之前,所述方法还包括:
根据所述告警信息获取所述待处理故障所在网络的网络类型;根据所述网络类型确定所述M个维度。
本申请基于故障所在网络的类型来设置用于进行相似计算的多个维度,针对性强且灵活性高,能够进一步有助于提高推荐的准确率。
在其中一种可能的实施方式中,所述根据所述M个维度的故障诊断信息处理得到所述待处理故障的特征,包括:
根据所述M个维度的故障诊断信息从所述M个维度中的每一个维度分别进行特征提取,得到所述每一个维度的特征值;
将所述每一个维度的特征值组合得到所述待处理故障的特征,所述待处理故障的特征为一个特征向量。
本申请将多个维度的特征组合成一个特征向量来计算相似度,便于计算,提高了计算效率。
在其中一种可能的实施方式中,所述根据所述告警信息获取M个维度的故障诊断信息之前,所述方法还包括:调整预设维度中的一个或多个维度得到所述M个维度。
不同维度的组合计算得到的相似度不同,本申请中可以灵活调整用于相似计算的多个维度,有助于提高最终推荐的准确率。
在其中一种可能的实施方式中,所述根据所述多个相似度从所述多个历史故障中筛选出相似故障,并推荐所述相似故障,包括:
根据所述多个相似度从所述多个历史故障中筛选出N个历史故障,并推荐所述N个历史故障,其中,所述N个历史故障为所述多个相似度中排名前N个相似度所关联的历史故障,所述多个相似度按照从大到小的顺序排列,所述N为正整数。
本申请中选取相似度较高的历史故障推荐,从而能够为待处理故障提供有效的参考。
在其中一种可能的实施方式中,所述根据所述多个相似度从所述多个历史故障中筛选出相似故障之后,所述方法还包括:
推荐所述相似故障与所述待处理故障的比较信息,其中,所述比较信息包括所述相似故障在所述M个维度中的每一个维度与所述待处理故障比较的情况。
本申请通过推荐故障之间的比较详情,有助于为运维人员提供进一步的参考信息。
在其中一种可能的实施方式中,所述多个历史故障为从数据库中筛选后得到的故障。
由于数据库中存储的历史故障数量庞大,筛选后再进行相似计算,可以节省计算资源,提高计算效率,从而能够更快响应用户的需求,推荐用户需要的历史故障。
在其中一种可能的实施方式中,所述根据所述多个相似度从所述多个历史故障中筛选出相似故障,并推荐所述相似故障之后,还包括:
接收对所述推荐的历史故障的评价信息,其中,所述评价信息用于作为后续历史故障推荐的参考信息。
本申请中对推荐的历史故障的评价信息可以用于后续历史故障推荐的参考,能够进一步 提高推荐的效率,且使得推荐的历史故障更加符合用户的需求。
第二方面,本申请公开了一种设备,该设备包括:
获取单元,用于获取待处理故障的告警信息;
所述获取单元,还用于根据所述告警信息获取M个维度的故障诊断信息,所述M个维度包括故障诊断的M个不同的角度,所述M为大于1的整数;
处理单元,用于根据所述M个维度的故障诊断信息处理得到所述待处理故障的特征;
计算单元,用于根据所述待处理故障的特征和多个历史故障的特征计算得到多个相似度,所述多个相似度表征所述待处理故障分别和所述多个历史故障的相似程度;
筛选单元,用于根据所述多个相似度从所述多个历史故障中筛选出相似故障;
推荐单元,用于推荐所述相似故障。
在其中一种可能的实施方式中,所述获取单元,还用于在获取待处理故障的告警信息之后,根据所述告警信息获取M个维度的故障诊断信息之前,根据所述告警信息获取所述待处理故障所在网络的网络类型;
所述设备还包括确定单元,用于根据所述网络类型确定所述M个维度。
在其中一种可能的实施方式中,所述处理单元具体用于:
根据所述M个维度的故障诊断信息从所述M个维度中的每一个维度分别进行特征提取,得到所述每一个维度的特征值;
将所述每一个维度的特征值组合得到所述待处理故障的特征,所述待处理故障的特征为一个特征向量。
在其中一种可能的实施方式中,所述设备还包括调整单元,用于在所述获取单元根据所述告警信息获取M个维度的故障诊断信息之前,调整预设维度中的一个或多个维度得到所述M个维度。
在其中一种可能的实施方式中,所述筛选单元具体用于:根据所述多个相似度从所述多个历史故障中筛选出N个历史故障;
所述推荐单元具体用于:推荐所述N个历史故障,其中,所述N个历史故障为所述多个相似度中排名前N个相似度所关联的历史故障,所述多个相似度按照从大到小的顺序排列,所述N为正整数。
在其中一种可能的实施方式中,所述推荐单元还用于在所述筛选单元根据所述多个相似度从所述多个历史故障中筛选出相似故障之后
推荐所述相似故障与所述待处理故障的比较信息,其中,所述比较信息包括所述相似故障在所述M个维度中的每一个维度与所述待处理故障比较的情况。
在其中一种可能的实施方式中,所述多个历史故障为从数据库中筛选后得到的故障。
在其中一种可能的实施方式中,所述设备还包括接收单元,用于在所述推荐单元推荐所述相似故障之后,接收对所述推荐的历史故障的评价信息,其中,所述评价信息用于作为后续历史故障推荐的参考信息。
第三方面,本申请公开了一种设备,所述设备包括处理器、通信接口和存储器,其中,所述存储器用于存储计算机程序和/或数据,所述处理器用于执行所述存储器中存储的计算机程序,使得所述设备执行如下操作:
获取待处理故障的告警信息;
根据所述告警信息获取M个维度的故障诊断信息,所述M个维度包括故障诊断的M个 不同的角度,所述M为大于1的整数;
根据所述M个维度的故障诊断信息处理得到所述待处理故障的特征;
根据所述待处理故障的特征和多个历史故障的特征计算得到多个相似度,所述多个相似度表征所述待处理故障分别和所述多个历史故障的相似程度;
根据所述多个相似度从所述多个历史故障中筛选出相似故障,并推荐所述相似故障。
在其中一种可能的实施方式中,所述获取待处理故障的告警信息之后,所述根据所述告警信息获取M个维度的故障诊断信息之前,所述设备还执行如下操作:
根据所述告警信息获取所述待处理故障所在网络的网络类型;
根据所述网络类型确定所述M个维度。
在其中一种可能的实施方式中,所述根据所述M个维度的故障诊断信息处理得到所述待处理故障的特征,包括:
根据所述M个维度的故障诊断信息从所述M个维度中的每一个维度分别进行特征提取,得到所述每一个维度的特征值;
将所述每一个维度的特征值组合得到所述待处理故障的特征;所述待处理故障的特征为一个特征向量。
在其中一种可能的实施方式中,所述根据所述告警信息获取M个维度的故障诊断信息之前,所述设备还执行如下操作:调整预设维度中的一个或多个维度得到所述M个维度。
在其中一种可能的实施方式中,所述根据所述多个相似度从所述多个历史故障中筛选出相似故障,并推荐所述相似故障,包括:
根据所述多个相似度从所述多个历史故障中筛选出N个历史故障,并推荐所述N个历史故障;其中,所述N个历史故障为所述多个相似度中排名前N个相似度所关联的历史故障;所述多个相似度按照从大到小的顺序排列;所述N为正整数。
在其中一种可能的实施方式中,所述根据所述多个相似度从所述多个历史故障中筛选出相似故障之后,所述设备还执行如下操作:
推荐所述相似故障与所述待处理故障的比较信息,其中,所述比较信息包括所述相似故障在所述M个维度中的每一个维度与所述待处理故障比较的情况。
在其中一种可能的实施方式中,所述多个历史故障为从数据库中筛选后得到的故障。
在其中一种可能的实施方式中,所述根据所述多个相似度从所述多个历史故障中筛选出相似故障,并推荐所述相似故障之后,所述设备还执行如下操作:
接收对所述推荐的历史故障的评价信息;其中,所述评价信息用于作为后续历史故障推荐的参考信息。
第四方面,本申请实施例公开了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行以实现上述第一方面任意一项所述的方法。
第五方面,本申请实施例公开了一种计算机程序产品,当所述计算机程序产品被计算机读取并执行时,上述第一方面任意一项所述的方法将被执行。
第六方面,本申请实施例公开了一种计算机程序,当所述计算机程序在计算机上执行时,将会使所述计算机实现上述第一方面任意一项所述的方法。
综上所述,相比于现有的技术方案中从单维度比较以实现相似故障的推荐,本方案通过多维度的相似比较来实现相似故障的推荐,能够提高推荐的准确率,为解决待处理的故障提供有效的参考,从而提高了网络维护的效率。
附图说明
图1为本方案实施例提供的相似故障推荐方法的系统架构示意图;
图2为本方案实施例提供的相似故障推荐方法的流程示意图;
图3为本方案实施例提供的一种余弦夹角示意图;
图4为本方案实施例提供的一种历史故障推荐页面示意图;
图5为本方案实施例提供的另一种历史故障推荐页面示意图;
图6为本方案实施例提供的另一种历史故障推荐页面示意图;
图7为本方案实施例提供的相似故障推荐方法的另一种流程示意图;
图8为本方案实施例提供的设备的逻辑结构示意图;
图9为本方案实施例提供的设备的硬件结构示意图。
具体实施方式
为了更好的理解本申请实施例提供的一种相似故障推荐方法,下面先对本申请实施例适用的场景进行示例性地描述。参阅图1,图1是本申请实施例提供的相似故障推荐方法适用的系统构架示意图。如图1所示,该系统构架可以包括一个或多个故障监控设备100(多个设备可以组成集群)以及包括一个或多个网络节点(或设备)110,故障监控设备100可以和网络节点110通过网络实现互相通信。其中:
故障监控设备100主要用于监控某一个网络中的网络节点100和线路是否出现故障的设备。例如,故障监控设备100可以是运营商用于监控网络中是否出现故障的设备,或者也可以是企业中用于监控企业自身的网络是否出现故障的设备等等,具体的该故障监控设备100为哪个网络中的设备根据实际情况确定,本方案对此不做限制。
故障监控设备100上需要运行有相应的应用程序来提供相应的相似故障推荐方法,如数据库服务,数据计算、决策执行等等。例如,在本申请实施例中,故障监控设备100可以通过某个应用程序监控到出现故障,然后再通过某个应用程序为新出现的故障推荐相似的历史故障,以便于借鉴该相似历史故障的解决方案快速解决新出现的故障。
网络节点110可以是故障监控设备100所在的网络中网络侧或者用户侧的设备。网络节点110可以安装并运行相关的应用(APP)。在本申请实施例中,当出现故障或者告警时,网络节点110可以通过相应的应用向故障监控设备100发送故障信息或告警信息等。
网络节点110可以包括但不限于任何一种基于智能操作系统的电子产品,其可与用户通过键盘、虚拟键盘、触摸板、触摸屏以及声控设备等输入设备来进行人机交互,诸如智能手机、平板电脑、台式电脑、手持计算机和可穿戴电子设备等。其中,智能操作系统包括但不限于任何通过向设备提供各种应用来丰富设备功能的操作系统,诸如安卓Android、IOS、Windows和MAC等系统。网络节点110还可以包括但不限于网络中的任何一种网络设备,例如服务器、集线器、交换机、网桥、路由器、网关、网络接口卡、无线接入点、打印机和调制解调器、光纤收发器等网络设备。
需要说明的是,本申请实施例提供的相似故障推荐方法适用的系统构架不限于图1所示系统架构。
下面提供一种相似故障推荐方法,该方法可以应用于图1所示的系统架构。参见图2,该方法包括但不限于如下步骤:
步骤201、获取待处理故障的告警信息。
在具体实施例中,当网络中出现故障(该故障可以称为待处理故障)时,网络节点会出现告警,然后网络节点会根据告警情况将告警列表、可能出现告警的原因、可能出现告警的位置例如哪个设备或者哪个端口、以及故障修复建议等信息发送给故障监控设备。该故障监控设备获取到的这些信息可以称为待处理故障的告警信息。这些告警信息主要是对该待处理故障的描述信息,该信息还可以包括其它信息例如可能的故障类型等,本方案对此不做限制。
可选的,故障监控设备可以通过卡夫卡kafka系统获取待处理故障的告警信息。示例性地,kafka可以接收多个网络节点发送的待处理故障的告警信息,并按顺序将这些信息存储,故障监控设备可以以订阅的方式订阅这些信息,从而获取到这些待处理故障的告警信息。
步骤202、根据上述告警信息获取M个维度的故障诊断信息。
首先,介绍一下维度的概念。在具体实施例中,可以从多个角度对故障进行诊断,从而快速解决故障,这些故障诊断的角度可以称为维度。那么,上述M个维度可以包括故障诊断的M个不同的角度;该M为大于1的整数。下面列出一些常用的故障诊断的维度,参见表1,但实际不限于表1中所列的维度。
表1
Figure PCTCN2021078895-appb-000001
表1中示例性列出了多个故障诊断的维度,这些维度例如可以包括网络位置、网络环链信息、保护类型、业务信息、设备日志、性能指标数据、诊断命令响应、告警类型等维度等。表1中还给出了各个维度的一些说明以及从各个维度分别进行故障相似比较的方法。下面详细说明:
网络位置可以包括网络侧或者用户侧等。可以通过故障所在的网络位置来判断两个故障是否有可能相似。如果两个故障出现的位置均位于网络侧或者均位于用户侧,则该两个故障有可能为相似故障。
网络环链可以包括接入环、接入链、汇聚环和核心环等等。可以通过故障所在的网络环链来判断两个故障是否有可能相似。如果两个故障出现的网络环链位于同一个环链内,例如均位于同一个接入环,则该两个故障有可能为相似故障。或者,如果两个故障出现的网络环链位于相同类型的环链内,例如均位于分组传送网(packet transport network,PTN)950这一类型的接入环内,但不是位于同一个接入环内,那么该两个故障也有可能为相似故障。
保护类型可以包括无保护、APS保护和Tunnel保护等等。这些保护可以用于在网络出现故障时减少甚至避免故障给网络带来的损失。可以通过判断出现故障后是否有相应类型的保护措施来判断两个故障是否有可能相似。如果两个故障出现后都有相同类型的保护措施例如都有APS保护或者都没有保护,那么该两个故障有可能为相似故障。
业务信息包括两方面,第一方面包括涉及的具体业务和业务受损情况等,第二方面包括涉及的具体业务类型和不同的业务类型对应的业务受损分布情况等。
关于第一方面,可以通过计算两个故障涉及的业务中相同业务数量的占比来判断两个故障是否有可能相似。例如,可以预先设定一个占比阈值,当相同业务数量与总涉及业务量的比值大于该阈值时,该两个故障有可能为相似故障。示例性地,假设阈值为0.6,第一故障涉及的业务有10个,第二故障涉及的业务有15个,这两个故障涉及的业务有8个是相同的,那么相同业务数量占总涉及业务量的占比为(8+8)/(10+15)=0.64。该占比大于阈值0.6,那么该两个故障有可能为相似故障。另外,还可以综合具体业务受损的情况例如是业务降级还是业务中断来判断。如果相同的业务的受损情况也相同,则该两个故障有可能为相似故障。
关于第二方面,可以通过比较两个故障涉及的业务的类型来判断两个故障是否有可能相似。如果两个故障涉及的业务类型相同或相似,那么该两个故障有可能为相似故障。此外,还可以综合业务类型的受损分布情况,例如相同或相似的业务类型对应的业务受损率,如果该业务受损率相近,那么该两个故障有可能为相似故障。
设备日志指的是故障涉及的设备的日志,可以通过比较两个故障涉及的设备的日志来判断该两个故障是否有可能相似。如果两个故障涉及的设备的日志相似,那么该两个故障有可能为相似故障。
性能指标数据可以包括各种评估设备或者网络性能好坏的参数指标,例如丢包率等等。可以通过比较两个故障涉及的设备或网络的性能指标的变化趋势或者变化动态来判断该两个故障是否有可能相似。性能指标的变化趋势或者变化动态例如可以包括平缓上升/下降,急速上升/下降,断崖式骤降,上下振荡,水平平直/波动等等。示例性地,如果两个故障涉及的设备的丢包率都是急速上升,那么该两个故障有可能为相似故障。
诊断命令响应指的是故障涉及的设备对诊断命令的响应情况。诊断命令包括查询该设备的整体运行状态的命令等。可以通过比较两个设备涉及的设备对相同的诊断命令的响应信息来判断该两个故障是否有可能相似。如果该两个设备的响应信息相似或相同,那么该两个故 障有可能相似。
告警一般数量较多,运维人员习惯对告警做一些分类,常见的告警类型包括线路告警、设备退服告警、单板告警、业务告警、保护降级告警、中断告警,风扇告警,温度告警,流量越限告警,光功能异常告警,误码告警,倒换类告警和回告类告警等等。可以通过比较两个故障对应的告警信息的告警类型来判断该两个故障是否有可能相似。如果该两个故障对应的告警类型相似或相同,那么该两个故障有可能相似。
需要说明的是,上述多个维度的信息有的可以是从网络节点中获取的,例如告警类型等等,有的可以是故障监控设备基于上述告警信息通过查询的方式获取的,例如业务信息、设备日志等等。还有的可以是根据已有的信息判断得到的,例如网络位置、网络环链等等。具体这些信息的获取根据实际情况确定,本方案对此不做限制。
另外,上述表1中所述的故障相似比较方法是从单个维度实现的故障相似比较,从单个维度实现的故障相似比较得到的相似故障可能偏差较大,参考意义不大,因此本方案提供了从多个维度实现故障的相似比较,减少比较得到的相似故障与待处理故障之间的偏差,提高相似故障的推荐准确率,从而为运维人员提供有效的参考信息。
基于上述维度的介绍,下面介绍根据上述告警信息获取M个维度的故障诊断信息。
在具体实施例中,当故障监控设备获取到上述告警信息之后,可以根据该告警信息获取到待处理故障在各个维度的故障诊断信息,例如可以获取到上述表1中的多个维度的故障诊断信息。该故障诊断信息包括各个维度实际描述该待处理故障的信息,例如可以是上述表1中维度说明中的信息,能够用于诊断出该待处理故障的故障类型和故障出现的具体位置。例如,以网络位置这个维度为例,如果待处理故障发生在网络侧,那么网络侧即为待处理故障在该网络位置这个维度的故障诊断信息。
由于告警信息包括告警列表、可能出现告警的原因、可能出现告警的位置例如哪个设备或者哪个端口、以及故障修复建议等信息,那么故障监控设备可以根据这些信息获取到待处理故障在各个维度的故障诊断信息。示例性的,例如可以根据告警列表可以确定具体的告警类型。可以根据可能出现告警的位置可以确定具体的网络位置例如可以确定是在网络侧还是在用户侧等。还可以根据该告警信息通过查询数据库的方式获取相应维度的故障诊断信息,例如通过查询获取待处理故障涉及到具体的业务信息以及具体的设备日志等等。这里只是示例说明,具体这些维度的故障诊断信息的获取根据实际情况确定,本方案对此不做限制。
在获取到待处理故障在各个维度的故障诊断信息之后,故障监控设备可以进一步确定用于相似计算的M个维度。下面示例性地介绍两种确定该M个维度的方式。
方式一,上述故障监控设备可以基于接收到的告警信息确定该待处理故障所在网络的网络类型,然后可以根据该待处理故障所在网络的网络类型确定该M个维度。具体的,不同的网络类型可以对应不同的M个维度,当然,不同的网络类型也可以对应相同的M个维度,具体根据实际情况确定。
可选的,不同的网络类型对应的M个维度可以是预先配置好的。例如,对于分组传送网PTN和光传送网(optical transport network,OTN)这两种类型的网络,可以预先配置PTN网络对应的M个维度包括网络位置、网络环链、业务信息和告警类型这几个维度,可以预先配置OTN网络对应的M个维度包括网络位置、网络环链、业务信息、性能指标数据和告警类型这几个维度。这里只是示例性地介绍,还有其他的网络类型,具体网络的分类不限制,具体根据实际情况确定,然后针对分类好的各种网络类型,可以预先配置好各个类型的网络对 应的M个维度。
可选的,在上述故障监控设备确定待处理故障所在网络的网络类型后,可以根据该网络类型的特性自适应地从预先设置好的维度列表例如表1中选择M个维度作为相似故障比较的维度。具体选择哪些维度可以根据网络类型的特性来决定,本方案对此不做限制。
方式二,可以不针对故障所在网络的网络类型来确定上述M个维度,而是统一使用预设的M个维度作为相似故障比较的维度。例如,不管是哪个网络类型,都统一使用网络位置、网络环链、保护类型、业务信息、设备日志、性能指标数据、诊断命令响应和告警类型这几个维度作为相似故障比较的维度。
或者,也可以是基于上述获取到的告警信息确定该M个维度,例如该信息包括告警列表和可能出现告警的位置例如哪个设备或者哪个端口,根据告警列表可以确定告警类型,根据可能出现告警的位置可以确定网络位置例如可以确定是在网络侧还是在用户侧,那么故障监控设备可以使用网络位置和告警类型这两个维度作为相似故障比较的维度。当然,这只是示例性地介绍,该告警信息具体包括的信息和根据该信息确定的维度根据实际情况确定,本方案对此不做限制。
需要说明的是,上述不管是根据网络类型预设的多个故障诊断信息,或者是统一的M个维度,还是根据告警信息确定的M个维度,维度的个数都是可以根据实际需要进行适当调整的,例如增加、删除或修改这些预设的维度,具体增加、减少或修改的维度根据实际情况确定,本方案对此不做限制。
在确定该M个维度后,故障监控设备可以从上述根据告警信息获取到待处理故障在各个维度的故障诊断信息中,选取该M个维度的故障诊断信息。
可选的,上述描述中是故障监控设备先根据告警信息获取到待处理故障在各个维度的故障诊断信息,再确定M个维度,然后再从各个维度的故障诊断信息选取该M个维度的故障诊断信息。但是,在实际实现中,故障监控设备也可以先根据告警信息确定该M个维度,然后再针对该M个维度,根据告警信息获取该M个维度的故障诊断信息。需要说明的是,具体的实现方式根据实际情况确定,本方案对此不做限制。
步骤203、根据上述M个维度的故障诊断信息处理得到上述待处理故障的特征。
在确定上述M个维度的故障诊断信息之后,故障监控设备可以根据该M个维度的故障诊断信息处理得到该待处理故障的特征,具体处理过程可以包括该M个维度各个维度的特征提取,以及将从各个维度提取的特征组合为该待处理故障的特征。下面示例性介绍根据上述M个维度的故障诊断信息处理得到上述待处理故障的特征的过程。
首先,介绍上述M个维度的故障诊断信息的数值化。信息的数值化,简单的说,可以是用数字来表示具体的文字信息。示例性地,对于网络位置维度,该网络位置包括用户侧和网络侧,那么对该网络位置维度的信息数值化之后,可以用某一个数字例如0表示用户侧,用另一个数字例如1表示网络侧。又例如,对于网络环链维度,该网络环链包括接入环、接入链、汇聚环和核心环,那么对该网络环链维度的信息数值化之后,可以用0、1、2和3分别表示接入环、接入链、汇聚环和核心环。其它维度的数值化可以参考该两个维度的数值化示例,此处不再赘述。需要说明的是,具体用哪些数字对维度的信息进行数值化根据实际情况确定,本方案对此不做限制。
对维度信息数值化之后,各个维度的信息与各个维度的数值化数据存在关联映射关系,例如上述网络位置维度,用户侧与数字0关联映射,网络侧与数字1关联等等。
由于上述已经获取到M个维度的故障诊断信息,那么在需要提取待处理故障在M个维度中各个维度的特征的时候,将该M个维度各个维度的故障诊断信息根据对应的关联关系查询到对应的数字,将该数字作为对应维度的特征值。为了便于理解,举例说明:
假设确定从网络位置、网络环链和告警类型这三个维度作为相似故障的比较维度,且获取到的待处理故障在网络位置、网络环链和告警类型维度的故障诊断信息分别为网络侧、核心环和线路告警。那么根据数值化之后的关联映射关系可知:网络侧的数值化数字为1、核心环的数值化数字为3以及线路告警的数值化数字为0。由此可得到待处理故障在网络位置、网络环链和告警类型维度的特征值分别为1、3和0。
在得到上述M个维度的特征值之后,可以将这些特征值组合成一个特征向量,这个特征向量可以称为该待处理故障的特征。例如,在上一段的例子中,得到待处理故障在网络位置、网络环链和告警类型维度的特征值分别为1、3和0。那么,可以将这些特征值组合成一个特征向量A=[1,3,0]。当然,向量A也可以表示为[3,1,0]等等。即该M个维度对应的特征值的顺序可以根据实际情况确定,本方案对此不做限制。
步骤204、根据该待处理故障的特征和多个历史故障的特征计算得到多个相似度,该多个相似度表征该待处理故障分别和该多个历史故障的相似程度。
在具体实施例中,上述故障监控设备的数据库中存储着历史故障的信息。这些历史故障的信息可以包括各个历史故障对应的多个维度的信息,以及各个历史故障对应的解决方案的详情信息等等。
在确定上述M个维度之后,故障监控设备可从数据库中获取各个历史故障的信息,并且根据上述确定的M个维度,分别提取各个历史故障的特征。提取历史故障的特征的方式与提取待处理故障的特征的方式相同,可以参照上述对应的描述,此处不再赘述。需要说明的是,在将提取到的历史故障对应的各个维度的特征值组合成特征向量的时候,其特征值的组合顺序与上述待处理故障对应的特征值的组合顺序相同。为便于理解,举例说明:
还是以M个维度为网络位置、网络环链和告警类型这三个维度为例,在上述待处理故障的多维度特征值组合中组合得到的特征向量为A=[1,3,0],这是按照网络位置、网络环链再到告警类型的顺序来组合的。那么,假设某个历史故障在网络位置、网络环链和告警类型这三个维度提取得到的特征分别为网络侧、核心环和设备退服告警。同样的,网络侧和核心环关联映射的数值化数字还是分别为1和3,设备退服告警关联映射的数值化数字为1。那么,按照网络位置、网络环链再到告警类型的顺序来组合该历史故障的特征向量得到B=[1,3,1]。即需要进行相似比较的两个故障对应的特征值组合的顺序要一致。
提取得到数据库中各个历史故障的特征向量之后,故障监控设备将上述提取得到的待处理故障的特征向量分别与各个历史故障的特征向量进行相似计算得到多个相似度。
可选的,待处理故障与历史故障之间的相似计算可以采用余弦相似度、词频-逆文本频率指数(term frequency–inverse document frequency,TF-IDF)等算法来计算。需要说明的是,用于待处理故障与历史故障之间的相似计算的算法很多,可以采用任意一种可行的相似计算算法,本方案对此不做限制。下面以余弦相似度算法为例介绍待处理故障与历史故障之间的相似计算。
余弦相似度算法的原理是将两个特征向量当做空间中的两个线段,这两个线段都是从原点出发,指向不同的方向,即该两个线段之间形成一个夹角,例如可以参见图3。图3中示例性给出了两个特征向量A和B的示意图,这两个向量从同一点出发,指向不同的方向,中 间的夹角为θ。这个夹角越小表明这两个特征向量的方向越相近,即表明这两个向量越相似。下面基于图3和前述的例子进行示例介绍。
引用前述的例子,假设待处理故障的特征向量为A=[1,3,0],某个历史故障的特征向量为B=[1,3,1]。首先,计算该两个特征向量之间夹角的余弦值的计算式如下:
Figure PCTCN2021078895-appb-000002
由于余弦值越接近1,就表明夹角的角度越接近0度,即就表明上述两个向量越相似。那么可以用两个向量的夹角的余弦值作为两个向量的相似度,可以说上述向量A和向量B的相似度为0.95或者也可以说是95%。由于向量A代表待处理故障,向量B代表某个历史故障,那么该待处理故障和该历史故障的相似度为95%。基于该算法可以计算得到待处理故障与数据库中各个历史故障之间的相似度得到多个相似度。
在另一种可能的实施方式中,由于数据库中的历史故障数量较多,如果每次都将待处理故障与数据库中所有的历史故障进行相似计算,那么需要耗费较大的计算资源和较多的时间,因此,为了节约资源和提高效率,可以从数据库的历史故障中筛选出一部分历史故障,然后只需计算待处理故障与筛选出的历史故障之间的相似度,从而节约了计算资源并提高了效率,可以更快地响应用户的需求。
可选的,在具体实施实例中,上述故障监控设备的数据库中存储着历史故障的信息,该信息还包括用户对相应历史故障的解决方案的评价和/或包括用户对该故障本身的评价,然后故障监控设备可以基于这些评价信息筛选出用于与待处理故障进行相似计算的历史故障。
例如,可以根据用户对相应历史故障的解决方案的评价来筛选,如果该评价指示该解决方案有用,能够较好的排除故障,那么可以选该评价对应的历史故障作为与待处理故障进行相似计算的历史故障。如果是解决方案评价较差的历史故障,则不用于与待处理故障进行相似计算。
又例如,可以根据用户对该故障本身的评价来筛选,如果该评价指示该故障是常见故障,那么可以选该评价对应的历史故障作为与待处理故障进行相似计算的历史故障。如果评价指示该故障不影响任何业务的处理,例如是用户侧中某些不用的端口或者设备出现了故障,那么该评价对应的故障可以不用于与待处理故障进行相似计算。
当然,也可以综合上述两种评价来筛选,具体不再赘述。
步骤205、根据上述多个相似度从上述多个历史故障中筛选出相似故障,并推荐所述相似故障。
在具体实施实例中,得到待处理故障与参与相似计算的历史故障之间的多个相似度之后,故障监控设备可以将这些相似度按从大到小的顺序排序,然后筛选排名前N个相似度所关联的历史故障推荐给用户,该N个历史故障可以称为该待处理故障的相似故障。下面示例性介绍两种推荐历史故障的方式。
方式一、该排名前N个相似度均大于预设的阈值。例如,假设预设的阈值为50%,假设计算得到的该多个相似度为95%、80%、65%、57%和20%,经过与阈值比较后,可以向用户推荐相似度为95%、80%、65%和57%相关联的历史故障。例如,可以参见图4。
图4示例性给出了推荐前N个历史故障的页面示意图。该推荐页面中可以包括相似度、故障级别、故障编号、故障类型、维护建议、故障状态和对应的创建时间等信息。当然该页面不限于这些信息,还可以包括其它的例如告警数量、影响业务数量和网络位置等信息,具 体包括的信息根据具体情况确定。图4中各个推荐故障给出的维护建议可以是基于对应历史故障的实际解决方案给出的建议。具体的维护建议可以通过响应于对查看控件401点击操作显示出来。当然,可选的,图4中也可以直接将维护建议的详情显示出来,不需要查看控件401。
可选的,图4中还可以将各个历史故障对应的实际解决方案显示出来,以便于用户参考。或者,故障监控设备也可以为用户提供推荐的历史故障的实际解决方案的查询入口,用户可以基于推荐的历史故障查询到这些历史故障的实际解决方案来参考借鉴,从而大大提高了待处理故障的处理效率。
此外,在图4中,在推荐页面还可以看到各个历史故障的推荐的原因,例如可以通过控件402将推荐原因显示在页面上,推荐原因可包括相似度分析和处理经验相似分析等部分的内容。
方式二,上述N的取值大小是预设的,例如,该N的取值可以预设为10、15或者20等。即将上述得到的多个相似度按从大到小的顺序排序,选取排名前N个例如前10个、前15个或者前20个相似度所关联的历史故障推荐给用户。具体N的取值本方案不做限制。
在其中一种可能的实施方式中,故障监控设备完成历史故障的推荐后,还可以接收用户对推荐的历史故障的评价信息,这些评价信息可以用于作为后续历史故障推荐的参考信息。
可选的,用户对推荐的历史故障的评价可以通过“点赞”或者“踩”等动作来实现。“点赞”表示用户对推荐的历史故障满意,给好评。而“踩”则表示用户对推荐的历史故障不满意,给差评。
可选的,用户对推荐的历史故障的评价也可以通过打分、给“星星”等方式来实现。对于打分的方式,其满分可以是10分,也可以是100分,分数越高,表示满意度越高。对于给“星星”的方式,可以是满星五颗星,给的星星的颗数越多,表示满意度越高。
当然,用户对推荐的历史故障的评价方式也可以是其它的方式,可以根据实际情况确定。
故障监控设备接收到用户的评价信息后,将这些评价信息与对应的历史故障关联,并计数,即记录历史故障对应好评和差评的个数。然后,在后续的历史故障推荐中,可以将评价较好的历史故障例如好评数较多的历史故障显示在推荐页面的前面,而评价较差的历史故障例如没有好评或者差评较多的历史故障可以显示在推荐页面的后面,甚至不推荐。示例性地,下面以评价方式为“点赞”的方式为例介绍,可以参见图5。
相比于图4,在图5所示的历史故障推荐页面中,还包括了对历史故障的点赞数501以及用于点赞的操作示意502。可以看到,点赞数越多的历史故障在推荐页面的显示位置就越前面。即故障监控设备按照推荐的历史故障的点赞数从多到少的顺序,将历史故障在推荐页面从上到下进行排列显示。此外,图5中用于点赞的操作示意502是用于用户点赞用的,故障监控设备可以从该用于点赞的操作示意502中接收到用户的点赞信息。
本申请实施例通过结合用户的评价来进行历史故障的推荐,能够结合用户的反馈为用户进行更精准的推荐,从而进一步提高待处理故障的处理效率。
在其中一种可能的实施方式中,故障监控设备还可以基于上述M个维度,从单个维度比较待处理故障和历史故障。然后,在用户需要了解单个维度的比较详情是,故障监控设备可以将单个维度比较的结果详情也显示在显示屏的界面,以供用户阅读参考。为了便于理解,下面举例说明。
以上述M个维度为网络位置、网络环链和告警类型这三个维度为例。假设提取到待处理 故障在网络位置、网络环链和告警类型维度的特征分别为网络侧、核心环和线路告警;提取到某个历史故障在网络位置、网络环链和告警类型维度的特征分别为网络侧、核心环和设备退服告警。那么,基于这三个维度并从单个维度比较该待处理故障和该历史故障的结果示例性地可以为:这两个故障都涉及网络侧和核心环,但是该待处理故障的告警类型涉及线路告警,而该历史故障的告警类型涉及设备退服告警。然后在用户需要的时候,将这些比较详情显示给用户,例如可以参见图6。
在图6中可以看到,待处理故障为编号为150672的故障,与待处理故障150672相似的故障为历史故障984。图6中示例性地给出了待处理故障150672的相关信息,例如故障类型、告警数量、影响到的业务类型、保护类型、影响到的业务数量以及创建该故障的时间等等。然后,在相似故障中,给出了相似的历史故障的故障编号、故障诊断维度、相似比较详情、历史故障的严重级别以及创建时间等。其中,故障诊断维度中的信息表明主要从网络位置、网络环链和告警类型这三个维度比较了待处理故障150672和历史故障984。相似比较详情中给出了涉及这三个维度的单维度比较详情,例如“都涉及网络侧和核心环,故障150672涉及线路告警,而故障984涉及设备退服告警”。
本申请实施例通过提供单维度的比较详情,有助于用户根据该详情快速找到解决待处理故障的方案,进一步提高解决效率。
为了便于理解本申请提供的相似故障推荐方法,下面结合图7进行示例性介绍。
图7所述为故障监控设备中用于提供相似故障推荐的软件架构示意图。在图7中,还示例性示出了基于该软件架构实现相似故障推荐的流程。可以看到,故障监控设备中包括相似故障推荐服务前端用户界面(user interface,UI)其中,该推荐服务可以包括推荐组件和数据库,推荐组件又可以包括特征处理单元、计算单元、排序单元和筛选单元等。数据库可以用于存储历史故障列表、历史故障的详情信息列表和推荐的历史故障列表。基于该软件架构实现相似故障推荐的流程包括但不限于如下步骤:
①特征处理单元从kafka组件中获取告警信息。
②然后,特征处理单元从数据库的历史故障列表和历史故障的详情信息列表查询到用于相似比较的历史故障及其详情信息。
③特征处理单元基于①和②步骤得到的信息处理得到待处理故障和历史故障的特征。
④特征处理单元将处理得到的待处理故障和历史故障的特征发送给计算单元。
⑤计算单元根据处理得到的特征计算待处理故障和历史故障的相似度。
⑥计算单元将计算得到的相似度发送给排序单元。
⑦排序单元按照从大到小的顺序对计算得到的相似度排序。
⑧排序单元将排序后的相似度发送给筛选单元。
⑨筛选单元从排序好的相似度中筛选出排名前N个相似度。
⑩筛选单元将该N个相似度关联的历史故障存储到数据库中,得到推荐的历史故障列表。
Figure PCTCN2021078895-appb-000003
然后,前端用户界面UI查询数据库中的推荐的历史故障列表,并将该列表的信息显示在用户界面。
上述步骤中的具体实现可以参见前述图2所述方法实施例中对应的描述,此处不再赘述。
综上所述,相比于现有的技术方案中从单维度比较以实现相似故障的推荐,本方案通过多维度的相似比较来实现相似故障的推荐,能够提高推荐的准确率,为解决待处理的故障提供有效的参考,从而提高了网络维护的效率。
上述主要对本申请实施例提供的相似故障推荐方法进行了介绍。可以理解的是,各个设备为了实现上述对应的功能,其包含了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,本申请能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
本申请实施例可以根据上述方法示例对故障监控设备进行功能模块的划分,例如,可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。需要说明的是,本申请实施例对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。
在采用对应各个功能划分各个功能模块的情况下,图8示出了设备的一种可能的逻辑结构示意图,该设备可以是上述的故障监控设备。该设备800包括获取单元801、处理单元802、计算单元803、筛选单元804和推荐单元805。
示例性的,获取单元801用于执行前述所示方法实施例中获取信息的步骤,例如用于执行获取告警信息和故障诊断信息的步骤。处理单元802用于执行前述所示方法实施例中根据M个维度的故障诊断信息处理得到待处理故障的特征的步骤。计算单元803用于执行前述所示方法实施例中相似计算的步骤。筛选单元804用于执行前述所示方法实施例中筛选的步骤,例如用于执行筛选相似的历史故障的步骤等。推荐单元805用于执行前述所示方法实施例中推荐的步骤,例如用于执行推荐相似的历史故障的步骤等。
可选的,该设备800还包括确定单元,用于执行前述所示方法实施例中确定M个维度的步骤。
可选的,该设备800还包括调整单元,用于执行前述所示方法实施例中调整维度的步骤。
可选的,该设备800还包括接收单元,用于执行前述所示方法实施例中接收评价信息的步骤。
图8所示设备中各个单元的具体操作以及有益效果可以参见上述图2所示方法实施例的描述,此处不再赘述。
图9所示为本申请提供的设备的一种可能的硬件结构示意图,该设备可以是上述方法实施例中所述的故障监控设备。该设备900包括:处理器901、存储器902和通信接口903。处理器901、通信接口903以及存储器902可以相互连接或者通过总线904相互连接。
示例性的,存储器902用于存储设备900的计算机程序和数据,存储器902可以包括但不限于是随机存储记忆体(random access memory,RAM)、只读存储器(read-only memory,ROM)、可擦除可编程只读存储器(erasable programmable read only memory,EPROM)或便携式只读存储器(compact disc read-only memory,CD-ROM)等。通信接口903用于支持设备900进行通信,例如接收或发送数据。
示例性的,处理器901可以是中央处理器单元、通用处理器、数字信号处理器、专用集成电路、现场可编程门阵列或者其他可编程逻辑器件、晶体管逻辑器件、硬件部件或者其任意组合。处理器也可以是实现计算功能的组合,例如包含一个或多个微处理器组合,数字信号处理器和微处理器的组合等等。处理器901可以用于读取上述存储器902中存储的程序, 执行上述图2以及可能的实施方式所述的方法中故障监控设备所做的操作。
本申请实施例还公开了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行以实现上述图2以及可能的实施方式所述的方法。
本申请实施例还公开了一种计算机程序产品,当所述计算机程序产品被计算机读取并执行时,上述图2以及可能的实施方式所述的方法将被执行。
本申请实施例还公开了一种计算机程序,当所述计算机程序在计算机上执行时,将会使所述计算机实现上述图2以及可能的实施方式所述的方法。
综上所述,相比于现有的技术方案中从单维度比较以实现相似故障的推荐,本方案通过多维度的相似比较来实现相似故障的推荐,能够提高推荐的准确率,为解决待处理的故障提供有效的参考,从而提高了网络维护的效率。
最后应说明的是:以上各实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述各实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的范围。

Claims (27)

  1. 一种相似故障推荐方法,其特征在于,包括:
    获取待处理故障的告警信息;
    根据所述告警信息获取M个维度的故障诊断信息,所述M个维度包括故障诊断的M个不同的角度,所述M为大于1的整数;
    根据所述M个维度的故障诊断信息处理得到所述待处理故障的特征;
    根据所述待处理故障的特征和多个历史故障的特征计算得到多个相似度,所述多个相似度表征所述待处理故障分别和所述多个历史故障的相似程度;
    根据所述多个相似度从所述多个历史故障中筛选出相似故障,并推荐所述相似故障。
  2. 根据权利要求1所述的方法,其特征在于,所述获取待处理故障的告警信息之后,所述根据所述告警信息获取M个维度的故障诊断信息之前,所述方法还包括:
    根据所述告警信息获取所述待处理故障所在网络的网络类型;
    根据所述网络类型确定所述M个维度。
  3. 根据权利要求1或2所述的方法,其特征在于,所述根据所述M个维度的故障诊断信息处理得到所述待处理故障的特征,包括:
    根据所述M个维度的故障诊断信息从所述M个维度中的每一个维度分别进行特征提取,得到所述每一个维度的特征值;
    将所述每一个维度的特征值组合得到所述待处理故障的特征,所述待处理故障的特征为一个特征向量。
  4. 根据权利要求1至3任一项所述的方法,其特征在于,所述根据所述告警信息获取M个维度的故障诊断信息之前,所述方法还包括:调整预设维度中的一个或多个维度得到所述M个维度。
  5. 根据权利要求1至4任一项所述的方法,其特征在于,所述根据所述多个相似度从所述多个历史故障中筛选出相似故障,并推荐所述相似故障,包括:
    根据所述多个相似度从所述多个历史故障中筛选出N个历史故障,并推荐所述N个历史故障,其中,所述N个历史故障为所述多个相似度中排名前N个相似度所关联的历史故障,所述多个相似度按照从大到小的顺序排列,所述N为正整数。
  6. 根据权利要求1至5任一项所述的方法,其特征在于,所述根据所述多个相似度从所述多个历史故障中筛选出相似故障之后,所述方法还包括:
    推荐所述相似故障与所述待处理故障的比较信息,其中,所述比较信息包括所述相似故障在所述M个维度中的每一个维度与所述待处理故障比较的情况。
  7. 根据权利要求1至6任一项所述的方法,其特征在于,所述多个历史故障为从数据库中筛选后得到的故障。
  8. 根据权利要求1至7任一项所述的方法,其特征在于,所述根据所述多个相似度从所述多个历史故障中筛选出相似故障,并推荐所述相似故障之后,还包括:
    接收对所述推荐的历史故障的评价信息,其中,所述评价信息用于作为后续历史故障推荐的参考信息。
  9. 一种设备,其特征在于,所述设备包括:
    获取单元,用于获取待处理故障的告警信息;
    所述获取单元,还用于根据所述告警信息获取M个维度的故障诊断信息,所述M个维度包括故障诊断的M个不同的角度,所述M为大于1的整数;
    处理单元,用于根据所述M个维度的故障诊断信息处理得到所述待处理故障的特征;
    计算单元,用于根据所述待处理故障的特征和多个历史故障的特征计算得到多个相似度,所述多个相似度表征所述待处理故障分别和所述多个历史故障的相似程度;
    筛选单元,用于根据所述多个相似度从所述多个历史故障中筛选出相似故障;
    推荐单元,用于推荐所述相似故障。
  10. 根据权利要求9所述的设备,其特征在于,所述获取单元,还用于在获取待处理故障的告警信息之后,根据所述告警信息获取M个维度的故障诊断信息之前,根据所述告警信息获取所述待处理故障所在网络的网络类型;
    所述设备还包括确定单元,用于根据所述网络类型确定所述M个维度。
  11. 根据权利要求9或10所述的设备,其特征在于,所述处理单元具体用于:
    根据所述M个维度的故障诊断信息从所述M个维度中的每一个维度分别进行特征提取,得到所述每一个维度的特征值;
    将所述每一个维度的特征值组合得到所述待处理故障的特征,所述待处理故障的特征为一个特征向量。
  12. 根据权利要求9至11任一项所述的设备,其特征在于,所述设备还包括调整单元,用于在所述获取单元根据所述告警信息获取M个维度的故障诊断信息之前,调整预设维度中的一个或多个维度得到所述M个维度。
  13. 根据权利要求9至12任一项所述的设备,其特征在于,所述筛选单元具体用于:根据所述多个相似度从所述多个历史故障中筛选出N个历史故障;
    所述推荐单元具体用于:推荐所述N个历史故障,其中,所述N个历史故障为所述多个相似度中排名前N个相似度所关联的历史故障,所述多个相似度按照从大到小的顺序排列,所述N为正整数。
  14. 根据权利要求9至13任一项所述的设备,其特征在于,所述推荐单元还用于在所述筛选单元根据所述多个相似度从所述多个历史故障中筛选出相似故障之后,推荐所述相似故障与所述待处理故障的比较信息,其中,所述比较信息包括所述相似故障在所述M个维度中的每一个维度与所述待处理故障比较的情况。
  15. 根据权利要求9至14任一项所述的设备,其特征在于,所述多个历史故障为从数据库中筛选后得到的故障。
  16. 根据权利要求9至15任一项所述的设备,其特征在于,所述设备还包括接收单元,用于在所述推荐单元推荐所述相似故障之后,接收对所述推荐的历史故障的评价信息,其中,所述评价信息用于作为后续历史故障推荐的参考信息。
  17. 一种设备,其特征在于,所述设备包括处理器、通信接口和存储器,其中,所述存储器用于存储计算机程序和/或数据,所述处理器用于执行所述存储器中存储的计算机程序,使得所述设备执行如下操作:
    获取待处理故障的告警信息;
    根据所述告警信息获取M个维度的故障诊断信息,所述M个维度包括故障诊断的M个不同的角度,所述M为大于1的整数;
    根据所述M个维度的故障诊断信息处理得到所述待处理故障的特征;
    根据所述待处理故障的特征和多个历史故障的特征计算得到多个相似度,所述多个相似度表征所述待处理故障分别和所述多个历史故障的相似程度;
    根据所述多个相似度从所述多个历史故障中筛选出相似故障,并推荐所述相似故障。
  18. 根据权利要求17所述的设备,其特征在于,所述获取待处理故障的告警信息之后,所述根据所述告警信息获取M个维度的故障诊断信息之前,所述设备还执行如下操作:
    根据所述告警信息获取所述待处理故障所在网络的网络类型;
    根据所述网络类型确定所述M个维度。
  19. 根据权利要求17或18所述的设备,其特征在于,所述根据所述M个维度的故障诊断信息处理得到所述待处理故障的特征,包括:
    根据所述M个维度的故障诊断信息从所述M个维度中的每一个维度分别进行特征提取,得到所述每一个维度的特征值;
    将所述每一个维度的特征值组合得到所述待处理故障的特征,所述待处理故障的特征为一个特征向量。
  20. 根据权利要求17至19任一项所述的设备,其特征在于,所述根据所述告警信息获取M个维度的故障诊断信息之前,所述设备还执行如下操作:调整预设维度中的一个或多个维度得到所述M个维度。
  21. 根据权利要求17至20任一项所述的设备,其特征在于,所述根据所述多个相似度从所述多个历史故障中筛选出相似故障,并推荐所述相似故障,包括:
    根据所述多个相似度从所述多个历史故障中筛选出N个历史故障,并推荐所述N个历史故障,其中,所述N个历史故障为所述多个相似度中排名前N个相似度所关联的历史故障,所述多个相似度按照从大到小的顺序排列,所述N为正整数。
  22. 根据权利要求21所述的设备,其特征在于,所述根据所述多个相似度从所述多个历史故障中筛选出相似故障之后,所述设备还执行如下操作:
    推荐所述相似故障与所述待处理故障的比较信息,其中,所述比较信息包括所述相似故障在所述M个维度中的每一个维度与所述待处理故障比较的情况。
  23. 根据权利要求17至22任一项所述的设备,其特征在于,所述多个历史故障为从数据库中筛选后得到的故障。
  24. 根据权利要求17至23任一项所述的设备,其特征在于,所述根据所述多个相似度从所述多个历史故障中筛选出相似故障,并推荐所述相似故障之后,所述设备还执行如下操作:
    接收对所述推荐的历史故障的评价信息,其中,所述评价信息用于作为后续历史故障推荐的参考信息。
  25. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行以实现权利要求1至8任意一项所述的方法。
  26. 一种计算机程序产品,其特征在于,当所述计算机程序产品被计算机读取并执行时,如权利要求1至8任意一项所述的方法将被执行。
  27. 一种计算机程序,其特征在于,当所述计算机程序在计算机上执行时,将会使所述计算机实现权利要求1至8任意一项所述的方法。
PCT/CN2021/078895 2020-03-18 2021-03-03 相似故障推荐方法及相关设备 WO2021185079A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP21771510.1A EP4102775A4 (en) 2020-03-18 2021-03-03 RECOMMENDATION PROCEDURES FOR SIMILAR FAULTS AND ASSOCIATED DEVICE
US17/946,788 US11757701B2 (en) 2020-03-18 2022-09-16 Method for recommending similar incident, and related device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010192545.2A CN113497716B (zh) 2020-03-18 2020-03-18 相似故障推荐方法及相关设备
CN202010192545.2 2020-03-18

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/946,788 Continuation US11757701B2 (en) 2020-03-18 2022-09-16 Method for recommending similar incident, and related device

Publications (1)

Publication Number Publication Date
WO2021185079A1 true WO2021185079A1 (zh) 2021-09-23

Family

ID=77771938

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/078895 WO2021185079A1 (zh) 2020-03-18 2021-03-03 相似故障推荐方法及相关设备

Country Status (4)

Country Link
US (1) US11757701B2 (zh)
EP (1) EP4102775A4 (zh)
CN (1) CN113497716B (zh)
WO (1) WO2021185079A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114978932A (zh) * 2022-05-20 2022-08-30 深信服科技股份有限公司 故障案例推荐方法、装置和计算可读存储介质
CN115909692A (zh) * 2022-10-25 2023-04-04 长威信息科技发展股份有限公司 一种高速公路报警事件的管理方法、平台、设备和介质
CN114978932B (zh) * 2022-05-20 2024-05-24 深信服科技股份有限公司 故障案例推荐方法、装置和计算可读存储介质

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114119989B (zh) * 2021-11-29 2023-08-11 北京百度网讯科技有限公司 图像特征提取模型的训练方法、装置及电子设备

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170308427A1 (en) * 2016-04-26 2017-10-26 Nec Laboratories America, Inc. System Fault Diagnosis via Efficient Temporal and Dynamic Historical Fingerprint Retrieval
CN108090567A (zh) * 2018-01-19 2018-05-29 国家电网公司 电力通信系统故障诊断方法及装置
CN110609759A (zh) * 2018-06-15 2019-12-24 华为技术有限公司 一种故障根因分析的方法及装置
CN111756560A (zh) * 2019-03-26 2020-10-09 中移(苏州)软件技术有限公司 一种数据处理方法、装置及存储介质

Family Cites Families (40)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9256846B2 (en) * 2012-05-16 2016-02-09 Honeywell International Inc. System and method for performance monitoring of a population of equipment
US9241044B2 (en) * 2013-08-28 2016-01-19 Hola Networks, Ltd. System and method for improving internet communication by using intermediate nodes
US10653368B1 (en) * 2013-09-09 2020-05-19 Cerner Innovation, Inc. Determining when to emit an alarm
US10305758B1 (en) * 2014-10-09 2019-05-28 Splunk Inc. Service monitoring interface reflecting by-service mode
US10505825B1 (en) * 2014-10-09 2019-12-10 Splunk Inc. Automatic creation of related event groups for IT service monitoring
US11200130B2 (en) * 2015-09-18 2021-12-14 Splunk Inc. Automatic entity control in a machine data driven service monitoring system
US11159545B2 (en) * 2015-04-10 2021-10-26 Cofense Inc Message platform for automated threat simulation, reporting, detection, and remediation
CN107770797A (zh) * 2016-08-17 2018-03-06 中国移动通信集团内蒙古有限公司 一种无线网络告警管理的关联分析方法及系统
CN106404441B (zh) 2016-09-22 2018-11-06 宁波大学 一种基于非线性相似度指标的故障分类诊断方法
US11586166B2 (en) * 2016-11-11 2023-02-21 Recon Pillar, Llc Systems and methods for providing monitoring and response measures in connection with remote sites
JP6515937B2 (ja) * 2017-02-08 2019-05-22 横河電機株式会社 イベント解析装置、イベント解析システム、イベント解析方法、イベント解析プログラム、および記録媒体
CN107248927B (zh) * 2017-05-02 2020-06-09 华为技术有限公司 故障定位模型的生成方法、故障定位方法和装置
US10685359B2 (en) * 2017-05-05 2020-06-16 Servicenow, Inc. Identifying clusters for service management operations
CN112398680B (zh) * 2017-07-31 2023-12-19 华为技术有限公司 一种故障定界方法及设备
CA3072045A1 (en) * 2017-08-02 2019-02-07 Strong Force Iot Portfolio 2016, Llc Methods and systems for detection in an industrial internet of things data collection environment with large data sets
IL254573A0 (en) * 2017-09-18 2017-11-30 Cyber Sepio Systems Ltd Install a method and computer software product for securing a local network from threats posed by foreign or hostile accessories
CN109995561B (zh) * 2017-12-30 2022-03-29 中国移动通信集团福建有限公司 通信网络故障定位的方法、装置、设备及介质
US11538044B2 (en) * 2018-05-18 2022-12-27 Nice Ltd. System and method for generation of case-based data for training machine learning classifiers
CN108830745B (zh) * 2018-06-29 2022-04-12 国网上海市电力公司 基于监控信息的电网连锁故障诊断、预警、评估系统
US20210272684A1 (en) * 2018-07-09 2021-09-02 Koninklijke Philips N.V. Reducing redundant alarms
US10922493B1 (en) * 2018-09-28 2021-02-16 Splunk Inc. Determining a relationship recommendation for a natural language request
US11528231B2 (en) * 2018-11-19 2022-12-13 Cisco Technology, Inc. Active labeling of unknown devices in a network
US11102236B2 (en) * 2018-11-19 2021-08-24 Cisco Technology, Inc. Systems and methods for remediating internet of things devices
US11170761B2 (en) * 2018-12-04 2021-11-09 Sorenson Ip Holdings, Llc Training of speech recognition systems
US11140038B2 (en) * 2018-12-12 2021-10-05 Level 3 Communications, Llc Systems and methods for network device management using device clustering
CN109902153B (zh) * 2019-04-02 2020-11-06 杭州安脉盛智能技术有限公司 基于自然语言处理和案例推理的设备故障诊断方法及系统
US11610136B2 (en) * 2019-05-20 2023-03-21 Kyndryl, Inc. Predicting the disaster recovery invocation response time
CN110309009B (zh) * 2019-05-21 2022-05-13 北京云集智造科技有限公司 基于情境的运维故障根因定位方法、装置、设备及介质
US11115432B2 (en) * 2019-07-08 2021-09-07 Servicenow, Inc. Multi-application recommendation engine for a remote network management platform
US11562170B2 (en) * 2019-07-15 2023-01-24 Microsoft Technology Licensing, Llc Modeling higher-level metrics from graph data derived from already-collected but not yet connected data
WO2021033274A1 (ja) * 2019-08-20 2021-02-25 日本電信電話株式会社 パターン抽出およびルール生成装置、方法およびプログラム
US11513935B2 (en) * 2019-08-30 2022-11-29 Dell Products L.P. System and method for detecting anomalies by discovering sequences in log entries
CN110752942B (zh) * 2019-09-06 2021-09-17 平安科技(深圳)有限公司 告警信息的决策方法、装置、计算机设备及存储介质
CN110598645B (zh) * 2019-09-17 2023-05-23 北京西骏数据科技股份有限公司 一种快速修复信息系统的故障和风险的方法
US11468365B2 (en) * 2019-09-30 2022-10-11 Amazon Technologies, Inc. GPU code injection to summarize machine learning training data
CN110704231A (zh) * 2019-09-30 2020-01-17 深圳前海微众银行股份有限公司 一种故障处理方法及装置
US20210224676A1 (en) * 2020-01-17 2021-07-22 Microsoft Technology Licensing, Llc Systems and methods for distributed incident classification and routing
US11204824B1 (en) * 2020-06-19 2021-12-21 Accenture Global Solutions Limited Intelligent network operation platform for network fault mitigation
US11394629B1 (en) * 2020-12-11 2022-07-19 Amazon Technologies, Inc. Generating recommendations for network incident resolution
US11663329B2 (en) * 2021-03-09 2023-05-30 International Business Machines Corporation Similarity analysis for automated disposition of security alerts

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170308427A1 (en) * 2016-04-26 2017-10-26 Nec Laboratories America, Inc. System Fault Diagnosis via Efficient Temporal and Dynamic Historical Fingerprint Retrieval
CN108090567A (zh) * 2018-01-19 2018-05-29 国家电网公司 电力通信系统故障诊断方法及装置
CN110609759A (zh) * 2018-06-15 2019-12-24 华为技术有限公司 一种故障根因分析的方法及装置
CN111756560A (zh) * 2019-03-26 2020-10-09 中移(苏州)软件技术有限公司 一种数据处理方法、装置及存储介质

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
LU, BAO ET AL.: "Fault Diagnosis of Power Equipment Technology Based on Case-Based Reasoning and Event Sequence Diagram", ELECTRONIC TECHNOLOGY & SOFTWARE ENGINEERING, 15 February 2020 (2020-02-15), pages 73 - 74, XP055852102 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114978932A (zh) * 2022-05-20 2022-08-30 深信服科技股份有限公司 故障案例推荐方法、装置和计算可读存储介质
CN114978932B (zh) * 2022-05-20 2024-05-24 深信服科技股份有限公司 故障案例推荐方法、装置和计算可读存储介质
CN115909692A (zh) * 2022-10-25 2023-04-04 长威信息科技发展股份有限公司 一种高速公路报警事件的管理方法、平台、设备和介质
CN115909692B (zh) * 2022-10-25 2024-02-06 长威信息科技发展股份有限公司 一种高速公路报警事件的管理方法、平台、设备和介质

Also Published As

Publication number Publication date
US20230017653A1 (en) 2023-01-19
CN113497716A (zh) 2021-10-12
EP4102775A1 (en) 2022-12-14
US11757701B2 (en) 2023-09-12
CN113497716B (zh) 2023-03-10
EP4102775A4 (en) 2023-08-02

Similar Documents

Publication Publication Date Title
CN110471916B (zh) 数据库的查询方法、装置、服务器及介质
US11757701B2 (en) Method for recommending similar incident, and related device
Qu et al. A new dependency and correlation analysis for features
US11108619B2 (en) Service survivability analysis method and apparatus
US20170269985A1 (en) Method and apparatus for failure classification
US10944473B2 (en) Method, apparatus and device for predicting fault of optical module
WO2022111265A1 (zh) 一种信息告警方法、设备及存储介质
CN107204894A (zh) 网络业务质量的监控方法及装置
US10708155B2 (en) Systems and methods for managing network operations
US20230062588A1 (en) Recommending a candidate runbook based on a relevance of the results of the candidate runbook to remediation of an event
EP4024765A1 (en) Method and apparatus for extracting fault propagation condition, and storage medium
GB2518151A (en) Network anomaly detection
CN112231187B (zh) 微服务异常分析方法及装置
EP4024234A1 (en) Network management method and network management system
CN113791897B (zh) 一种农信系统的服务器基线检测报告的展现方法及系统
CN114679335A (zh) 电力监控系统网络安全风险评估训练、评估方法及设备
WO2020135894A2 (zh) 限制短路电流的输电线路投切组合方案建立方法及装置
CN113965445B (zh) 一种质差根因的定位方法、装置、计算机设备和存储介质
CN113886130A (zh) 一种处理数据库故障的方法,装置及介质
CN114244685A (zh) 一种云服务中心访问异常处置系统
EP4277223A1 (en) Data processing method, apparatus and system, and storage medium
US11561880B2 (en) Method to analyze impact of a configuration change to one device on other connected devices in a data center
WO2024066331A1 (zh) 网络异常检测方法、装置、电子设备及存储介质
US20230179486A1 (en) Method and apparatus related to network analysis
US10031788B2 (en) Request profile in multi-threaded service systems with kernel events

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21771510

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2021771510

Country of ref document: EP

Effective date: 20220906

NENP Non-entry into the national phase

Ref country code: DE