CN111241145A - Self-healing rule mining method and device based on big data - Google Patents

Self-healing rule mining method and device based on big data Download PDF

Info

Publication number
CN111241145A
CN111241145A CN201811437497.8A CN201811437497A CN111241145A CN 111241145 A CN111241145 A CN 111241145A CN 201811437497 A CN201811437497 A CN 201811437497A CN 111241145 A CN111241145 A CN 111241145A
Authority
CN
China
Prior art keywords
data
rule
mining
sample data
strong association
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811437497.8A
Other languages
Chinese (zh)
Inventor
王璇
舒锋
戴安妮
竺士杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Group Zhejiang Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Group Zhejiang Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Group Zhejiang Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN201811437497.8A priority Critical patent/CN111241145A/en
Publication of CN111241145A publication Critical patent/CN111241145A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/40Business processes related to the transportation industry
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Business, Economics & Management (AREA)
  • Health & Medical Sciences (AREA)
  • Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention provides a self-healing rule mining method and device based on big data, wherein the method comprises the following steps: collecting service data, performance data and log data as sample data; preprocessing the sample data, and converting the sample data into a form suitable for data mining; mining association rules of the preprocessed sample data to obtain a certain number of strong association rules, wherein each strong association rule at least comprises business data and performance data; and verifying the strong association rule, and if the strong association rule is bound to correspond to the occurrence of an abnormal phenomenon, taking the strong association rule as a self-healing rule. The embodiment of the invention greatly reduces the dependence on human experience, reduces the energy input of professionals and saves a great amount of manpower.

Description

Self-healing rule mining method and device based on big data
Technical Field
The invention relates to the technical field of data mining, in particular to a self-healing rule mining method and device based on big data.
Background
In order to adapt to the continuous increase of the number of users and the continuous expansion of service types in the telecommunication industry, the design of a service acceptance system is also developed, and the characteristics of complex architecture relationship, large node scale, frequent system updating and the like are presented. In the face of such a complex, large and variable "super" system, the operation and maintenance personnel still need to maintain the goal of abnormally fast recovery, which is a great challenge. How to fix a position the slight anomaly fast and solve fast in complicated, huge, changeable system, the existing mainstream scheme is realized through the operation and maintenance automation mode, what the automation mode to this kind of condition mainly adopted is the self-healing means, the self-healing process relies on the automated flow drive, wherein the rule basis of judging the self-healing is the kernel of whole self-healing, how to formulate the current common solution of rule as follows:
the first scheme is as follows: and introducing an industry general rule, and generating a self-healing rule based on industry recognized operation index information, such as the operation index CPU utilization rate and the memory utilization rate of hardware. However, only rule indexes of general devices such as hosts, networks and the like are widely accepted in the industry at present, and other abnormal points beyond the scope are blind areas.
Scheme II: the most traditional method for converting the rules by human brain conversion rules is to summarize the rules by human experience based on human experiences, and mainly depends on the historical experience, technical ability and summarization and induction ability of personnel. However, the scheme has strong dependence on personnel ability, and the rule of human brain transformation is relatively simple, and as the system becomes more complex and huge, the yield efficiency of the rule precipitation of human brain transformation is low from the perspective of the rule yield efficiency, and the pace of system development cannot be kept up.
The third scheme is as follows: the fault recovery method adopts a scheme of individual rule design, after the fault is recovered, the fault occurrence and processing process is combed once again, the fault root cause is found out, the fault characteristics are extracted aiming at the fault, the fault characteristics are solidified into self-healing rules, and the same fault is avoided. According to the scheme, other problems beyond the fault range cannot be radiated, the individuality of each rule enables the rule to be incapable of quickly adapting to system changes, the rule is easy to become invalid, and the accuracy of the rule cannot be guaranteed along with the change of the system from the using effect of the rule.
In summary, the existing technical solutions all have serious disadvantages, and have been unable to adapt to the self-healing requirement of current system maintenance in terms of rule application range, rule output efficiency, rule using effect, and the like.
Disclosure of Invention
The present invention provides a big data based self-healing rule mining method and apparatus that overcomes or at least partially solves the above-mentioned problems.
In a first aspect, an embodiment of the present invention provides a self-healing rule mining method, including:
collecting service data, performance data and log data as sample data;
preprocessing the sample data, and converting the sample data into a form suitable for data mining;
mining association rules of the preprocessed sample data to obtain a certain number of strong association rules, wherein each strong association rule at least comprises business data and performance data;
verifying the strong association rule, and if the strong association rule is bound to correspond to the abnormal phenomenon, taking the strong association rule as a self-healing rule
In a second aspect, an embodiment of the present invention provides an excavating device for self-healing rules, including:
the sample data acquisition module is used for acquiring the service data, the performance data and the log data as sample data;
the preprocessing module is used for preprocessing the sample data and converting the sample data into a form suitable for data mining;
the association rule mining module is used for mining association rules of the preprocessed sample data to obtain a certain number of strong association rules, and each strong association rule at least comprises business data and performance data;
and the verification module is used for verifying the strong association rule, and if the strong association rule is bound to correspond to the occurrence of an abnormal phenomenon, the strong association rule is used as a self-healing rule.
In a third aspect, an embodiment of the present invention provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and the processor implements the steps of the method provided in the first aspect when executing the program.
In a fourth aspect, an embodiment of the present invention provides a non-transitory computer readable storage medium, on which a computer program is stored, which when executed by a processor, implements the steps of the method as provided in the first aspect.
The self-healing rule mining method and device based on big data provided by the embodiment of the invention have the characteristics of diversification of collected data information by collecting performance data, log data and service data, are not limited to basic data common to several industries any more, incorporate the service data, mine loading of more data from an association rule algorithm, convert the data into more rule forms to cope with abnormity, break through the original limitation of rule construction, and enable the rule application range to be wider, in addition, sample data is mined by the association rule mining algorithm, the whole process is highly automatic, the rule output efficiency is remarkably improved, the dependence on human experience is greatly reduced, the energy input of professionals is reduced, a large amount of manpower is saved, in addition, the high matching with the actual requirement is realized by verifying the obtained strong association rule, the system abnormity is judged more accurately, and the rules are always in a training and optimizing state along with the continuous change of the system, so that the defect that the rules need to be manually maintained is overcome.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
Fig. 1 is a schematic flow chart of a self-healing rule mining method according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of an excavating device for self-healing rules according to an embodiment of the present invention;
fig. 3 is a schematic physical structure diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In order to overcome the above problems in the prior art, an embodiment of the present invention provides a self-healing rule mining method, and the inventive concept is as follows: the service data, the performance data and the log data are brought into a mining range, basic data common to several industries is not limited, the association among the data is mined through association rules, the mined strong association rules are verified, whether the strong association rules are necessarily associated with abnormal phenomena or not is judged, and self-healing rules are obtained. The embodiment of the invention enables the construction of the self-healing rule to break through the original limitation, has wider application range, realizes high matching with the actual requirement, has more accurate judgment on the system abnormity, and simultaneously, the self-healing rule is always in the excavation state along with the continuous change of the system, thereby avoiding the defect that the rule needs manual maintenance.
Fig. 1 is a schematic flow chart of a self-healing rule mining method provided in an embodiment of the present invention, as shown in the figure, including:
s101, collecting service data, performance data and log data as sample data.
It should be noted that, in the embodiments of the present invention, service data is collected from a service monitoring system, performance data is collected from an application host/virtual machine, and log data is collected from a log management center. Table 1 is a table of a plurality of key indexes in sample data acquired in the embodiment of the present invention, and as shown in table 1, the embodiment of the present invention is divided into three acquisition types, which are log data, performance data, and service data, respectively, each acquisition type has a plurality of index items, each index item is composed of a plurality of index sub-items, and each index sub-item is represented by a specific index value. For example, the log data includes an index item: TOP5 error, representing the most frequently occurring 5 errors, includes a key for these 5 types of error messages, characterized for each type of error message by the number of occurrences of the key for that error message per minute. It can be understood that each index item, each index sub item, and each index value shown in table 1 are only a part of sample data collected in the mining method according to the embodiment of the present invention. When sampling sample data, the embodiment of the invention fully considers the increase of the user scale and the data scale, prepares for data asset accumulation, realizes multiple data sources through methods such as a client, program code insertion and the like, collects the data in a full amount through multiple methods, and collects enough and comprehensive sample data throughout the whole life cycle of a product used by a user.
TABLE 1 Table of key indicators in sample data
Figure BDA0001884118110000051
S102, preprocessing the sample data, and converting the sample data into a form suitable for data mining.
Data preprocessing is an important part of the mining process of the embodiment of the invention, and clean, accurate and concise data must be provided in order to mine rich rules. The preprocessing process in the embodiment of the invention can improve the accuracy, integrity and consistency of data by performing operations such as data cleaning, data reduction, data transformation and the like on the collected sample data, and is an important step after the embodiment of the invention starts.
S103, mining association rules of the preprocessed sample data to obtain a certain number of strong association rules, wherein each strong association rule at least comprises business data and performance data.
It should be understood that the self-healing process includes a self-healing rule and a self-healing operation, the self-healing rule indicates a precondition for performing the self-healing operation, for example, an index value a in the service data is abnormal, an error keyword B appears in the log, and a performance data C appears in the log, which is a self-healing rule, and after the self-healing rule appears, the base station performs a task of cell switching out, which is a self-healing operation. Strong association rules are implications in the form of X → Y, where X and Y are referred to as the antecedent or left-hand-side (LHS) and successor (RHS) of the association rule, respectively. Wherein, the rule XY is related, and the support degree and the trust degree exist. The embodiment of the invention converts the self-healing rule into the strong association rule for mining, and the mining of the self-healing rule can be realized by adopting the existing association rule mining method. The mining process of the association rules mainly comprises two stages: in the first stage, all high frequency item sets (frequency items) are found from the data set, and in the second stage, Association Rules (Association Rules) are generated from the high frequency item sets. In the embodiment of the present invention, the material set is a set composed of sample data, and the high-frequency project group is a project group including at least service data and performance data, and it can be understood that the service data and the performance data in the high-frequency project group are both represented by the index sub-item and the corresponding index value. It should be understood that the number of the service data and the performance data in the self-healing rule is not particularly limited.
And S104, verifying the strong association rule, and if the strong association rule is bound to correspond to the abnormal phenomenon, taking the strong association rule as a self-healing rule.
Specifically, after a strong association rule is acquired, the strong association rule needs to be verified, the verification is completed through simulation test, when an abnormal phenomenon occurs, such as a front report 503 error, whether an index sub-item in the strong association rule reaches a corresponding index value is simulated and collected, and if the index value reaches the index value, the 503 error is necessarily reported, the strong association rule is used as a self-healing rule.
The embodiment of the invention ensures that the acquired data information has diversified characteristics by acquiring the performance data, the log data and the service data, is not limited to basic data common to several industries any more, incorporates the service data, excavates more data from the association rule algorithm for loading, converts the data into more rule forms for dealing with the abnormity, ensures that the rule construction breaks through the original limitation and ensures that the rule application range is wider, in addition, the association rule excavation algorithm is used for excavating the sample data, the whole process is highly automatic, the rule output efficiency is obviously improved, the dependence on human experience is greatly reduced, meanwhile, the energy input of professionals is reduced, a great amount of manpower is saved, in addition, the high matching with the actual requirement is realized by verifying the acquired strong association rule, the judgment on the system abnormity is more accurate and simultaneously along with the continuous change of the system, the rules are always in a training and optimizing state, and the defect that the rules need to be manually maintained is overcome.
On the basis of the foregoing embodiments, as an optional embodiment, the preprocessing the sample data further includes: and labeling the acquired index value through stream processing, and storing the sample data in a corresponding database table according to the type of labeling.
Correspondingly, the mining of the association rule is performed on the preprocessed sample data, and specifically comprises the following steps:
according to the label determined by the user, extracting sample data from the database table corresponding to the determined label to carry out association rule mining.
Specifically, the tags of the embodiment of the present invention may be system changes, time intervals, importance levels, and the like, and through tagging, when data mining is performed, the purpose of mining sample data with specific tags can be achieved, so that the mined self-healing rules are more targeted.
On the basis of the above embodiments, as an alternative embodiment, the association rule mining algorithm is Apriori algorithm.
Specifically, the Apriori algorithm is a frequent item set algorithm for mining association rules, and the core idea is to mine a frequent item set through two stages of candidate set generation and downward closed detection of plots. Apriori algorithm searches candidate item set by using a level-wise (level-wise) method, and finds frequent item set by limiting candidate generation, and the calculation formula is as follows:
Figure BDA0001884118110000071
Figure BDA0001884118110000072
where x and y are disjoint sets of terms, i.e., the strength of an association rule can be defined in terms of its support and confidence measures, both support(s) and confidence (c), as N is the total number of transactions in the historical time period, and σ (x ∪ y) is the support count, which represents the number of times x and y occur simultaneously in N transactions.
The flow of the algorithm is generally as follows:
firstly, finding out all frequent item sets, wherein the occurrence frequency of the frequent item sets is at least as same as the predefined minimum support degree;
secondly, generating strong association rules from the frequent item set, wherein the rules must meet the minimum support degree and the minimum credibility;
and thirdly, generating a desired rule by using the frequent item set, and generating all rules only containing items of the set, wherein the right part of each rule only has one item, and the definition of the rule in the middle is adopted. Once these rules are generated, only those rules that are greater than a given minimum confidence are left. To generate all frequency sets, a recursive approach is used.
On the basis of the foregoing embodiments, as an optional embodiment, the strong association rule is verified, specifically, the strong association rule is verified by using a random forest (random forest) model
The random forest model refers to a classifier that trains and predicts a sample by using a plurality of trees. In general, the processing flow of the random forest model is as follows:
1. generating n samples from the sample set by means of resampling;
2. assuming that the feature number of the samples is a, selecting k features in a for n samples, and obtaining an optimal segmentation point by establishing a decision tree;
3. and repeating the steps m times to generate m decision trees.
In the embodiment of the present invention, the sample set is a frequent item set, the sample is each item in the frequent item set, the characteristic is an index sub-item, the optimal division point is a specific value, for example, the CPU utilization is greater than 60%, wherein the CPU utilization is an index sub-item and is also a characteristic, greater than 60% is the optimal division point, and the decision tree is a process of performing simulation test on the frequent item set to obtain a result, which can be understood as a generated strong association rule.
On the basis of the foregoing embodiments, as an optional embodiment, the sample data is preprocessed, specifically:
performing data cleaning on the sample data, including: deleting repeated data and irrelevant data, smoothing noise data, and interpolating abnormal data and missing data.
Specifically, in the embodiment of the present invention, noise data is smoothed by using a binning method, where a "neighbor" (surrounding value) is considered to smoothly store a value of the data, a "bin depth" indicates that there is the same number of data in different bins, and a "bin width" indicates a value range of each bin value. The box separation method can remove noise, discretize continuous data and increase granularity. The method for separating the boxes in the embodiment of the invention can be a sampling equal-depth box separating method, an equal-width box separating method, a minimum entropy method or a user-defined interval method.
And performing dimension reduction processing on the sample data after data cleaning, and discarding the sample data.
It should be noted that the effect of the dimension reduction processing is to reduce the data after cleaning, and on the premise of keeping the original appearance of the data as much as possible, the time for data exchange and subsequent data mining is reduced by reducing the data volume. Dimension reduction is to reduce the number of features of data, discard unimportant features, and describe data with as few key features as possible.
And converting the sample data left after the dimension reduction processing into a form suitable for data mining in a smooth aggregation and data generalization mode.
Specifically, the smoothing process may be a smoothing by average value, a smoothing by boundary value, and a smoothing by median value. For real data, the data is transformed by conceptual layering and discretization of the data.
Fig. 2 is a schematic structural diagram of an excavating device according to a self-healing rule provided in an embodiment of the present invention, and as shown in fig. 2, the excavating device includes: a sample data acquisition module 201, a preprocessing module 202, an association rule mining module 203 and a verification module 204, wherein:
the sample data acquiring module 201 is configured to acquire service data, performance data, and log data as sample data.
It should be noted that the sample data acquisition module in the embodiment of the present invention collects service data from the service monitoring system, collects performance data from the application host/virtual machine, and collects log data from the log management center. The method fully considers the increase of the user scale and the data scale, prepares for data asset accumulation, realizes multiple data sources through methods such as a client and program code insertion, collects the data in a full amount by multiple methods, and collects enough and comprehensive sample data throughout the whole life cycle of a product used by a user.
The preprocessing module 202 is configured to preprocess the sample data and convert the sample data into a form suitable for data mining.
Data preprocessing is an important part of the mining process of the embodiment of the invention, and clean, accurate and concise data must be provided in order to mine rich rules. The preprocessing module in the embodiment of the invention can improve the accuracy, integrity and consistency of data by performing operations such as data cleaning, data reduction, data transformation and the like on the acquired sample data, and is an important module in the embodiment of the invention.
The association rule mining module 203 is configured to perform association rule mining on the preprocessed sample data to obtain a certain number of strong association rules, where each strong association rule at least includes service data and performance data.
It should be understood that the self-healing process includes a self-healing rule and a self-healing operation, the self-healing rule indicates a precondition for performing the self-healing operation, for example, an index value a in the service data is abnormal, an error keyword B appears in the log, and a performance data C appears in the log, which is a self-healing rule, and after the self-healing rule appears, the base station performs a task of cell switching out, which is a self-healing operation. Strong association rules are implications in the form of X → Y, where X and Y are referred to as the antecedent or left-hand-side (LHS) and successor (RHS) of the association rule, respectively. Wherein, the rule XY is related, and the support degree and the trust degree exist. The embodiment of the invention converts the self-healing rule into the strong association rule for mining, and the mining of the self-healing rule can be realized by adopting the existing association rule mining method. The mining process of the association rules mainly comprises two stages: in the first stage, all high frequency item sets (frequency items) are found from the data set, and in the second stage, Association Rules (Association Rules) are generated from the high frequency item sets. In the embodiment of the present invention, the material set is a set composed of sample data, and the high-frequency project group is a project group including at least service data and performance data, and it can be understood that the service data and the performance data in the high-frequency project group are both represented by the index sub-item and the corresponding index value. It should be understood that the number of the service data and the performance data in the self-healing rule is not particularly limited.
The verification module 204 is configured to verify the strong association rule, and if the strong association rule inevitably corresponds to an abnormal phenomenon, the strong association rule is used as a self-healing rule.
Specifically, after a strong association rule is acquired, the strong association rule needs to be verified through a verification module, the verification is completed through simulation testing, when an abnormal phenomenon occurs, such as a front report 503 error, whether an index sub-item in the strong association rule reaches a corresponding index value is simulated and collected, and if the index value reaches the index value, the front report 503 error is inevitable, the strong association rule is used as a self-healing rule.
The mining device provided in the embodiment of the present invention specifically executes the flows of the mining method embodiments, and please refer to the contents of the mining method embodiments in detail, which are not described herein again. The embodiment of the invention ensures that the acquired data information has diversified characteristics by acquiring the performance data, the log data and the service data, is not limited to basic data common to several industries any more, incorporates the service data, excavates more data from the association rule algorithm for loading, converts the data into more rule forms for dealing with the abnormity, ensures that the rule construction breaks through the original limitation and ensures that the rule application range is wider, in addition, the association rule excavation algorithm is used for excavating the sample data, the whole process is highly automatic, the rule output efficiency is obviously improved, the dependence on human experience is greatly reduced, meanwhile, the energy input of professionals is reduced, a great amount of manpower is saved, in addition, the high matching with the actual requirement is realized by verifying the acquired strong association rule, the judgment on the system abnormity is more accurate and simultaneously along with the continuous change of the system, the rules are always in a training and optimizing state, and the defect that the rules need to be manually maintained is overcome.
On the basis of the above embodiments, the preprocessing module of the embodiment of the present invention includes: the data processing system comprises a data cleaning unit, a data reduction unit and a change unit, wherein:
the data cleaning unit is used for performing data cleaning on the sample data, and comprises: deleting repeated data and irrelevant data, smoothing noise data, and interpolating abnormal data and missing data.
Specifically, the data cleaning unit performs smoothing processing on the noise data by using a binning method, in which values of the stored data are smoothed by considering "neighbors" (surrounding values), the depth of a bin indicates that the same number of data exist in different bins, and the width of a bin indicates a value range of each bin value. The box separation method can remove noise, discretize continuous data and increase granularity. The method for separating the boxes in the embodiment of the invention can be a sampling equal-depth box separating method, an equal-width box separating method, a minimum entropy method or a user-defined interval method.
The data reduction unit is used for carrying out dimension reduction processing on the sample data after data cleaning and discarding the sample data.
The data reduction unit is used for reducing the cleaned data, and reducing the time of data exchange and subsequent data mining by reducing the data volume on the premise of keeping the original appearance of the data as much as possible. Dimension reduction is to reduce the number of features of data, discard unimportant features, and describe data with as few key features as possible.
And the change unit is used for converting the residual sample data after the dimension reduction processing into a form suitable for data mining in a smooth aggregation and data generalization mode.
Specifically, the variation unit may perform smoothing by an average value, smoothing by a boundary value, and smoothing by a median value at the time of the smoothing processing. For real data, the data is transformed by conceptual layering and discretization of the data.
Fig. 3 is a schematic entity structure diagram of an electronic device according to an embodiment of the present invention, and as shown in fig. 3, the electronic device may include: a processor (processor)310, a communication Interface (communication Interface)320, a memory (memory)330 and a communication bus 340, wherein the processor 310, the communication Interface 320 and the memory 330 communicate with each other via the communication bus 340. The processor 310 may invoke a computer program stored on the memory 330 and executable on the processor 310 to perform the mining methods provided by the various embodiments described above, including, for example: collecting service data, performance data and log data as sample data; preprocessing the sample data, and converting the sample data into a form suitable for data mining; mining association rules of the preprocessed sample data to obtain a certain number of strong association rules, wherein each strong association rule at least comprises business data and performance data; and verifying the strong association rule, and if the strong association rule is bound to correspond to the occurrence of an abnormal phenomenon, taking the strong association rule as a self-healing rule.
In addition, the logic instructions in the memory 330 may be implemented in the form of software functional units and stored in a computer readable storage medium when the software functional units are sold or used as independent products. Based on such understanding, the technical solutions of the embodiments of the present invention may be essentially implemented or make a contribution to the prior art, or may be implemented in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the methods described in the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
Embodiments of the present invention further provide a non-transitory computer-readable storage medium, on which a computer program is stored, where the computer program is implemented to perform the mining method provided in the foregoing embodiments when executed by a processor, and the mining method includes: collecting service data, performance data and log data as sample data; preprocessing the sample data, and converting the sample data into a form suitable for data mining; mining association rules of the preprocessed sample data to obtain a certain number of strong association rules, wherein each strong association rule at least comprises business data and performance data; and verifying the strong association rule, and if the strong association rule is bound to correspond to the occurrence of an abnormal phenomenon, taking the strong association rule as a self-healing rule.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (9)

1. A self-healing rule mining method is characterized by comprising the following steps:
collecting service data, performance data and log data as sample data;
preprocessing the sample data, and converting the sample data into a form suitable for data mining;
mining association rules of the preprocessed sample data to obtain a certain number of strong association rules, wherein each strong association rule at least comprises business data and performance data;
and verifying the strong association rule, and if the strong association rule is bound to correspond to the occurrence of an abnormal phenomenon, taking the strong association rule as a self-healing rule.
2. The mining method of claim 1, wherein the pre-processing the sample data further comprises:
labeling the acquired index value through stream processing, and storing sample data in a corresponding database table according to the type of labeling;
correspondingly, the mining of the association rule is performed on the preprocessed sample data, and specifically comprises the following steps:
according to the label determined by the user, extracting sample data from the database table corresponding to the determined label to carry out association rule mining.
3. The mining method according to claim 1, wherein the association rule mining algorithm is Apriori algorithm.
4. A mining method as claimed in claim 1, wherein the validation of the strong association rules is carried out by using a random forest model.
5. The mining method according to claim 1, wherein the preprocessing of the sample data is specifically:
performing data cleaning on the sample data, including: deleting repeated data and irrelevant data, smoothing noise data and interpolating abnormal data and missing data;
carrying out dimension reduction processing on the sample data after data cleaning, and discarding the sample data;
and converting the sample data left after the dimension reduction processing into a form suitable for data mining in a smooth aggregation and data generalization mode.
6. A self-healing regular excavating device, comprising:
the sample data acquisition module is used for acquiring the service data, the performance data and the log data as sample data;
the preprocessing module is used for preprocessing the sample data and converting the sample data into a form suitable for data mining;
the association rule mining module is used for mining association rules of the preprocessed sample data to obtain a certain number of strong association rules, and each strong association rule at least comprises business data and performance data;
and the verification module is used for verifying the strong association rule, and if the strong association rule is bound to correspond to the occurrence of an abnormal phenomenon, the strong association rule is used as a self-healing rule.
7. Excavating device according to claim 6, wherein the preprocessing module is embodied as:
the data cleaning unit is used for performing data cleaning on the sample data, and comprises: deleting repeated data and irrelevant data, smoothing noise data and interpolating abnormal data and missing data;
the data reduction unit is used for carrying out dimension reduction processing on the sample data after the data cleaning and discarding the sample data;
and the change unit is used for converting the residual sample data after the dimension reduction processing into a form suitable for data mining in a smooth aggregation and data generalization mode.
8. An electronic device, comprising:
at least one processor; and
at least one memory communicatively coupled to the processor, wherein:
the memory stores program instructions executable by the processor, the processor invoking the program instructions to perform the mining method of any one of claims 1 to 5.
9. A non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform the mining method of any one of claims 1 to 5.
CN201811437497.8A 2018-11-28 2018-11-28 Self-healing rule mining method and device based on big data Pending CN111241145A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811437497.8A CN111241145A (en) 2018-11-28 2018-11-28 Self-healing rule mining method and device based on big data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811437497.8A CN111241145A (en) 2018-11-28 2018-11-28 Self-healing rule mining method and device based on big data

Publications (1)

Publication Number Publication Date
CN111241145A true CN111241145A (en) 2020-06-05

Family

ID=70863736

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811437497.8A Pending CN111241145A (en) 2018-11-28 2018-11-28 Self-healing rule mining method and device based on big data

Country Status (1)

Country Link
CN (1) CN111241145A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113420069A (en) * 2021-06-24 2021-09-21 平安科技(深圳)有限公司 Association rule mining method, system, terminal and storage medium based on abnormal samples
CN113434404A (en) * 2021-06-24 2021-09-24 北京同创永益科技发展有限公司 Automatic service verification method and device for verifying reliability of disaster recovery backup system

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104616210A (en) * 2015-02-05 2015-05-13 河海大学常州校区 Method for fusion reconstruction and interaction of intelligent power distribution network big data
WO2016029570A1 (en) * 2014-08-28 2016-03-03 北京科东电力控制系统有限责任公司 Intelligent alert analysis method for power grid scheduling
CN107301296A (en) * 2017-06-27 2017-10-27 西安电子科技大学 Circuit breaker failure influence factor method for qualitative analysis based on data
CN108415789A (en) * 2018-01-24 2018-08-17 西安交通大学 Node failure forecasting system and method towards extensive mixing heterogeneous storage system
CN108446184A (en) * 2018-02-23 2018-08-24 北京天元创新科技有限公司 Analyze the method and system of failure root primordium
CN108650684A (en) * 2018-02-12 2018-10-12 中国联合网络通信集团有限公司 A kind of correlation rule determines method and device
CN108768695A (en) * 2018-04-27 2018-11-06 华为技术有限公司 The problem of KQI localization method and device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016029570A1 (en) * 2014-08-28 2016-03-03 北京科东电力控制系统有限责任公司 Intelligent alert analysis method for power grid scheduling
CN104616210A (en) * 2015-02-05 2015-05-13 河海大学常州校区 Method for fusion reconstruction and interaction of intelligent power distribution network big data
CN107301296A (en) * 2017-06-27 2017-10-27 西安电子科技大学 Circuit breaker failure influence factor method for qualitative analysis based on data
CN108415789A (en) * 2018-01-24 2018-08-17 西安交通大学 Node failure forecasting system and method towards extensive mixing heterogeneous storage system
CN108650684A (en) * 2018-02-12 2018-10-12 中国联合网络通信集团有限公司 A kind of correlation rule determines method and device
CN108446184A (en) * 2018-02-23 2018-08-24 北京天元创新科技有限公司 Analyze the method and system of failure root primordium
CN108768695A (en) * 2018-04-27 2018-11-06 华为技术有限公司 The problem of KQI localization method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
邓晓衡;曾德天;: "基于AHP和混合Apriori-Genetic算法的交通事故成因分析模型" *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113420069A (en) * 2021-06-24 2021-09-21 平安科技(深圳)有限公司 Association rule mining method, system, terminal and storage medium based on abnormal samples
CN113434404A (en) * 2021-06-24 2021-09-24 北京同创永益科技发展有限公司 Automatic service verification method and device for verifying reliability of disaster recovery backup system
CN113420069B (en) * 2021-06-24 2023-08-11 平安科技(深圳)有限公司 Association rule mining method, system, terminal and storage medium based on abnormal samples
CN113434404B (en) * 2021-06-24 2024-03-19 北京同创永益科技发展有限公司 Automatic service verification method and device for verifying reliability of disaster recovery system

Similar Documents

Publication Publication Date Title
CN112148772A (en) Alarm root cause identification method, device, equipment and storage medium
CN112181758B (en) Fault root cause positioning method based on network topology and real-time alarm
CN110471913A (en) A kind of data cleaning method and device
CN110932899B (en) Intelligent fault compression research method and system applying AI
CN111726248A (en) Alarm root cause positioning method and device
CN115809183A (en) Method for discovering and disposing information-creating terminal fault based on knowledge graph
CN114465874B (en) Fault prediction method, device, electronic equipment and storage medium
CN112217674B (en) Alarm root cause identification method based on causal network mining and graph attention network
CN109547251B (en) Service system fault and performance prediction method based on monitoring data
Chu et al. Prefix-graph: A versatile log parsing approach merging prefix tree with probabilistic graph
CN109656898A (en) Distributed large-scale complex community detection method and device based on node degree
CN111241145A (en) Self-healing rule mining method and device based on big data
CN113268370A (en) Root cause alarm analysis method, system, equipment and storage medium
CN105630797A (en) Data processing method and system
CN109993391A (en) Distributing method, device, equipment and the medium of network O&M task work order
CN109993390A (en) Alarm association and worksheet processing optimization method, device, equipment and medium
CN114647558A (en) Method and device for detecting log abnormity
CN106096117B (en) Uncertain graph key side appraisal procedure based on flow and reliability
CN110399278B (en) Alarm fusion system and method based on data center anomaly monitoring
CN117034149A (en) Fault processing strategy determining method and device, electronic equipment and storage medium
CN108243058A (en) A kind of method and apparatus based on alarm positioning failure
CN115495587A (en) Alarm analysis method and device based on knowledge graph
CN112750047B (en) Behavior relation information extraction method and device, storage medium and electronic equipment
EP3855316A1 (en) Optimizing breakeven points for enhancing system performance
CN114968933A (en) Method and device for classifying logs of data center

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination