CN112905671A - Time series exception handling method and device, electronic equipment and storage medium - Google Patents

Time series exception handling method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN112905671A
CN112905671A CN202110313319.XA CN202110313319A CN112905671A CN 112905671 A CN112905671 A CN 112905671A CN 202110313319 A CN202110313319 A CN 202110313319A CN 112905671 A CN112905671 A CN 112905671A
Authority
CN
China
Prior art keywords
data
abnormal
time
time series
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110313319.XA
Other languages
Chinese (zh)
Inventor
张文池
王泓琳
陈哲康
周波
王勇
刘大鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National Computer Network and Information Security Management Center
Original Assignee
Beijing Bishi Technology Co ltd
National Computer Network and Information Security Management Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Bishi Technology Co ltd, National Computer Network and Information Security Management Center filed Critical Beijing Bishi Technology Co ltd
Priority to CN202110313319.XA priority Critical patent/CN112905671A/en
Publication of CN112905671A publication Critical patent/CN112905671A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2474Sequence data queries, e.g. querying versioned data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a time series exception handling method and device, electronic equipment and a computer readable storage medium. The time series exception handling method comprises the following steps: acquiring time sequence data, training the time sequence data, and constructing a model; detecting whether abnormal data exist in the time sequence data obtained in real time according to the model, and if so, recommending part of the abnormal data; judging whether the recommended part of abnormal data is reasonable or not, and then feeding back a judgment result; and optimizing the model according to the judgment result, and then continuously detecting the real-time sequence data. According to the time series exception handling method, obvious bias is not generated on data, indexes with specific scene semantics can be adapted, operation and maintenance requirements in the field of non-traditional internet can be met, the method has higher expandability and universality, and specific exception reasons can be given for given exception results.

Description

Time series exception handling method and device, electronic equipment and storage medium
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a method and an apparatus for processing a time series exception, an electronic device, and a computer-readable storage medium.
Background
Modern software enterprises often rely on a large number of application services installed on a large number of infrastructures, including physical machines, virtual machines, containers. To ensure the reliability of these high-level services and systems, the operation and maintenance personnel need to monitor and check the operating conditions of the infrastructure. During routine operation and maintenance management work, an operation and maintenance engineer typically monitors and collects various performance metrics for the infrastructure. For example, the machine often has indexes such as memory utilization rate, CPU utilization rate, and disk utilization rate, and in actual operation, due to a fault caused by external attack, disk medium aging, performance continuous overload, and the like, the availability of the machine is severely challenged, and at this time, these monitoring indexes also reflect an abnormality. The method is very important for the abnormity detection of the time series indexes, and can help an operation and maintenance team to find the fault as soon as possible, so that the efficiency of fault occurrence to troubleshooting is improved.
The problem of anomaly detection of time series indexes is also widely noted in academia, and algorithms for anomaly detection of time series indexes are proposed in large quantities in recent years, but are limited by algorithm effects and detection performance, and the methods still cannot meet the requirements of actual landing application. In consideration of the fact that the number of indexes to be monitored and checked in operation and maintenance work is extremely large, manual index marking is impractical, and therefore a supervised anomaly detection method is difficult to practice, and an unsupervised learning mode must be adopted. In addition, the time series anomaly detection scenes are different, the service objects and loads of the services are different greatly, and the trends and characteristics shown by the indexes sometimes have strong service correlation, so that the anomaly detection method needs to have the capability of efficiently collecting the feedback of the operation and maintenance experts so as to acquire the knowledge of the operation and maintenance experts.
The following table 1 lists the most advanced unsupervised time series anomaly detection algorithms in the academic world at present, most of the algorithms adopt deep learning models, huge computing resources are required to support training, the computing performance needs to be improved, and user feedback cannot be directly applied to deep learning framework optimization. The traditional unsupervised statistical learning method needs a large amount of manual parameter adjustment and has uneven effects. The algorithms also have obvious bias on data, each algorithm is excellent in performance on a specific data type, but has no universality, and specific abnormal reasons are difficult to explain by given abnormal results.
Characteristics of Regression statistics learning Traditional unsupervised learning Unsupervised depth generation model
High capacity Difference (D) In general Is excellent in
Without need of regulating parameters Difference (D) Is excellent in In general
Need not label Is excellent in Is excellent in Is excellent in
The detection speed is high Is excellent in In general In general
Low training resources Is excellent in In general Difference (D)
Short training time Is excellent in In general Difference (D)
Can be manually adjusted Difference (D) In general Difference (D)
TABLE 1
Disclosure of Invention
The present invention is directed to solve at least one of the problems in the background art and provides a time series exception handling method, a time series exception handling apparatus, an electronic device, and a computer-readable storage medium.
In order to achieve the above object, the present invention provides a method for processing time series exception, comprising the following steps:
acquiring time sequence data, training the time sequence data, and constructing a model;
detecting whether abnormal data exist in the time sequence data obtained in real time according to the model, and if so, recommending part of the abnormal data;
judging whether the recommended part of abnormal data is reasonable or not, and then feeding back a judgment result;
and optimizing the model according to the judgment result, and then continuously detecting the real-time sequence data.
According to one aspect of the invention, acquiring time series data comprises acquiring regular small-scale time series data and irregular large-scale time series data, clustering all time series data when acquiring irregular large-scale time series data, and then training various types of time series data to construct a model.
According to one aspect of the invention, the clustering process is to capture the correlation among the time sequence data to be trained through DBSCAN, and cluster the data with approximate shape and consistent periodicity.
According to an aspect of the present invention, in the clustering process, in calculating the approximation degree of the time-series data, the distance between the time-series data is calculated using DTW.
According to one aspect of the invention, according to the type of the time sequence data, feature data capable of representing the corresponding type of the time sequence data is selected for training, and a model is constructed.
According to one aspect of the invention, RRCF is adopted to select all the feature data for training, all the feature data are iterated to obtain a plurality of decision trees, the decision trees form a decision forest, and then whether abnormal data exist in the real-time sequence data is determined through voting of the decision forest.
According to one aspect of the invention, when constructing the decision tree, the RRCF selects a segmentation dimension for segmenting the feature data when constructing the decision tree, and the RRCF has a probability of selecting the feature data as
Figure BDA0002990156980000031
gi=maxx∈Sxj-xj-1(ii) a Where i is the characteristic data, piRepresenting the probability of the feature i being selected, the probability value being between 0 and 1; liRepresenting the difference between the maximum value and the minimum value of the characteristic i in a training sample set and in a characteristic set obtained by calculation; gi represents the maximum difference between two adjacent characteristic values in the characteristic set obtained by calculation after the characteristic i is sorted according to the characteristic size in the training sample set; sigma gjRepresenting g calculated for each feature dimension jjThe summation ∑ ljRepresents l calculated for each feature dimension jjAnd (6) summing.
According to one aspect of the invention, the RRCF equally divides the feature data in the slicing dimension into N intervals [ l [ ]0,h0,l1,h1,...,lN-1,hN-1]And calculating the density d of each intervali=Count(p,p∈[li,hi]) Wherein the probability that each of the intervals is selected is
Figure BDA0002990156980000032
Finally randomly selecting a cutting point X from the selected intervali~Uniform[li,hi](ii) a Wherein l-0 and h-N-1 respectively represent the minimum value of the characteristic in the characteristic dimension solved for the training set, h-N-1 represents the maximum value of the characteristic, the difference between the minimum value and the maximum value is divided by N, and the N intervals are equally divided.
According to one aspect of the present invention, when the abnormal data exists, the abnormal score codip of the abnormal data is calculated by using the dividing point, and when the abnormal score codip is calculated, the ratio codip of the number of the abnormal data contained in the sibling subtree and the father subtree of the dividing point is calculatedNodeSelecting the largest ratio CoDispNodeAbnormal data xiIs an abnormality score of
Figure BDA0002990156980000041
According to one aspect of the invention, the recommending part of the abnormal data is to select a plurality of most abnormal segments in the abnormal data, and recommend after obtaining labels of the plurality of segments; or
Recommending partial abnormal data by selecting a plurality of uncertain segments in the abnormal data and recommending after obtaining labels of the segments; or
And the recommendation of the abnormal data of the part is to divide the abnormal data into a plurality of groups according to the abnormal scores, obtain a plurality of fragments in each group, and recommend after obtaining the labels of the fragments.
According to one aspect of the invention, after the abnormal data of n labeled segments are obtained by the model, the abnormal data and M decision trees in the decision forest of the model jointly form an abnormal score matrix codip _ M [ x [ [ x ])i][treej]For each exception data xiIf the feedback judgment result is true positive, the decision tree isjHas a weight of twj=twj+δ×CoDisp_M[xi][treej]And selecting a decision tree with higher weight according to the feedback judgment result so as to optimize the model.
In order to achieve the above object, the present invention further provides a time-series exception handling apparatus, including:
the data processing module is used for acquiring time series data, training the time series data and constructing a model;
the abnormal data detection recommending module detects whether abnormal data exist in the time sequence data obtained in real time according to the model, and if the abnormal data exist, part of the abnormal data are recommended;
the abnormal data judgment feedback module judges whether the part of abnormal data is reasonable or not and then feeds back a judgment result;
and the model optimization module optimizes the model according to the feedback judgment result and then continuously detects the real-time sequence data.
According to an aspect of the invention, further comprising:
and the data classification processing module is used for acquiring irregular large-scale time sequence data, clustering all the time sequence data, training various time sequence data and constructing a model.
In order to achieve the above object, the present invention further provides an electronic device, which includes a processor, a memory, and a computer program stored in the memory and executable on the processor, wherein the computer program, when executed by the processor, implements the above time-series exception handling method.
To achieve the above object, the present invention further provides a computer-readable storage medium, on which a computer program is stored, and the computer program, when executed by a processor, implements the above time-series exception handling method.
According to one scheme of the invention, as the number of time sequences to be monitored in a production environment is extremely large, each production unit can generate dozens or even hundreds of monitoring index data, the index data need to be monitored completely, if the time sequences are trained respectively in a targeted manner, the number of models and consumed resources are extremely large, and the existing operation and maintenance resources are difficult to support. Therefore, before the targeted training stage of the index data, the data are clustered, so that the detection processing time can be greatly reduced, and the abnormity can be quickly and accurately processed.
According to one scheme of the invention, a characteristic data selection stage is provided, and more appropriate characteristic data are extracted in a targeted manner according to the statistical information and characteristics of indexes, so that the accuracy of the model is improved.
According to one scheme of the invention, the most abnormal 30 segments are selected, and the labels of the abnormal segments are acquired, so that the explicit abnormality can be further confirmed, and the false positive rate can be reduced.
According to one scheme of the invention, 30 most uncertain segments (namely around the vicinity of an abnormality judgment threshold) are selected, and the labels can help the model to clearly classify boundaries, so that the identification accuracy of fuzzy abnormalities is improved.
According to one aspect of the invention, the abnormal data is divided into 10 groups according to the abnormal scores, each group obtains at most 3 segments, and the labels can capture attitudes of the judgment feedback module on different abnormal judgment conditions, so as to help the model determine the optimal threshold value selection range.
According to one scheme of the invention, the invention provides an unsupervised, white-box and accurate time series exception handling method which is matched with active learning and can actively and efficiently collect feedback information. On the basis of a traditional unsupervised learning frame, an active learning stage is introduced, abnormality is actively recommended to a judgment feedback part (such as a judgment feedback module or operation and maintenance personnel) and feedback is acquired, so that a model is corrected, and the accuracy is improved. The method reserves the advantages of the traditional unsupervised learning in the aspects of parameter adjustment and marking, designs the application strategy of marking feedback in a targeted manner, and further optimizes the recall rate, the detection speed and the capacity of the model.
According to one scheme of the invention, the processing method has no obvious bias on data, can adapt to indexes with specific scene semantics, can meet the operation and maintenance requirements in the field of non-traditional Internet, has higher expandability and universality, and can give specific abnormal reasons for the given abnormal result.
According to one aspect of the present invention, the present invention is able to accurately detect and interpret anomalies, testing on 1 public data set and 2 time series data of a commercial bank's actual production environment, ultimately reaching F1-score of 0.81 and 0.89 on both data sets. Compared with the traditional unsupervised exception handling method, the best F1-score is improved by 0.19-0.5 on two data sets, and the detection time is shortened by 58%.
Drawings
FIG. 1 schematically shows a flow diagram of a method for time series exception handling according to one embodiment of the present invention;
FIG. 2 schematically represents an approximate index plot collected by the same switch;
3-5 schematically show three different anomaly fragment proactive recommender diagrams;
fig. 6 schematically shows a functional configuration diagram of a time-series abnormality processing apparatus according to an embodiment of the present invention.
Detailed Description
The content of the invention will now be discussed with reference to exemplary embodiments. It is to be understood that the embodiments discussed are merely intended to enable one of ordinary skill in the art to better understand and thus implement the teachings of the present invention, and do not imply any limitations on the scope of the invention.
As used herein, the term "include" and its variants are to be read as open-ended terms meaning "including, but not limited to. The term "based on" is to be read as "based, at least in part, on". The terms "one embodiment" and "an embodiment" are to be read as "at least one embodiment".
In view of the above-described drawbacks of the prior art in the background art, the present invention provides a batch task time monitoring method, which can predict the time of a batch task and detect an abnormality of the batch task, and update a task model or generate an alarm according to the prediction and detection results.
FIG. 1 schematically shows a flow diagram of a method for time series exception handling according to one embodiment of the present invention. As shown in fig. 1, a time-series exception handling method according to an embodiment of the present invention includes the following steps:
a. acquiring time sequence data, training the time sequence data, and constructing a model;
b. detecting whether abnormal data exist in the time sequence data obtained in real time according to the model, and if so, recommending part of the abnormal data;
c. judging whether the recommended part of abnormal data is reasonable or not, and then feeding back a judgment result;
d. and optimizing the model according to the judgment result, and then continuously detecting the real-time sequence data.
In practice, the time series data may be represented by x, where x is { x ═ x1,x2,...,xNN is the length of data x, data point x at any time ttIs a specific data value. The time series may be collected from many sources, such as networks, transaction links, request logs, and the like. Same sourceHave a greater probability of having similar characteristics.
Because the number of time sequences to be monitored in a production environment is extremely large, each production unit can generate dozens or even hundreds of monitoring index data, the index data need to be monitored completely, if the time sequences are trained respectively in a targeted manner, the number of models and consumed resources are extremely large, and the existing operation and maintenance resources are difficult to support. Therefore, before the targeted training stage of the index data, the data are clustered, so that the detection processing time can be greatly reduced, and the abnormity can be quickly and accurately processed.
Specifically, according to an embodiment of the present invention, in the step a, in the clustering stage, the algorithm uses DBSCAN to capture the association relationship between the timing indexes to be trained, and clusters the indexes with similar shapes and consistent periodicity. In calculating the index similarity, distance between indexes is calculated using dtw (dynamic Time warping). The DBSCAN does not need to provide the predefined category information, and can control the clustering accuracy by adjusting the clustering radius, so the DBSCAN is very suitable for index clustering scenes.
Figure 2 schematically shows an approximate index map of the same switch acquisition. As shown in fig. 2, two network traffic curves for different ports of the same switch exhibit substantially the same trend and scale. In an actual production environment, the same type of data under the same monitoring unit also has the clustering characteristic, and by utilizing the characteristic, the number of models generated in a model training stage can be greatly reduced, consumed resources are reduced, and the cost performance of an operation and maintenance tool is improved. In addition, the number of data in a part of scenes is small, the accuracy requirement is high, and the cost performance of the single training data is higher than that of pre-clustering at the moment, so that the clustering stage is taken as an optional step.
As can be seen from the above, in the present invention, acquiring time series data includes acquiring regular small-scale time series data and acquiring irregular large-scale time series data, and the acquisition of regular small-scale time series data is only performed by direct training, while the acquisition of irregular large-scale time series data requires clustering, and then training various types of time series data, and then constructing a model.
According to one embodiment of the invention, according to the type of the time sequence data, feature data capable of representing the corresponding type of the time sequence data is selected for training, and a model is constructed. The time series data have different characteristics. For example, percentage type sequence data tends to exhibit a horizontal state with short dips or spikes in failure; transaction sequence data related to services often show periodic peaks/valleys, and a small amount of fluctuation occurs in the case of failure; exchanging infrastructure sequence data such as space, there may be a process that slowly rises over time. Therefore, the invention provides a characteristic data selection stage, and according to the statistical information and characteristics of the indexes, more suitable characteristic data are extracted in a targeted manner, so that the model accuracy is improved. The specific extraction rules are shown in the following table:
Figure BDA0002990156980000091
TABLE 2
In this embodiment, table 2 contains simple and effective feature data that can cover the different features of most curves, and is easy to calculate and performs well.
According to one embodiment of the invention, RRCF is adopted to select all feature data for training, all feature data are iterated to obtain a plurality of decision trees, the decision trees form a decision forest, and then whether abnormal data exist in real-time sequence data is determined through voting of the decision forest.
When the decision tree is constructed, the RRCF selects the segmentation dimension for segmenting the feature data, and the probability of the RRCF selecting the feature data under the segmentation dimension is
Figure BDA0002990156980000092
gi=maxx∈Sxj-xj-1(ii) a Where i is the characteristic data, piRepresenting the probability, that the feature i is selectedThe value is between 0 and 1; liRepresenting the difference between the maximum value and the minimum value of the characteristic i in a training sample set and in a characteristic set obtained by calculation; gi represents the maximum difference between two adjacent characteristic values in the characteristic set obtained by calculation after the characteristic i is sorted according to the characteristic size in the training sample set; sigma gjRepresenting g calculated for each feature dimension jjThe summation ∑ ljRepresents l calculated for each feature dimension jjAnd (6) summing.
Specifically, the unsupervised anomaly detection basic algorithm selected by the invention is RRCF (robust Random Cut forest), the detection effect of the unsupervised anomaly detection basic algorithm is better than that of other unsupervised anomaly detection algorithms, and a certain difference exists between the accuracy of the unsupervised anomaly detection basic algorithm and the accuracy of the unsupervised anomaly detection basic algorithm used when the vehicle is actually landed. The RRCF trains all training sample feature data in batches, each batch of feature data is subjected to multiple rounds of iteration to obtain a decision tree, and all decision trees finally form a decision forest and decide whether the training sample feature data are abnormal or not through voting. In the process of constructing the decision tree, feature segmentation needs to be selected from multiple dimensions of feature data. The RRCF considers that the segmentation is carried out on the dimension with larger coverage data range, the distinguishing effect of the sample is better, namely the probability that the feature i is selected
Figure BDA0002990156980000101
li=maxx∈Sxi-minx∈SxiWherein Si represents the probability of the feature i being selected, li represents the difference between the maximum value and the minimum value in the feature i, S represents the training sample set, and xiRepresenting the value of the feature i calculated for one sample in S. But this does not take into account the effect of the distribution of the dimensions themselves. According to an embodiment of the invention, when a decision tree is constructed and the dimension for cutting branches is selected, in addition to considering the coverage range of data of the dimension, the extreme difference of the data is used as an influence factor, namely, the probability of selecting the characteristic i is selected by the invention
Figure BDA0002990156980000102
Wherein g isi=maxx∈Sxj-xj-1. Thus, the larger the maximum spacing of the data distribution in each dimension,the degree of discrimination provided by segmentation at the interval is higher, so that segmentation dimensionality is selected more effectively, and model accuracy is improved.
Further, when a decision tree is constructed, after each iteration determines a segmentation dimension, a suitable boundary point needs to be selected on data of the dimension, and left and right subtrees are divided according to the boundary point. After the RRCF equally divides the dimension data, a dividing point is randomly selected, and the distribution characteristics of the dimension are not considered. According to one embodiment of the invention, the RRCF equally divides the feature data in the segmentation dimension into N intervals l0,h0,l1,h1,...,lN-1,hN-1]And calculating the density d of each intervali=Count(p,p∈[li,hi]) Wherein the probability that each interval is selected is
Figure BDA0002990156980000103
Finally randomly selecting a cutting point X from the selected intervali~Uniform[li,hi]. Wherein l-0 and h-N-1 respectively represent the minimum value of the characteristic in the characteristic dimension solved for the training set, h-N-1 represents the maximum value of the characteristic, the difference between the minimum value and the maximum value is divided by N, and the N intervals are equally divided. For example, the left and right endpoints of the ith interval are liAnd hi. The selection strategy can identify the sparse part of the segmentation dimension more accurately, so that the discrimination is improved. In the present embodiment, diThe density of the intervals is represented, and refers to the number of samples in the range. Since the spacing widths are the same, the greater the number of samples, the greater the density. Count represents the Count, p represents each sample falling in the interval, i.e. [ l ] is countedi,hi]Number of samples in the interval range. Uniform [ li,hi]Represents the interval of pair li,hiMake normalization, XiIs a randomly selected segmentation point in the normalized interval.
Further, when abnormal data exists, an abnormal score codip of the abnormal data is calculated using the dividing point (specific node), and when the abnormal score codip is calculated, the sibling subtree and father of the dividing point are calculatedProportion CoDisp of abnormal data quantity contained in subtreeNodeThe higher the ratio, the higher the outlier degree of the outlier data. Since the calculation process of each abnormal data involves a plurality of characteristic data, the model is gradually moved upwards from the initial node for detection, and after repeated multiple iterations, the largest proportion CoDisp is selectedNodeAbnormal data xiIs an abnormality score of
Figure BDA0002990156980000111
Abnormal score CoDispxiMeans xiThe calculated degree of abnormality is sampled. First, xiA leaf sample in the decision tree is dropped, and the algorithm searches upwards from the leaf until a branch Node is found, and the sample size of the sub-tree represented by the Node is far smaller than that of the sibling sub-tree thereof. Final sample xiThe Codisp of (1) is the average value of the Codisp of the Node nodes corresponding to the sample in each tree in the whole forest. In the present embodiment, the largest ratio codip is selectedNodeConsidering the depth at which the node is located, deeper nodes in the tree are more normal. Thus find the demarcation point of the sample where xiThe subtree is isolated from other large samples and is more representative.
Further, in the step b, recommending part of abnormal data as a plurality of most abnormal segments in the selected abnormal data, and recommending after obtaining labels of the plurality of segments; or
Recommending partial abnormal data by selecting a plurality of uncertain segments in the abnormal data, and recommending after obtaining labels of the plurality of segments; or
And recommending part of abnormal data, namely segmenting the abnormal data into a plurality of groups according to the abnormal scores, acquiring a plurality of fragments in each group, and recommending after acquiring the labels of the fragments.
3-5 schematically show three different anomaly fragment proactive recommender diagrams. As shown in fig. 3, according to an embodiment of the present invention, the scheme a selects the most abnormal 30 segments, and the labels of these abnormal segments can further affirm the explicit abnormality and reduce the false positive rate.
According to another embodiment of the invention, as shown in fig. 4, the scheme B selects the most uncertain 30 segments (i.e., around the anomaly determination threshold), and these labels can help the model to clearly classify the boundary, thereby improving the identification accuracy of the fuzzy anomaly.
As shown in fig. 5, according to the third embodiment of the present invention, the solution C divides the abnormal data into 10 groups according to the abnormal score, each group obtains at most 3 segments, and these labels can capture, for example, attitudes of the judgment feedback module on different abnormal judgment conditions, thereby helping the model determine the optimal threshold selection range.
In experiments disclosing data sets, the F1-score for protocol a was higher than the other two protocols, but each of the other two protocols possessed specific applicable scenarios.
Furthermore, the invention improves the processing efficiency of the model in the online detection stage through various technologies, and enables the model to have the capability of dynamic adjustment according to the feedback of the user. In the on-line detection stage, only the extreme abnormal value is selected as the automatic model feedback data to dynamically adjust the RRCF model, so that the model updating frequency is reduced, and the detection performance is improved. According to an embodiment of the invention, after the abnormal data of n labeled segments are obtained by the model, the abnormal data and M trees in the decision forest of the model jointly form an abnormal score matrix codip _ M [ x ]i][treej]For each exception data xiIf the user marks true sun, tree is usedjWeight tw ofj=twj+δ×CoDosp_M[xi][treej]. The self-correction of the model is fed back, so that the model can be helped to screen out decision trees with higher quality, the decision trees have higher weight in later-stage abnormal judgment, and the decision trees with higher weight are selected, so that the model is optimized, and the influence on the detection result is improved.
Furthermore, the present invention provides a time-series exception handling apparatus for implementing the time-series exception handling method, as shown in fig. 6, the apparatus including:
the data processing module is used for acquiring time series data, training the time series data and constructing a model;
the abnormal data detection recommending module detects whether abnormal data exist in the time sequence data obtained in real time according to the model, and if the abnormal data exist, part of the abnormal data are recommended;
the abnormal data judgment feedback module judges whether the part of abnormal data is reasonable or not and then feeds back a judgment result;
and the model optimization module optimizes the model according to the feedback judgment result and then continuously detects the real-time sequence data.
According to an embodiment of the present invention, further comprising:
and the data classification processing module is used for acquiring irregular large-scale time sequence data, clustering all the time sequence data, training various time sequence data and constructing a model.
In the invention, the data processing module acquires time sequence data, including acquiring regular small-scale time sequence data and irregular large-scale time sequence data, and when acquiring irregular large-scale time sequence data, all the time sequence data are clustered, and then various time sequence data are trained to construct a model.
The clustering process is to capture the incidence relation among the time sequence data to be trained through DBSCAN and cluster the data with approximate shape and consistent periodicity.
In the clustering process, in calculating the approximation degree of the time-series data, the distance between the time-series data is calculated using Dynamic Time Warping (DTW).
And the data classification processing module selects characteristic data which can represent the time sequence data of the corresponding type according to the type of the time sequence data to train and construct a model.
According to one embodiment of the invention, the abnormal data detection recommendation module adopts RRCF to select all feature data for training, the feature data are iterated to obtain a plurality of decision trees, the decision trees form a decision forest, and then whether abnormal data exist in the real-time sequence data or not is determined through decision forest voting.
In this embodiment, when constructing the decision tree, the RRCF selects a segmentation dimension for segmenting the feature data, and the RRCF has a probability of selecting the feature data in the segmentation dimension of
Figure BDA0002990156980000131
gi=maxx∈Sxj-xj-1(ii) a Where i is the characteristic data, piRepresenting the probability of the feature i being selected, the probability value being between 0 and 1; liRepresenting the difference between the maximum value and the minimum value of the characteristic i in a training sample set and in a characteristic set obtained by calculation; gi represents the maximum difference between two adjacent characteristic values in the characteristic set obtained by calculation after the characteristic i is sorted according to the characteristic size in the training sample set; sigma gjRepresenting g calculated for each feature dimension jjThe summation ∑ ljRepresents l calculated for each feature dimension jjAnd (6) summing.
In this embodiment, the RRCF equally divides the feature data in the segmentation dimension into N intervals [ l [ ]0,h0,l1,h1,...,lN-1,hN-1]And calculating the density d of each intervali=Count(p,p∈[li,hi]) Wherein the probability that each interval is selected is
Figure BDA0002990156980000141
Finally randomly selecting a cutting point X from the selected intervali~Uniform[li,hi]. Wherein l-0 and h-N-1 respectively represent the minimum value of the characteristic in the characteristic dimension solved for the training set, h-N-1 represents the maximum value of the characteristic, the difference between the minimum value and the maximum value is divided by N, and the N intervals are equally divided.
When abnormal data exists, the abnormal score CoDisp of the abnormal data is calculated by using the dividing point, and when the abnormal score CoDisp is calculated, the proportion CoDisp of the abnormal data quantity contained in the brother subtree and the father subtree of the dividing point is calculatedNodeSelecting the largest ratio CoDispNodeAbnormal data xiIs an abnormality score of
Figure BDA0002990156980000142
In the invention, the abnormal data detection recommending module recommends part of abnormal data as a plurality of most abnormal segments in the selected abnormal data, acquires labels of the plurality of segments and then recommends; or
Recommending partial abnormal data by selecting a plurality of uncertain segments in the abnormal data, and recommending after obtaining labels of the plurality of segments; or
And recommending part of abnormal data, namely segmenting the abnormal data into a plurality of groups according to the abnormal scores, acquiring a plurality of fragments in each group, and recommending after acquiring the labels of the fragments.
According to an embodiment of the present invention, after obtaining the abnormal data of n labeled segments, the model and M decision trees in the decision forest of the model jointly form an abnormal score matrix codip _ M [ x [ ]i][treej]For each exception data xiIf the feedback judgment result is true positive, the decision tree isjHas a weight of twj=twj+δ×CoDisp_M[xi][treej]And selecting a decision tree with higher weight according to the feedback judgment result so as to optimize the model.
To achieve the above object, the present invention also provides an electronic device, including: the time-series exception handling system comprises a processor, a memory and a computer program which is stored on the memory and can run on the processor, wherein the computer program realizes the time-series exception handling method when being executed by the processor.
In order to achieve the above object, the present invention further provides a computer-readable storage medium, on which a computer program is stored, and the computer program is executed by a processor to implement the above time-series exception handling method.
According to the scheme, the invention provides an unsupervised, white-box and accurate time series exception handling method which is matched with active learning and can actively and efficiently collect feedback information. On the basis of a traditional unsupervised learning frame, an active learning stage is introduced, abnormality is actively recommended to a judgment feedback part (such as a judgment feedback module or operation and maintenance personnel) and feedback is acquired, so that a model is corrected, and the accuracy is improved. The method reserves the advantages of the traditional unsupervised learning in the aspects of parameter adjustment and marking, designs the application strategy of marking feedback in a targeted manner, and further optimizes the recall rate, the detection speed and the capacity of the model.
Moreover, the processing method has no obvious bias on data, can adapt to indexes with specific scene semantics, can meet the operation and maintenance requirements in the field of non-traditional Internet, has higher expandability and universality, and can give specific abnormal reasons to the given abnormal result.
Moreover, the present invention was able to accurately detect and interpret anomalies, tested on 1 public data set and time series data of 2 commercial bank actual production environments, ultimately reaching F1-score of 0.81 and 0.89 on both data sets. Compared with the traditional unsupervised exception handling method, the best F1-score is improved by 0.19-0.5 on two data sets, and the detection time is shortened by 58%.
Those of ordinary skill in the art will appreciate that the modules and algorithm steps described in connection with the embodiments disclosed herein can be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described apparatuses and devices may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is merely a logical division, and in actual implementation, there may be other divisions, for example, multiple modules or components may be combined or integrated into another system, or some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or modules, and may be in an electrical, mechanical or other form.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical modules, may be located in one place, or may be distributed on a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the embodiment of the present invention.
In addition, each functional module in the embodiments of the present invention may be integrated into one processing module, or each module may exist alone physically, or two or more modules may be integrated into one module.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method for transmitting/receiving the power saving signal according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a U disk, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disk.
The above description is only a preferred embodiment of the application and is illustrative of the principles of the technology employed. It will be appreciated by a person skilled in the art that the scope of the invention as referred to in the present application is not limited to the embodiments with a specific combination of the above-mentioned features, but also covers other embodiments with any combination of the above-mentioned features or their equivalents without departing from the inventive concept. For example, the above features may be replaced with (but not limited to) features having similar functions disclosed in the present application.
It should be understood that the order of execution of the steps in the summary of the invention and the embodiments of the present invention does not absolutely imply any order of execution, and the order of execution of the steps should be determined by their functions and inherent logic, and should not be construed as limiting the process of the embodiments of the present invention.

Claims (15)

1. A time series exception handling method is characterized by comprising the following steps:
acquiring time sequence data, training the time sequence data, and constructing a model;
detecting whether abnormal data exist in the time sequence data obtained in real time according to the model, and if so, recommending part of the abnormal data;
judging whether the recommended part of abnormal data is reasonable or not, and then feeding back a judgment result;
and optimizing the model according to the judgment result, and then continuously detecting the real-time sequence data.
2. The method for processing the time series abnormality according to claim 1, wherein the acquiring of the time series data includes acquiring regular small-scale time series data and acquiring irregular large-scale time series data, and when acquiring the irregular large-scale time series data, all the time series data are clustered, and then each type of time series data is trained to construct a model.
3. The method for processing the time series abnormity according to claim 2, wherein the clustering process is to capture the incidence relation among the time series data to be trained through DBSCAN, and cluster the data with approximate shape and consistent periodicity.
4. The method according to claim 3, wherein in the clustering process, in calculating the approximation degree of the time-series data, a distance between the time-series data is calculated using a dynamic time warping algorithm.
5. The method according to claim 4, wherein the time series abnormality processing method is characterized in that feature data capable of representing the corresponding type of time series data is selected for training to construct a model according to the type of the time series data.
6. The method for processing the time series abnormity according to claim 5, wherein all the feature data are selected for training by adopting a steady random deforestation, a plurality of decision trees are obtained by iteration of all the feature data, the decision trees form a decision forest, and then whether the abnormal data exist in the real-time series data is determined by voting of the decision forest.
7. The method of claim 6, wherein the RRCF selects a segmentation dimension for segmenting the feature data when constructing the decision tree, and the RRCF has a probability of selecting the feature data in the segmentation dimension as
Figure FDA0002990156970000021
gi=maxx∈Sxj-xj-1(ii) a Where i is the characteristic data, piRepresenting the probability of the feature i being selected, the probability value being between 0 and 1; liRepresenting the difference between the maximum value and the minimum value of the characteristic i in a training sample set and in a characteristic set obtained by calculation; giRepresenting the maximum difference between two adjacent characteristic values in the characteristic set obtained by calculation and sorting the characteristic i according to the characteristic size in the training sample set; sigma giRepresentsG calculated for each feature dimension jjThe summation ∑ ljRepresents l calculated for each feature dimension jjAnd (6) summing.
8. The method of processing time series exceptions of claim 7 wherein the RRCF equally divides the feature data in the sliced dimension into N intervals [ l [ ]0,h0,l1,h1,...,lN-1,hN-1]And calculating the density d of each intervali=Count(p,p∈[li,hi]) Wherein the probability that each of the intervals is selected is
Figure FDA0002990156970000022
Finally randomly selecting a cutting point X from the selected intervali~Uniform[li,hi](ii) a Wherein, 1-0 and h-N-1 respectively represent the minimum value of the characteristic in the characteristic dimension solved by the training set, 1-0 represents the maximum value of the characteristic, h-N-1 represents the difference of the minimum value and the maximum value of the characteristic, and the difference is divided by N and equally divided into N intervals.
9. The method according to claim 8, wherein when the abnormal data exists, the division point is used to calculate an abnormal score codip of the abnormal data, and when the abnormal score codip is calculated, a ratio codip of the number of the abnormal data included in a sibling subtree and a parent subtree of the division point is calculatedNodeSelecting the largest ratio CoDispNodeAbnormal data xiIs an abnormality score of
Figure FDA0002990156970000023
T∈forest。
10. The time-series abnormality processing method according to claim 9, wherein the recommendation of the partial abnormality data is to select a plurality of pieces of the abnormality data which are presumed to be most abnormal; or
Recommending partial abnormal data by selecting a plurality of uncertain segments in the abnormal data and recommending after obtaining labels of the segments; or
And the recommendation of the abnormal data of the part is to divide the abnormal data into a plurality of groups according to the abnormal scores, obtain a plurality of fragments in each group, and recommend after obtaining the labels of the fragments.
11. The method for processing time series exception according to claim 10, wherein the anomaly data of n labeled segments obtained by the model and M decision trees in the decision forest of the model together form an exception score matrix codip _ M [ x ×)i][treej]For each exception data xiIf the feedback judgment result is true positive, the decision tree isjHas a weight of twj=twj+δ×CoDisp_M[xi][treej]And selecting a decision tree with higher weight according to the feedback judgment result so as to optimize the model.
12. A time-series exception handling apparatus, comprising:
the data processing module is used for acquiring time series data, training the time series data and constructing a model;
the abnormal data detection recommending module detects whether abnormal data exist in the time sequence data obtained in real time according to the model, and if the abnormal data exist, part of the abnormal data are recommended;
the abnormal data judgment feedback module judges whether the part of abnormal data is reasonable or not and then feeds back a judgment result;
and the model optimization module optimizes the model according to the feedback judgment result and then continuously detects the real-time sequence data.
13. The time-series exception handling apparatus according to claim 12, further comprising
And the data classification processing module is used for acquiring irregular large-scale time sequence data, clustering all the time sequence data, training various time sequence data and constructing a model.
14. An electronic device comprising a processor, a memory, and a computer program stored on the memory and executable on the processor, the computer program, when executed by the processor, implementing the time series exception handling method of any one of claims 1 to 11.
15. A computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, implements the time-series abnormality processing method according to any one of claims 1 to 11.
CN202110313319.XA 2021-03-24 2021-03-24 Time series exception handling method and device, electronic equipment and storage medium Pending CN112905671A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110313319.XA CN112905671A (en) 2021-03-24 2021-03-24 Time series exception handling method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110313319.XA CN112905671A (en) 2021-03-24 2021-03-24 Time series exception handling method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN112905671A true CN112905671A (en) 2021-06-04

Family

ID=76106631

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110313319.XA Pending CN112905671A (en) 2021-03-24 2021-03-24 Time series exception handling method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112905671A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113656271A (en) * 2021-08-10 2021-11-16 上海浦东发展银行股份有限公司 Method, device and equipment for processing user abnormal behaviors and storage medium
CN114066173A (en) * 2021-10-26 2022-02-18 福建正孚软件有限公司 Capital flow behavior analysis method and storage medium
CN115146174A (en) * 2022-07-26 2022-10-04 北京永信至诚科技股份有限公司 Key clue recommendation method and system based on multi-dimensional weight model

Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130024172A1 (en) * 2010-03-30 2013-01-24 Kabushiki Kaisha Toshiba Anomaly detecting apparatus
CN105228175A (en) * 2015-09-17 2016-01-06 福建新大陆软件工程有限公司 A kind of base station energy consumption optimization method based on decision tree and system
CN109032829A (en) * 2018-07-23 2018-12-18 腾讯科技(深圳)有限公司 Data exception detection method, device, computer equipment and storage medium
CN109753049A (en) * 2018-12-21 2019-05-14 国网江苏省电力有限公司南京供电分公司 The exceptional instructions detection method of one provenance net load interaction industrial control system
CN109871401A (en) * 2018-12-26 2019-06-11 北京奇安信科技有限公司 A kind of time series method for detecting abnormality and device
CN110138745A (en) * 2019-04-23 2019-08-16 极客信安(北京)科技有限公司 Abnormal host detection method, device, equipment and medium based on data stream sequences
US20190288904A1 (en) * 2016-12-07 2019-09-19 Huawei Technologies Co., Ltd. Network Detection Method and Apparatus
US20200053110A1 (en) * 2017-03-28 2020-02-13 Han Si An Xin (Beijing) Software Technology Co., Ltd Method of detecting abnormal behavior of user of computer network system
CN110910204A (en) * 2019-10-24 2020-03-24 东莞市盟大塑化科技有限公司 User monitoring system based on artificial intelligence
CN111178456A (en) * 2020-01-15 2020-05-19 腾讯科技(深圳)有限公司 Abnormal index detection method and device, computer equipment and storage medium
CN111262722A (en) * 2019-12-31 2020-06-09 中国广核电力股份有限公司 Safety monitoring method for industrial control system network
CN111459778A (en) * 2020-03-12 2020-07-28 平安科技(深圳)有限公司 Operation and maintenance system abnormal index detection model optimization method and device and storage medium
CN111858231A (en) * 2020-05-11 2020-10-30 北京必示科技有限公司 Single index abnormality detection method based on operation and maintenance monitoring
CN111931868A (en) * 2020-09-24 2020-11-13 常州微亿智造科技有限公司 Time series data abnormity detection method and device
CN112084056A (en) * 2020-08-25 2020-12-15 腾讯科技(深圳)有限公司 Abnormality detection method, apparatus, device and storage medium
CN112381181A (en) * 2020-12-11 2021-02-19 桂林电子科技大学 Dynamic detection method for building energy consumption abnormity

Patent Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130024172A1 (en) * 2010-03-30 2013-01-24 Kabushiki Kaisha Toshiba Anomaly detecting apparatus
CN105228175A (en) * 2015-09-17 2016-01-06 福建新大陆软件工程有限公司 A kind of base station energy consumption optimization method based on decision tree and system
US20190288904A1 (en) * 2016-12-07 2019-09-19 Huawei Technologies Co., Ltd. Network Detection Method and Apparatus
US20200053110A1 (en) * 2017-03-28 2020-02-13 Han Si An Xin (Beijing) Software Technology Co., Ltd Method of detecting abnormal behavior of user of computer network system
CN109032829A (en) * 2018-07-23 2018-12-18 腾讯科技(深圳)有限公司 Data exception detection method, device, computer equipment and storage medium
CN109753049A (en) * 2018-12-21 2019-05-14 国网江苏省电力有限公司南京供电分公司 The exceptional instructions detection method of one provenance net load interaction industrial control system
CN109871401A (en) * 2018-12-26 2019-06-11 北京奇安信科技有限公司 A kind of time series method for detecting abnormality and device
CN110138745A (en) * 2019-04-23 2019-08-16 极客信安(北京)科技有限公司 Abnormal host detection method, device, equipment and medium based on data stream sequences
CN110910204A (en) * 2019-10-24 2020-03-24 东莞市盟大塑化科技有限公司 User monitoring system based on artificial intelligence
CN111262722A (en) * 2019-12-31 2020-06-09 中国广核电力股份有限公司 Safety monitoring method for industrial control system network
CN111178456A (en) * 2020-01-15 2020-05-19 腾讯科技(深圳)有限公司 Abnormal index detection method and device, computer equipment and storage medium
CN111459778A (en) * 2020-03-12 2020-07-28 平安科技(深圳)有限公司 Operation and maintenance system abnormal index detection model optimization method and device and storage medium
CN111858231A (en) * 2020-05-11 2020-10-30 北京必示科技有限公司 Single index abnormality detection method based on operation and maintenance monitoring
CN112084056A (en) * 2020-08-25 2020-12-15 腾讯科技(深圳)有限公司 Abnormality detection method, apparatus, device and storage medium
CN111931868A (en) * 2020-09-24 2020-11-13 常州微亿智造科技有限公司 Time series data abnormity detection method and device
CN112381181A (en) * 2020-12-11 2021-02-19 桂林电子科技大学 Dynamic detection method for building energy consumption abnormity

Non-Patent Citations (7)

* Cited by examiner, † Cited by third party
Title
YAN SU 等: "An Improved Random Forest Model for the Prediction of Dam Displacement", IEEE ACCESS, vol. 9, pages 9142, XP011831800, DOI: 10.1109/ACCESS.2021.3049578 *
YAO WANG: ""Practical and White-Box Anomaly Detection through Unsupervised and Active Learning"", 《2020 29TH INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATIONS AND NETWORKS》, pages 1 - 8 *
孟亦凡;李敬兆;张梅;: "基于LSTM边缘计算与随机森林雾决策的矿工状态监测设备", 煤矿机械, no. 11, pages 150 - 154 *
杨永娇 等: "基于Isolation Forest和Random Forest相结合的智能电网时间序列数据异常检测算法", 计算机与现代化, no. 03, pages 99 - 102 *
邓志赟 等: "基于PAM-RF的奶牛活动异常情况监测", 广东农业科学, vol. 42, no. 16, pages 122 - 129 *
闻克宇;赵国堂;何必胜;马剑;: "基于改进迁移学习的高速铁路短期客流时间序列预测方法", 系统工程, no. 03, pages 77 - 87 *
马超 等: "大数据环境下离散制造车间异常事件发现方法", 计算机应用与软件, vol. 34, no. 09, pages 288 - 293 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113656271A (en) * 2021-08-10 2021-11-16 上海浦东发展银行股份有限公司 Method, device and equipment for processing user abnormal behaviors and storage medium
CN114066173A (en) * 2021-10-26 2022-02-18 福建正孚软件有限公司 Capital flow behavior analysis method and storage medium
CN115146174A (en) * 2022-07-26 2022-10-04 北京永信至诚科技股份有限公司 Key clue recommendation method and system based on multi-dimensional weight model

Similar Documents

Publication Publication Date Title
KR101984730B1 (en) Automatic predicting system for server failure and automatic predicting method for server failure
CN112905671A (en) Time series exception handling method and device, electronic equipment and storage medium
CN108415789B (en) Node fault prediction system and method for large-scale hybrid heterogeneous storage system
CN112118141B (en) Communication network-oriented alarm event correlation compression method and device
Zhang et al. Predict failures in production lines: A two-stage approach with clustering and supervised learning
CN109544399B (en) Power transmission equipment state evaluation method and device based on multi-source heterogeneous data
CN111325410B (en) Universal fault early warning system based on sample distribution and early warning method thereof
EP1958034B1 (en) Use of sequential clustering for instance selection in machine condition monitoring
CN113918367A (en) Large-scale system log anomaly detection method based on attention mechanism
CN111416790B (en) Network abnormal access intelligent identification method and device based on user behavior, storage medium and computer equipment
CN112685459A (en) Attack source feature identification method based on K-means clustering algorithm
CN113125903A (en) Line loss anomaly detection method, device, equipment and computer-readable storage medium
CN110753049B (en) Safety situation sensing system based on industrial control network flow
CN116843955A (en) Microorganism classification and identification method and system based on computer vision
Aziz et al. Cluster Analysis-Based Approach Features Selection on Machine Learning for Detecting Intrusion.
CN116823496A (en) Intelligent insurance risk assessment and pricing system based on artificial intelligence
CN113608968B (en) Power dispatching monitoring data anomaly detection method based on density distance comprehensive decision
CN114880312A (en) Flexibly-set application system service data auditing method
CN112363891B (en) Method for obtaining abnormal reasons based on fine-grained events and KPIs (Key Performance indicators) analysis
CN114416423A (en) Root cause positioning method and system based on machine learning
CN111275136B (en) Fault prediction system based on small sample and early warning method thereof
CN117216713A (en) Fault delimiting method, device, electronic equipment and storage medium
CN112039907A (en) Automatic testing method and system based on Internet of things terminal evaluation platform
CN116701846A (en) Hydropower station dispatching operation data cleaning method based on unsupervised learning
CN117172093A (en) Method and device for optimizing strategy of Linux system kernel configuration based on machine learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20210810

Address after: 100029 Beijing city Chaoyang District Yumin Road No. 3

Applicant after: NATIONAL COMPUTER NETWORK AND INFORMATION SECURITY MANAGEMENT CENTER

Address before: 100083 4th floor, block a, Dongsheng building, No. 8, Zhongguancun East Road, Haidian District, Beijing

Applicant before: Beijing Bishi Technology Co.,Ltd.

Applicant before: NATIONAL COMPUTER NETWORK AND INFORMATION SECURITY MANAGEMENT CENTER