CN109543765A - A kind of industrial data denoising method based on improvement IForest - Google Patents

A kind of industrial data denoising method based on improvement IForest Download PDF

Info

Publication number
CN109543765A
CN109543765A CN201811439128.2A CN201811439128A CN109543765A CN 109543765 A CN109543765 A CN 109543765A CN 201811439128 A CN201811439128 A CN 201811439128A CN 109543765 A CN109543765 A CN 109543765A
Authority
CN
China
Prior art keywords
data
iforest
abnormal
value
buffer area
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811439128.2A
Other languages
Chinese (zh)
Inventor
孙杰
李鹏飞
丁有伟
陈智也
沈祥红
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Sea Level Data Technology Co ltd
Original Assignee
Jiangsu Sea Level Data Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Sea Level Data Technology Co ltd filed Critical Jiangsu Sea Level Data Technology Co ltd
Publication of CN109543765A publication Critical patent/CN109543765A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present invention is a kind of to be related to the industrial data denoising method based on deep learning based on the industrial data denoising method for improving IForest, belongs to the application fields such as big data processing and machine learning.Include the following steps: 1) building initially isolated forest IForest;2) online abnormality detection;3) judge whether to need to update detector.Compared to traditional IForset abnormal point detecting method, the present invention has the advantage that first is that not using the threshold value of single numeric type using the threshold setting method arrived when proposing outlier detection based on global isolated forest;Second is that using buffer technology to store sampled data, isolated forest is updated with timing, improves the accuracy of training pattern.Third is that the set time is updated the training data of buffer area, to eliminate excessively outmoded data, new data is stored, to improve the abnormality detection effect to data at hand.

Description

A kind of industrial data denoising method based on improvement IForest
Technical field
A kind of industrial data being related to based on the industrial data denoising method for improving IForest based on deep learning of the present invention Denoising method belongs to the application fields such as big data processing and machine learning.
Background technique
Big data and machine learning have become the new trend of computer nowadays research industry, and major IT enterprises are released one after another Commercialized platform and its big data processing scheme.In numerous data, due to data acquisition or misregistration, data itself Intrinsic variation, inevitable accidentalia, all may cause the generation of abnormal data.There are the obvious spies of noise in data Sign is that wherein there is abnormal datas.Because of the particularity of this data, it is believed that it is not random deviation, but with it is other Data are not belonging to same mechanism.By being detected to this data, can effectively find present in current industrial data Problem carries out data cleansing or timely early warning, reduces the risk or accident of industrial equipment operation, can also set to currently used The fields such as the standby following operating condition, anti-cheating, pseudo-base station and financial swindling provide directive significance.It must in data prediction These abnormal datas must be handled, just can ensure that the processing accuracy of next step data, therefore abnormality detection is also data The crucial research direction of one excavated.
There is currently the algorithms of a variety of data de-noisings, mainly include Statistics-Based Method, the method based on propinquity, base Method in cluster and the method based on time series etc..Various methods have the scope of application of oneself.Based on statistical Method be by training one data-oriented collection probabilistic model, using in the model be lower than probability tend in object as extremely Value.Statistical method can be subdivided into parametric technique and nonparametric technique, and common parametric technique needs first to assume normal parameter It is to obey certain distribution according to some parameter to generate, common parametric technique has method based on Gauss model, based on returning The method etc. of model.And nonparametric technique does not need to assume that data meet certain probabilistic model in advance, but according to the number of input According to final probabilistic model is established, wherein common nonparametric technique has the method based on histogram and the side based on kernel estimates Method etc..Neighborhood density where method based on propinquity assumes normal value object is higher, and abnormal data is often in density Lower neighborhood, there are significantly different with normal data.Such algorithm needs the neighborhood to each data in the detection process It is searched, so its corresponding time complexity is very high.Method based on cluster is according to data similarity, by set of metadata of similar data It is divided into the same cluster, the similarity for needing to meet data in cluster is big as far as possible, and the similarity of data is as far as possible between cluster It is small.Usually assume that normal data belongs to the high cluster of the big degree of approximation of range, and abnormal data then belongs to the low cluster of the small degree of approximation of range. When in this way, usually the degree of peeling off of each data is calculated, and this degree of peeling off tends to rely on data point The size and cluster central point distance of affiliated cluster.Method based on time series is to arrange the data obtained with time change Column combine, and are nursed using mathematical method or the method for data mining to it, right to find out wherein implicit meaning Follow-up work provides safeguard.But time data are often with apparent time response, constituent also different from general data, It usually will appear seasonal variations, circulation change or irregular variation etc..
And the IForest based on statistical modeling has linear time complexity, can be used in the data containing mass data Collection is above.The quantity usually set is more, and algorithm is more stable.Since each tree all independently generates mutually, can dispose Accelerate operation on large scale distributed system.Traditional IForest method is fewer in terms of industrial data cleaning, It is simple threshold values to be set, to judge number currently entered on the basis of establishing isolated forest when handling other abnormal datas According to whether in abnormal.It has ignored as time goes by, the continuous arrival of new data, the isolated forest originally established can also occur Variation will lead to this wrong feature of Result.
Although it is sentenced in conclusion currently used traditional IForest method can be used in industrial data cleaning Fixed abnormal threshold value is excessively single, and does not account for as the model that new data arrival was established originally should be updated, To ensure this characteristic of the accuracy of abnormal data result of lookup.
In the application of industrial equipment, the data acquired in real time would generally do Feature Engineering comprising some abnormal datas It needs to analyze the data being related to before with model training, removes abnormal data therein.
Therefore the industrial data denoising method that a kind of effect is good, high-efficient is needed.
Summary of the invention
It is a kind of based on the industrial data denoising for improving IForest it is an object of the invention to provide in view of the above shortcomings Method, setting are updated original IForest, with the isolated relevant threshold value of forest of the overall situation, and as time goes by mention The accuracy that high abnormal data is searched, convenient for quickly and effectively being denoised to large-scale industry data.In solving industrial data There are when abnormal data, abnormal data therein can be judged and be searched, while guaranteeing to search accuracy, be improved The speed of lookup.
The present invention adopts the following technical solutions to achieve:
Based on the industrial data denoising method for improving IForest, include the following steps,
1) forest IForest is initially isolated in building;
1-1) building tree iTree;
Isolated forest IForest(Isolation Forest) it is to be constituted by largely setting iTree;ITree is one kind random two Fork tree, each node or there are two children or be exactly leaf node;All properties are all in given history data set D, D The composition process of the variable of continuous type, iTree is as follows:
An attribute Qi 1-1-1) is randomly choosed in given history data set D;
A value q of the attribute 1-1-2) is randomly choosed, the value is between maximum and minimum value;
1-1-3) the attribute Qi according to step 1-1-1) classifies to every record, attribute Qi value is less than the note of q Recording playback is placed on right child in left child, the record attribute Qi value more than or equal to q;
1-1-4) the left child of recursive construction and right child only have a record or a plurality of one until meeting incoming data set The record of sample or the height of tree have reached restriction height l;Modify the number of path N of isolated forest, initial value 0.
1-2) data generated because of industrial equipment are diversified, and every kind of data can all generate an iTree, by institute Some trees iTree has combined to form initial isolated forest IForest.The isolated forest can be used as initial abnormal inspection Survey device.
2) online abnormality detection;
To the data of each arrival, its corresponding different types of data is put into the isolated forest built up in step 1) In, unusual condition is judged, if being higher than preset threshold value, explanation according to the abnormal score that input data mean depth obtains The arrival data do not reach general level, are abnormal datas;
3) judge whether to need to update detector;
Calculated according to user's application scale predetermined, if storage sample buffer area expired, to detector into Row updates.According to principle of locality, for upcoming data, new data is more more valuable than legacy data, and therefore, setting is fixed Time legacy data in buffer area is carried out it is superseded, so as to new data arrival when facilitate storage, improve the accuracy of abnormality detection.
In step 2, when the data of the arrival reach, it to be also based on Poisson distribution, judges whether the sample is used as more New samples are added to the buffer area for setting storage sample data, and in the buffer area, sample data is suitable according to time order and function Sequence arrangement.
Compared to traditional IForset abnormal point detecting method, the present invention has the advantage that first is that being isolated based on global Using the threshold setting method arrived when forest proposes outlier detection, the threshold value of single numeric type is not used;Second is that using Buffer technology stores sampled data, updates isolated forest with timing, improves the accuracy of training pattern.Third is that the set time The training data of buffer area is updated, to eliminate excessively outmoded data, stores new data, to improve at hand The abnormality detection effect of data.
Detailed description of the invention
Below with reference to attached drawing, the invention will be further described:
Fig. 1 is the step flow chart of the method for the present invention;
Fig. 2 is transformer data exception detection process flow chart in embodiment of the present invention method.
Specific embodiment
Below by embodiment, in conjunction with the realization process of attached drawing present invention be described in more detail method.
Fig. 1 shows the step process of the method for the present invention, based on the industrial data denoising method for improving IForest, including Following steps:
1) forest IForest is initially isolated in building;
2) online abnormality detection;
To the data of each arrival, its corresponding different types of data is put into the isolated forest built up, is judged different Normal situation illustrates the arrival number if being higher than preset threshold value according to the abnormal score that input data mean depth obtains It is abnormal data according to general level is not reached;
The main task of the step is to calculate the abnormal score of data record x, and abnormal scoring function is as follows,
, in which:
Wherein N is the number of passes in IForest from root to leaf node, and M is sample set scale, and avgdepth (x) is forest Average isolation depth (isolation depth smaller, easier to be isolated, i.e. x is that the probability of abnormal data is bigger), depth (x, Itree data x) is indicated from tree root to the traversal depth for being isolated node, and c (M) is the isolation for the data set being made of M sample The desired value of depth is set, H (M) is harmonic number, can estimate to obtain with Euler's constant, i.e. H (M)=ln (M)+0.572156649.
It can be seen that working asWhen (i.e. the average isolation depth of forest level off to 0),(i.e. the abnormal score of data record x level off to 1), data x is abnormal data at this time;Work as avgdepth (x) when>c (M), score (x)<0.5 thinks that data x is normal data at this time;Threshold value u can be determined by experiment 0.5.
Wherein it should be noted that needing to calculate whether it meets Poisson distribution when input data arrives, calculate public Formula is:
Wherein X is stochastic variable, only takes nonnegative integral value 0,1,2 ..., parameter It is that random time is averaged in the unit time Incidence;E is constant, and normal value is e=2.718281828459;K=0,1,2 ....When Poisson distribution effect is description unit Between random time occur number.
When the arrival data reach, it is based on Poisson distribution, judges whether the sample is added to as more new samples and sets Set the buffer area of storage sample data.
3) judge whether to need to update detector;
According to principle of locality, for upcoming data, new data is more more valuable than legacy data, therefore, when setting is fixed Between legacy data in buffer area is carried out it is superseded, so as to new data arrival when facilitate storage;
Abnormal rate 3-1) is calculated according to user's application scale predetermined, if having arrived fixed renewal time or Sample Buffer Area has expired, then is updated to detector;
It 3-2) is calculated according to user's application scale predetermined, if the buffer area of storage sample has been expired, to detection Device is updated;
Finally return that the anomaly detector of update, that is, isolated forest IForest '.
It is the detection process flow chart of transformer data exception in step 2 shown in Fig. 2, process includes the following steps:
2-1) first collect transformer historical data, wherein transformer historical data include voltage ratio, working frequency, inductance, Degree of protection and fully loaded performance etc.;These different types of data are arranged, are indicated using numerical value;
Above-mentioned historical data is established into isolated forest according to IForest algorithm, as the model carried out abnormality detection;
The threshold value for test determining abnormal data score 2-2) is set as 0.5;When equipment operating data arrives, to wherein each The data of seed type carry out depth calculation in isolated forest, seek its mean value, and calculate it according to abnormal scoring function and obtain extremely Point, if it exceeds threshold value 0.5, then carry out early warning, it is abnormal to illustrate that current arrival data exist;If being not above threshold value 0.5, Wait the arrival of equipment operating data next time.
The distribution situation that it is obeyed should also be calculated when operation data reaches, if obeying Poisson distribution, and And buffer area it is discontented when, which should be added buffer area.
2-3) when buffer area has been expired, or time is up for fixed update abnormal detection model, then needs to abnormality detection Model, that is, IForest are updated, to ensure the accuracy of abnormality detection.
The present invention by being improved to traditional IForest method, using improve abnormal data lookup accuracy as target, A kind of industrial data denoising method based on improvement IForest is devised, it, should to carry out quickly and effectively anomaly data detection Algorithm effect is good, and time efficiency is high, and higher-dimension mass data can be effectively treated.

Claims (5)

1. a kind of based on the industrial data denoising method for improving IForest, which comprises the steps of:
1) forest IForest is initially isolated in building;
1-1) building tree iTree;
All tree iTree 1-2) have been combined to form to initial isolated forest IForest;The isolated forest is as initial Anomaly detector;
2) online abnormality detection;
To the data of each arrival, its corresponding different types of data is put into the isolated forest built up in step 1) In, unusual condition is judged, if being higher than preset threshold value, explanation according to the abnormal score that input data mean depth obtains The arrival data do not reach general level, are abnormal datas;
3) judge whether to need to update detector;
Calculated according to user's application scale predetermined, if storage sample buffer area expired, to detector into Row updates.
2. according to claim 1 based on the industrial data denoising method for improving IForest, which is characterized in that isolate gloomy Woods IForest is constituted by largely setting iTree;ITree is a kind of random binary tree, each node or there are two child, It is exactly leaf node;All properties are all the variable of continuous type, the composition process of iTree in given history data set D, D It is as follows:
An attribute Qi 1-1-1) is randomly choosed in given history data set D;
A value q of the attribute 1-1-2) is randomly choosed, the value is between maximum and minimum value;
1-1-3) the attribute Qi according to step 1-1-1) classifies to every record, attribute Qi value is less than the note of q Recording playback is placed on right child in left child, the record attribute Qi value more than or equal to q;
1-1-4) the left child of recursive construction and right child only have a record or a plurality of one until meeting incoming data set The record of sample or the height of tree have reached restriction height l;Modify the number of path N of isolated forest, initial value 0.
3. according to claim 1 based on the industrial data denoising method for improving IForest, which is characterized in that step 2 In, when the data of the arrival reach, it to be based on Poisson distribution, judges whether the sample is used as more new samples to be added to setting The buffer area of sample data is stored well, and in the buffer area, sample data is arranged according to chronological order.
4. according to claim 1 based on the industrial data denoising method for improving IForest, which is characterized in that step 2 The detection process of middle transformer data exception, includes the following steps:
2-1) first collect transformer historical data, wherein transformer historical data include voltage ratio, working frequency, inductance, Degree of protection and fully loaded performance;These different types of data are arranged, are indicated using numerical value;
Above-mentioned historical data is established into isolated forest according to IForest algorithm, as the model carried out abnormality detection;
The threshold value for test determining abnormal data score 2-2) is set as 0.5;When equipment operating data arrives, to wherein each The data of seed type carry out depth calculation in isolated forest, seek its mean value, and calculate it according to abnormal scoring function and obtain extremely Point, if it exceeds threshold value 0.5, then carry out early warning, it is abnormal to illustrate that current arrival data exist;If being not above threshold value 0.5, Wait the arrival of equipment operating data next time;
The distribution situation that it is obeyed should also be calculated when operation data reaches, if obeying Poisson distribution, Er Qiehuan Rush area it is discontented when, which should be added buffer area;
2-3) when buffer area has been expired, or time is up for fixed update abnormal detection model, then needs to abnormality detection model, Namely IForest is updated, to ensure the accuracy of abnormality detection.
5. according to claim 1 based on the industrial data denoising method for improving IForest, which is characterized in that step (3) Judge whether to need to update detector;According to principle of locality, for upcoming data, new data is more valuable than legacy data Value, therefore, setting the set time legacy data in buffer area is carried out it is superseded, so as to new data arrival when facilitate storage;It is specific Steps are as follows:
Abnormal rate 3-1) is calculated according to user's application scale predetermined, if having arrived fixed renewal time or Sample Buffer Area has expired, then is updated to detector;
It 3-2) is calculated according to user's application scale predetermined, if the buffer area of storage sample has been expired, to detection Device is updated;
Finally return that the anomaly detector of update, that is, isolated forest IForest '.
CN201811439128.2A 2018-08-23 2018-11-29 A kind of industrial data denoising method based on improvement IForest Pending CN109543765A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810965599 2018-08-23
CN2018109655990 2018-08-23

Publications (1)

Publication Number Publication Date
CN109543765A true CN109543765A (en) 2019-03-29

Family

ID=65850765

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811439128.2A Pending CN109543765A (en) 2018-08-23 2018-11-29 A kind of industrial data denoising method based on improvement IForest

Country Status (1)

Country Link
CN (1) CN109543765A (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109948738A (en) * 2019-04-11 2019-06-28 合肥工业大学 Energy consumption method for detecting abnormality, the apparatus and system of coating drying room
CN110119340A (en) * 2019-05-17 2019-08-13 北京字节跳动网络技术有限公司 Method for monitoring abnormality, device, electronic equipment and storage medium
CN110149258A (en) * 2019-04-12 2019-08-20 北京航空航天大学 A kind of automobile CAN-bus network data method for detecting abnormality based on isolated forest
CN110243599A (en) * 2019-07-02 2019-09-17 西南交通大学 Multidimensional peels off train EMU axle box bearing temperature anomaly state monitoring method
CN110888850A (en) * 2019-12-04 2020-03-17 国网山东省电力公司威海供电公司 Data quality detection method based on power Internet of things platform
CN111428886A (en) * 2020-04-10 2020-07-17 青岛聚好联科技有限公司 Fault diagnosis deep learning model self-adaptive updating method and device
CN111740856A (en) * 2020-05-07 2020-10-02 北京直真科技股份有限公司 Network communication equipment alarm acquisition abnormity early warning method based on abnormity detection algorithm
CN112990246A (en) * 2019-12-17 2021-06-18 杭州海康威视数字技术股份有限公司 Method and device for establishing isolated tree model
CN113112188A (en) * 2021-05-14 2021-07-13 北京邮电大学 Power dispatching monitoring data anomaly detection method based on pre-screening dynamic integration
CN113721000A (en) * 2021-07-16 2021-11-30 国家电网有限公司大数据中心 Method and system for detecting abnormity of dissolved gas in transformer oil
CN113762507A (en) * 2021-08-24 2021-12-07 浙江中辰城市应急服务管理有限公司 Semi-supervised deep learning arc voltage anomaly detection method based on phase space reconstruction
CN116842322A (en) * 2023-07-19 2023-10-03 深圳市精微康投资发展有限公司 Electric motor operation optimization method and system based on data processing

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106960358A (en) * 2017-01-13 2017-07-18 重庆小富农康农业科技服务有限公司 A kind of financial fraud behavior based on rural area electronic commerce big data deep learning quantifies detecting system
CN107657288A (en) * 2017-10-26 2018-02-02 国网冀北电力有限公司 A kind of power scheduling flow data method for detecting abnormality based on isolated forest algorithm

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106960358A (en) * 2017-01-13 2017-07-18 重庆小富农康农业科技服务有限公司 A kind of financial fraud behavior based on rural area electronic commerce big data deep learning quantifies detecting system
CN107657288A (en) * 2017-10-26 2018-02-02 国网冀北电力有限公司 A kind of power scheduling flow data method for detecting abnormality based on isolated forest algorithm

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
丁智国,莫毓昌,杨凡: ""一种新的在线流数据异常检测方法"", 《计算机科学》 *
单林森: ""配电变压器故障分析及随机森林诊断实现"", 《科技风》 *
李星南,施展,亢中苗等: ""基于孤立森林算法和BP神经网络算法的电力运维数据清洗方法"", 《电气应用》 *

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109948738A (en) * 2019-04-11 2019-06-28 合肥工业大学 Energy consumption method for detecting abnormality, the apparatus and system of coating drying room
CN110149258A (en) * 2019-04-12 2019-08-20 北京航空航天大学 A kind of automobile CAN-bus network data method for detecting abnormality based on isolated forest
CN110119340A (en) * 2019-05-17 2019-08-13 北京字节跳动网络技术有限公司 Method for monitoring abnormality, device, electronic equipment and storage medium
CN110243599A (en) * 2019-07-02 2019-09-17 西南交通大学 Multidimensional peels off train EMU axle box bearing temperature anomaly state monitoring method
CN110243599B (en) * 2019-07-02 2020-05-05 西南交通大学 Method for monitoring temperature abnormal state of multi-dimensional outlier train motor train unit axle box bearing
CN110888850A (en) * 2019-12-04 2020-03-17 国网山东省电力公司威海供电公司 Data quality detection method based on power Internet of things platform
CN112990246B (en) * 2019-12-17 2022-09-09 杭州海康威视数字技术股份有限公司 Method and device for establishing isolated tree model
CN112990246A (en) * 2019-12-17 2021-06-18 杭州海康威视数字技术股份有限公司 Method and device for establishing isolated tree model
CN111428886A (en) * 2020-04-10 2020-07-17 青岛聚好联科技有限公司 Fault diagnosis deep learning model self-adaptive updating method and device
CN111428886B (en) * 2020-04-10 2023-08-04 青岛聚好联科技有限公司 Method and device for adaptively updating deep learning model of fault diagnosis
CN111740856B (en) * 2020-05-07 2023-04-28 北京直真科技股份有限公司 Network communication equipment alarm acquisition abnormity early warning method based on abnormity detection algorithm
CN111740856A (en) * 2020-05-07 2020-10-02 北京直真科技股份有限公司 Network communication equipment alarm acquisition abnormity early warning method based on abnormity detection algorithm
CN113112188B (en) * 2021-05-14 2022-05-17 北京邮电大学 Power dispatching monitoring data anomaly detection method based on pre-screening dynamic integration
CN113112188A (en) * 2021-05-14 2021-07-13 北京邮电大学 Power dispatching monitoring data anomaly detection method based on pre-screening dynamic integration
CN113721000A (en) * 2021-07-16 2021-11-30 国家电网有限公司大数据中心 Method and system for detecting abnormity of dissolved gas in transformer oil
CN113721000B (en) * 2021-07-16 2023-02-03 国家电网有限公司大数据中心 Method and system for detecting abnormity of dissolved gas in transformer oil
CN113762507A (en) * 2021-08-24 2021-12-07 浙江中辰城市应急服务管理有限公司 Semi-supervised deep learning arc voltage anomaly detection method based on phase space reconstruction
CN113762507B (en) * 2021-08-24 2023-12-29 浙江中辰城市应急服务管理有限公司 Semi-supervised deep learning arc voltage anomaly detection method based on phase space reconstruction
CN116842322A (en) * 2023-07-19 2023-10-03 深圳市精微康投资发展有限公司 Electric motor operation optimization method and system based on data processing
CN116842322B (en) * 2023-07-19 2024-02-23 深圳市精微康投资发展有限公司 Electric motor operation optimization method and system based on data processing

Similar Documents

Publication Publication Date Title
CN109543765A (en) A kind of industrial data denoising method based on improvement IForest
CN104281674B (en) It is a kind of based on the adaptive clustering scheme and system that gather coefficient
CN106250513A (en) A kind of event personalization sorting technique based on event modeling and system
CN106547740A (en) Text message processing method and device
CN112640380A (en) Apparatus and method for anomaly detection of an input stream of events
CN110110322A (en) Network new word discovery method, apparatus, electronic equipment and storage medium
CN103761249B (en) Data lead-in method and system based on Data Matching
CN110471946A (en) A kind of LOF outlier detection method and system based on grid beta pruning
CN106294815B (en) A kind of clustering method and device of URL
CN106909669A (en) The detection method and device of a kind of promotion message
CN107679135A (en) The topic detection of network-oriented text big data and tracking, device
CN107623639A (en) Data flow distribution similarity join method based on EMD distances
CN105990170A (en) Wafer yield analysis method and device
CN103150470B (en) Data flow concept drift method for visualizing under a kind of dynamic data environment
CN110457595A (en) Emergency event alarm method, device, system, electronic equipment and storage medium
CN110717092A (en) Method, system, device and storage medium for matching objects for articles
CN109471934B (en) Financial risk clue mining method based on Internet
Rani Visual analytics for comparing the impact of outliers in k-means and k-medoids algorithm
CN107704872A (en) A kind of K means based on relatively most discrete dimension segmentation cluster initial center choosing method
CN108874974A (en) Parallelization Topic Tracking method based on frequent term set
CN111079089B (en) Base station data anomaly detection method based on interval division
CN109753497A (en) Data processing empty value method, apparatus and terminal device
CN107562714A (en) A kind of statement similarity computational methods and device
CN105930462A (en) Cloud computing platform based massive data processing method
Subhadra et al. Extended ACO based document clustering with hybrid distance metric

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190329

RJ01 Rejection of invention patent application after publication