CN115438035A - Data exception handling method based on KPCA and mixed similarity - Google Patents
Data exception handling method based on KPCA and mixed similarity Download PDFInfo
- Publication number
- CN115438035A CN115438035A CN202211321839.6A CN202211321839A CN115438035A CN 115438035 A CN115438035 A CN 115438035A CN 202211321839 A CN202211321839 A CN 202211321839A CN 115438035 A CN115438035 A CN 115438035A
- Authority
- CN
- China
- Prior art keywords
- data
- dimensional data
- dimensional
- low
- kpca
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/215—Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
- G06F16/258—Data format conversion from or to a database
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Quality & Reliability (AREA)
- Complex Calculations (AREA)
Abstract
The invention discloses a data exception handling method based on KPCA and mixed similarity, which comprises the following steps: s1: the terminal generates a task and uploads the task to the edge terminal; s2: the edge terminal receives the task and divides the data related to the task into high-dimensional data and low-dimensional data; s3: processing the high-dimensional data and the low-dimensional data; s4: and the edge terminal uploads the processed data to the cloud terminal. Through the mode, the data exception handling method provided by the invention has higher integrity on data feature mining, and the data exception handling method based on KPCA and mixed similarity has higher accuracy, so that the quality management level of a data set is improved, and the safe, stable and high-quality operation of a cloud end and an edge end to a task is promoted.
Description
Technical Field
The invention relates to the field of big data processing, in particular to a data exception handling method based on KPCA and mixed similarity.
Background
In recent years, a traditional industrial control system is gradually connected with an internet and cloud platform to form an industrial internet platform. Meanwhile, with the rapid development of the internet of things and the 5G technology, the mobile terminal device generates massive data. All data collected by the terminal equipment are transmitted to the cloud end through the network, and cleaning, mining and other work are carried out on the cloud end. Therefore, long time delay is brought by huge pressure of network bandwidth, and meanwhile, computing resources of a cloud computing center can be wasted, therefore, after reasonable data cleaning processing is carried out at an edge end, clean data are uploaded to a cloud end to be stored and utilized very necessarily, detection and cleaning of abnormal values of related industrial data are often included in the prior art, and duplication removal of related redundant data is rarely included.
The Chinese invention patent (application number: 201811519395.0, publication number: CN 109635958A) discloses an intelligent power data anomaly detection method, which performs dimension reduction on effective offline data samples and calculates to obtain a time sequence sample sequence, and comprises the following steps: carrying out dimensionality reduction on the effective offline data sample by using a Principal Component Analysis (PCA) method, and removing the relevance of each dimensionality characteristic above three dimensions to obtain the offline data sample after dimensionality reduction; and carrying out serialization processing on the offline data samples after dimension reduction to obtain a time sequence sample sequence. The scheme has the following defects: most of traditional industrial data are high-dimensional data with strong nonlinearity, the PCA algorithm has a common effect on nonlinear data processing, the data information after dimensionality reduction is poor in storage, nonlinear characteristics are difficult to obtain, and the data accuracy after abnormal detection is low.
Chinese invention patent (application number: 201911423436.0, publication number: CN 111275288A) discloses a multidimensional data anomaly detection method and device based on XGboost, which comprises the following steps: data acquisition and cleaning, namely performing standardized processing on the cleaned data and unifying dimensions among different dimensionality data; extracting characteristics and reducing dimensions, constructing an anomaly detection model for training, training dimension reduction data by using an XGboost method, and establishing a prediction model of equipment anomaly; and carrying out online abnormal detection, and if the abnormal detection exceeds a given threshold value, judging that the abnormality occurs. The technical scheme has the defects that only the Pearson correlation coefficient is considered, the test effect on the data set with strong correlation relation is good, the effect on the industrial data with strong nonlinear relation is poor, the detection accuracy of redundant data is insufficient, and the de-duplication effect is poor.
Disclosure of Invention
In order to solve the above technical problem or at least partially solve the above technical problem, the present disclosure provides a data exception handling method based on KPCA and hybrid similarity, comprising:
s1: the terminal generates a task and uploads the task to the edge terminal;
s2: the edge terminal receives the task and divides data related to the task into high-dimensional data and low-dimensional data according to dimensions;
s3: processing the high-dimensional data and the low-dimensional data;
s4: and the edge terminal uploads the processed data to the cloud terminal.
Further, the high-dimensional data is data with dimension > = 3;
the low-dimensional data is data with a dimension < 3;
further, the high-dimensional data and the low-dimensional data are processed; the method comprises the following steps:
s31, carrying out anomaly detection on the high-dimensional data and the low-dimensional data to obtain a detection result;
s32, cleaning the detection result to obtain a cleaned data set;
and S33, judging and processing redundant data of the cleaned data set.
Further, the performing anomaly detection on the high-dimensional data and the low-dimensional data to obtain a detection result includes:
s311, carrying out anomaly detection on the low-dimensional data by adopting iForest to obtain the path length and the anomaly score corresponding to each low-dimensional data;
s312, converting the high-dimensional data into feature data by adopting a KPCA algorithm, and performing anomaly detection on the feature data by adopting iForest to obtain the path length and the anomaly score corresponding to each high-dimensional data;
further, the converting the high-dimensional data into the feature data by using the KPCA algorithm includes:
and establishing a high-dimensional data mapping database, and recording all original high-dimensional data and corresponding characteristic data in the high-dimensional data mapping database.
Further, the cleaning the detection result includes:
s321, obtaining path lengths and abnormal scores of high-dimensional data and low-dimensional data, and calculating an average path length;
and S322, taking the data with the average path length ranging from 0 to 0.15 and the abnormal score ranging from 0.85 to 1 as abnormal values, and cleaning the data.
Further, the cleaning of the detection result is performed separately for the high-dimensional data and the low-dimensional data by using the methods described in S31, S32, and S33.
Further, the performing redundant data judgment and processing on the cleaned data set includes:
s331, obtaining the data with the average path length similar to the abnormal score, and assuming the obtained data asThen will beRegarded as the redundant data; in the step S331, the low-dimensional data and the high-dimensional data are both performed by the above method, and are separately and synchronously performed;
s332. AnalysisData type of (A) ifFor low-dimensional redundant data, go to S333, ifTurning to S334 for high-dimensional redundant data;
s333, acquiring similarity of the low-dimensional redundant data by adopting Pearson correlation coefficient(ii) a The formula is as follows:
s334, obtaining the high-dimensional data mapping databaseCorresponding original high dimensional dataAcquiring the similarity of the high-dimensional redundant data by adopting a hybrid similarity algorithm(ii) a The formula is as follows:
whereinThe spearman correlation coefficient is taken as the weight,is composed ofThe spearman correlation coefficient of the data,is composed ofA mutual information value of;
s335. The method comprises the following stepsOrAnd a predetermined thresholdBy comparison, if H 1 >Delta or H 2 >δ, then representsIf redundant data exists in the data, the data is cleared.
Further, theA preset threshold valueIs manually taken as a value of (1),the range is 0 to 1, the preferable value is 0.5,the range does not exceed the calculated maximum value of similarity, preferably,the values were set to 90% of the maximum similarity value.
Compared with the prior art, the technical scheme provided by the invention has the following advantages:
the data exception handling method based on KPCA and mixed similarity provided by the invention can analyze the tasks generated by the terminal and uploaded to the edge terminal, divide the data related to the tasks into high-dimensional data and low-dimensional data, process the high-dimensional data and the low-dimensional data, and upload the processed data to the cloud terminal by the edge terminal. Meanwhile, aiming at the characteristic that the dimensionality of industrial data changes greatly, the data type is divided into high-dimensional data and low-dimensional data, the high-dimensional data is subjected to data processing by adopting a KPCA algorithm, the dimensionality of a data set is reduced by characteristic extraction, and the anomaly detection of the high-dimensional data and the low-dimensional data is realized; aiming at the characteristic that the non-linear characteristics of the industrial data are difficult to mine, the method adopts the Pearson correlation coefficient to combine with a mixed similarity algorithm to realize the detection of redundant data, wherein, for the non-linear characteristics of the high-dimensional data and the similarity between the high-dimensional data with a certain dependency relationship, the similarity of the high-dimensional data is calculated by adopting the Spireman correlation coefficient to combine with a mutual information value method. Therefore, the data exception handling method provided by the invention has higher integrity on data feature mining, and the provided data exception detection and duplicate removal scheme has higher accuracy, so that the quality management level of the data set is improved, and the safe, stable and high-quality operation of the cloud end and the edge end to the task is promoted.
Drawings
Fig. 1 is a flowchart of a data exception handling method based on KPCA and mixed similarity according to the present invention.
FIG. 2 is a flow chart of a high-dimensional data low-dimensional data processing method of the data exception handling method based on KPCA and mixed similarity provided by the invention.
FIG. 3 is an abnormal data cleaning flow chart of the data abnormality processing method based on KPCA and mixed similarity provided by the present invention.
Detailed Description
The following detailed description of the preferred embodiments of the present invention, taken in conjunction with the accompanying drawings, will make the advantages and features of the present invention more comprehensible to those skilled in the art, and will thus provide a clear and concise definition of the scope of the present invention.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present disclosure, but the present disclosure may be practiced otherwise than as described herein; it is to be understood that the embodiments disclosed in the specification are only a few embodiments of the present disclosure, and not all embodiments.
Fig. 1 is a flowchart of a data exception handling method based on KPCA and mixed similarity provided by the present invention, where the method includes:
s1: the terminal generates a task and uploads the task to the edge terminal;
s2: the edge terminal receives the task and divides the data related to the task into high-dimensional data and low-dimensional data according to dimensions;
s3: processing the high-dimensional data and the low-dimensional data;
s4: and the edge terminal uploads the processed data to the cloud terminal.
Further, the high-dimensional data is data with dimension > = 3;
the low-dimensional data is data with a dimension < 3;
further, referring to fig. 2, the high-dimensional data and the low-dimensional data are processed; the method comprises the following steps:
s31, carrying out anomaly detection on the high-dimensional data and the low-dimensional data to obtain a detection result;
s32, cleaning the detection result to obtain a cleaned data set;
and S33, judging and processing redundant data of the cleaned data set.
Further, the performing anomaly detection on the high-dimensional data and the low-dimensional data to obtain a detection result includes:
s311, carrying out anomaly detection on the low-dimensional data by adopting iForest to obtain the path length and the anomaly score corresponding to each low-dimensional data;
s312, converting the high-dimensional data into feature data by adopting a KPCA algorithm, and performing anomaly detection on the feature data by adopting iForest to obtain the path length and the anomaly score corresponding to each high-dimensional data;
further, the calculation formula of the path length is as follows:
wherein, theAs the length of the path, it is,is the number of samples to be tested,is the Euler constant;
the calculation formula of the abnormal score is as follows:
wherein, theThe number of the abnormal points is represented,indicates a path length expectation, saidIn order to be a function of the harmony,。
the above-mentionedAnd outputting a value of 0 to 1 through an iForest algorithm for the path length expectation of the data on all iTrees.
Further, the converting the high-dimensional data into the feature data by using the KPCA algorithm includes:
establishing a high-dimensional data mapping database, and recording all original high-dimensional data and corresponding feature data in the high-dimensional data mapping database;
it can be understood that the feature data is obtained by dimensionality reduction of the original high-dimensional data, and in the anomaly detection of the high-dimensional data and the low-dimensional data, the high-dimensional data has nonlinear features, so that the feature data of the high-dimensional data is obtained by using a KPCA algorithm with a good effect, and the feature data is processed; in the process of judging and processing the redundant data of the cleaned data set, in order to ensure the integrity of high-dimensional data information, the original high-dimensional data is selected to be processed; the purpose of the construction of the high-dimensional data mapping database is to ensure the storage of the original high-dimensional data and the characteristic data, so that the scheme has higher flexibility and reliability.
Further, the cleaning the detection result includes:
s321, obtaining path lengths and abnormal scores of high-dimensional data and low-dimensional data, and calculating an average path length;
s322, taking the data with the average path length within the range of 0-0.15 and the abnormal score within the range of 0.85-1 as abnormal values, and cleaning the data;
in particular, determining ranges can be set by one skilled in the art based on the data characteristics and actual requirements, and the values provided herein are not intended to be limiting.
Go toStep (b), the detection result is cleaned, and the high-dimensional data and the low-dimensional data are respectively processed by the methods in the above S31, S32, and S33, and are separately and synchronously performed, wherein the high-dimensional data respectively select data with the same dimension to be processed, for example, the dimension of the high-dimensional data is N i (i =0,1, \8230;, N) then obtains the N of each dimension i The dimension data is processed using the above method, and is not described herein.
Further, referring to fig. 3, the determining and processing redundant data for the cleaned data set includes:
s331, obtaining data with similar average path length and abnormal score, and assuming the obtained data asThen will beRegarded as the redundant data; in the step S331, the low-dimensional data and the high-dimensional data are both performed by the above method, and are separately and synchronously performed;
s332. AnalysisData type of (A) ifFor low-dimensional redundant data, go to S333, ifTurning to S334 for high-dimensional redundant data;
s333, acquiring similarity of the low-dimensional redundant data by adopting Pearson correlation coefficient(ii) a The formula is as follows:
s334, obtaining the high-dimensional data mapping databaseCorresponding original high dimensional dataAcquiring the similarity of the high-dimensional redundant data by adopting a hybrid similarity algorithm(ii) a The formula is as follows:
whereinThe spearman correlation coefficient is taken as the weight,is composed ofThe spearman correlation coefficient of the data,is composed ofThe mutual information value of (a), wherein:
, representing dataThe joint probability of (a) is determined,to represent、The base of the log is usually taken as e.
In the scheme, the mutual information is the measurement of the mutual dependence degree of two data, and the larger the mutual information value is, the larger the dependence degree between the two data is;
s335. The method is implementedOrAnd a predetermined threshold valueBy comparison, if H 1 >Delta or H 2 >δ, then representsRedundant data exists in the data, and data clearing is carried out.
Further, theA preset threshold valueThe value of (b) can be determined depending on the situation,preferably 0.5.
In particular, the method comprises the following steps of,determination the person skilled in the art can set the threshold value according to the data characteristics and actual requirements, preferably, the manually set fixed threshold value is 90% of the current upper limit value of similarity, and the values provided herein can be used as reference, and are not limited.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of methods according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
It is noted that, in this document, relational terms are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrases "comprising one of 8230; \8230;" 8230; "does not exclude the presence of additional like elements in a process, method, article, or apparatus that comprises the element.
Claims (9)
1. A data exception handling method based on KPCA and mixed similarity is characterized by comprising the following steps:
s1: the terminal generates a task and uploads the task to the edge terminal;
s2: the edge terminal receives the task and divides the data related to the task into high-dimensional data and low-dimensional data;
s3: processing the high-dimensional data and the low-dimensional data;
s4: and the edge terminal uploads the processed data to the cloud terminal.
2. A KPCA and mixed similarity based data exception handling method according to claim 1,
the high-dimensional data is data with dimension > = 3;
the low-dimensional data is data with a dimension < 3.
3. A method for processing data exceptions based on KPCA and mixed similarities according to claim 1,
the processing the high-dimensional data and the low-dimensional data comprises:
s31, carrying out anomaly detection on the high-dimensional data and the low-dimensional data to obtain a detection result;
s32, cleaning the detection result to obtain a cleaned data set;
and S33, judging and processing redundant data of the cleaned data set.
4. A KPCA and mixed similarity based data exception handling method according to claim 3,
the abnormal detection of the high-dimensional data and the low-dimensional data to obtain a detection result comprises the following steps:
s311, carrying out anomaly detection on the low-dimensional data by adopting iForest to obtain the path length and the anomaly score corresponding to each low-dimensional data;
s312, converting the high-dimensional data into feature data by adopting a KPCA algorithm, and then performing anomaly detection on the feature data by adopting iForest to obtain the path length and the anomaly score corresponding to each high-dimensional data.
5. A method for processing data exceptions based on KPCA and mixed similarities according to claim 4,
the converting the high-dimensional data into the feature data by adopting a KPCA algorithm comprises the following steps:
and establishing a high-dimensional data mapping database, and recording all original high-dimensional data and corresponding feature data in the high-dimensional data mapping database.
6. A method for processing data exceptions based on KPCA and mixed similarities according to claim 5,
the cleaning the detection result comprises:
s321, obtaining path lengths and abnormal scores of high-dimensional data and low-dimensional data, and calculating an average path length;
s322, taking the data with the average path length within the range of 0-0.15 and the abnormal score within the range of 0.85-1 as abnormal values, and carrying out data cleaning.
7. A method for KPCA and mixed similarity based data exception handling according to any one of claims 4-6,
and cleaning the detection result, wherein the high-dimensional data and the low-dimensional data are respectively processed by adopting the methods related in S31, S32 and S33, and the high-dimensional data and the low-dimensional data are respectively processed by selecting data with the same dimension.
8. A KPCA and mixed similarity based data exception handling method according to claim 6,
the redundant data judgment and processing of the cleaned data set comprises the following steps:
s331, obtaining the data with the average path length similar to the abnormal score, and assuming the obtained data asThen will beConsidered as redundant data;in the step S331, the low-dimensional data and the high-dimensional data are both performed by the above method, and are separately and synchronously performed;
s332. AnalysisData type of (1) ifFor low-dimensional redundant data, go to S333, ifTurning to S334 for high-dimensional redundant data;
s333, acquiring similarity H of the low-dimensional redundant data by adopting Pearson correlation coefficient 1 (ii) a The formula is as follows:
s334, obtaining the high-dimensional data mapping databaseCorresponding original high dimensional dataObtaining the similarity H of the high-dimensional redundant data by adopting a hybrid similarity algorithm 2 (ii) a The formula is as follows:
wherein mu is the weight occupied by the spearman correlation coefficient,is composed ofThe spearman correlation coefficient of the data,is composed ofA mutual information value of;
9. The method of claim 8, wherein the data exception handling method based on KPCA and mixed similarity,
and the mu and the preset threshold delta are manually taken, the mu range is 0 to 1, and the delta range does not exceed the calculated maximum value of the similarity.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211321839.6A CN115438035B (en) | 2022-10-27 | 2022-10-27 | Data exception handling method based on KPCA and mixed similarity |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211321839.6A CN115438035B (en) | 2022-10-27 | 2022-10-27 | Data exception handling method based on KPCA and mixed similarity |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115438035A true CN115438035A (en) | 2022-12-06 |
CN115438035B CN115438035B (en) | 2023-04-07 |
Family
ID=84252560
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211321839.6A Active CN115438035B (en) | 2022-10-27 | 2022-10-27 | Data exception handling method based on KPCA and mixed similarity |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115438035B (en) |
Citations (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140162274A1 (en) * | 2012-06-28 | 2014-06-12 | Taxon Biosciences, Inc. | Compositions and methods for identifying and comparing members of microbial communities using amplicon sequences |
CN104091337A (en) * | 2014-07-11 | 2014-10-08 | 北京工业大学 | Deformation medical image registration method based on PCA and diffeomorphism Demons |
CN106709869A (en) * | 2016-12-25 | 2017-05-24 | 北京工业大学 | Dimensionally reduction method based on deep Pearson embedment |
CN106886601A (en) * | 2017-03-02 | 2017-06-23 | 大连理工大学 | A kind of Cross-modality searching algorithm based on the study of subspace vehicle mixing |
CN109214503A (en) * | 2018-08-01 | 2019-01-15 | 华北电力大学 | Project of transmitting and converting electricity cost forecasting method based on KPCA-LA-RBM |
CN110069467A (en) * | 2019-04-16 | 2019-07-30 | 沈阳工业大学 | System peak load based on Pearson's coefficient and MapReduce parallel computation clusters extraction method |
CN111275288A (en) * | 2019-12-31 | 2020-06-12 | 华电国际电力股份有限公司十里泉发电厂 | XGboost-based multi-dimensional data anomaly detection method and device |
CN111338897A (en) * | 2020-02-24 | 2020-06-26 | 京东数字科技控股有限公司 | Identification method of abnormal node in application host, monitoring equipment and electronic equipment |
US20200293554A1 (en) * | 2018-03-15 | 2020-09-17 | Alibaba Group Holding Limited | Abnormal sample prediction |
CN111931868A (en) * | 2020-09-24 | 2020-11-13 | 常州微亿智造科技有限公司 | Time series data abnormity detection method and device |
US20210200746A1 (en) * | 2019-12-30 | 2021-07-01 | Royal Bank Of Canada | System and method for multivariate anomaly detection |
CN113420691A (en) * | 2021-06-30 | 2021-09-21 | 昆明理工大学 | Mixed domain characteristic bearing fault diagnosis method based on Pearson correlation coefficient |
CN113901993A (en) * | 2021-09-24 | 2022-01-07 | 上海海事大学 | Fault diagnosis method based on PCCs secondary feature optimization |
CN114239807A (en) * | 2021-12-17 | 2022-03-25 | 山东省计算中心(国家超级计算济南中心) | RFE-DAGMM-based high-dimensional data anomaly detection method |
WO2022110557A1 (en) * | 2020-11-25 | 2022-06-02 | 国网湖南省电力有限公司 | Method and device for diagnosing user-transformer relationship anomaly in transformer area |
CN115150744A (en) * | 2022-08-02 | 2022-10-04 | 天津城建大学 | Indoor signal interference source positioning method for large conference venue |
-
2022
- 2022-10-27 CN CN202211321839.6A patent/CN115438035B/en active Active
Patent Citations (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140162274A1 (en) * | 2012-06-28 | 2014-06-12 | Taxon Biosciences, Inc. | Compositions and methods for identifying and comparing members of microbial communities using amplicon sequences |
CN104091337A (en) * | 2014-07-11 | 2014-10-08 | 北京工业大学 | Deformation medical image registration method based on PCA and diffeomorphism Demons |
CN106709869A (en) * | 2016-12-25 | 2017-05-24 | 北京工业大学 | Dimensionally reduction method based on deep Pearson embedment |
CN106886601A (en) * | 2017-03-02 | 2017-06-23 | 大连理工大学 | A kind of Cross-modality searching algorithm based on the study of subspace vehicle mixing |
US20200293554A1 (en) * | 2018-03-15 | 2020-09-17 | Alibaba Group Holding Limited | Abnormal sample prediction |
CN109214503A (en) * | 2018-08-01 | 2019-01-15 | 华北电力大学 | Project of transmitting and converting electricity cost forecasting method based on KPCA-LA-RBM |
CN110069467A (en) * | 2019-04-16 | 2019-07-30 | 沈阳工业大学 | System peak load based on Pearson's coefficient and MapReduce parallel computation clusters extraction method |
US20210200746A1 (en) * | 2019-12-30 | 2021-07-01 | Royal Bank Of Canada | System and method for multivariate anomaly detection |
CN111275288A (en) * | 2019-12-31 | 2020-06-12 | 华电国际电力股份有限公司十里泉发电厂 | XGboost-based multi-dimensional data anomaly detection method and device |
CN111338897A (en) * | 2020-02-24 | 2020-06-26 | 京东数字科技控股有限公司 | Identification method of abnormal node in application host, monitoring equipment and electronic equipment |
CN111931868A (en) * | 2020-09-24 | 2020-11-13 | 常州微亿智造科技有限公司 | Time series data abnormity detection method and device |
WO2022110557A1 (en) * | 2020-11-25 | 2022-06-02 | 国网湖南省电力有限公司 | Method and device for diagnosing user-transformer relationship anomaly in transformer area |
CN113420691A (en) * | 2021-06-30 | 2021-09-21 | 昆明理工大学 | Mixed domain characteristic bearing fault diagnosis method based on Pearson correlation coefficient |
CN113901993A (en) * | 2021-09-24 | 2022-01-07 | 上海海事大学 | Fault diagnosis method based on PCCs secondary feature optimization |
CN114239807A (en) * | 2021-12-17 | 2022-03-25 | 山东省计算中心(国家超级计算济南中心) | RFE-DAGMM-based high-dimensional data anomaly detection method |
CN115150744A (en) * | 2022-08-02 | 2022-10-04 | 天津城建大学 | Indoor signal interference source positioning method for large conference venue |
Non-Patent Citations (4)
Title |
---|
NING ZHANG: "Magnetic Anomaly Detection Method Based on Feature Fusion and Isolation Forest Algorithm", 《IEEE ACCESS ( VOLUME: 10)》 * |
李为州: "说话人识别中基于深度信念网络的超向量降维的研究", 《电脑知识与技术》 * |
杨英华等: "基于子空间混合相似度的过程监测与故障诊断", 《仪器仪表学报》 * |
陈茂: "工业物联网中基于边缘计算的大数据清洗算法的研究", 《CNKI优秀硕士学位论文全文库》 * |
Also Published As
Publication number | Publication date |
---|---|
CN115438035B (en) | 2023-04-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107563426B (en) | Method for learning locomotive running time sequence characteristics | |
WO2023134086A1 (en) | Convolutional neural network model pruning method and apparatus, and electronic device and storage medium | |
CN111414571A (en) | Atmospheric pollutant monitoring method | |
CN112381790A (en) | Abnormal image detection method based on depth self-coding | |
CN110750412B (en) | Log abnormity detection method | |
CN116148656B (en) | Portable analog breaker fault detection method | |
CN115130519B (en) | Hull structure fault prediction method using convolutional neural network | |
CN113704201A (en) | Log anomaly detection method and device and server | |
CN115452376A (en) | Bearing fault diagnosis method based on improved lightweight deep convolution neural network | |
CN114594398A (en) | Energy storage lithium ion battery data preprocessing method | |
CN115438035B (en) | Data exception handling method based on KPCA and mixed similarity | |
CN110990383A (en) | Similarity calculation method based on industrial big data set | |
CN116881798A (en) | Conditional gracile causal analysis method based on variable selection and reverse time lag feature selection for complex systems such as weather | |
CN117675230A (en) | Knowledge-graph-based oil well data integrity identification method | |
CN112950566B (en) | Windshield damage fault detection method | |
CN114756742A (en) | Information pushing method and device and storage medium | |
CN114155410A (en) | Graph pooling, classification model training and reconstruction model training method and device | |
CN113240213A (en) | Method, device and equipment for selecting people based on neural network and tree model | |
CN110321366B (en) | Statistical quantity determining method and system based on online learning | |
CN115827982A (en) | Big data information acquisition system based on computer | |
US20240272976A1 (en) | Abnormality detection device, abnormality detection method, and abnormality detection program | |
CN113887718B (en) | Channel pruning method and device based on relative activation rate and lightweight flow characteristic extraction network model simplification method | |
CN110728615B (en) | Steganalysis method based on sequential hypothesis testing, terminal device and storage medium | |
CN113878613B (en) | Industrial robot harmonic reducer early fault detection method based on WLCTD and OMA-VMD | |
CN115204671A (en) | Big data-based annual newspaper analysis system for listed companies |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |