CN113673606A - Intelligent identification method and system for safety monitoring data abnormity - Google Patents

Intelligent identification method and system for safety monitoring data abnormity Download PDF

Info

Publication number
CN113673606A
CN113673606A CN202110975189.6A CN202110975189A CN113673606A CN 113673606 A CN113673606 A CN 113673606A CN 202110975189 A CN202110975189 A CN 202110975189A CN 113673606 A CN113673606 A CN 113673606A
Authority
CN
China
Prior art keywords
data
identification
monitoring data
algorithm
abnormity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110975189.6A
Other languages
Chinese (zh)
Inventor
金鑫鑫
韦耀国
黎利兵
姜云辉
卢正超
聂鼎
刘紫娥
黄子晗
商玉洁
李辉
张春雨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Institute of Water Resources and Hydropower Research
Original Assignee
China Institute of Water Resources and Hydropower Research
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Institute of Water Resources and Hydropower Research filed Critical China Institute of Water Resources and Hydropower Research
Priority to CN202110975189.6A priority Critical patent/CN113673606A/en
Publication of CN113673606A publication Critical patent/CN113673606A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06F18/2155Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the incorporation of unlabelled data, e.g. multiple instance learning [MIL], semi-supervised techniques using expectation-maximisation [EM] or naïve labelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24147Distances to closest patterns, e.g. nearest neighbour classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Alarm Systems (AREA)

Abstract

The invention discloses a method and a system for intelligently identifying safety monitoring data abnormity, which relate to the technical field of data abnormity identification and specifically comprise the following steps: acquiring monitoring data; intelligently classifying and preprocessing the monitoring data to obtain first monitoring data; establishing an abnormal data identification model; the abnormal data identification model adopts a modeling method combining an isolated forest algorithm, an elliptic envelope algorithm and a KNN algorithm; and inputting the first monitoring data into the abnormal data identification model for data abnormal identification. The method provided by the invention is adopted to carry out intelligent identification on data abnormity, so that the limitation of the traditional single algorithm is avoided, and the efficiency and the precision of monitoring data abnormity identification are improved.

Description

Intelligent identification method and system for safety monitoring data abnormity
Technical Field
The invention relates to the technical field of data anomaly identification, in particular to a safety monitoring data anomaly intelligent identification method and system.
Background
Engineering safety monitoring is an important means for knowing the operation state of an engineering structure, and is also an important basis for safety state evaluation and dynamic safety monitoring. The monitoring data acquired by the current engineering automation and manual monitoring means has the characteristics of mixed data, dirty data and complex data under the influence of environmental factors, external interference, human factors, instrument stability and the like, wherein the monitoring data contains a certain amount of abnormal data, and the abnormal data comprises data abnormity, namely gross errors and data abnormity caused by real structural abnormity. Therefore, filtering useless gross error data from a great amount of raw data and retaining important data really reflecting structural abnormality are key problems to be solved and are necessary steps for carrying out structural safety and stability evaluation.
The current data anomaly identification method mainly comprises a traditional statistical test method based on a data sequence, such as a Leidett criterion, a Romanofsky criterion and the like, and a method based on distance, density, deviation and the like, wherein a gross error identification method introduced with an artificial intelligence algorithm comprises a clustering algorithm, an autoregressive algorithm and the like. However, the major drawbacks of the current gross error identification method include:
(1) part of algorithms depend on a certain data distribution rule, otherwise, the recognition effect is greatly reduced;
(2) different models need to be established for data sets with different change rules and distribution characteristics of data;
(3) the threshold index of part of algorithms is difficult to determine;
(4) the quality of the data is uneven, and the difficulty of gross error identification is increased.
Therefore, for those skilled in the art, it is an urgent need to solve the problem of performing gross error identification on multiple kinds of monitoring data, avoiding the limitation of the previous single algorithm, and improving the efficiency and accuracy of monitoring data anomaly identification.
Disclosure of Invention
In view of the above, the invention provides an intelligent identification method and system for safety monitoring data abnormality, which can simultaneously process gross error identification of multiple kinds of monitoring data, avoid the limitation of the previous single algorithm, and greatly improve the efficiency and accuracy of monitoring data abnormality identification.
In order to achieve the purpose, the invention adopts the following technical scheme: on one hand, the intelligent identification method for the abnormity of the safety monitoring data is provided, and the specific steps are as follows:
acquiring monitoring data;
intelligently classifying and preprocessing the monitoring data to obtain first monitoring data;
establishing an abnormal data identification model; the abnormal data identification model adopts a modeling method combining an isolated forest algorithm, an elliptic envelope algorithm and a KNN algorithm;
and inputting the first monitoring data into the abnormal data identification model for data abnormal identification.
Preferably, an automatic encoder is adopted to extract encoding characteristics from the monitoring data, a K-means model is trained through the encoding characteristics to intelligently classify the monitoring data, and the classification basis is the integrity and the representativeness of the monitoring data.
By adopting the technical scheme, the method has the following beneficial technical effects: the monitoring data are classified, so that the problem that the calculation efficiency is reduced due to the fact that the data quantity with the same characteristic type is too large is solved.
Preferably, the monitoring data is preprocessed, including missing value processing and normalization processing.
By adopting the technical scheme, the method has the following beneficial technical effects: and carrying out unified normalization processing on the original data, and improving the precision of the model.
Preferably, the step of performing data anomaly identification by using the anomaly data identification model comprises:
respectively carrying out data anomaly identification by utilizing the isolated forest algorithm, the elliptic envelope algorithm and the KNN algorithm to obtain three groups of output values;
summing the three groups of output values to obtain a decision value;
if the decision value is larger than 0, the first monitoring data is normal;
and if the decision value is less than 0, the first monitoring data has gross error.
By adopting the technical scheme, the method has the following beneficial technical effects: 3 unsupervised algorithms adapted to different data types were used: and carrying out data training and testing on isolated forests, elliptical envelopes and KNN algorithms, and finishing the construction of a gross error model by taking the sum of the results of the 3 algorithms as a comprehensive judgment result, thereby avoiding the limitation of the traditional single algorithm.
On the other hand, the intelligent identification system for the safety monitoring data abnormity comprises a data acquisition module, a data processing module, an identification model establishing module and a data abnormity identification module; the data acquisition module is connected with the data processing module, the data processing module is connected with the identification model establishing module, and the identification model establishing module is connected with the data abnormity identifying module;
the data acquisition module is used for acquiring monitoring data;
the data processing module is used for intelligently classifying and preprocessing the monitoring data to obtain first monitoring data;
the identification model establishing module is used for establishing an abnormal data identification model, and the abnormal data identification model adopts a modeling method combining an isolated forest algorithm, an elliptic envelope algorithm and a KNN algorithm;
and the data anomaly identification module is used for inputting the first monitoring data into the anomaly data identification model to carry out data anomaly identification.
According to the technical scheme, compared with the prior art, the invention discloses and provides an intelligent identification method and system for safety monitoring data abnormity, based on data analysis and artificial intelligence technology, data are classified by adopting a feature selection, dimension reduction and clustering method, and 3 unsupervised algorithms which are suitable for different data types are adopted: and carrying out data training and testing by using isolated forests, elliptical envelopes and KNN algorithms, adding the results of the 3 algorithms as a comprehensive judgment result, completing the construction of a gross error model, and identifying the gross error data on the basis of the model. The method can simultaneously process gross error identification of various monitoring data, avoids the limitation of the traditional single algorithm, and greatly improves the efficiency and the precision of abnormal identification of the monitoring data.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
FIG. 1 is a schematic flow diagram of the process of the present invention;
FIG. 2 is a schematic diagram of the system of the present invention;
FIG. 3 is a graph of the test results of the present invention;
fig. 4 is a diagram showing the result of identifying an abnormality in the measured osmolarity value according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment of the invention discloses an intelligent identification method for safety monitoring data abnormity on the one hand, as shown in figure 1, the specific steps are as follows:
s1, acquiring monitoring data;
s2, intelligently classifying and preprocessing the monitoring data to obtain first monitoring data;
s3, establishing an abnormal data identification model; the abnormal data identification model adopts a modeling method combining an isolated forest algorithm, an elliptic envelope algorithm and a KNN algorithm;
and S4, inputting the first monitoring data into an abnormal data identification model for data abnormal identification.
Furthermore, an automatic encoder is adopted to extract encoding characteristics from the monitoring data, a K mean value model is trained through the encoding characteristics to intelligently classify the monitoring data, and the classification basis is the integrity and the representativeness of the monitoring data.
Further, for the requirement of subsequent abnormal diagnosis analysis, the monitoring data is preprocessed, which includes the following aspects:
(1) missing value processing: deleting the missing values uniformly;
(2) normalization treatment: and the monitoring data is subjected to unified normalization processing, so that the model precision is improved.
The abnormal data identification model adopts a modeling method which is integrated by a plurality of methods, and adopts an unsupervised algorithm for the possibility of wide application, wherein the unsupervised algorithm selects the following three algorithms:
(1) IsolationForest (outlier detection);
(2) EllipticEnvelope (elliptical model fitting, outlier detection);
(3) KNN (K-neighbor algorithm, new singularity detection).
The data anomaly identification using the anomaly data identification model includes the steps of:
respectively carrying out data anomaly identification by utilizing an isolated forest algorithm, an elliptic envelope algorithm and a KNN algorithm to obtain three groups of output values;
summing the three groups of output values to obtain a decision value;
if the decision value is larger than 0, the first monitoring data is normal;
if the decision value is less than 0, the first monitoring data has gross error.
On the other hand, an intelligent identification system for safety monitoring data abnormity is provided, as shown in fig. 2, and comprises a data acquisition module, a data processing module, an identification model establishing module and a data abnormity identification module; the data acquisition module is connected with the data processing module, the data processing module is connected with the identification model establishing module, and the identification model establishing module is connected with the data abnormity identifying module;
the data acquisition module is used for acquiring monitoring data;
the data processing module is used for intelligently classifying and preprocessing the monitoring data to obtain first monitoring data;
the identification model establishing module is used for establishing an abnormal data identification model, and the abnormal data identification model adopts a modeling method combining an isolated forest algorithm, an elliptic envelope algorithm and a KNN algorithm;
and the data anomaly identification module is used for inputting the first monitoring data into the anomaly data identification model to carry out data anomaly identification.
Experiments were carried out in this example to verify the effectiveness of the method of the invention. The method for identifying the gross error of the permeability monitoring data of the south-to-north water diversion comprises the following steps:
(1) and (3) reading data, namely reading required training data, namely the osmotic pressure water level monitoring data of the whole year in 2020 and data to be detected, namely the data of the first half year in 2021.
(2) And (4) data normalization, namely normalizing the value range to be between (-1, 1) by adopting a MinMaxScaler function, so that data abnormality can be better captured.
(3) And (3) model training, namely respectively calling three algorithms of an isolated forest, ellipse model fitting and K-neighbor algorithm to train the model, so as to obtain the model with the best recognition accuracy and model parameters. For example, in the elliptic Gaussian algorithm, the constancy is 0.005.
(4) And performing gross error judgment on the data to be tested by adopting the trained model, and outputting a result. For example, in the KNN algorithm, the call statement is
from sklearn.neighbors importKNeighborsClassifierknn=KNeighborsClassifier(n_neighbors=3,n_jobs=-1)
Selecting historical data from kann.fit (train _ X, train _ y) # as training set
preds=knn.predict(test_X)
And then, performing gross error prediction on the new data set by adopting the trained model, wherein preds is a gross error recognition result. The test results are shown in fig. 3 and 4.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (5)

1. An intelligent identification method for safety monitoring data abnormity is characterized by comprising the following specific steps:
acquiring monitoring data;
intelligently classifying and preprocessing the monitoring data to obtain first monitoring data;
establishing an abnormal data identification model; the abnormal data identification model adopts a modeling method combining an isolated forest algorithm, an elliptic envelope algorithm and a KNN algorithm;
and inputting the first monitoring data into the abnormal data identification model for data abnormal identification.
2. The method as claimed in claim 1, wherein an automatic encoder is used to extract coding features from the monitored data, and a K-means model is trained by the coding features to intelligently classify the monitored data according to the integrity and representativeness of the monitored data.
3. The intelligent identification method for the abnormity of the safety monitoring data according to claim 1, wherein the monitoring data is preprocessed, and the preprocessing comprises missing value processing and normalization processing.
4. The intelligent identification method for the abnormity of the safety monitoring data according to claim 1, wherein the step of carrying out data abnormity identification by using the abnormal data identification model comprises the following steps:
respectively carrying out data anomaly identification by utilizing the isolated forest algorithm, the elliptic envelope algorithm and the KNN algorithm to obtain three groups of output values;
summing the three groups of output values to obtain a decision value;
if the decision value is larger than 0, the first monitoring data is normal;
and if the decision value is less than 0, the first monitoring data has gross error.
5. An intelligent identification system for safety monitoring data abnormity is characterized by comprising a data acquisition module, a data processing module, an identification model establishing module and a data abnormity identification module; the data acquisition module is connected with the data processing module, the data processing module is connected with the identification model establishing module, and the identification model establishing module is connected with the data abnormity identifying module;
the data acquisition module is used for acquiring monitoring data;
the data processing module is used for intelligently classifying and preprocessing the monitoring data to obtain first monitoring data;
the identification model establishing module is used for establishing an abnormal data identification model, and the abnormal data identification model adopts a modeling method combining an isolated forest algorithm, an elliptic envelope algorithm and a KNN algorithm;
and the data anomaly identification module is used for inputting the first monitoring data into the anomaly data identification model to carry out data anomaly identification.
CN202110975189.6A 2021-08-24 2021-08-24 Intelligent identification method and system for safety monitoring data abnormity Pending CN113673606A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110975189.6A CN113673606A (en) 2021-08-24 2021-08-24 Intelligent identification method and system for safety monitoring data abnormity

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110975189.6A CN113673606A (en) 2021-08-24 2021-08-24 Intelligent identification method and system for safety monitoring data abnormity

Publications (1)

Publication Number Publication Date
CN113673606A true CN113673606A (en) 2021-11-19

Family

ID=78545635

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110975189.6A Pending CN113673606A (en) 2021-08-24 2021-08-24 Intelligent identification method and system for safety monitoring data abnormity

Country Status (1)

Country Link
CN (1) CN113673606A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160342903A1 (en) * 2015-05-21 2016-11-24 Software Ag Usa, Inc. Systems and/or methods for dynamic anomaly detection in machine sensor data
CN109032829A (en) * 2018-07-23 2018-12-18 腾讯科技(深圳)有限公司 Data exception detection method, device, computer equipment and storage medium
CN111709465A (en) * 2020-06-04 2020-09-25 中国电建集团华东勘测设计研究院有限公司 Intelligent identification method for rough difference of dam safety monitoring data

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160342903A1 (en) * 2015-05-21 2016-11-24 Software Ag Usa, Inc. Systems and/or methods for dynamic anomaly detection in machine sensor data
CN109032829A (en) * 2018-07-23 2018-12-18 腾讯科技(深圳)有限公司 Data exception detection method, device, computer equipment and storage medium
CN111709465A (en) * 2020-06-04 2020-09-25 中国电建集团华东勘测设计研究院有限公司 Intelligent identification method for rough difference of dam safety monitoring data

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
谭京京: "《基于Hadoop的桥梁监测数据孤立点挖掘研究》", 《中国优秀博硕士学位论文全文数据库(硕士)工程科技Ⅱ辑》 *

Similar Documents

Publication Publication Date Title
US8868985B2 (en) Supervised fault learning using rule-generated samples for machine condition monitoring
CN110596506A (en) Converter fault diagnosis method based on time convolution network
KR102291964B1 (en) Method for Fault Detection and Fault Diagnosis in Semiconductor Manufacturing Process
CN110389269A (en) Low-voltage platform area topological relation recognition methods and its device based on electric current Optimized Matching
CN114444620B (en) Indicator diagram fault diagnosis method based on generating type antagonistic neural network
CN113452018B (en) Method for identifying standby shortage risk scene of power system
CN116552306B (en) Monitoring system and method for direct current pile
CN115409131A (en) Production line abnormity detection method based on SPC process control system
CN111338972A (en) Machine learning-based software defect and complexity incidence relation analysis method
CN110879377A (en) Metering device fault tracing method based on deep belief network
CN109670549A (en) The data screening method, apparatus and computer equipment of fired power generating unit
CN116075733A (en) Battery management system for classifying battery modules
CN113112188B (en) Power dispatching monitoring data anomaly detection method based on pre-screening dynamic integration
CN114330486A (en) Power system bad data identification method based on improved Wasserstein GAN
CN113283546A (en) Furnace condition abnormity alarm method and system of heating furnace integrity management centralized control device
CN113673606A (en) Intelligent identification method and system for safety monitoring data abnormity
CN113485863B (en) Method for generating heterogeneous imbalance fault samples based on improved generation of countermeasure network
CN113705405B (en) Nuclear pipeline fault diagnosis method
CN115184054A (en) Mechanical equipment semi-supervised fault detection and analysis method, device, terminal and medium
CN115081514A (en) Industrial equipment fault identification method under data imbalance condition
CN114358160A (en) Data anomaly detection method in power system
CN113204894A (en) Construction method and application of electric energy metering abnormity diagnosis model
CN114964776A (en) Wheel set bearing fault diagnosis method based on MSE and PSO-SVM
CN112733878A (en) Transformer fault diagnosis method based on kmeans-SVM algorithm
CN112560674B (en) Method and system for detecting sound signal quality

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20211119

RJ01 Rejection of invention patent application after publication