CN112036075A - Abnormal data judgment method based on environmental monitoring data association relation - Google Patents

Abnormal data judgment method based on environmental monitoring data association relation Download PDF

Info

Publication number
CN112036075A
CN112036075A CN202010801821.0A CN202010801821A CN112036075A CN 112036075 A CN112036075 A CN 112036075A CN 202010801821 A CN202010801821 A CN 202010801821A CN 112036075 A CN112036075 A CN 112036075A
Authority
CN
China
Prior art keywords
data
model
hidden layer
abnormal
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010801821.0A
Other languages
Chinese (zh)
Inventor
孙康
尤洋
郭月
郑皓皓
秦少立
汪太明
孟双双
张霞
杨子成
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suncere Information Technology Co ltd
CHINA NATIONAL ENVIRONMENTAL MONITORING CENTRE
Original Assignee
Suncere Information Technology Co ltd
CHINA NATIONAL ENVIRONMENTAL MONITORING CENTRE
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suncere Information Technology Co ltd, CHINA NATIONAL ENVIRONMENTAL MONITORING CENTRE filed Critical Suncere Information Technology Co ltd
Priority to CN202010801821.0A priority Critical patent/CN112036075A/en
Publication of CN112036075A publication Critical patent/CN112036075A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Computer Hardware Design (AREA)
  • Geometry (AREA)
  • Quality & Reliability (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to an abnormal data judgment method based on an environmental monitoring data association relation, which mainly comprises the following steps: firstly, dividing monitoring data into training data, verification data and test data, constructing a model by using the training data, and selecting the optimal parameters of the model by using the verification data according to MAE; after the model is built and debugged, the model is embedded into an environment monitoring platform after test data testing. And (3) providing a predicted value at the next moment on a monitoring platform according to the real-time monitoring data and the model, calculating absolute values bia, bia of the predicted result and the true value, and comparing the absolute values with the MAE +/-30% true value to judge whether the measured value is abnormal. The method fully considers the influence of meteorological conditions on the monitoring data and the time continuity and change characteristics of the monitoring data, finally solves the problem that the multisource monitoring data lack automatic quality control means, realizes the automatic and intelligent screening and judging functions on suspicious data, guarantees the quality of the data, and provides powerful support for later-stage data use and environmental forecast early warning.

Description

Abnormal data judgment method based on environmental monitoring data association relation
Technical Field
The invention relates to the technical field of data quality control of real-time environment monitoring, which is mainly used for automatically judging abnormal values of real-time monitoring data of particulate matters and gaseous pollutants.
Background
For the control and monitoring of the atmospheric environment data quality, most of the currently used data screening methods adopt a manual mode, namely, abnormal fluctuation, outlier degree and the like of each monitoring index are judged by drawing a daily average graph and a monthly average graph. The method increases a large amount of human resources, and the manual review often has omission in the face of massive monitoring data. In view of the fact that the concentration index of the monitored object output by the environment monitoring instrument generally takes minutes or hours as a unit, certain hysteresis exists in manual data auditing, and the quality of the data can be controlled in real time through an automatic auditing mechanism.
Aiming at the condition that the atmospheric monitoring data lacks of automatic quality control means, an algorithm is designed according to the technical scheme adopted by data monitoring and compounding of an environmental monitoring master station, the automatic intelligent quality control technology of the atmospheric environmental monitoring data is realized, the problem that the multisource monitoring data lacks of the automatic quality control means is solved, the quality control of the atmospheric monitoring equipment conforms to the same method system, and the development of the remote automatic quality control technology of the monitoring equipment is promoted.
Disclosure of Invention
The invention aims to provide an abnormal data judgment method based on an environmental monitoring data incidence relation, and aims to solve the problems that multi-source monitoring data lack automatic quality control means and the like.
An abnormal data judgment method based on an environmental monitoring data association relation comprises the following steps:
s1, preprocessing historical data and environmental monitoring data to be analyzed: judging missing values and abnormal values of historical data and monitoring data of environmental monitoring data to be analyzed by data acquisition software, and replacing the missing values and the abnormal values;
s2, dividing the data into training data and verification data, and converting the training data and the verification data into sequence data required by the model; the training set data and the verification set data comprise normal data and data which are artificially marked as abnormal, the abnormal reasons comprise sudden rise/sudden fall of the data, no day and night change, low persistence value and the like, and the abnormal data judgment is related to the continuity of the front and rear monitoring data; the ratio of training set to validation set data may be 7: 3;
s3, constructing a model by using the training data, and selecting the optimal parameters of the model by using the verification data according to the average absolute error MAE; after the model is built and debugged, the model is embedded into an environment monitoring platform, and environment monitoring data c to be analyzed at the moment t-1t-trueContinuously obtaining a predicted value c at time t as input datat-pre
S4, predicting the value ct-preEnvironmental monitoring data c to be analyzed at time tt-trueComparing the absolute value bia with the empirical error of the true MAE + -30%, determining the abnormality, and exceeding the rangeThe data is marked as abnormal data, and the MAE changes along with the input data in the process, so that the threshold is a dynamic threshold.
Wherein bia ═ ct-pre-ct-true|。
Preferably, the method for replacing the missing value and the abnormal value in step S1 includes:
s11, the interpolation function of the linear interpolation is a first-order polynomial, and first, it is assumed that the known function y ═ f (x) is in the interval [ a, b]Upper (n +1) mutually different points xiThe values in (i ═ 0,1,2,3.., n) are y, respectivelyiSolving a polynomial equation:
Figure BDA0002627671220000021
to satisfy
Figure BDA0002627671220000022
From the analytic geometry:
Figure BDA0002627671220000023
wherein x0、x1、y0、y1-known statistics
x——x0,x1Any data in between
y-interpolation data corresponding to x;
and S12, carrying out linear interpolation according to the formula by utilizing data of 2 moments before and after the missing or abnormal value.
Preferably, the environmental monitoring data includes meteorological five parameters air pressure, temperature, humidity, wind direction, wind speed and pollutant concentration.
Preferably, the historical data and the environmental monitoring data to be analyzed in step S1 are grouped according to seasons and regions, and a two-dimensional table corresponding to each pollutant is respectively constructed
Figure BDA0002627671220000024
The existing monitoring data are hourly monitoring data, and due to the strong regional and seasonal characteristics of the environmental monitoring data, the national data are divided according to regions (northeast, northwest, north China, south China and south China) and seasons (spring, summer, autumn and winter); abnormal values and missing values judged by data acquisition software in the hourly monitoring data are automatically replaced by a linear interpolation method, and finally a two-dimensional table corresponding to each pollutant in different regions and different seasons is constructed for constructing respective models.
Preferably, the training data constructing model in step S3 includes the following steps: in the modeling process, certain pollutant concentration and meteorological conditions at the moment t are used
Figure BDA0002627671220000031
As input, a certain pollutant concentration and meteorological conditions at time t +1
Figure BDA0002627671220000032
Using training set data as output
Figure BDA0002627671220000033
Obtaining the pollutant concentration and meteorological conditions at the t +1 moment after learning;
when the model is constructed, the model passes through an input layer, a hidden layer and an output layer, and t can be selectively reserved in the hidden layer1、t2... t time information, and using it as input information to act on t +1 time; the calculation steps and methods for each unit are as follows:
an input gate:
Figure BDA0002627671220000034
forget the door:
Figure BDA0002627671220000035
an output gate:
Figure BDA0002627671220000036
hidden layer candidate memory cell value at current time:
Figure BDA0002627671220000037
hidden layer memory cell state value at present:
Figure BDA0002627671220000038
hidden layer output value at current moment:
Figure BDA0002627671220000039
i. phi and omega are respectively an input gate, a forgetting gate and an output gate, h is the output of a hidden layer, c is the value of a memory cell of the hidden layer, theta and sigma respectively represent nonlinear activation functions of a plurality of gates, theta generally takes a tanh function, sigma generally is a logistic sigmoid function,
Figure BDA00026276712200000310
representing an input-input gate weight matrix,
Figure BDA00026276712200000311
representing the hidden layer unit-input gate weight matrix at the previous time,
Figure BDA00026276712200000312
representing the input gate-hidden layer memory cell weight matrix,
Figure BDA00026276712200000313
is the weight matrix of the input layer-forgetting gate,
Figure BDA00026276712200000314
is a weight matrix of a previous time hidden layer unit-forgetting gate,
Figure BDA00026276712200000315
is a weight matrix of a hidden layer memory unit-forgetting gate,
Figure BDA00026276712200000316
is an input layer-output gate weight matrix,
Figure BDA00026276712200000317
is the weight matrix of the hidden layer unit-output gate at the last moment,
Figure BDA00026276712200000318
is a weight matrix of output gates of hidden layer memory units,
Figure BDA00026276712200000319
is a weight matrix of the input layer-hidden layer memory cell,
Figure BDA00026276712200000320
is the weight matrix from the hidden layer unit to the hidden layer memory unit at the previous time,
Figure BDA00026276712200000321
the bias of the input gate, the forgetting gate, the output gate and the hidden layer memory cell, respectively.
In consideration of the characteristic that the time correlation among data in the samples is strong, an RNN-LSTM model is selected from several common machine learning algorithms, and the model can improve the prediction accuracy of time series data by utilizing the time information of an input sequence. On the basis of a common multilayer BP neural network, the transverse connection among all units of a hidden layer is added, and the output of the neural unit of the previous time sequence can be used as the input of the current neural unit through a weight matrix, so that the neural network has a memory function.
The optimized training set data is used as model input, and model construction is completed; the method for selecting the optimal parameters of the model according to the mean absolute error MAE by using the verification data in step S3 includes: the verification data is used as model input, and the model predicted value c is usedpreAnd the true value ctureComparing, namely evaluating by adopting an average absolute error (MAE), and obtaining an optimal model parameter when the MAE is minimum;
Figure BDA0002627671220000041
wherein n represents the number of samples, ctrueRepresenting the true value, cpreRepresenting the predicted value.
Compared with the existing abnormal data detection means, the method uses the history data after manual examination as training data to construct a neural network model, and then selects the optimal parameters of the model according to the MAE by using verification data; and after the model is built and debugged, embedding an environment monitoring platform, and giving a predicted value at the next moment according to the real-time monitoring data and the model to judge whether the monitored value at the next moment is abnormal or not. The invention applies the method of machine learning and data mining to the field of environmental monitoring, realizes the fusion innovation of computer data science and environmental monitoring interdisciplinary, and establishes an analysis system by learning aiming at monitoring data with large scale and high complexity of internal rules. And when the model is constructed, the influence of meteorological conditions on the monitoring data and the time continuity and change characteristics of the monitoring data are fully considered, and the test result also reflects that the fitting degree of the model predicted value and the actual monitoring value is good. The problem that multi-source monitoring data lack automatic quality control means is finally solved, automatic and intelligent functions of screening and judging suspicious data are achieved, quality control of the atmospheric monitoring equipment conforms to the same method system, quality of the monitoring data is guaranteed, development of remote automatic quality control technology of the monitoring equipment is promoted, and powerful support is provided for later-stage data use and environment forecast early warning.
Drawings
FIG. 1 is a flow chart of an abnormal data determination method based on an environmental monitoring data association relationship according to the present invention;
FIG. 2 is a schematic diagram of RNN-LSTM model construction;
fig. 3 is a diagram of the predicted effect.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the accompanying drawings.
Example 1
The embodiment detects the abnormality of the monitoring data of month 11 according to the monitoring data of the atmospheric pollutants of month 9 to month 10 of 2019 (see table 1 for partial pollutant data and meteorological data).
The model execution steps are as follows:
the first step is as follows: the atmospheric pollutant data and the corresponding meteorological data are preprocessed, interpolation processing of missing values, abnormal values (caused by insufficient sampling time) and the like automatically identified by the meteorological data software is carried out, and the data which are artificially checked to be abnormal are marked, and the details are shown in a table 2.
Wherein automatically replacing missing and outliers comprises:
s11. first, assume that the known function y ═ f (x) is in the interval [ a, b ═ b]Upper (n +1) mutually different points xiThe values in (i ═ 0,1,2,3.., n) are y, respectivelyiSolving a polynomial equation:
Figure BDA0002627671220000051
to satisfy
Figure BDA0002627671220000052
From the analytic geometry:
Figure BDA0002627671220000053
wherein x0、x1、y0、y1-known statistics
x——x0,x1Any data in between
y-the interpolated data corresponding to x.
The second step is that: the above data of month 9 was taken as training data, the data of month 10 as verification data, and the data of month 11 as test data, and the training and verification data were converted into sequence data format required for the model.
The third step: constructing a neural network based on RNN-LSTM,training data is input into the model, and the model automatically adjusts network parameters (including the weights and thresholds of the network nodes) according to the prediction error of the training data until the training process is finished. A standard 3-layer network is adopted during training, wherein 1 input layer (6 input nodes) +1 hidden layer +1 output layer (1 output node) is adopted; meanwhile, model hyper-parameters (including hidden layer node number, learning rate and the like of the model) are adjusted according to the verification data effect, wherein PM10Model hidden layer node number is 500, PM2.5The number of nodes of the hidden layer of the model is 300, the number of nodes of the hidden layer of the CO model is 200, and NO2Model hidden layer node is 100, O3Model hidden layer node is 100, SO2The model hidden layer node is 100. FIG. 2 is a flow of model building.
The model formula is as follows:
an input gate:
Figure BDA0002627671220000061
forget the door:
Figure BDA0002627671220000062
an output gate:
Figure BDA0002627671220000063
hidden layer candidate memory cell value at current time:
Figure BDA0002627671220000064
hidden layer memory cell state value at present:
Figure BDA0002627671220000065
hidden layer output value at current moment:
Figure BDA0002627671220000066
the fourth step: inputting the 11-month test data into the trained model, and comparing the predicted data with the measured values, and showing the effect of part of the test data in fig. 3.
The mean absolute error was used for evaluation, where CO was about 0.08 and NO was2About 6.85, O3About 10.45, PM10About 7.80, PM2.5About 4.97, SO2About 3.38. The size of the MAE is related to the scale of the test data, and the overall 11-month data test results are better.
The fifth step: and judging data abnormity according to the absolute value of the difference between the predicted value and the measured value, and judging the data to be abnormal by the model when the data exceeds the model prediction error MAE +/-30% of the measured value. The partial prediction values, bia, and MAE values of the model output are shown in table 3, and the partial results of the model abnormality determination are shown in table 4.
TABLE 1 some parts of pollutant data and meteorological data
Figure BDA0002627671220000067
Figure BDA0002627671220000071
Figure BDA0002627671220000081
Note: wherein NA represents missing data, RM is manually audited as abnormal data
TABLE 2 data after pretreatment
Figure BDA0002627671220000082
Figure BDA0002627671220000091
Figure BDA0002627671220000101
Table 2
Time of day SO2_MARK NO2_MARK O3_MARK CO_MARK PM10_MARK PM2.5_MARK
2019-09-13 01:00 0 0 0 0 0 0
2019-09-13 02:00 0 0 0 0 0 0
2019-09-13 03:00 0 0 0 0 0 0
2019-09-13 04:00 0 0 0 0 0 0
2019-09-13 05:00 0 0 0 0 0 0
2019-09-13 06:00 0 0 0 0 0 0
2019-09-13 07:00 0 0 0 0 0 0
2019-09-13 08:00 0 0 0 0 0 0
2019-09-13 09:00 0 0 0 0 0 0
2019-09-13 10:00 0 0 0 0 0 0
2019-09-13 11:00 0 0 0 0 0 0
2019-09-13 12:00 0 0 0 0 0 0
2019-09-13 13:00 0 0 0 0 0 0
2019-09-13 14:00 0 0 0 0 0 0
2019-09-13 15:00 0 0 0 0 0 0
2019-09-13 16:00 0 0 0 0 0 0
2019-09-13 17:00 0 0 1 0 0 0
2019-09-13 18:00 0 0 0 0 0 0
2019-09-13 19:00 0 0 0 0 0 0
2019-09-13 20:00 0 0 0 0 0 0
2019-09-13 21:00 0 0 0 0 0 0
2019-09-13 22:00 0 0 0 0 0 0
2019-09-13 23:00 0 0 0 0 0 0
2019-09-14 00:00 0 0 0 0 0 0
TABLE 3 model judged NO2Partial time prediction data, measured data, MAE and bia values
Figure BDA0002627671220000102
Figure BDA0002627671220000111
Note: NO2The 1-representation model in the _Markcolumn is judged as abnormal data
Table 4 partial abnormal pollutant data finally judged by model
Figure BDA0002627671220000112
Figure BDA0002627671220000121

Claims (6)

1. An abnormal data judgment method based on an environmental monitoring data association relation is characterized by comprising the following steps:
s1, preprocessing historical data and environmental monitoring data to be analyzed: judging missing values and abnormal values of the historical data and the environmental monitoring data to be analyzed by data acquisition software, and replacing the missing values and the abnormal values;
s2, dividing the processed historical data into training data and verification data, and converting the training data and the verification data into sequence data required by the model;
s3, constructing a model by using the training data, and selecting the optimal parameters of the model by using the verification data according to the MAE; after the model is built and debugged, the model is embedded into an environment monitoring platform, and environment monitoring data c to be analyzed at the moment t-1t-1-trueContinuously obtaining a predicted value c at time t as input datat-pre
S4, predicting the value ct-preEnvironmental monitoring data c to be analyzed at time tt-trueComparing the absolute value bia with the empirical error of the true MAE + -30%, determining the abnormality, and marking the data exceeding the range as abnormal data
Wherein bia ═ ct-pre-ct-true|。
2. The abnormal data determination method based on the environmental monitoring data association relationship as claimed in claim 1, wherein the method for replacing the missing value and the abnormal value in step S1 includes:
s11. first, assume that the known function y ═ f (x) is in the interval [ a, b ═ b]Upper (n +1) mutually different points xiThe values in (i ═ 0,1,2,3.., n) are y, respectivelyiSolving a polynomial equation:
Figure FDA0002627671210000011
to satisfy
Figure FDA0002627671210000012
From the analytic geometry:
Figure FDA0002627671210000013
wherein x0、x1、y0、y1-known statistics
x——x0,x1Any data in between
y-interpolation data corresponding to x;
and S12, utilizing data of 2 moments before and after the missing or abnormal value to perform interpolation according to the formula.
3. The abnormal data determination method based on the environmental monitoring data association relation as claimed in claim 1, wherein the environmental monitoring data comprises meteorological five parameters of air pressure, temperature, humidity, wind direction, wind speed and pollutant concentration.
4. The abnormal data determination method based on the environmental monitoring data association relationship as claimed in claim 3, wherein the historical data and the environmental monitoring data to be analyzed in step S1 are grouped according to seasons and regions, and a two-dimensional table corresponding to each pollutant is respectively constructed
Figure FDA0002627671210000021
5. The abnormal data determination method based on the correlation of environmental monitoring data as claimed in claim 4The method is characterized in that the training data in the step S3 is used for constructing the model, and the method comprises the following steps: using a certain pollutant concentration and meteorological conditions at time t
Figure FDA0002627671210000022
As input, a certain pollutant concentration and meteorological conditions at time t +1
Figure FDA0002627671210000023
Using training set data as output
Figure FDA0002627671210000024
Obtaining the pollutant concentration and meteorological conditions at the t +1 moment after learning;
when the model is constructed, the model passes through an input layer, a hidden layer and an output layer, and t can be selectively reserved in the hidden layer1、t2... t time information, and using it as input information to act on t +1 time; the calculation steps and methods for each unit are as follows:
an input gate:
Figure FDA0002627671210000025
forget the door:
Figure FDA0002627671210000026
an output gate:
Figure FDA0002627671210000027
hidden layer candidate memory cell value at current time:
Figure FDA0002627671210000028
hidden layer memory cell state value at present:
Figure FDA0002627671210000029
at the present timeEtching hidden layer output value:
Figure FDA00026276712100000210
i. phi and omega are respectively an input gate, a forgetting gate and an output gate, h is the output of a hidden layer, c is the value of a memory cell of the hidden layer, theta and sigma respectively represent nonlinear activation functions of a plurality of gates, theta generally takes a tanh function, sigma generally is an lOgic sigmOid function,
Figure FDA00026276712100000211
representing an input-input gate weight matrix,
Figure FDA00026276712100000212
representing the hidden layer unit-input gate weight matrix at the previous time,
Figure FDA00026276712100000213
representing the input gate-hidden layer memory cell weight matrix,
Figure FDA00026276712100000214
is the weight matrix of the input layer-forgetting gate,
Figure FDA00026276712100000215
is a weight matrix of a previous time hidden layer unit-forgetting gate,
Figure FDA00026276712100000216
is a weight matrix of a hidden layer memory unit-forgetting gate,
Figure FDA00026276712100000217
is an input layer-output gate weight matrix,
Figure FDA00026276712100000218
is the weight matrix of the hidden layer unit-output gate at the last moment,
Figure FDA0002627671210000031
is a weight matrix of output gates of hidden layer memory units,
Figure FDA0002627671210000032
is a weight matrix of the input layer-hidden layer memory cell,
Figure FDA0002627671210000033
is the weight matrix from the hidden layer unit to the hidden layer memory unit at the previous time,
Figure FDA0002627671210000034
the bias of the input gate, the forgetting gate, the output gate and the hidden layer memory cell, respectively.
6. The abnormal data determination method based on the context-aware data association relationship of claim 4, wherein the method for verifying the optimal parameter MAE of the data selection model in step S3 comprises: the verification data is used as model input, and the model predicted value c is usedpreAnd the true value ctureComparing, namely adopting MAE evaluation to obtain the optimal model parameter when the MAE is minimum;
Figure FDA0002627671210000035
wherein n represents the number of samples, ctrueRepresenting the true value, cpreRepresenting the predicted value.
CN202010801821.0A 2020-08-11 2020-08-11 Abnormal data judgment method based on environmental monitoring data association relation Pending CN112036075A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010801821.0A CN112036075A (en) 2020-08-11 2020-08-11 Abnormal data judgment method based on environmental monitoring data association relation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010801821.0A CN112036075A (en) 2020-08-11 2020-08-11 Abnormal data judgment method based on environmental monitoring data association relation

Publications (1)

Publication Number Publication Date
CN112036075A true CN112036075A (en) 2020-12-04

Family

ID=73578362

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010801821.0A Pending CN112036075A (en) 2020-08-11 2020-08-11 Abnormal data judgment method based on environmental monitoring data association relation

Country Status (1)

Country Link
CN (1) CN112036075A (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112613233A (en) * 2020-12-18 2021-04-06 中国环境监测总站 Algorithm for discovering environmental monitoring abnormal data based on single-classification support vector machine model
CN112767653A (en) * 2020-12-21 2021-05-07 武汉达梦数据技术有限公司 Geological disaster professional monitoring data acquisition method and system
CN113434854A (en) * 2021-08-26 2021-09-24 中国电子信息产业集团有限公司 Method for generating data element based on sandbox environment and storage medium
CN113569324A (en) * 2021-08-03 2021-10-29 招商局重庆交通科研设计院有限公司 Slope deformation monitoring abnormal data analysis and optimization method
CN114035553A (en) * 2021-11-16 2022-02-11 湖南机电职业技术学院 Control system fault detection method and device based on system identification and fitting degree
CN114638290A (en) * 2022-03-07 2022-06-17 廖彤 Environment monitoring instrument fault prediction method based on edge calculation and BP neural network
CN114826988A (en) * 2021-01-29 2022-07-29 中国电信股份有限公司 Method and device for anomaly detection and parameter filling of time sequence data
CN115080909A (en) * 2022-07-15 2022-09-20 深圳市城市交通规划设计研究中心股份有限公司 Analysis method for influencing data of internet of things sensing equipment, electronic equipment and storage medium
CN116576553A (en) * 2023-07-11 2023-08-11 韦德电子有限公司 Data optimization acquisition method and system for air conditioner
CN117074627A (en) * 2023-10-16 2023-11-17 三科智能(山东)集团有限公司 Medical laboratory air quality monitoring system based on artificial intelligence

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101149819A (en) * 2007-10-31 2008-03-26 山东省科学院海洋仪器仪表研究所 Meteorological element real time data singular value elimination method
US20120226653A1 (en) * 2009-09-24 2012-09-06 Mclaughlin Michael John Method of contaminant prediction
CN103336906A (en) * 2013-07-15 2013-10-02 哈尔滨工业大学 Sampling GPR method of continuous anomaly detection in collecting data flow of environment sensor
CN105303051A (en) * 2015-11-11 2016-02-03 中国科学院遥感与数字地球研究所 Air pollutant concentration prediction method
CN108900546A (en) * 2018-08-13 2018-11-27 杭州安恒信息技术股份有限公司 The method and apparatus of time series Network anomaly detection based on LSTM
CN109302410A (en) * 2018-11-01 2019-02-01 桂林电子科技大学 A kind of internal user anomaly detection method, system and computer storage medium
CN109738939A (en) * 2019-03-21 2019-05-10 蔡寅 A kind of Precursory Observational Data method for detecting abnormality
CN110008979A (en) * 2018-12-13 2019-07-12 阿里巴巴集团控股有限公司 Abnormal data prediction technique, device, electronic equipment and computer storage medium
CN111144286A (en) * 2019-12-25 2020-05-12 北京工业大学 Urban PM2.5 concentration prediction method fusing EMD and LSTM
CN111241673A (en) * 2020-01-07 2020-06-05 北京航空航天大学 Health state prediction method for industrial equipment in noisy environment

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101149819A (en) * 2007-10-31 2008-03-26 山东省科学院海洋仪器仪表研究所 Meteorological element real time data singular value elimination method
US20120226653A1 (en) * 2009-09-24 2012-09-06 Mclaughlin Michael John Method of contaminant prediction
CN103336906A (en) * 2013-07-15 2013-10-02 哈尔滨工业大学 Sampling GPR method of continuous anomaly detection in collecting data flow of environment sensor
CN105303051A (en) * 2015-11-11 2016-02-03 中国科学院遥感与数字地球研究所 Air pollutant concentration prediction method
CN108900546A (en) * 2018-08-13 2018-11-27 杭州安恒信息技术股份有限公司 The method and apparatus of time series Network anomaly detection based on LSTM
CN109302410A (en) * 2018-11-01 2019-02-01 桂林电子科技大学 A kind of internal user anomaly detection method, system and computer storage medium
CN110008979A (en) * 2018-12-13 2019-07-12 阿里巴巴集团控股有限公司 Abnormal data prediction technique, device, electronic equipment and computer storage medium
CN109738939A (en) * 2019-03-21 2019-05-10 蔡寅 A kind of Precursory Observational Data method for detecting abnormality
CN111144286A (en) * 2019-12-25 2020-05-12 北京工业大学 Urban PM2.5 concentration prediction method fusing EMD and LSTM
CN111241673A (en) * 2020-01-07 2020-06-05 北京航空航天大学 Health state prediction method for industrial equipment in noisy environment

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112613233A (en) * 2020-12-18 2021-04-06 中国环境监测总站 Algorithm for discovering environmental monitoring abnormal data based on single-classification support vector machine model
CN112767653A (en) * 2020-12-21 2021-05-07 武汉达梦数据技术有限公司 Geological disaster professional monitoring data acquisition method and system
CN114826988A (en) * 2021-01-29 2022-07-29 中国电信股份有限公司 Method and device for anomaly detection and parameter filling of time sequence data
CN113569324A (en) * 2021-08-03 2021-10-29 招商局重庆交通科研设计院有限公司 Slope deformation monitoring abnormal data analysis and optimization method
CN113434854A (en) * 2021-08-26 2021-09-24 中国电子信息产业集团有限公司 Method for generating data element based on sandbox environment and storage medium
CN114035553A (en) * 2021-11-16 2022-02-11 湖南机电职业技术学院 Control system fault detection method and device based on system identification and fitting degree
CN114035553B (en) * 2021-11-16 2023-11-24 湖南机电职业技术学院 Control system fault detection method and device based on system identification and fitting degree
CN114638290A (en) * 2022-03-07 2022-06-17 廖彤 Environment monitoring instrument fault prediction method based on edge calculation and BP neural network
CN114638290B (en) * 2022-03-07 2024-04-30 廖彤 Environment monitoring instrument fault prediction method based on edge calculation and BP neural network
CN115080909A (en) * 2022-07-15 2022-09-20 深圳市城市交通规划设计研究中心股份有限公司 Analysis method for influencing data of internet of things sensing equipment, electronic equipment and storage medium
CN115080909B (en) * 2022-07-15 2022-11-25 深圳市城市交通规划设计研究中心股份有限公司 Analysis method for influencing data of internet of things sensing equipment, electronic equipment and storage medium
CN116576553A (en) * 2023-07-11 2023-08-11 韦德电子有限公司 Data optimization acquisition method and system for air conditioner
CN116576553B (en) * 2023-07-11 2023-09-22 韦德电子有限公司 Data optimization acquisition method and system for air conditioner
CN117074627A (en) * 2023-10-16 2023-11-17 三科智能(山东)集团有限公司 Medical laboratory air quality monitoring system based on artificial intelligence
CN117074627B (en) * 2023-10-16 2024-01-09 三科智能(山东)集团有限公司 Medical laboratory air quality monitoring system based on artificial intelligence

Similar Documents

Publication Publication Date Title
CN112036075A (en) Abnormal data judgment method based on environmental monitoring data association relation
CN108426812B (en) PM2.5 concentration value prediction method based on memory neural network
Lindemann et al. Anomaly detection and prediction in discrete manufacturing based on cooperative LSTM networks
CN111814956B (en) Multi-task learning air quality prediction method based on multi-dimensional secondary feature extraction
CN112733443B (en) Water supply network model parameter optimization checking method based on virtual monitoring points
CN111179591B (en) Road network traffic time sequence characteristic data quality diagnosis and restoration method
CN110716512A (en) Environmental protection equipment performance prediction method based on coal-fired power plant operation data
CN114169254A (en) Abnormal energy consumption diagnosis method and system based on short-term building energy consumption prediction model
CN115860286B (en) Air quality prediction method and system based on time sequence gate mechanism
CN113988210A (en) Method and device for restoring distorted data of structure monitoring sensor network and storage medium
CN114154619A (en) Ship track prediction method based on CNN and BILSTM
CN115759488A (en) Carbon emission monitoring and early warning analysis system and method based on edge calculation
CN110991776A (en) Method and system for realizing water level prediction based on GRU network
CN114662791A (en) Long time sequence pm2.5 prediction method and system based on space-time attention
CN114004397A (en) Multi-factor influence considered regional energy consumption situation prediction method and device
CN114217025B (en) Analysis method for evaluating influence of meteorological data on air quality concentration prediction
Wang et al. Pm2. 5 prediction based on neural network
CN118052326A (en) Campus water supply network water consumption prediction method based on neural network
CN114462717A (en) Small sample gas concentration prediction method based on improved GAN and LSTM
CN114648095A (en) Air quality concentration inversion method based on deep learning
CN114239990A (en) Time series data prediction method based on time series decomposition and LSTM
CN117578441A (en) Method for improving power grid load prediction precision based on neural network
CN112986393A (en) Bridge inhaul cable damage detection method and system
CN117350146A (en) GA-BP neural network-based drainage pipe network health evaluation method
CN112528566A (en) Real-time air quality data calibration method and system based on AdaBoost training model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination