CN113849479A - Comprehensive energy supply station oil tank leakage detection method based on instant learning and self-adaptive threshold - Google Patents
Comprehensive energy supply station oil tank leakage detection method based on instant learning and self-adaptive threshold Download PDFInfo
- Publication number
- CN113849479A CN113849479A CN202111044449.4A CN202111044449A CN113849479A CN 113849479 A CN113849479 A CN 113849479A CN 202111044449 A CN202111044449 A CN 202111044449A CN 113849479 A CN113849479 A CN 113849479A
- Authority
- CN
- China
- Prior art keywords
- oil
- sample
- data
- query sample
- time
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 47
- 238000000034 method Methods 0.000 claims abstract description 46
- 238000005259 measurement Methods 0.000 claims abstract description 24
- 230000004927 fusion Effects 0.000 claims abstract description 8
- 230000008859 change Effects 0.000 claims abstract description 7
- 238000012805 post-processing Methods 0.000 claims abstract description 7
- 238000012360 testing method Methods 0.000 claims description 34
- 230000003044 adaptive effect Effects 0.000 claims description 19
- 239000013598 vector Substances 0.000 claims description 19
- 239000007788 liquid Substances 0.000 claims description 17
- 238000009826 distribution Methods 0.000 claims description 15
- 238000012549 training Methods 0.000 claims description 13
- 239000011159 matrix material Substances 0.000 claims description 10
- 238000005070 sampling Methods 0.000 claims description 4
- AYFVYJQAPQTCCC-GBXIJSLDSA-N L-threonine Chemical compound C[C@@H](O)[C@H](N)C(O)=O AYFVYJQAPQTCCC-GBXIJSLDSA-N 0.000 claims description 3
- 239000000284 extract Substances 0.000 claims description 3
- 239000000446 fuel Substances 0.000 claims description 3
- 239000000126 substance Substances 0.000 claims description 3
- 230000035772 mutation Effects 0.000 claims description 2
- 238000000638 solvent extraction Methods 0.000 claims description 2
- 238000002474 experimental method Methods 0.000 abstract 1
- 230000007246 mechanism Effects 0.000 description 7
- 238000003860 storage Methods 0.000 description 7
- 238000004519 manufacturing process Methods 0.000 description 5
- 230000002159 abnormal effect Effects 0.000 description 4
- 238000004422 calculation algorithm Methods 0.000 description 4
- 230000008569 process Effects 0.000 description 3
- 230000009467 reduction Effects 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 238000013459 approach Methods 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000003745 diagnosis Methods 0.000 description 2
- 238000007599 discharging Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000007613 environmental effect Effects 0.000 description 2
- 238000013178 mathematical model Methods 0.000 description 2
- 238000000513 principal component analysis Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000005422 blasting Methods 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 239000002131 composite material Substances 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000005429 filling process Methods 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000010238 partial least squares regression Methods 0.000 description 1
- 238000011112 process operation Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/211—Schema design and management
- G06F16/212—Schema design and management with details for data modelling support
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01M—TESTING STATIC OR DYNAMIC BALANCE OF MACHINES OR STRUCTURES; TESTING OF STRUCTURES OR APPARATUS, NOT OTHERWISE PROVIDED FOR
- G01M3/00—Investigating fluid-tightness of structures
- G01M3/02—Investigating fluid-tightness of structures by using fluid or vacuum
- G01M3/26—Investigating fluid-tightness of structures by using fluid or vacuum by measuring rate of loss or gain of fluid, e.g. by pressure-responsive devices, by flow detectors
- G01M3/32—Investigating fluid-tightness of structures by using fluid or vacuum by measuring rate of loss or gain of fluid, e.g. by pressure-responsive devices, by flow detectors for containers, e.g. radiators
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/215—Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/242—Query formulation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2465—Query processing support for facilitating data mining operations in structured databases
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Fuzzy Systems (AREA)
- Probability & Statistics with Applications (AREA)
- Software Systems (AREA)
- Quality & Reliability (AREA)
- Examining Or Testing Airtightness (AREA)
Abstract
The invention discloses a comprehensive energy supply station oil tank leakage detection method based on instant learning and self-adaptive threshold. The invention adopts an instant learning non-parametric modeling method, when leakage judgment is needed to be carried out on an online real-time data stream, a local data set which is most matched with an online data mode is searched and constructed from a historical database to establish a leakage detection model based on oil level soft measurement, and a detection threshold value is adaptively updated according to a detection result so as to continuously adapt to small amplitude fluctuation of data characteristics. After the operation is carried out for a period of time, the steps are repeated to update the model so as to adapt to the problem of data characteristic change caused by large change of factors such as oil quantity, temperature and the like. When a local model is established, a plurality of prediction results of the query sample are subjected to post-processing fusion by adopting an exponential weighted average method, so that the prediction performance of the oil height is effectively improved. Finally, experiments based on real oil tank data prove that the method can effectively improve the accuracy, the applicability and the timeliness of the leakage detection.
Description
Technical Field
The invention discloses a comprehensive energy supply station oil tank leakage detection method based on instant learning and self-adaptive threshold. The invention belongs to the field of industrial system fault detection, and particularly relates to leakage fault detection for an oil storage tank system of a comprehensive energy supply station.
Background
In complex industrial processes, the safe operation of production equipment and systems is related to production efficiency, product quality, production process safety, and environmental and personal safety. With the development of modern industry, the operation condition of industrial equipment is more and more complex, the influence factors are more and more, and the requirements on the safety and the reliability of the equipment make the fault detection technology be paid much attention. The existing industrial equipment fault diagnosis methods mainly comprise three methods, namely a mechanism model-based method, a knowledge-based method and a data-driven method. The mechanism-based method needs an accurate mathematical model, but in the actual industrial process, due to many and complex uncertainty factors, the establishment of the corresponding mechanism model is very difficult; knowledge-based methods such as neural networks, expert systems and the like do not require accurate mathematical modeling, but problems of difficult parameter learning, poor self-adaptation and the like often exist in practice; the process operation data are transformed from the measurement space to the feature space and then analyzed based on the data-driven method, and a complex mathematical model does not need to be established.
With the application of a large number of sensors and intelligent instruments of an industrial platform, a distributed control system and a computer storage technology, massive data are accumulated in the industrial system, valuable information is generally difficult to extract from a small amount of data, but the massive data contain immeasurable value, and corresponding value can be extracted from the immeasurable value through mining and analysis. Data-driven based approaches are one of the hot directions of current research. A great deal of research has been conducted by the predecessors on data-based fault detection and fault diagnosis, and multivariate statistical analysis methods such as Principal Component Analysis (PCA), Partial Least Squares (PLS), and Linear Discriminant Analysis (LDA) have been widely used in the field of data-based process monitoring.
The oil tank leakage detection of the comprehensive energy supply station is one of the applications of the fault detection of industrial equipment, the common method for detecting the oil storage tank leakage of each comprehensive energy supply station at present is to directly use a sensor to detect whether liquid exists at the bottom of the oil tank, if the liquid exists, a leakage alarm is sent, the generation of the liquid is probably the moisture in the oil unloading and filling process, and thus, the misinformation is caused, and the maintenance cost is increased; some methods predict the height of the oil level by using stored historical data through a complex mechanism formula, and the complex mechanism formula often causes the problem that coefficients and a large number of parameters are difficult to determine due to environmental changes or equipment degradation and the like; besides, a data-driven oil level soft measurement early warning method is provided, a global modeling method is used, and the global model often has the problems that the structure of the model is difficult to confirm, the parameters are difficult to optimize, the model is difficult to update, the precision is reduced due to data nonlinearity and the like. Compared with the global modeling and the traditional local modeling method, the basic modeling framework based on the just-in-time learning finds out the similar sample which is most matched with the current query sample modality from the historical accumulated data to be used for local modeling, so that better modeling precision is obtained. In addition, most of the early warning threshold setting methods used at present are fixed thresholds set according to a global model, influence caused by change of online working conditions or operating environments is not considered, and the probability of false alarm and false alarm is high.
Disclosure of Invention
The invention aims to provide a comprehensive energy supply station oil tank leakage detection method based on instant learning and self-adaptive threshold, aiming at the problems of high global modeling complexity, fixed threshold fault detection delay, high fault leakage detection rate and the like in the existing oil tank leakage detection method based on oil height soft measurement. By adopting the modeling method of the instant learning, the online real-time data stream can be matched with the history sample with the most similar modality, and a fault early warning model based on the oil level soft measurement is established, so that the real-time early warning of the online data is realized. Meanwhile, the invention adds the self-adaptive threshold value method into the model for instant learning, and can dynamically adapt to the characteristic of small amplitude fluctuation of data. After a period of operation, similar samples and local modeling need to be updated in order to adapt to the problem of data characteristic changes due to large changes in oil volume, temperature, and the like. The comprehensive energy supply station oil tank leakage detection method based on the instant learning and the self-adaptive threshold can simplify the model, improve the prediction precision and effectively reduce the problem of false alarm rate caused by the fixed threshold. The applicability, accuracy and timeliness of the fault detection method are improved on the whole.
The purpose of the invention is realized by the following technical scheme:
a method for integrated energy supply station tank leak detection based on instant learning and adaptive thresholds, the method comprising the steps of:
(1) the method comprises the steps of collecting current running state data of the comprehensive energy supply station in real time, storing and collecting historical running state data of an oil tank of the comprehensive energy supply station, wherein the collected historical running state data of the oil tank of the comprehensive energy supply station are running data under a normal leakage-free stable state, the running data comprise a plurality of groups of oil heights H, temperatures T and reading times T in a liquid level instrument system and an oil quantity V in the tank in a station transaction information system, and the collected current running state data comprise the oil heights, the temperatures and the oil quantity in the tank at the current moment.
(2) Forming a query sample set by Nq continuous real-time running state data, and selecting N from historical running state data according to the query sampletrainAnd forming a training set by the similar samples with similar time and oil quantity to construct an oil height prediction soft measurement model for immediate learning. The oil height soft measurement model is specifically expressed as:
Hq=f(Tq,Hj,Tj,Vq-Vj)
where the index q denotes the index of the query sample, j denotes the index of the corresponding historical time-like sample used for prediction, Hq、Tq、VqOil height, temperature and oil mass, H, respectively, of the query samplej、Tj、VjOil height, temperature, oil mass of similar samples, respectively.
(3) Selecting a group from the historical operating state data according to each query sample, respectively, and separating the group from the query sample by a period of timeContinuous NtestTaking individual samples as test samples and inputting the test samples and corresponding query samples into the oil height soft measurement model together for prediction to obtain N of each query sampletestPredicted oil heightThe superscript i represents the index of the test sample corresponding to the query sample, and N istestThe oil height of each predicted oil height is subjected to post-processing fusion of exponential weighted average to be used as the predicted oil height of each final query sample
(4) The adaptive threshold updating is carried out according to each query sample, and the method specifically comprises the following substeps:
(4.1) obtaining an initial training set prediction residual error by the oil height soft measurement model, wherein the residual error is a difference value between the predicted oil height and the actual oil height, and a residual error vector can be expressed as:
performing Kernel Density Estimation (KDE) on the residual vector, and taking a lower quantile with the confidence level of 0.95 as an initial threshold Thre0。
(4.2) obtaining each query sample residual error and a corresponding test set residual error by the oil height soft measurement model, wherein the test set residual errors are expressed as:
(4.3) updating the threshold for each query sample using the following update strategy:
if residual error of sample k is queriedGreater than a threshold value Threk-1If yes, then not update the threshold value, let Threk= Threk-1And outputs an alarm; threkRepresents the threshold, Thre, corresponding to the kth sample0Is the initial threshold.
If the residual error of the query sample k is less than or equal to the threshold Threk-1Then, the threshold is updated:
will be provided withAnd EtrainForming new residual vectorsRe-estimating the probability density of residual distribution, and taking the lower quantile with the confidence degree of 0.95 as the updated threshold Threk。
Further, the step 1 specifically comprises:
acquiring oil height H, temperature T and reading time T by using a liquid level meter system; acquiring gun-lifting time t by using site transaction information systemtTime t of hanging gungThe sales volume per fueling, i.e., the change in the volume of fuel in the tank Δ V.
Gun lifting time t through site transaction information systemtGun hanging time tgRemoving oiling data from the liquid level instrument data according to the corresponding condition of the sampling time of the liquid level instrument system, namely data between the gun lifting time and the gun hanging time; eliminating data sections of oil unloading operation, namely data sections with obviously increased oil height; outliers (portions of data mutations) were removed. Finally obtaining the multi-section oil conservation data after the oil unloading section and the oil filling section are removed,i.e. operating data in steady state.
Dividing historical running state data of the comprehensive energy supply station into a plurality of stable states S according to oil quantity1、 S2…SlEach steady state comprises 1 or more moments of running state data, the oil quantity is different between each steady state, and the first steady state S is1As a reference starting point, given an oil quantity of zero as an initial value, a second steady state S2Amount of oil (c)Analogize the kth stationary state SkThe oil amount isWherein m iskIs shown in a steady state SkThe number of operating state data items, k, contained in the following list is 1, 2, …, l. And reconstructing a characteristic state matrix according to the steady state number S and the time sequence, wherein the characteristic state matrix comprises a variable steady state number S, an oil height H, a temperature T and an oil quantity V in the steady state.
Further, the step of removing the outlier specifically comprises:
(a) partitioning historical operating state data into NgroupAnd (4) grouping.
(b) Performing KNN Euclidean distance sorting on the data of each subgroup obtained in the step (a):
the euclidean distance vectors of the jth sample in the set with the remaining samples except for itself can be expressed as:
wherein j is 1, 2, …, (N)batch) -j represents all samples within the group except j, DjDimension of 1(Nbatch-1)。
(c) The 3 σ criterion extracts noise: from DjCalculates the standard deviation and mean of the set of distances:
according to the 3 delta criterion, if present in the set of distance dataThen the jth sample is an outlier and should be discarded.
Further, in the step 2, N is respectively selected from the historical operating state data according to each query sampletrainThe similar samples with similar time and oil quantity are specifically as follows:
and calculating according to the Euclidean distance to obtain a similar sample with the oil quantity similar to the oil quantity of the query sample:
dj,q=|vj-vq|
wherein v isjOil quantity, v, representing the jth historical sampleqRepresenting the oil quantity of the query sample, and selecting N according to the threshold valuevEach sample is used as a similar sample with similar oil quantity of the query sample.
Select consecutive N before query sampletEach sample is taken as a similar sample close in time.
Ntrain=Nv+Nt
Further, in the step 3, N is addedtestThe oil height of each predicted oil height is subjected to post-processing fusion of exponential weighted average to be used as the predicted oil height of each final query sampleThe method comprises the following specific steps:
whereinRepresenting the final output of the query sample, beta representsThe weighting coefficient of (A) is an adjustable hyper-parameter, and is scaled by (N)test-i) an index representing the weight.
Further, in the step 4, the stepAnd EtrainForming new residual vectorsThe probability density of the re-estimated residual distribution is specifically:
approximating the residuals to a normal distribution the mean and variance of the distribution were calculated using a recursive method:
wherein, mukAnd deltakProbability density mean and variance, μ, respectively, of the re-estimated k-th query sample residual0,δ0Respectively obtaining the probability density mean value and variance of all query sample residuals obtained by the training set; mu.stest,k,δtest,kRespectively the residual errors obtained from the kth query sample test setProbability density mean and variance.
The invention has the beneficial effects that: the method aims at the problems of complex parameters, weak generalization capability of the detection method, global nonlinearity, high false alarm rate and the like of the existing oil tank leakage detection method based on a mechanism model, a global fixed threshold and the like. The invention provides a comprehensive energy supply station oil tank leakage detection method based on instant learning and self-adaptive threshold. The invention adopts a modeling method of instant learning, when leakage judgment is needed to be carried out on an online real-time data stream, firstly a local data set which is most matched with an online data mode is searched and constructed from a historical database to establish a leakage detection model based on oil height soft measurement, and a detection threshold value is adaptively updated according to a detection result so as to continuously adapt to small amplitude fluctuation of data characteristics. Meanwhile, when a local model is established, a plurality of prediction results of the query sample are subjected to post-processing fusion by an exponential weighted average method, and the prediction performance of the oil height is effectively improved. Overall, the invention effectively reduces the false alarm rate of the oil tank leakage detection of the comprehensive energy supply station and improves the applicability, the accuracy and the timeliness of the detection method.
Drawings
FIG. 1 is an overall framework for tank leak detection based on instant learning and adaptive thresholds;
FIG. 2 is a diagram of the division of inspection data based on tank data from a certain integrated energy supply station;
FIG. 3 is a graph of leak detection results for a global model using fixed thresholds and adaptive thresholds;
FIG. 4 is a graph of leak detection results for the instant learning model using fixed and adaptive thresholds.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and specific examples of the actual data situation of the storage tank of the integrated energy supply station.
The safety of the oil storage tank is related to the overall safety of the comprehensive energy supply station, and the oil tank leakage is the most likely to occur in the oil storage tank and is also the blasting fuse for subsequent accidents such as fire, explosion and the like, so the method has very important significance for detecting the oil tank leakage. And the relevant leakage detection strategy needs to meet the characteristics of real-time performance, accuracy and the like. The invention takes the data of a liquid level meter and a station transaction system of a distributed system of a certain comprehensive energy supply station in Zhejiang province as an example, and extracts 7 key variables such as oil height, temperature, gun lifting time, gun hanging time, oil filling amount and the like.
As shown in fig. 1, the invention is a method for detecting tank leakage of an integrated energy supply station based on instant learning and adaptive threshold, comprising the following steps:
(1) the method comprises the steps of collecting current running state data of the comprehensive energy supply station in real time, storing and collecting historical running state data of the comprehensive energy supply station, wherein the collected historical running state data comprise a plurality of groups of oil heights H and temperatures T in a liquid level meter system, reading time T and the oil quantity V in a station transaction information system, and the collected current running state data comprise the oil heights (obtained by measurement of a sensor), the temperatures and the oil quantity in the tank at the current moment.
(2) And (3) carrying out data processing on the collected historical data of the comprehensive energy supply station: acquiring oil height H, temperature T and reading time T by using a liquid level meter system; acquiring gun-lifting time t by using site transaction information systemtTime t of hanging gungThe sales volume per fueling, i.e., the change in the volume of fuel in the tank Δ V.
Gun lifting time t through site transaction information systemtGun hanging time tgRemoving oiling data from the liquid level instrument data according to the corresponding condition of the sampling time of the liquid level instrument system, namely data between the gun lifting time and the gun hanging time; eliminating data sections of oil unloading operation, namely data sections with obviously increased oil height;
cleaning to remove abnormal points: because the original data collected in the distributed operating system of the comprehensive energy supply station has obvious abnormal values and outliers, abnormal samples in the data are removed by using a data noise reduction algorithm of a KNN-3 delta criterion for establishing subsequent characteristic engineering and improving model precision. The step comprises the following substeps:
(2.1) grouping: the station level historical data has the characteristics of large data volume and obvious periodicity, so that the data are grouped according to the sample sampling sequence, the noise reduction algorithm is respectively applied to each group, and the calculation amount of the KNN algorithm is reduced. Grouped into groupsThe principle is to reduce the operation amount and reject abnormal data as soon as possible, if the number of samples N in each groupbatch200, the total number of samples N is 10000, the number of groups NgroupIs 50.
(2.2) KNN Euclidean distance ranking: the data for each subgroup from the first step were:
the euclidean distance vector of the jth sample from the rest of the samples except itself can be expressed as:
wherein j is 1, 2, …, (N)batch) And j represents all samples within this group except j.
(2.3)3 σ criterion extraction noise: d obtained in substep (1.2)jIs 1 (N)batch-1) vector of DjThe standard deviation and mean of the set of distances are calculated for each element in (a):
according to the 3 delta criterion, the values of the distance data are almost entirely concentrated in (mu-3 sigma, mu +3 sigma)]Within this interval, the probability of exceeding this range is only less than 0.27%. If present in the set of range dataThe measurement is an outlier and should be rejected.
(3) Characteristic matrix: dividing historical operating state data of the comprehensive energy supply station into a plurality of stable states according to oil quantityState S1、S2…SlThe division basis is that the state without oil unloading and oil filling operation is regarded as a stable state through the lifting and grabbing time in the system. Each steady state comprises 1 or more moments of running state data, the oil amount is different between each steady state, and the first steady state S after oil unloading operation1As a reference starting point, given an oil quantity of zero as an initial value, a second steady state S2Amount of oil (c)Analogize the kth stationary state SkThe oil amount isWhen an oil-discharging operation is encountered, the current oil amount plus the oil-discharging amount is used as the current oil amount. Wherein m iskIs shown in a steady state SkThe number of operating state data items, k, contained in the following list is 1, 2, …, l. And reconstructing a characteristic state matrix according to the steady state number S and the time sequence, wherein the characteristic state matrix comprises a variable steady state number S, an oil height H, a temperature T and an oil quantity V in the steady state. In the practical process of the comprehensive energy supply stationThe oil quantity is a positive value.
The run matrix at this time can be expressed as:
X={x1,x2,…,xN}
wherein each sample comprises oil height, temperature and oil amount in the tank, N represents the number of the obtained historical data samples, and a prediction sample pair is formed between two sample points, such as a sample x1And x2Forming a feature matrix { H2,T1,H1,T1,V1-V2}. The time sequence needs to be considered in the actual online sample prediction.
(4) The oil height prediction soft measurement model based on the instant learning is realized by the following substeps:
(4.1) selection based on similar samples with similar oil volumes and times.
Firstly, calculating and obtaining according to the Euclidean distance based on similar samples with similar oil quantities, wherein the formula of the Euclidean distance of a univariate is as follows:
dj,q=|vj-vq|
wherein v isjOil quantity, v, representing the jth historical sampleqIndicating the quantity of oil of the query sample to be predicted, if the distance d between the sample oil quantitiesj,qThe smaller the sample is, the higher the similarity of oil amounts between the samples is, and N is takenvEach sample was taken as a similar sample with similar amounts of oil.
Secondly, selecting similar samples with similar time: since the samples are arranged according to the time sequence, we only need to take the query sample xqPrevious consecutive NtAnd (4) one sample is needed.
Similar samples can now be expressed as:
(4.2) local online modeling:
and (3) forming a training set by the similar samples obtained in the substep (3.1), and establishing a data-driven soft measurement model, wherein the oil height soft measurement model is specifically represented as:
Hq=f(Tq,Hj,Tj,Vq-Vj)
where the subscript q denotes the index of the query sample, j denotes the index of the corresponding historical time query sample used for prediction, Hq、Tq、VqOil height, temperature and oil mass, H, respectively, of the query samplej、Tj、VjOil height, temperature, oil mass of similar samples, respectively. f (-) is a Partial Least Squares Regression (PLS) based oil height soft measurement model.
(4.3) oil height prediction: and (4) after the local model is built, predicting the output of the query sample. Two points should be noted when making predictions:
a. since the change in the tank oil level is relatively slow in a short time without a discharge operation, a plurality of consecutive query samples may share the same similar sample, i.e. consecutive NgInputting the queries into the same oil height prediction model to obtain predicted oil height;
b. the influence of the historical samples on the query samples is decreased from near to far in time, but the input samples are too close to each other, so that the output completely follows the trend of the input data, and therefore, if the input is fault data, whether the fault occurs or not cannot be detected in time. To balance the contradiction, the input variable is selected by separating the input variable from the query sample for a period of time and successively taking a plurality of samples similar in time, such as 100 samples before the time and N samples before the timetest50 samples are used as samples to construct the input variables. And performing post-processing fusion of exponential weighted average on the predicted value of each query sample as final prediction output, wherein the weighting formula is as follows:
whereinRepresenting the final output of the query sample, beta representsThe weighting coefficients of (a), are adjustable hyper-parameters,designating the predicted value of the ith test sample as input, NtestIndicating the number of input samples, superscript (N)test-i) an index representing the weight. The nature of the exponentially weighted average is a moving average weighted exponentially down. The weighting of each value decreases exponentially with time,more recent data is weighted more heavily, but older data is also given a certain weight. This is consistent with the predicted relationship in an actual tank, and we believe that points closer to the query sample are more reliable, but the effect of samples at farther times cannot be ignored.
Once the output prediction is complete, the model is discarded immediately until the next set of query samples arrives, again with similar sample selection and local modeling.
(5) The tank leakage detection method based on the adaptive threshold value updating is realized by the following sub-steps:
(5.1) obtaining an initial training set prediction residual error by an oil height soft measurement model, wherein the residual error is from a difference value of a predicted oil height and an actual oil height, and a residual error vector can be expressed as:
performing Kernel Density Estimation (KDE) on the residual vector, and taking a lower quantile with the confidence level of 0.95 as an initial threshold Thre0. Meaning that 95% of the residual distribution is the confidence interval beyond which 5% of the samples are considered outliers.
To improve the computational efficiency, the probability density estimation can use normal distribution, so the initial mean and variance θ can be obtained0={μ0,δ0}。
(5.2) for the query sample, a set of test set residuals is obtained before performing the exponentially weighted fusion, which can be expressed as:
wherein the content of the first and second substances,similarly, the residuals of the query samples are obtained by performing exponential average fusionThe residual vector of the query sample can be expressed as:
byTheta can be calculatedk={μtest,k,δtest,kIn which μtest,k,δtest,kAre respectively asMean and variance of.
(5.3) according to the adaptive updating threshold value of each online data (query sample), in order to avoid the threshold value of the algorithm changing with the fault data in an adaptive way, the following updating strategy is adopted:
if residual error of sample k is queriedGreater than a threshold value Threk-1If yes, then not update the threshold value, let Threk= Threk-1And outputs an alarm; threkRepresents the threshold, Thre, corresponding to the kth sample0Is the initial threshold.
If the residual error of the query sample k is less than or equal to the threshold Threk-1The threshold is updated.
Taking the example of updating the first threshold, if the weighted threshold of the first query sample is greater than the initial threshold, that is, the first query sample is updated to the initial thresholdThen not update the threshold, let Thre1=Thre0And outputs an alarm;
and EtrainForming new residual vectorsRe-estimating the probability density of residual distribution, and taking the lower quantile with the confidence degree of 0.95 as the updated threshold Thre1. If it is the k sample, it will beAnd EtrainForming new residual vectorsRe-estimating the probability density of residual distribution, and taking the lower quantile with the confidence degree of 0.95 as the updated threshold Threk。
To reduce storage space and computational effort, the residuals may be approximated as normal distributions and the mean and variance of the distributions are calculated using a recursive approach:
the update strategy for each online sample is analogized thereafter. Real-time and adaptive threshold updating is realized.
(6) And (3) evaluating the performance of the detection model: because the fault data of the comprehensive energy supply station in the actual running state are few, and the obvious problem of unbalanced category exists, the quality of the model cannot be measured simply by using the indexes of the false alarm rate and the missing report rate. Therefore, a Matthews Correlation Coefficient (MCC) is selected and used, wherein the MCC is a correlation coefficient between a real value and a predicted value in binary classification, the value range is [ -1, 1], and the more the value is close to 1, the more accurate model prediction is represented. The specific calculation formula is as follows:
wherein TP represents the number of positive samples predicted correctly; TN represents the number of negative samples predicting correctly; FN represents the number of positive sample prediction errors; FP represents the number of negative sample prediction errors. The Mazis correlation coefficient is widely applied to the classification problem in evaluation machine learning, in particular to the classification problem under the condition that positive and negative samples are unbalanced. Some scientists claim that the mausus correlation coefficient is the most informative single score for establishing the quality of a binary classifier prediction in a confusion matrix environment.
Regression index R of training set in oil height prediction soft measurement model by adopting global model and instant learning framework299.969% and 99.741%, respectively, and RMSE 2.410 and 1.732, respectively. It can be seen that the accuracy of the two is comparable in oil high prediction performance.
Table 1 shows the detection results of the data of the liquid level meter and the station transaction system of the distributed system of a certain comprehensive energy supply station in Zhejiang province. During actual operation, the fault data is little or no, so random noise is artificially added at the 1500 th sample point of the test data to simulate the tank leakage scene which may occur in the actual process, such as the dashed line below the right curve of the vertical line in fig. 2.
Fig. 3 and 4 are detection curves for the global model and the instantaneous learning model using fixed thresholds and adaptive thresholds, respectively. It can be seen from the graph that the fault starts at the 1500 th sample, the global model can detect the fault after a period of time, and the instant learning framework can detect the fault immediately when the fault occurs. The timeliness of fault detection under the scene of instant learning is verified.
It can also be seen from the curve that the adaptive threshold significantly improves the problems of fixed threshold with higher false reports and false negatives.
In addition, table 1 shows statistics of leakage fault detection results, and it can be seen that the MCC composite indicator performs best in the detection strategy using the adaptive threshold in the instantaneous learning framework. Although the false alarm rate is zero, the missing report rate is high because the dynamic threshold is adopted by the global model, because the training data of the global model is more, the influence of the residual error of the training set is larger when the threshold is updated by using the self-adaptive threshold method, and the influence of the residual error of the test set on the distribution is weakened. The reasonability of the instant learning model in the scene is also proved, and the contradiction between model precision reduction caused by too small training set and insensitive threshold updating caused by too large training set can be relieved.
TABLE 1 leak Fault detection results
The method has the advantages that the complexity of mechanism model modeling is avoided by adopting a data driving method, and the false alarm rate, the missing report rate and the timeliness of leakage detection of the oil tank leakage detection are obviously improved by adopting an instant learning framework and a self-adaptive threshold fault detection strategy.
Claims (6)
1. The method for detecting the tank leakage of the integrated energy supply station based on the instant learning and the self-adaptive threshold value is characterized by comprising the following steps of:
(1) and acquiring the current running state data of the comprehensive energy supply station in real time, and storing and collecting historical running state data of an oil tank of the comprehensive energy supply station. The historical operating state data of the oil tank of the comprehensive energy supply station is collected to be operating data under a normal and leakage-free stable state, the historical operating state data comprises a plurality of groups of oil heights H and temperatures T in a liquid level meter system, reading time T and oil quantity V in the oil tank in a station transaction information system, and the collected current operating state data comprises the oil heights, the temperatures and the oil quantity in the oil tank at the current moment.
(2) Forming a query sample set by Nq continuous real-time running state data, and selecting N from historical running state data according to the query sampletrainAnd forming a training set by the similar samples with similar time and oil quantity to construct an oil height prediction soft measurement model for immediate learning. The oil height soft measurement model is specifically expressed as:
Hq=f(Tq,Hj,Tj,Vq-Vj)
where the index q denotes the index of the query sample, j denotes the index of the corresponding historical time-like sample used for prediction, Hq、Tq、VqOil height, temperature and oil mass, H, respectively, of the query samplej、Tj、VjOil height, temperature, oil mass of similar samples, respectively.
(3) Selecting a group from the historical running state data according to each query sample, wherein the group is separated from the query sample by a period of time and is N continuoustestTaking individual samples as test samples and inputting the test samples and corresponding query samples into the oil height soft measurement model together for prediction to obtain N of each query sampletestPredicted oil heightThe superscript i represents the index of the test sample corresponding to the query sample, and N istestThe oil height of each predicted oil height is subjected to post-processing fusion of exponential weighted average to be used as the predicted oil height of each final query sample
(4) The adaptive threshold updating is carried out according to each query sample, and the method specifically comprises the following substeps:
(4.1) obtaining an initial training set prediction residual error by the oil height soft measurement model, wherein the residual error is a difference value between the predicted oil height and the actual oil height, and a residual error vector can be expressed as:
performing Kernel Density Estimation (KDE) on the residual vector, and taking a lower quantile with the confidence level of 0.95 as an initial threshold Thre0。
(4.2) obtaining each query sample residual error and a corresponding test set residual error by the oil height soft measurement model, wherein the test set residual errors are expressed as:
(4.3) updating the threshold value of each query sample one by adopting the following updating strategy:
if residual error of sample k is queriedGreater than a threshold value Threk-1If yes, then not update the threshold value, let Threk=Threk-1And outputs an alarm; threkRepresents the threshold, Thre, corresponding to the kth sample0Is the initial threshold.
If the residual error of the query sample k is less than or equal to the threshold Threk-1Then, the threshold is updated:
2. The method for detecting the tank leakage of the integrated energy supply station based on the immediate learning and the adaptive threshold value according to claim 1, wherein the step 1 is specifically as follows:
acquiring oil height H, temperature T and reading time T by using a liquid level meter system; acquiring gun-lifting time t by using site transaction information systemtTime t of hanging gungThe sales volume per fueling, i.e., the change in the volume of fuel in the tank Δ V.
Gun lifting time t through site transaction information systemtGun hanging time tgRemoving oiling data from the liquid level instrument data according to the corresponding condition of the sampling time of the liquid level instrument system, namely data between the gun lifting time and the gun hanging time; eliminating data sections of oil unloading operation, namely data sections with obviously increased oil height; outliers (portions of data mutations) were removed. And finally obtaining multi-section oil quantity conservation data after the oil unloading section and the oil filling section are removed, namely the operation data in a stable state.
Dividing historical running state data of the comprehensive energy supply station into a plurality of stable states S according to oil quantity1、S2…SlEach steady state comprises 1 or more moments of running state data, the oil quantity is different between each steady state, and the first steady state S is1As a reference starting point, given an oil quantity of zero as an initial value, a second steady state S2Amount of oil (c)Analogize the kth stationary state SkThe oil amount isWherein m iskIs shown in a steady state SkThe number of operating state data items, k, contained in the following list is 1, 2, …, l. And reconstructing a characteristic state matrix according to the steady state number S and the time sequence, wherein the characteristic state matrix comprises a variable steady state number S, an oil height H, a temperature T and an oil quantity V in the steady state.
3. The method for detecting tank leakage of integrated energy supply station based on learning-on-demand and adaptive threshold as claimed in claim 2, wherein the step of removing outliers is specifically:
(a) partitioning historical operating state data into NgroupAnd (4) grouping.
(b) Performing KNN Euclidean distance sorting on the data of each subgroup obtained in the step (a):
the euclidean distance vectors of the jth sample in the set with the remaining samples except for itself can be expressed as:
wherein j is 1, 2, …, (N)batch) -j represents all samples within the group except j, DjHas a dimension of 1 × (N)batch-1)。
(c) The 3 σ criterion extracts noise: from DjCalculates the standard deviation and mean of the set of distances:
4. The method for detecting tank leakage in integrated energy supply station based on learning-on-demand and adaptive threshold as claimed in claim 1, wherein in step 2, N is selected from historical operating status data according to each query sampletrainThe similar samples with similar time and oil quantity are specifically as follows:
and calculating according to the Euclidean distance to obtain a similar sample with the oil quantity similar to the oil quantity of the query sample:
dj,q=|vj-vq|
wherein v isjOil quantity, v, representing the jth historical sampleqRepresenting the oil quantity of the query sample, and selecting N according to the threshold valuevEach sample is used as a similar sample with similar oil quantity of the query sample.
Select consecutive N before query sampletEach sample is taken as a similar sample close in time.
Ntrain=Nv+Nt
5. The integrated energy supply station tank leak detection method based on learning-on-demand and adaptive threshold as claimed in claim 1, wherein in step 3, N is calculatedtestThe oil height of each predicted oil height is subjected to post-processing fusion of exponential weighted average to be used as the predicted oil height of each final query sampleThe method comprises the following specific steps:
6. The integrated energy supply station tank leak detection method based on learning-on-demand and adaptive threshold as claimed in claim 1, wherein in step 4, the tank leak detection method is implementedAnd EtrainForming new residual vectorsThe probability density of the re-estimated residual distribution is specifically:
approximating the residuals to a normal distribution the mean and variance of the distribution were calculated using a recursive method:
wherein, mukAnd deltakProbability density mean and variance, μ, respectively, of the re-estimated k-th query sample residual0,δ0Respectively obtaining the probability density mean value and variance of all query sample residuals obtained by the training set; mu.stest,k,δtest,kRespectively the residual errors obtained from the kth query sample test setProbability density mean and variance.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111044449.4A CN113849479A (en) | 2021-09-07 | 2021-09-07 | Comprehensive energy supply station oil tank leakage detection method based on instant learning and self-adaptive threshold |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111044449.4A CN113849479A (en) | 2021-09-07 | 2021-09-07 | Comprehensive energy supply station oil tank leakage detection method based on instant learning and self-adaptive threshold |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113849479A true CN113849479A (en) | 2021-12-28 |
Family
ID=78973319
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111044449.4A Pending CN113849479A (en) | 2021-09-07 | 2021-09-07 | Comprehensive energy supply station oil tank leakage detection method based on instant learning and self-adaptive threshold |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113849479A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114459691A (en) * | 2022-01-05 | 2022-05-10 | 东北石油大学 | Method and system for evaluating leakage risk in carbon dioxide geological storage body |
-
2021
- 2021-09-07 CN CN202111044449.4A patent/CN113849479A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114459691A (en) * | 2022-01-05 | 2022-05-10 | 东北石油大学 | Method and system for evaluating leakage risk in carbon dioxide geological storage body |
CN114459691B (en) * | 2022-01-05 | 2024-03-15 | 东北石油大学 | Leakage risk evaluation method and system in carbon dioxide geological storage body |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Don et al. | Dynamic process fault detection and diagnosis based on a combined approach of hidden Markov and Bayesian network model | |
CN111222549B (en) | Unmanned aerial vehicle fault prediction method based on deep neural network | |
CN116757534B (en) | Intelligent refrigerator reliability analysis method based on neural training network | |
Pani et al. | A survey of data treatment techniques for soft sensor design | |
CN113762329A (en) | Method and system for constructing state prediction model of large rolling mill | |
CN114297918A (en) | Aero-engine residual life prediction method based on full-attention depth network and dynamic ensemble learning | |
CN109298633A (en) | Chemical production process fault monitoring method based on adaptive piecemeal Non-negative Matrix Factorization | |
Shi et al. | Health index synthetization and remaining useful life estimation for turbofan engines based on run-to-failure datasets | |
CN112434390A (en) | PCA-LSTM bearing residual life prediction method based on multi-layer grid search | |
Huang et al. | Bayesian neural network based method of remaining useful life prediction and uncertainty quantification for aircraft engine | |
Liu et al. | Grey-based approach for estimating software reliability under nonhomogeneous Poisson process | |
CN115688581A (en) | Oil gas gathering and transportation station equipment parameter early warning method, system, electronic equipment and medium | |
Wang et al. | Three‐stage feature selection approach for deep learning‐based RUL prediction methods | |
CN113849479A (en) | Comprehensive energy supply station oil tank leakage detection method based on instant learning and self-adaptive threshold | |
CN113780420A (en) | Method for predicting concentration of dissolved gas in transformer oil based on GRU-GCN | |
Agarwal et al. | Hierarchical deep recurrent neural network based method for fault detection and diagnosis | |
Wenqiang et al. | Remaining useful life prediction for mechanical equipment based on temporal convolutional network | |
CN116432856A (en) | Pipeline dynamic early warning method and device based on CNN-GLSTM model | |
Xing-yu et al. | Autoencoder-based fault diagnosis for grinding system | |
CN116522065A (en) | Coal mill health degree assessment method based on deep learning | |
CN116127831A (en) | Soft measurement method for difficult-to-measure parameters of heavy gas turbine | |
CN114137915A (en) | Fault diagnosis method for industrial equipment | |
Zhao et al. | Remaining useful life prediction method based on convolutional neural network and long short-term memory neural network | |
Gu et al. | An improved similarity-based residual life prediction method based on the dynamic variable combination | |
Cao et al. | Research on soft sensing modeling method of gas turbine’s difficult-to-measure parameters |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |