CN113849479A - Comprehensive energy supply station oil tank leakage detection method based on instant learning and self-adaptive threshold - Google Patents

Comprehensive energy supply station oil tank leakage detection method based on instant learning and self-adaptive threshold Download PDF

Info

Publication number
CN113849479A
CN113849479A CN202111044449.4A CN202111044449A CN113849479A CN 113849479 A CN113849479 A CN 113849479A CN 202111044449 A CN202111044449 A CN 202111044449A CN 113849479 A CN113849479 A CN 113849479A
Authority
CN
China
Prior art keywords
oil
sample
data
query sample
time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111044449.4A
Other languages
Chinese (zh)
Inventor
赵春晖
王应龙
常树超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN202111044449.4A priority Critical patent/CN113849479A/en
Publication of CN113849479A publication Critical patent/CN113849479A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/211Schema design and management
    • G06F16/212Schema design and management with details for data modelling support
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01MTESTING STATIC OR DYNAMIC BALANCE OF MACHINES OR STRUCTURES; TESTING OF STRUCTURES OR APPARATUS, NOT OTHERWISE PROVIDED FOR
    • G01M3/00Investigating fluid-tightness of structures
    • G01M3/02Investigating fluid-tightness of structures by using fluid or vacuum
    • G01M3/26Investigating fluid-tightness of structures by using fluid or vacuum by measuring rate of loss or gain of fluid, e.g. by pressure-responsive devices, by flow detectors
    • G01M3/32Investigating fluid-tightness of structures by using fluid or vacuum by measuring rate of loss or gain of fluid, e.g. by pressure-responsive devices, by flow detectors for containers, e.g. radiators
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Fuzzy Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Software Systems (AREA)
  • Quality & Reliability (AREA)
  • Examining Or Testing Airtightness (AREA)

Abstract

The invention discloses a comprehensive energy supply station oil tank leakage detection method based on instant learning and self-adaptive threshold. The invention adopts an instant learning non-parametric modeling method, when leakage judgment is needed to be carried out on an online real-time data stream, a local data set which is most matched with an online data mode is searched and constructed from a historical database to establish a leakage detection model based on oil level soft measurement, and a detection threshold value is adaptively updated according to a detection result so as to continuously adapt to small amplitude fluctuation of data characteristics. After the operation is carried out for a period of time, the steps are repeated to update the model so as to adapt to the problem of data characteristic change caused by large change of factors such as oil quantity, temperature and the like. When a local model is established, a plurality of prediction results of the query sample are subjected to post-processing fusion by adopting an exponential weighted average method, so that the prediction performance of the oil height is effectively improved. Finally, experiments based on real oil tank data prove that the method can effectively improve the accuracy, the applicability and the timeliness of the leakage detection.

Description

Comprehensive energy supply station oil tank leakage detection method based on instant learning and self-adaptive threshold
Technical Field
The invention discloses a comprehensive energy supply station oil tank leakage detection method based on instant learning and self-adaptive threshold. The invention belongs to the field of industrial system fault detection, and particularly relates to leakage fault detection for an oil storage tank system of a comprehensive energy supply station.
Background
In complex industrial processes, the safe operation of production equipment and systems is related to production efficiency, product quality, production process safety, and environmental and personal safety. With the development of modern industry, the operation condition of industrial equipment is more and more complex, the influence factors are more and more, and the requirements on the safety and the reliability of the equipment make the fault detection technology be paid much attention. The existing industrial equipment fault diagnosis methods mainly comprise three methods, namely a mechanism model-based method, a knowledge-based method and a data-driven method. The mechanism-based method needs an accurate mathematical model, but in the actual industrial process, due to many and complex uncertainty factors, the establishment of the corresponding mechanism model is very difficult; knowledge-based methods such as neural networks, expert systems and the like do not require accurate mathematical modeling, but problems of difficult parameter learning, poor self-adaptation and the like often exist in practice; the process operation data are transformed from the measurement space to the feature space and then analyzed based on the data-driven method, and a complex mathematical model does not need to be established.
With the application of a large number of sensors and intelligent instruments of an industrial platform, a distributed control system and a computer storage technology, massive data are accumulated in the industrial system, valuable information is generally difficult to extract from a small amount of data, but the massive data contain immeasurable value, and corresponding value can be extracted from the immeasurable value through mining and analysis. Data-driven based approaches are one of the hot directions of current research. A great deal of research has been conducted by the predecessors on data-based fault detection and fault diagnosis, and multivariate statistical analysis methods such as Principal Component Analysis (PCA), Partial Least Squares (PLS), and Linear Discriminant Analysis (LDA) have been widely used in the field of data-based process monitoring.
The oil tank leakage detection of the comprehensive energy supply station is one of the applications of the fault detection of industrial equipment, the common method for detecting the oil storage tank leakage of each comprehensive energy supply station at present is to directly use a sensor to detect whether liquid exists at the bottom of the oil tank, if the liquid exists, a leakage alarm is sent, the generation of the liquid is probably the moisture in the oil unloading and filling process, and thus, the misinformation is caused, and the maintenance cost is increased; some methods predict the height of the oil level by using stored historical data through a complex mechanism formula, and the complex mechanism formula often causes the problem that coefficients and a large number of parameters are difficult to determine due to environmental changes or equipment degradation and the like; besides, a data-driven oil level soft measurement early warning method is provided, a global modeling method is used, and the global model often has the problems that the structure of the model is difficult to confirm, the parameters are difficult to optimize, the model is difficult to update, the precision is reduced due to data nonlinearity and the like. Compared with the global modeling and the traditional local modeling method, the basic modeling framework based on the just-in-time learning finds out the similar sample which is most matched with the current query sample modality from the historical accumulated data to be used for local modeling, so that better modeling precision is obtained. In addition, most of the early warning threshold setting methods used at present are fixed thresholds set according to a global model, influence caused by change of online working conditions or operating environments is not considered, and the probability of false alarm and false alarm is high.
Disclosure of Invention
The invention aims to provide a comprehensive energy supply station oil tank leakage detection method based on instant learning and self-adaptive threshold, aiming at the problems of high global modeling complexity, fixed threshold fault detection delay, high fault leakage detection rate and the like in the existing oil tank leakage detection method based on oil height soft measurement. By adopting the modeling method of the instant learning, the online real-time data stream can be matched with the history sample with the most similar modality, and a fault early warning model based on the oil level soft measurement is established, so that the real-time early warning of the online data is realized. Meanwhile, the invention adds the self-adaptive threshold value method into the model for instant learning, and can dynamically adapt to the characteristic of small amplitude fluctuation of data. After a period of operation, similar samples and local modeling need to be updated in order to adapt to the problem of data characteristic changes due to large changes in oil volume, temperature, and the like. The comprehensive energy supply station oil tank leakage detection method based on the instant learning and the self-adaptive threshold can simplify the model, improve the prediction precision and effectively reduce the problem of false alarm rate caused by the fixed threshold. The applicability, accuracy and timeliness of the fault detection method are improved on the whole.
The purpose of the invention is realized by the following technical scheme:
a method for integrated energy supply station tank leak detection based on instant learning and adaptive thresholds, the method comprising the steps of:
(1) the method comprises the steps of collecting current running state data of the comprehensive energy supply station in real time, storing and collecting historical running state data of an oil tank of the comprehensive energy supply station, wherein the collected historical running state data of the oil tank of the comprehensive energy supply station are running data under a normal leakage-free stable state, the running data comprise a plurality of groups of oil heights H, temperatures T and reading times T in a liquid level instrument system and an oil quantity V in the tank in a station transaction information system, and the collected current running state data comprise the oil heights, the temperatures and the oil quantity in the tank at the current moment.
(2) Forming a query sample set by Nq continuous real-time running state data, and selecting N from historical running state data according to the query sampletrainAnd forming a training set by the similar samples with similar time and oil quantity to construct an oil height prediction soft measurement model for immediate learning. The oil height soft measurement model is specifically expressed as:
Hq=f(Tq,Hj,Tj,Vq-Vj)
where the index q denotes the index of the query sample, j denotes the index of the corresponding historical time-like sample used for prediction, Hq、Tq、VqOil height, temperature and oil mass, H, respectively, of the query samplej、Tj、VjOil height, temperature, oil mass of similar samples, respectively.
(3) Selecting a group from the historical operating state data according to each query sample, respectively, and separating the group from the query sample by a period of timeContinuous NtestTaking individual samples as test samples and inputting the test samples and corresponding query samples into the oil height soft measurement model together for prediction to obtain N of each query sampletestPredicted oil height
Figure BDA0003250705870000031
The superscript i represents the index of the test sample corresponding to the query sample, and N istestThe oil height of each predicted oil height is subjected to post-processing fusion of exponential weighted average to be used as the predicted oil height of each final query sample
Figure BDA0003250705870000032
(4) The adaptive threshold updating is carried out according to each query sample, and the method specifically comprises the following substeps:
(4.1) obtaining an initial training set prediction residual error by the oil height soft measurement model, wherein the residual error is a difference value between the predicted oil height and the actual oil height, and a residual error vector can be expressed as:
Figure BDA0003250705870000033
Figure BDA0003250705870000034
Ntrain=Nv+Nt
performing Kernel Density Estimation (KDE) on the residual vector, and taking a lower quantile with the confidence level of 0.95 as an initial threshold Thre0
(4.2) obtaining each query sample residual error and a corresponding test set residual error by the oil height soft measurement model, wherein the test set residual errors are expressed as:
Figure BDA0003250705870000035
wherein the content of the first and second substances,
Figure BDA0003250705870000036
the residual vector of the query sample can be expressed as:
Figure BDA0003250705870000037
(4.3) updating the threshold for each query sample using the following update strategy:
if residual error of sample k is queried
Figure BDA0003250705870000038
Greater than a threshold value Threk-1If yes, then not update the threshold value, let Threk= Threk-1And outputs an alarm; threkRepresents the threshold, Thre, corresponding to the kth sample0Is the initial threshold.
If the residual error of the query sample k is less than or equal to the threshold Threk-1Then, the threshold is updated:
will be provided with
Figure BDA0003250705870000039
And EtrainForming new residual vectors
Figure BDA00032507058700000310
Re-estimating the probability density of residual distribution, and taking the lower quantile with the confidence degree of 0.95 as the updated threshold Threk
Further, the step 1 specifically comprises:
acquiring oil height H, temperature T and reading time T by using a liquid level meter system; acquiring gun-lifting time t by using site transaction information systemtTime t of hanging gungThe sales volume per fueling, i.e., the change in the volume of fuel in the tank Δ V.
Gun lifting time t through site transaction information systemtGun hanging time tgRemoving oiling data from the liquid level instrument data according to the corresponding condition of the sampling time of the liquid level instrument system, namely data between the gun lifting time and the gun hanging time; eliminating data sections of oil unloading operation, namely data sections with obviously increased oil height; outliers (portions of data mutations) were removed. Finally obtaining the multi-section oil conservation data after the oil unloading section and the oil filling section are removed,i.e. operating data in steady state.
Dividing historical running state data of the comprehensive energy supply station into a plurality of stable states S according to oil quantity1、 S2…SlEach steady state comprises 1 or more moments of running state data, the oil quantity is different between each steady state, and the first steady state S is1As a reference starting point, given an oil quantity of zero as an initial value, a second steady state S2Amount of oil (c)
Figure BDA0003250705870000041
Analogize the kth stationary state SkThe oil amount is
Figure BDA0003250705870000042
Wherein m iskIs shown in a steady state SkThe number of operating state data items, k, contained in the following list is 1, 2, …, l. And reconstructing a characteristic state matrix according to the steady state number S and the time sequence, wherein the characteristic state matrix comprises a variable steady state number S, an oil height H, a temperature T and an oil quantity V in the steady state.
Further, the step of removing the outlier specifically comprises:
(a) partitioning historical operating state data into NgroupAnd (4) grouping.
(b) Performing KNN Euclidean distance sorting on the data of each subgroup obtained in the step (a):
Figure BDA0003250705870000043
the euclidean distance vectors of the jth sample in the set with the remaining samples except for itself can be expressed as:
Figure BDA0003250705870000044
wherein j is 1, 2, …, (N)batch) -j represents all samples within the group except j, DjDimension of 1(Nbatch-1)。
(c) The 3 σ criterion extracts noise: from DjCalculates the standard deviation and mean of the set of distances:
average value:
Figure BDA0003250705870000045
standard deviation:
Figure BDA0003250705870000046
according to the 3 delta criterion, if present in the set of distance data
Figure BDA0003250705870000047
Then the jth sample is an outlier and should be discarded.
Further, in the step 2, N is respectively selected from the historical operating state data according to each query sampletrainThe similar samples with similar time and oil quantity are specifically as follows:
and calculating according to the Euclidean distance to obtain a similar sample with the oil quantity similar to the oil quantity of the query sample:
dj,q=|vj-vq|
wherein v isjOil quantity, v, representing the jth historical sampleqRepresenting the oil quantity of the query sample, and selecting N according to the threshold valuevEach sample is used as a similar sample with similar oil quantity of the query sample.
Select consecutive N before query sampletEach sample is taken as a similar sample close in time.
Ntrain=Nv+Nt
Further, in the step 3, N is addedtestThe oil height of each predicted oil height is subjected to post-processing fusion of exponential weighted average to be used as the predicted oil height of each final query sample
Figure BDA0003250705870000051
The method comprises the following specific steps:
Figure BDA0003250705870000052
wherein
Figure 1
Representing the final output of the query sample, beta represents
Figure BDA0003250705870000054
The weighting coefficient of (A) is an adjustable hyper-parameter, and is scaled by (N)test-i) an index representing the weight.
Further, in the step 4, the step
Figure BDA0003250705870000055
And EtrainForming new residual vectors
Figure BDA0003250705870000056
The probability density of the re-estimated residual distribution is specifically:
approximating the residuals to a normal distribution the mean and variance of the distribution were calculated using a recursive method:
Figure BDA0003250705870000057
Figure BDA0003250705870000058
wherein, mukAnd deltakProbability density mean and variance, μ, respectively, of the re-estimated k-th query sample residual0,δ0Respectively obtaining the probability density mean value and variance of all query sample residuals obtained by the training set; mu.stest,k,δtest,kRespectively the residual errors obtained from the kth query sample test set
Figure 2
Probability density mean and variance.
The invention has the beneficial effects that: the method aims at the problems of complex parameters, weak generalization capability of the detection method, global nonlinearity, high false alarm rate and the like of the existing oil tank leakage detection method based on a mechanism model, a global fixed threshold and the like. The invention provides a comprehensive energy supply station oil tank leakage detection method based on instant learning and self-adaptive threshold. The invention adopts a modeling method of instant learning, when leakage judgment is needed to be carried out on an online real-time data stream, firstly a local data set which is most matched with an online data mode is searched and constructed from a historical database to establish a leakage detection model based on oil height soft measurement, and a detection threshold value is adaptively updated according to a detection result so as to continuously adapt to small amplitude fluctuation of data characteristics. Meanwhile, when a local model is established, a plurality of prediction results of the query sample are subjected to post-processing fusion by an exponential weighted average method, and the prediction performance of the oil height is effectively improved. Overall, the invention effectively reduces the false alarm rate of the oil tank leakage detection of the comprehensive energy supply station and improves the applicability, the accuracy and the timeliness of the detection method.
Drawings
FIG. 1 is an overall framework for tank leak detection based on instant learning and adaptive thresholds;
FIG. 2 is a diagram of the division of inspection data based on tank data from a certain integrated energy supply station;
FIG. 3 is a graph of leak detection results for a global model using fixed thresholds and adaptive thresholds;
FIG. 4 is a graph of leak detection results for the instant learning model using fixed and adaptive thresholds.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and specific examples of the actual data situation of the storage tank of the integrated energy supply station.
The safety of the oil storage tank is related to the overall safety of the comprehensive energy supply station, and the oil tank leakage is the most likely to occur in the oil storage tank and is also the blasting fuse for subsequent accidents such as fire, explosion and the like, so the method has very important significance for detecting the oil tank leakage. And the relevant leakage detection strategy needs to meet the characteristics of real-time performance, accuracy and the like. The invention takes the data of a liquid level meter and a station transaction system of a distributed system of a certain comprehensive energy supply station in Zhejiang province as an example, and extracts 7 key variables such as oil height, temperature, gun lifting time, gun hanging time, oil filling amount and the like.
As shown in fig. 1, the invention is a method for detecting tank leakage of an integrated energy supply station based on instant learning and adaptive threshold, comprising the following steps:
(1) the method comprises the steps of collecting current running state data of the comprehensive energy supply station in real time, storing and collecting historical running state data of the comprehensive energy supply station, wherein the collected historical running state data comprise a plurality of groups of oil heights H and temperatures T in a liquid level meter system, reading time T and the oil quantity V in a station transaction information system, and the collected current running state data comprise the oil heights (obtained by measurement of a sensor), the temperatures and the oil quantity in the tank at the current moment.
(2) And (3) carrying out data processing on the collected historical data of the comprehensive energy supply station: acquiring oil height H, temperature T and reading time T by using a liquid level meter system; acquiring gun-lifting time t by using site transaction information systemtTime t of hanging gungThe sales volume per fueling, i.e., the change in the volume of fuel in the tank Δ V.
Gun lifting time t through site transaction information systemtGun hanging time tgRemoving oiling data from the liquid level instrument data according to the corresponding condition of the sampling time of the liquid level instrument system, namely data between the gun lifting time and the gun hanging time; eliminating data sections of oil unloading operation, namely data sections with obviously increased oil height;
cleaning to remove abnormal points: because the original data collected in the distributed operating system of the comprehensive energy supply station has obvious abnormal values and outliers, abnormal samples in the data are removed by using a data noise reduction algorithm of a KNN-3 delta criterion for establishing subsequent characteristic engineering and improving model precision. The step comprises the following substeps:
(2.1) grouping: the station level historical data has the characteristics of large data volume and obvious periodicity, so that the data are grouped according to the sample sampling sequence, the noise reduction algorithm is respectively applied to each group, and the calculation amount of the KNN algorithm is reduced. Grouped into groupsThe principle is to reduce the operation amount and reject abnormal data as soon as possible, if the number of samples N in each groupbatch200, the total number of samples N is 10000, the number of groups NgroupIs 50.
(2.2) KNN Euclidean distance ranking: the data for each subgroup from the first step were:
Figure BDA0003250705870000071
the euclidean distance vector of the jth sample from the rest of the samples except itself can be expressed as:
Figure BDA0003250705870000072
wherein j is 1, 2, …, (N)batch) And j represents all samples within this group except j.
(2.3)3 σ criterion extraction noise: d obtained in substep (1.2)jIs 1 (N)batch-1) vector of DjThe standard deviation and mean of the set of distances are calculated for each element in (a):
average value:
Figure BDA0003250705870000073
standard deviation:
Figure BDA0003250705870000074
according to the 3 delta criterion, the values of the distance data are almost entirely concentrated in (mu-3 sigma, mu +3 sigma)]Within this interval, the probability of exceeding this range is only less than 0.27%. If present in the set of range data
Figure BDA0003250705870000075
The measurement is an outlier and should be rejected.
(3) Characteristic matrix: dividing historical operating state data of the comprehensive energy supply station into a plurality of stable states according to oil quantityState S1、S2…SlThe division basis is that the state without oil unloading and oil filling operation is regarded as a stable state through the lifting and grabbing time in the system. Each steady state comprises 1 or more moments of running state data, the oil amount is different between each steady state, and the first steady state S after oil unloading operation1As a reference starting point, given an oil quantity of zero as an initial value, a second steady state S2Amount of oil (c)
Figure BDA0003250705870000076
Analogize the kth stationary state SkThe oil amount is
Figure BDA0003250705870000081
When an oil-discharging operation is encountered, the current oil amount plus the oil-discharging amount is used as the current oil amount. Wherein m iskIs shown in a steady state SkThe number of operating state data items, k, contained in the following list is 1, 2, …, l. And reconstructing a characteristic state matrix according to the steady state number S and the time sequence, wherein the characteristic state matrix comprises a variable steady state number S, an oil height H, a temperature T and an oil quantity V in the steady state. In the practical process of the comprehensive energy supply station
Figure BDA0003250705870000082
The oil quantity is a positive value.
The run matrix at this time can be expressed as:
X={x1,x2,…,xN}
wherein each sample comprises oil height, temperature and oil amount in the tank, N represents the number of the obtained historical data samples, and a prediction sample pair is formed between two sample points, such as a sample x1And x2Forming a feature matrix { H2,T1,H1,T1,V1-V2}. The time sequence needs to be considered in the actual online sample prediction.
(4) The oil height prediction soft measurement model based on the instant learning is realized by the following substeps:
(4.1) selection based on similar samples with similar oil volumes and times.
Firstly, calculating and obtaining according to the Euclidean distance based on similar samples with similar oil quantities, wherein the formula of the Euclidean distance of a univariate is as follows:
dj,q=|vj-vq|
wherein v isjOil quantity, v, representing the jth historical sampleqIndicating the quantity of oil of the query sample to be predicted, if the distance d between the sample oil quantitiesj,qThe smaller the sample is, the higher the similarity of oil amounts between the samples is, and N is takenvEach sample was taken as a similar sample with similar amounts of oil.
Secondly, selecting similar samples with similar time: since the samples are arranged according to the time sequence, we only need to take the query sample xqPrevious consecutive NtAnd (4) one sample is needed.
Similar samples can now be expressed as:
Figure BDA0003250705870000083
(4.2) local online modeling:
and (3) forming a training set by the similar samples obtained in the substep (3.1), and establishing a data-driven soft measurement model, wherein the oil height soft measurement model is specifically represented as:
Hq=f(Tq,Hj,Tj,Vq-Vj)
where the subscript q denotes the index of the query sample, j denotes the index of the corresponding historical time query sample used for prediction, Hq、Tq、VqOil height, temperature and oil mass, H, respectively, of the query samplej、Tj、VjOil height, temperature, oil mass of similar samples, respectively. f (-) is a Partial Least Squares Regression (PLS) based oil height soft measurement model.
(4.3) oil height prediction: and (4) after the local model is built, predicting the output of the query sample. Two points should be noted when making predictions:
a. since the change in the tank oil level is relatively slow in a short time without a discharge operation, a plurality of consecutive query samples may share the same similar sample, i.e. consecutive NgInputting the queries into the same oil height prediction model to obtain predicted oil height;
b. the influence of the historical samples on the query samples is decreased from near to far in time, but the input samples are too close to each other, so that the output completely follows the trend of the input data, and therefore, if the input is fault data, whether the fault occurs or not cannot be detected in time. To balance the contradiction, the input variable is selected by separating the input variable from the query sample for a period of time and successively taking a plurality of samples similar in time, such as 100 samples before the time and N samples before the timetest50 samples are used as samples to construct the input variables. And performing post-processing fusion of exponential weighted average on the predicted value of each query sample as final prediction output, wherein the weighting formula is as follows:
Figure BDA0003250705870000091
wherein
Figure BDA0003250705870000092
Representing the final output of the query sample, beta represents
Figure BDA0003250705870000093
The weighting coefficients of (a), are adjustable hyper-parameters,
Figure BDA0003250705870000094
designating the predicted value of the ith test sample as input, NtestIndicating the number of input samples, superscript (N)test-i) an index representing the weight. The nature of the exponentially weighted average is a moving average weighted exponentially down. The weighting of each value decreases exponentially with time,more recent data is weighted more heavily, but older data is also given a certain weight. This is consistent with the predicted relationship in an actual tank, and we believe that points closer to the query sample are more reliable, but the effect of samples at farther times cannot be ignored.
Once the output prediction is complete, the model is discarded immediately until the next set of query samples arrives, again with similar sample selection and local modeling.
(5) The tank leakage detection method based on the adaptive threshold value updating is realized by the following sub-steps:
(5.1) obtaining an initial training set prediction residual error by an oil height soft measurement model, wherein the residual error is from a difference value of a predicted oil height and an actual oil height, and a residual error vector can be expressed as:
Figure BDA0003250705870000095
Figure BDA0003250705870000096
performing Kernel Density Estimation (KDE) on the residual vector, and taking a lower quantile with the confidence level of 0.95 as an initial threshold Thre0. Meaning that 95% of the residual distribution is the confidence interval beyond which 5% of the samples are considered outliers.
To improve the computational efficiency, the probability density estimation can use normal distribution, so the initial mean and variance θ can be obtained0={μ0,δ0}。
(5.2) for the query sample, a set of test set residuals is obtained before performing the exponentially weighted fusion, which can be expressed as:
Figure BDA0003250705870000101
wherein the content of the first and second substances,
Figure BDA0003250705870000102
similarly, the residuals of the query samples are obtained by performing exponential average fusion
Figure BDA0003250705870000103
The residual vector of the query sample can be expressed as:
Figure BDA0003250705870000104
Figure BDA0003250705870000105
by
Figure BDA0003250705870000106
Theta can be calculatedk={μtest,k,δtest,kIn which μtest,k,δtest,kAre respectively as
Figure BDA0003250705870000107
Mean and variance of.
(5.3) according to the adaptive updating threshold value of each online data (query sample), in order to avoid the threshold value of the algorithm changing with the fault data in an adaptive way, the following updating strategy is adopted:
if residual error of sample k is queried
Figure BDA0003250705870000108
Greater than a threshold value Threk-1If yes, then not update the threshold value, let Threk= Threk-1And outputs an alarm; threkRepresents the threshold, Thre, corresponding to the kth sample0Is the initial threshold.
If the residual error of the query sample k is less than or equal to the threshold Threk-1The threshold is updated.
Taking the example of updating the first threshold, if the weighted threshold of the first query sample is greater than the initial threshold, that is, the first query sample is updated to the initial threshold
Figure BDA0003250705870000109
Then not update the threshold, let Thre1=Thre0And outputs an alarm;
if it is not
Figure BDA00032507058700001010
The threshold is updated, and the updating method is as follows:
Figure BDA00032507058700001011
and EtrainForming new residual vectors
Figure BDA00032507058700001012
Re-estimating the probability density of residual distribution, and taking the lower quantile with the confidence degree of 0.95 as the updated threshold Thre1. If it is the k sample, it will be
Figure BDA00032507058700001013
And EtrainForming new residual vectors
Figure BDA00032507058700001014
Re-estimating the probability density of residual distribution, and taking the lower quantile with the confidence degree of 0.95 as the updated threshold Threk
To reduce storage space and computational effort, the residuals may be approximated as normal distributions and the mean and variance of the distributions are calculated using a recursive approach:
Figure BDA00032507058700001015
Figure BDA00032507058700001016
the update strategy for each online sample is analogized thereafter. Real-time and adaptive threshold updating is realized.
(6) And (3) evaluating the performance of the detection model: because the fault data of the comprehensive energy supply station in the actual running state are few, and the obvious problem of unbalanced category exists, the quality of the model cannot be measured simply by using the indexes of the false alarm rate and the missing report rate. Therefore, a Matthews Correlation Coefficient (MCC) is selected and used, wherein the MCC is a correlation coefficient between a real value and a predicted value in binary classification, the value range is [ -1, 1], and the more the value is close to 1, the more accurate model prediction is represented. The specific calculation formula is as follows:
Figure BDA0003250705870000111
wherein TP represents the number of positive samples predicted correctly; TN represents the number of negative samples predicting correctly; FN represents the number of positive sample prediction errors; FP represents the number of negative sample prediction errors. The Mazis correlation coefficient is widely applied to the classification problem in evaluation machine learning, in particular to the classification problem under the condition that positive and negative samples are unbalanced. Some scientists claim that the mausus correlation coefficient is the most informative single score for establishing the quality of a binary classifier prediction in a confusion matrix environment.
Regression index R of training set in oil height prediction soft measurement model by adopting global model and instant learning framework299.969% and 99.741%, respectively, and RMSE 2.410 and 1.732, respectively. It can be seen that the accuracy of the two is comparable in oil high prediction performance.
Table 1 shows the detection results of the data of the liquid level meter and the station transaction system of the distributed system of a certain comprehensive energy supply station in Zhejiang province. During actual operation, the fault data is little or no, so random noise is artificially added at the 1500 th sample point of the test data to simulate the tank leakage scene which may occur in the actual process, such as the dashed line below the right curve of the vertical line in fig. 2.
Fig. 3 and 4 are detection curves for the global model and the instantaneous learning model using fixed thresholds and adaptive thresholds, respectively. It can be seen from the graph that the fault starts at the 1500 th sample, the global model can detect the fault after a period of time, and the instant learning framework can detect the fault immediately when the fault occurs. The timeliness of fault detection under the scene of instant learning is verified.
It can also be seen from the curve that the adaptive threshold significantly improves the problems of fixed threshold with higher false reports and false negatives.
In addition, table 1 shows statistics of leakage fault detection results, and it can be seen that the MCC composite indicator performs best in the detection strategy using the adaptive threshold in the instantaneous learning framework. Although the false alarm rate is zero, the missing report rate is high because the dynamic threshold is adopted by the global model, because the training data of the global model is more, the influence of the residual error of the training set is larger when the threshold is updated by using the self-adaptive threshold method, and the influence of the residual error of the test set on the distribution is weakened. The reasonability of the instant learning model in the scene is also proved, and the contradiction between model precision reduction caused by too small training set and insensitive threshold updating caused by too large training set can be relieved.
TABLE 1 leak Fault detection results
Figure BDA0003250705870000121
The method has the advantages that the complexity of mechanism model modeling is avoided by adopting a data driving method, and the false alarm rate, the missing report rate and the timeliness of leakage detection of the oil tank leakage detection are obviously improved by adopting an instant learning framework and a self-adaptive threshold fault detection strategy.

Claims (6)

1. The method for detecting the tank leakage of the integrated energy supply station based on the instant learning and the self-adaptive threshold value is characterized by comprising the following steps of:
(1) and acquiring the current running state data of the comprehensive energy supply station in real time, and storing and collecting historical running state data of an oil tank of the comprehensive energy supply station. The historical operating state data of the oil tank of the comprehensive energy supply station is collected to be operating data under a normal and leakage-free stable state, the historical operating state data comprises a plurality of groups of oil heights H and temperatures T in a liquid level meter system, reading time T and oil quantity V in the oil tank in a station transaction information system, and the collected current operating state data comprises the oil heights, the temperatures and the oil quantity in the oil tank at the current moment.
(2) Forming a query sample set by Nq continuous real-time running state data, and selecting N from historical running state data according to the query sampletrainAnd forming a training set by the similar samples with similar time and oil quantity to construct an oil height prediction soft measurement model for immediate learning. The oil height soft measurement model is specifically expressed as:
Hq=f(Tq,Hj,Tj,Vq-Vj)
where the index q denotes the index of the query sample, j denotes the index of the corresponding historical time-like sample used for prediction, Hq、Tq、VqOil height, temperature and oil mass, H, respectively, of the query samplej、Tj、VjOil height, temperature, oil mass of similar samples, respectively.
(3) Selecting a group from the historical running state data according to each query sample, wherein the group is separated from the query sample by a period of time and is N continuoustestTaking individual samples as test samples and inputting the test samples and corresponding query samples into the oil height soft measurement model together for prediction to obtain N of each query sampletestPredicted oil height
Figure FDA0003250705860000011
The superscript i represents the index of the test sample corresponding to the query sample, and N istestThe oil height of each predicted oil height is subjected to post-processing fusion of exponential weighted average to be used as the predicted oil height of each final query sample
Figure FDA0003250705860000012
(4) The adaptive threshold updating is carried out according to each query sample, and the method specifically comprises the following substeps:
(4.1) obtaining an initial training set prediction residual error by the oil height soft measurement model, wherein the residual error is a difference value between the predicted oil height and the actual oil height, and a residual error vector can be expressed as:
Figure FDA0003250705860000013
Figure FDA0003250705860000014
Ntrain=Nv+Nt
performing Kernel Density Estimation (KDE) on the residual vector, and taking a lower quantile with the confidence level of 0.95 as an initial threshold Thre0
(4.2) obtaining each query sample residual error and a corresponding test set residual error by the oil height soft measurement model, wherein the test set residual errors are expressed as:
Figure FDA0003250705860000015
wherein the content of the first and second substances,
Figure FDA0003250705860000021
the residual vector of the query sample can be expressed as:
Figure FDA0003250705860000022
(4.3) updating the threshold value of each query sample one by adopting the following updating strategy:
if residual error of sample k is queried
Figure FDA0003250705860000023
Greater than a threshold value Threk-1If yes, then not update the threshold value, let Threk=Threk-1And outputs an alarm; threkRepresents the threshold, Thre, corresponding to the kth sample0Is the initial threshold.
If the residual error of the query sample k is less than or equal to the threshold Threk-1Then, the threshold is updated:
will be provided with
Figure FDA0003250705860000024
And EtrainForming new residual vectors
Figure FDA0003250705860000025
Re-estimating the probability density of residual distribution, and taking the lower quantile with the confidence degree of 0.95 as the updated threshold Threk
2. The method for detecting the tank leakage of the integrated energy supply station based on the immediate learning and the adaptive threshold value according to claim 1, wherein the step 1 is specifically as follows:
acquiring oil height H, temperature T and reading time T by using a liquid level meter system; acquiring gun-lifting time t by using site transaction information systemtTime t of hanging gungThe sales volume per fueling, i.e., the change in the volume of fuel in the tank Δ V.
Gun lifting time t through site transaction information systemtGun hanging time tgRemoving oiling data from the liquid level instrument data according to the corresponding condition of the sampling time of the liquid level instrument system, namely data between the gun lifting time and the gun hanging time; eliminating data sections of oil unloading operation, namely data sections with obviously increased oil height; outliers (portions of data mutations) were removed. And finally obtaining multi-section oil quantity conservation data after the oil unloading section and the oil filling section are removed, namely the operation data in a stable state.
Dividing historical running state data of the comprehensive energy supply station into a plurality of stable states S according to oil quantity1、S2…SlEach steady state comprises 1 or more moments of running state data, the oil quantity is different between each steady state, and the first steady state S is1As a reference starting point, given an oil quantity of zero as an initial value, a second steady state S2Amount of oil (c)
Figure FDA0003250705860000026
Analogize the kth stationary state SkThe oil amount is
Figure FDA0003250705860000027
Wherein m iskIs shown in a steady state SkThe number of operating state data items, k, contained in the following list is 1, 2, …, l. And reconstructing a characteristic state matrix according to the steady state number S and the time sequence, wherein the characteristic state matrix comprises a variable steady state number S, an oil height H, a temperature T and an oil quantity V in the steady state.
3. The method for detecting tank leakage of integrated energy supply station based on learning-on-demand and adaptive threshold as claimed in claim 2, wherein the step of removing outliers is specifically:
(a) partitioning historical operating state data into NgroupAnd (4) grouping.
(b) Performing KNN Euclidean distance sorting on the data of each subgroup obtained in the step (a):
Figure FDA0003250705860000028
the euclidean distance vectors of the jth sample in the set with the remaining samples except for itself can be expressed as:
Figure FDA0003250705860000031
wherein j is 1, 2, …, (N)batch) -j represents all samples within the group except j, DjHas a dimension of 1 × (N)batch-1)。
(c) The 3 σ criterion extracts noise: from DjCalculates the standard deviation and mean of the set of distances:
average value:
Figure FDA0003250705860000032
standard deviation:
Figure FDA0003250705860000033
according to the 3 delta criterion, if present in the set of distance data
Figure FDA0003250705860000034
Then the jth sample is an outlier and should be discarded.
4. The method for detecting tank leakage in integrated energy supply station based on learning-on-demand and adaptive threshold as claimed in claim 1, wherein in step 2, N is selected from historical operating status data according to each query sampletrainThe similar samples with similar time and oil quantity are specifically as follows:
and calculating according to the Euclidean distance to obtain a similar sample with the oil quantity similar to the oil quantity of the query sample:
dj,q=|vj-vq|
wherein v isjOil quantity, v, representing the jth historical sampleqRepresenting the oil quantity of the query sample, and selecting N according to the threshold valuevEach sample is used as a similar sample with similar oil quantity of the query sample.
Select consecutive N before query sampletEach sample is taken as a similar sample close in time.
Ntrain=Nv+Nt
5. The integrated energy supply station tank leak detection method based on learning-on-demand and adaptive threshold as claimed in claim 1, wherein in step 3, N is calculatedtestThe oil height of each predicted oil height is subjected to post-processing fusion of exponential weighted average to be used as the predicted oil height of each final query sample
Figure FDA0003250705860000035
The method comprises the following specific steps:
Figure FDA0003250705860000036
wherein
Figure FDA0003250705860000041
Representing the final output of the query sample, beta represents
Figure FDA0003250705860000042
The weighting coefficient of (A) is an adjustable hyper-parameter, and is scaled by (N)test-i) an index representing the weight.
6. The integrated energy supply station tank leak detection method based on learning-on-demand and adaptive threshold as claimed in claim 1, wherein in step 4, the tank leak detection method is implemented
Figure FDA0003250705860000043
And EtrainForming new residual vectors
Figure FDA0003250705860000044
The probability density of the re-estimated residual distribution is specifically:
approximating the residuals to a normal distribution the mean and variance of the distribution were calculated using a recursive method:
Figure FDA0003250705860000045
Figure FDA0003250705860000046
wherein, mukAnd deltakProbability density mean and variance, μ, respectively, of the re-estimated k-th query sample residual0,δ0Respectively obtaining the probability density mean value and variance of all query sample residuals obtained by the training set; mu.stest,k,δtest,kRespectively the residual errors obtained from the kth query sample test set
Figure FDA0003250705860000047
Probability density mean and variance.
CN202111044449.4A 2021-09-07 2021-09-07 Comprehensive energy supply station oil tank leakage detection method based on instant learning and self-adaptive threshold Pending CN113849479A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111044449.4A CN113849479A (en) 2021-09-07 2021-09-07 Comprehensive energy supply station oil tank leakage detection method based on instant learning and self-adaptive threshold

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111044449.4A CN113849479A (en) 2021-09-07 2021-09-07 Comprehensive energy supply station oil tank leakage detection method based on instant learning and self-adaptive threshold

Publications (1)

Publication Number Publication Date
CN113849479A true CN113849479A (en) 2021-12-28

Family

ID=78973319

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111044449.4A Pending CN113849479A (en) 2021-09-07 2021-09-07 Comprehensive energy supply station oil tank leakage detection method based on instant learning and self-adaptive threshold

Country Status (1)

Country Link
CN (1) CN113849479A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114459691A (en) * 2022-01-05 2022-05-10 东北石油大学 Method and system for evaluating leakage risk in carbon dioxide geological storage body

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114459691A (en) * 2022-01-05 2022-05-10 东北石油大学 Method and system for evaluating leakage risk in carbon dioxide geological storage body
CN114459691B (en) * 2022-01-05 2024-03-15 东北石油大学 Leakage risk evaluation method and system in carbon dioxide geological storage body

Similar Documents

Publication Publication Date Title
Don et al. Dynamic process fault detection and diagnosis based on a combined approach of hidden Markov and Bayesian network model
CN111222549B (en) Unmanned aerial vehicle fault prediction method based on deep neural network
CN116757534B (en) Intelligent refrigerator reliability analysis method based on neural training network
Pani et al. A survey of data treatment techniques for soft sensor design
CN113762329A (en) Method and system for constructing state prediction model of large rolling mill
CN114297918A (en) Aero-engine residual life prediction method based on full-attention depth network and dynamic ensemble learning
CN109298633A (en) Chemical production process fault monitoring method based on adaptive piecemeal Non-negative Matrix Factorization
Shi et al. Health index synthetization and remaining useful life estimation for turbofan engines based on run-to-failure datasets
CN112434390A (en) PCA-LSTM bearing residual life prediction method based on multi-layer grid search
Huang et al. Bayesian neural network based method of remaining useful life prediction and uncertainty quantification for aircraft engine
Liu et al. Grey-based approach for estimating software reliability under nonhomogeneous Poisson process
CN115688581A (en) Oil gas gathering and transportation station equipment parameter early warning method, system, electronic equipment and medium
Wang et al. Three‐stage feature selection approach for deep learning‐based RUL prediction methods
CN113849479A (en) Comprehensive energy supply station oil tank leakage detection method based on instant learning and self-adaptive threshold
CN113780420A (en) Method for predicting concentration of dissolved gas in transformer oil based on GRU-GCN
Agarwal et al. Hierarchical deep recurrent neural network based method for fault detection and diagnosis
Wenqiang et al. Remaining useful life prediction for mechanical equipment based on temporal convolutional network
CN116432856A (en) Pipeline dynamic early warning method and device based on CNN-GLSTM model
Xing-yu et al. Autoencoder-based fault diagnosis for grinding system
CN116522065A (en) Coal mill health degree assessment method based on deep learning
CN116127831A (en) Soft measurement method for difficult-to-measure parameters of heavy gas turbine
CN114137915A (en) Fault diagnosis method for industrial equipment
Zhao et al. Remaining useful life prediction method based on convolutional neural network and long short-term memory neural network
Gu et al. An improved similarity-based residual life prediction method based on the dynamic variable combination
Cao et al. Research on soft sensing modeling method of gas turbine’s difficult-to-measure parameters

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination