CN112235043A - Distributed optical fiber abnormal data restoration model based on self-adaptive long-term and short-term memory - Google Patents
Distributed optical fiber abnormal data restoration model based on self-adaptive long-term and short-term memory Download PDFInfo
- Publication number
- CN112235043A CN112235043A CN202010959309.9A CN202010959309A CN112235043A CN 112235043 A CN112235043 A CN 112235043A CN 202010959309 A CN202010959309 A CN 202010959309A CN 112235043 A CN112235043 A CN 112235043A
- Authority
- CN
- China
- Prior art keywords
- optical fiber
- value
- data
- adaptive
- lstm
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04B—TRANSMISSION
- H04B10/00—Transmission systems employing electromagnetic waves other than radio-waves, e.g. infrared, visible or ultraviolet light, or employing corpuscular radiation, e.g. quantum communication
- H04B10/07—Arrangements for monitoring or testing transmission systems; Arrangements for fault measurement of transmission systems
- H04B10/075—Arrangements for monitoring or testing transmission systems; Arrangements for fault measurement of transmission systems using an in-service signal
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/049—Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Electromagnetism (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Optical Communication System (AREA)
Abstract
The invention provides a self-adaptive long-short term memory-based distributed optical fiber abnormal sensing data restoration model which mainly comprises a noise reduction preprocessing module and a self-adaptive LSTM prediction module and is used for restoring abnormal values in monitoring data. Aiming at the noise signal characteristics of the monitored data, a data filtering method of a weighted difference fusion 3 sigma threshold criterion is adopted, and the limit strain of the theoretically derived optical fiber is used as a defined threshold value in a form of combining time sequence variable filtering and position variable filtering, so that abnormal values are eliminated; and then, taking the null value of the optical fiber sensing monitoring data and the filtered abnormal value as a prediction object of the LSTM model for data restoration. The Loss function Loss in the self-adaptive iterative training process is adopted to improve the learning efficiency of the training model, further reduce the accumulated error, traverse all monitoring sample points of the distributed optical fiber according to the spatial resolution to complete the repair work, and have high efficiency and high monitoring precision.
Description
Technical Field
The invention designs a missing data restoration model, in particular to a distributed optical fiber abnormal data restoration model based on self-adaptive Long-Short Term Memory (LSTM), which is used for restoring the problem of missing sample point data caused by factors such as external interference, jump of input signals and the like in the structure monitoring process of a distributed optical fiber.
Background
The distributed optical fiber sensor system serving as a novel structural health monitoring technology has the advantages of high measurement precision, wide measurement range, high spatial resolution and the like. In the process of monitoring the structural state of the distributed optical fiber, when the measured strain gradient jumps beyond the demodulation range of the optical fiber demodulator due to factors such as the harsh external monitoring environment, the unreasonable optical fiber layout mode, noise signals and the structural characteristics of the monitored object, the optical fiber sensing data greatly increase the probability of generating abnormal values of the data. In order to solve the problems, a specific data restoration model is needed according to the characteristics of distributed optical fiber sensing data.
The time series is a remarkable characteristic of the monitoring data collected by the distributed optical fiber sensing signal because the monitoring data has a typical regular time sequence characteristic. Common time series data analysis mainly comprises a comprehensive autoregressive moving average method, a time series difference algorithm, a deep neural network, a Monte Carlo method, a long-term and short-term memory cycle and the like. The LSTM effectively overcomes the defects of gradient sudden drop and iterative explosion of a recurrent neural network and insufficient long-term memory capacity, and is more fully suitable for training long-distance time sequence information. The LSTM mainly describes the current data and the information input into the network, learns and continuously updates state information through a deep network, predicts the development trend of subsequent data by utilizing the strong memory capacity of the LSTM, and is widely applied to fault feature diagnosis, image analysis, voice recognition, time sequence prediction such as financial stock market and the like. The problem that local deletion and noise signals exist in the existing distributed optical fiber in the field of structure monitoring for collecting sequential autocorrelation time series data limits the application of a distributed optical fiber sensor system, and the problem becomes a technical problem to be solved urgently.
Disclosure of Invention
In order to solve the problems of the prior art, the invention aims to overcome the defects of the prior art and provide a distributed optical fiber abnormal data restoration model based on self-adaptive long-term and short-term memory, and particularly aims to solve the problems of local deficiency and noise signals of time sequence data of continuous autocorrelation acquired by a distributed optical fiber in the field of structure monitoring, the optical fiber time sequence monitoring data restoration model is optimized, an LSTM algorithm and the characteristics of optical fiber monitoring data are effectively combined together, the optical fiber abnormal data restoration model consisting of a noise reduction preprocessing module and an LSTM prediction module is constructed, and the monitoring precision is improved.
In order to achieve the purpose of the invention, the invention adopts the following inventive concept:
based on the excellent performance of the LSTM in the time sequence prediction, the invention effectively combines the LSTM algorithm and the optical fiber monitoring data characteristic together to construct an optical fiber abnormal data restoration model consisting of a noise reduction preprocessing module and an LSTM prediction module.
Firstly, the weighted difference and the threshold value method are adopted to be fused through a 3 sigma criterion to finish noise pretreatment.
Then, according to the fact that the positions and the quantity of the missing data of different sample points are different, the invention provides an input cycle length of adaptive iteration based on optical fiber sampling frequency change in a targeted mode, and then a corresponding error loss function is defined to constrain the learning efficiency of an LSTM network model. Because the predicted value at the previous moment can become the independent variable value at the next moment, so that the iteration error accumulation exists, the invention sets an average correction coefficient delta to improve the fitting degree of the average correction coefficient delta and the measured value. And finally, completing data restoration work by traversing all the sample points. The invention provides a distributed optical fiber abnormal data restoration model based on self-adaptive LSTM, effectively combines the improved LSTM algorithm with distributed optical fiber data, and improves the prediction precision of the LSTM algorithm on the optical fiber abnormal data.
According to the inventive concept, the invention adopts the following technical scheme:
a distributed optical fiber abnormal data restoration model based on self-adaptive long-term and short-term memory combines a self-adaptive LSTM algorithm with distributed optical fiber time sequence data, and completes abnormal data restoration work by changing the optical fiber time sequence input period length of the LSTM and control parameters in the training process in a self-adaptive manner; the model mainly comprises a noise reduction preprocessing module and a self-adaptive LSTM data prediction module, and specifically comprises the following execution steps:
1) firstly, fusing a weighted difference algorithm based on an optical fiber time sequence variable and a 3 sigma threshold noise reduction algorithm based on a position variable through a noise reduction preprocessing module, so as to improve the noise reduction effect of an optical fiber abnormal noise signal;
2) by constructing a self-adaptive LSTM prediction module, two self-adaptive parameters of the input cycle length L and the Loss function Loss of the LSTM are continuously updated, and an error correction coefficient delta is set, so that the repair precision of the model is improved.
Preferably, the specific method of step 1) is as follows: defining a weighted disturbance factor alpha in a noise reduction preprocessing modulei(ii) a Defining a strain threshold based on the mechanical properties of the optical fiber based on a 3 sigma threshold criterion of the position variable to further constrain the effective range of the monitored values; combining a time sequence variable weighted difference algorithm with a 3 sigma threshold method of a position variable to carry out noise signal filtering pretreatment; in the step 1), the method specifically comprises the following steps:
1-1) in the time sequence variable weighted difference method, the independent variable is used as a time sequence, the dependent variable is used as a measured strain value, and the weighted first-order difference is calculated as follows:
where m represents participation in the computation of yxThe window length of (d); y ismeanRepresenting the mean of m samples within a window, i.e.i denotes the number of sampling points, yiIndicating the monitored value of the sample point, alphaiIs a weighted perturbation factor which satisfies:
wherein the content of the first and second substances,to pairThe influence factor of (2) is 1/m;to pairThe influence factor of (2/m); by the way of analogy, the method can be used,to pairThe influence factor of (a) is m-1/m;
1-2) in the position variable 3 sigma threshold criterion, setting the threshold on the basis of the standard 3 sigma, mainly by the ultimate stress sigma of the mechanical strength of the fiberuDeducing effective strain monitoring range [ epsilon ] of optical fibermin,εmax](ii) a Allowable stress of engineeringThe limit strain epsilon available with the fiber then satisfies:
E(1+cε)ε≤[σ]
wherein c is a nonlinear equilibrium constant and takes a value of 3.0-6.0; n is an allowable stress balance constant and is 3.0-5.0; combining the two formulas, and calculating the finite limit strain value range of the optical fiber as follows:
wherein E represents the elastic modulus of the optical fiber, σuRepresents the ultimate breaking stress of the optical fiber;
when data of position variables are filtered, an extreme threshold method is adopted to screen abnormal valuesDefining a constraint coefficient K based on the mean value and standard deviation of sampled data at the same time of an optical fiber1And K2To define the width of the range, the width is positiveThe common measurement value satisfies:
wherein, K1εminShould approximate μ -k σ; k2εmaxApproximate μ + k σ; σ is the standard deviation of all sampling values in the value window, μ is equal to the mean value of the sample points, k is the confidence range coefficient, k belongs to [0, 3]]。
The invention carries out denoising pretreatment on the monitored original data. And removing the defined gross error by adopting a noise reduction algorithm based on the weighted difference of the time sequence sliding window and a position variable 3 sigma threshold criterion. When the independent variable is time based on the sampling frequency and the dependent variable is the measured strain. Data y of x timexAnd x-1 time yx-1Is compared with the average weighted error of the data at the current m time instants. And after the weighted differential filtering is finished, converting the independent variable into a position variable, wherein the dependent variable is a dependent variable. Screening position variable monitoring data by adopting a strain limit threshold method based on an optical fiber processing module, and obtaining the limit stress sigma of the optical fiber according to the mechanical strength of the optical fiberu。
Preferably, the specific method of step 2) is: in the self-adaptive LSTM prediction module, the training efficiency of the model is improved by utilizing the self-adaptive input time period length L and the improved custom Loss function Loss; in the step 2), the method specifically comprises the following steps:
2-1) in the time pre-sequencing, if the time interval between the current predicted point and the last predicted point is L, after each window movement, the window length L of the improved self-adaptation is as follows:
wherein d iskIs an initial input period window with k as a sampling point, which satisfies:
wherein L iskIs the window number of consecutive missing data for the k sample point location; l issumRepresenting the sum of the missing windows of all sampling points; hkRepresenting the number of abnormal values at a sampling point k on the optical fiber; t is the total sampling time;
2-2) on the basis of a standard regularization term function, integrating the segmented input cycle length L and the window step length distance n by utilizing the thought of a two-dimensional data group, and defining a training Loss function Loss of the LSTM so as to improve the learning efficiency; l is the length of the current segmentation window, n is less than L, n is the number of the step lengths of the sample points of the prediction set, and then the input and the output of the hidden layer meet a two-dimensional array with the dimension of (L-n, n); according to the error calculation formula, the Loss function Loss in the training process is defined as:
in the formula, i represents the number of sampling points;is the average of the split windows; y isiRepresenting a predicted value; l represents the input cycle length; n represents the step distance of the prediction window; r (omega) is a regularization term used for limiting interference noise in model learning; β is its controllable weight factor constant.
Preferably, in the adaptive LSTM prediction module, an average error coefficient delta is defined by calculating the difference of monitoring values of sampling points of adjacent positions of n steps before the time t to be measured based on a position independent variable to correct an actual LSTM prediction value, so that the iteration error of optical fiber data is reduced; in step n, the average coefficient of variation δ satisfies:
in the formula, betaiFor the proposed weight factor, equal ton represents the step distance of the prediction window, i represents the number of movements of the monitored value at the previous time, j represents the number of movements of the monitored value at the next time, yt-iIndicating the monitored value at the previous time instant,
yt-jthe actual predicted value at the time t is represented by the monitoring value at the next timeThe output value is:
in the formula, ytRepresenting the theoretical prediction of the LSTM,the average value of the measured values of the n step length sampling points is represented, and the larger the spatial resolution is, the closer the monitoring values of the adjacent sampling points are.
Preferably, the 3 σ criterion filtering is performed on all the monitored data timing and position variables. If yxWithin the confidence interval, the data is regarded as a normal value, otherwise, the data is regarded as a noise signal, and NaN is adopted to replace the original data;
μ-kσ≤yx≤μ+kσ
where k represents the range of confidence intervals, in general, k ∈ [0, 3 ];
then constructing an LSTM prediction module, and grouping the preprocessed data according to a space sampling rate, wherein one space sample point is regarded as a group of independent time sequences;
defining a time sequence variable set T and a corresponding measurement strain set epsilon according to the characteristics of the monitoring data;
T=(tk,tk+1τ,...tk+dτ)
ε=(εk,εk+1τ,...εk+dτ)
wherein the time sequence of any timeIndependent variable tjThe position of (a) satisfies:
in the formula, t0The method comprises the steps that initial acquisition time of an optical fiber is obtained, and the sampling frequency of an optical fiber demodulator is f; suppose LkIs the number of windows of k-position continuous missing data of the sampling point, LsumRepresenting the sum of the missing windows of all sampling points; initial input period window length dkComprises the following steps:
in the formula (d)sumThe number of the sampling points with normal monitoring values can be expressed as:
dsum=m-Hk
in the formula, HkRepresenting the number of abnormal values at a sampling point k on the optical fiber, wherein m is the total sampling time; and if the time interval between the current prediction point and the last prediction point is set, L of the length of the self-adaptive window meets the following conditions:
defining a Loss function Loss in the training process according to an error calculation formula and the length of a segmentation window;
according to the error calculation formula, if L is the length of the current segmentation window (n < L), and n is the number of sample point steps in the prediction set, the Loss function Loss during the training process can be defined as:
in the formula: r (omega) is a regularization term used for limiting interference noise in model learning, and beta is a controllable weight factor thereof;
and finally, setting an average error coefficient delta to correct the fitting degree of the estimated value by calculating the difference of the monitoring values of the sampling points of the H step lengths adjacent to the moment to be measured based on the position independent variable information.
The principle of the invention is as follows:
the invention relates to a distributed optical fiber abnormal data restoration model based on self-adaptive long-term and short-term memory, which specifically comprises the following steps:
firstly, according to the discrete point characteristics generated by monitoring data, the invention provides a weighting thought during time sequence variable differential filtering, and provides a threshold method based on the mechanical characteristics of optical fibers in a position variable 3 sigma criterion noise reduction algorithm. The preprocessing of the noise signal is finished by fusing and filtering the two signals;
then, a self-adaptive LSTM prediction model is provided; the length adaptation of an adaptive window is adopted to modify the limitation of sample point prediction redundancy of a traditional LSTM model which has less abnormal values due to empirical random setting;
continuously adjusting parameter values of regularization in Loss functions Loss and LSTM layers in the model training process to improve the model learning efficiency;
because the prediction value of the output of the prediction sequence at the last moment can become a training set at the next moment along with the movement of an input window, the problem of error accumulation exists; therefore, the difference value of the monitoring values of the sampling points at the adjacent positions of H steps before the moment is calculated in a weighting mode, an average iteration error coefficient delta is provided, and a predicted value is updated, so that the prediction accuracy of the LSTM is further improved.
Compared with the prior art, the invention has the following obvious and prominent substantive characteristics and remarkable advantages:
1. aiming at the data problems of abnormal values and null values generated in the monitoring process of a distributed optical fiber structure, the invention provides a noise reduction method based on the weighted difference fusion 3 sigma threshold criterion of a sliding window, and provides noise (discrete points) in monitoring data; then, taking the denoised sample value and the system null value as the restoration object of the LSTM model;
2. aiming at the condition that the lengths of the lost segments of the sampling points in the optical fiber monitoring data are different, the invention adopts a self-adaptive sliding window to replace the input cycle length of an LSTM time sequence variable; redefining a loss function in the training process according to the length of the segmentation window; then, in order to further reduce the prediction error, setting an average error coefficient delta to replace a null value at the moment of the prediction sequence with a corresponding correction prediction value, and keeping the existing measured value at the moment of the sequence; and finally, the calculation accuracy of the output sequence predicted by the LSTM and the actual monitoring value is calculated through cross validation, so that the efficiency and the calculation accuracy are high.
Drawings
Fig. 1 is a flow chart of an improved algorithm of the preferred embodiment of the present invention.
Fig. 2 is a time series variation of sampling points for a preferred embodiment of the present invention.
Fig. 3 is a schematic diagram of distributed fiber monitoring data according to a preferred embodiment of the present invention.
Detailed Description
The invention will now be further described and illustrated with reference to the following examples and drawings. The following description is illustrative and not intended to limit the scope of the invention.
The first embodiment is as follows:
in the embodiment, a distributed optical fiber abnormal data restoration model based on self-adaptive long-term and short-term memory combines a self-adaptive LSTM algorithm with distributed optical fiber time sequence data, and completes abnormal data restoration work by changing the length of an optical fiber time sequence input period of the LSTM and control parameters in a training process in a self-adaptive manner; the model mainly comprises a noise reduction preprocessing module and a self-adaptive LSTM data prediction module, and specifically comprises the following execution steps:
1) firstly, fusing a weighted difference algorithm based on an optical fiber time sequence variable and a 3 sigma threshold noise reduction algorithm based on a position variable through a noise reduction preprocessing module, so as to improve the noise reduction effect of an optical fiber abnormal noise signal;
2) by constructing a self-adaptive LSTM prediction module, two self-adaptive parameters of the input cycle length L and the Loss function Loss of the LSTM are continuously updated, and an error correction coefficient delta is set, so that the repair precision of the model is improved.
The method is used for solving the problems that the distributed optical fiber collects sequential data with continuous autocorrelation in the structure monitoring field and has local deletion and noise signals, optimizing an optical fiber sequential monitoring data restoration model, effectively combining an LSTM algorithm with optical fiber monitoring data characteristics, constructing an optical fiber abnormal data restoration model consisting of a noise reduction preprocessing module and an LSTM prediction module, and improving monitoring precision.
Example two:
this embodiment is substantially the same as the first embodiment, and is characterized in that:
in this embodiment, referring to fig. 1 to fig. 3, the specific method of step 1) is:
defining a weighted disturbance factor alpha in a noise reduction preprocessing modulei(ii) a Defining a strain threshold based on the mechanical properties of the optical fiber based on a 3 sigma threshold criterion of the position variable to further constrain the effective range of the monitored values; combining a time sequence variable weighted difference algorithm with a 3 sigma threshold method of a position variable to carry out noise signal filtering pretreatment;
1-1) in the time sequence variable weighted difference method, the independent variable is used as a time sequence, the dependent variable is used as a measured strain value, and the weighted first-order difference is calculated as follows:
wherein, yxAnd yx-1Representing the values of x and x-1 sample points, respectively, m representing the number of samples involved in computing yxThe window length of (d); y ismeanRepresenting the mean of m samples within a window, i.e.i denotes the number of sampling points, yiIndicating the monitored values of the sample points.
αiIs a weighted perturbation factor which satisfies:
wherein the content of the first and second substances,to pairThe influence factor of (2) is 1/m;to pairThe influence factor of (2/m); by the way of analogy, the method can be used,to pairThe influence factor of (a) is m-1/m;
1-2) in the position variable 3 sigma threshold criterion, setting the threshold on the basis of the standard 3 sigma, mainly by the ultimate stress sigma of the mechanical strength of the fiberuDeducing effective strain monitoring range [ epsilon ] of optical fibermin,εmax](ii) a Allowable stress of engineeringThe limit strain epsilon available with the fiber then satisfies:
E(1+cε)ε≤[σ]
wherein c is a nonlinear equilibrium constant and takes a value of 3.0-6.0; n is an allowable stress balance constant and is 3.0-5.0; combining the two formulas, and calculating the finite limit strain value range of the optical fiber as follows:
wherein E represents the elastic modulus, σuRepresenting the ultimate breaking stress of the fiber.
During data filtering of position variables, a limit threshold value method is adopted for screeningAbnormal valueDefining a constraint coefficient K based on the mean value and standard deviation of sampled data at the same time of an optical fiber1And K2To define the width of the range, the normal measurement of width satisfies:
wherein, K1εminShould approximate μ -k σ; k2εmaxApproximate μ + k σ; σ is the standard deviation of all sampling values in the value window, μ is equal to the mean value of the sample points, k is the confidence range coefficient, k belongs to [0, 3]]。
In this embodiment, for the acquired raw data, data denoising preprocessing is performed based on a weighted difference and a 3 σ threshold criterion, and the spatial position after filtering is set to NaN, so as to facilitate subsequent data recovery work. When data filtering based on time sequence variables is carried out, when a weighted difference algorithm based on a sliding window is adopted to compare numerical values at adjacent moments, a reasonable window length m and a window weighted disturbance factor alpha need to be seti. When data of position variables are filtered, an extreme threshold method is adopted to screen abnormal valuesFirst, the elastic modulus E and the ultimate stress σ of the optical fiber under the experimental conditionsuAnd the allowable stress balance coefficient n and the nonlinear balance constant c of the optical fiber can obtain the ultimate strain value of the optical fiber. In the embodiment, in the noise reduction preprocessing module, a weighted disturbance factor alpha is provided on the basis of a first-order difference algorithm of a time sequence variableiThe concept of (a); on the basis of a 3 sigma criterion of a position variable, proposing an idea of strain threshold based on the mechanical characteristics of the optical fiber to further constrain the effective range of the monitoring value; the time sequence variable weighted difference algorithm is combined with the 3 sigma threshold method of the position variable, and the purpose of noise signal filtering preprocessing is achieved together.
In this embodiment, the specific method of step 2) is as follows: in the self-adaptive LSTM prediction module, the training efficiency of the model is improved by utilizing the self-adaptive input time period length L and the improved custom Loss function Loss;
2-1) in the time pre-sequencing, if the time interval between the current predicted point and the last predicted point is L, after each window movement, the window length L of the improved self-adaptation is as follows:
wherein d iskIs an initial input period window with k as a sampling point, which satisfies:
wherein L iskIs the window number of consecutive missing data for the k sample point location; l issumRepresenting the sum of the missing windows of all sampling points; hkRepresenting the number of abnormal values at a sampling point k on the optical fiber; t is the total sampling time;
2-2) on the basis of a standard regularization term function, integrating the segmented input cycle length L and the window step length distance n by utilizing the thought of a two-dimensional data group, and defining a training Loss function Loss of the LSTM so as to improve the learning efficiency; l is the length of the current segmentation window, n is less than L, n is the number of the step lengths of the sample points of the prediction set, and then the input and the output of the hidden layer meet a two-dimensional array with the dimension of (L-n, n); according to the error calculation formula, the Loss function Loss in the training process is defined as:
in the formula, i represents the number of sampling points;is the average of the split windows; y isiRepresenting a monitored value; l represents the input cycle length;n represents the step distance of the window; r (omega) is a regularization term used for limiting interference noise in model learning; β is its controllable weight factor constant.
The embodiment proposes the idea of a two-dimensional data set to integrate the length L of the segmented input period and the window step length distance n, and redefines the training Loss function Loss of the LSTM, so as to improve the learning efficiency.
In this embodiment, in the adaptive LSTM prediction module, an average error coefficient δ is defined by calculating a difference between monitoring values of sampling points at adjacent positions n steps before a time t to be measured based on a position independent variable to correct an actual LSTM prediction value, thereby reducing an iteration error of optical fiber data; in step n, the average coefficient of variation δ satisfies:
in the formula, betaiFor the proposed weight factor, equal ton represents the distance of stepping, i represents the number of movements of the monitored value at the previous time, j represents the number of movements of the monitored value at the subsequent time, yt-iIndicating the monitored value, y, of the previous momentt-jThe actual predicted value at the time t is represented by the monitoring value at the next timeThe output value is:
in the formula, ytRepresenting the predicted value of the LSTM;the average value of the measured values of the n step length sampling points is represented, and the larger the spatial resolution is, the closer the monitoring values of the adjacent sampling points are.
The embodiment is based on a repair model of distributed optical fiber abnormal sensing data of self-adaptive Short-Term Memory (LSTM) and is used for repairing abnormal values in monitoring data. The model of the embodiment mainly comprises a noise reduction preprocessing module and an adaptive LSTM prediction module. Firstly, aiming at the noise signal characteristics of monitoring data, a data filtering method of a weighted difference fusion 3 sigma threshold rule is adopted, and the limit strain of an optical fiber derived theoretically is used as a defined threshold value through a form of combining time sequence variable filtering and position variable filtering, so that abnormal values are eliminated. And then, taking the null value of the optical fiber sensing monitoring data and the filtered abnormal value as a prediction object of the LSTM model for data restoration. In the LSTM model, because the number of data lost at different sample points in the sampled data is different, the present embodiment uses a self-adaptive input window as the input cycle length of the LSTM timing variable; the Loss function Loss in the self-adaptive iterative training process is adopted to improve the learning efficiency of the training model; in order to further reduce the accumulated error, the average difference between the predicted time of each output and n adjacent space sampling points of the previous step length is calculated, and an average error coefficient delta is set to correct the predicted value of the LSTM at the current time. And finally traversing all monitoring sample points of the distributed optical fiber according to the spatial resolution to finish the repair work.
Example three:
this embodiment is substantially the same as the previous embodiment, and is characterized in that:
in the present embodiment, as shown in fig. 2, 3 σ criterion filtering is performed on all the monitoring data timing and position variables. If yxWithin the confidence interval, the data is regarded as a normal value, otherwise, the data is regarded as a noise signal, and NaN is adopted to replace the original data;
μ-kσ≤yx≤μ+kσ
where k represents the range of confidence intervals, in general, k ∈ [0, 3 ];
then constructing an LSTM prediction module, and grouping the preprocessed data according to a space sampling rate, wherein one space sample point is regarded as a group of independent time sequences;
defining a time sequence variable set T and a corresponding measurement strain set epsilon according to the characteristics of the monitoring data, and referring to FIG. 2;
T=(tk,tk+1τ,...tk+dτ)
ε=(εk,εk+1τ,...εk+dτ)
wherein, the time sequence independent variable t at any timejThe position of (a) satisfies:
in the formula, t0The method comprises the steps that initial acquisition time of an optical fiber is obtained, and the sampling frequency of an optical fiber demodulator is f; suppose LkIs the number of windows of k-position continuous missing data of the sampling point, LsumRepresenting the sum of the missing windows of all sampling points; initial input period window length dkComprises the following steps:
in the formula (d)sumThe number of the sampling points with normal monitoring values can be expressed as:
dsum=m-Hk
in the formula, HkRepresenting the number of abnormal values at a sampling point k on the optical fiber, wherein m is the total sampling time; and if the time interval between the current prediction point and the last prediction point is set, L of the length of the self-adaptive window meets the following conditions:
defining a Loss function Loss in the training process according to an error calculation formula and the length of a segmentation window;
according to the error calculation formula, if L is the length of the current segmentation window (n < L), and n is the number of sample point steps in the prediction set, the Loss function Loss during the training process can be defined as:
in the formula: r (omega) is a regularization term used for limiting interference noise in model learning, and beta is a controllable weight factor thereof;
and finally, setting an average error coefficient delta to correct the fitting degree of the estimated value by calculating the difference of the monitoring values of the sampling points of the H step lengths adjacent to the moment to be measured based on the position independent variable information. In the embodiment, the parameter values of regularization in Loss functions Loss and LSTM layers are continuously adjusted in the model training process so as to improve the model learning efficiency.
Example four:
this embodiment is substantially the same as the previous embodiment, and is characterized in that:
in this embodiment, as shown in fig. 1, a distributed optical fiber abnormal data repair model based on adaptive LSTM is implemented by the following steps:
1) for the acquired original data, carrying out data noise reduction preprocessing based on a weighted difference and 3 sigma threshold criterion, and setting the spatial position after filtering as NaN so as to facilitate subsequent data restoration work;
when data filtering based on time sequence variables is carried out, when a weighted difference algorithm based on a sliding window is adopted to compare numerical values at adjacent moments, a reasonable window length m and a window weighted disturbance factor alpha need to be setiThe method comprises the following specific operations:
wherein y isxAnd yx-1Values representing x and x-1 sample points, y, respectivelymeanRepresenting the mean value of m sampling points in the window range;
when data of position variables are filtered, an extreme threshold method is adopted to screen abnormal valuesFirst of all under the experimental conditionsElastic modulus E and ultimate stress σ of optical fiberuWhen the stress balance coefficient n allowed by engineering and the nonlinear balance constant c of the optical fiber are used, the obtained ultimate strain value of the optical fiber meets the following requirements:
the specific operation of threshold screening is as follows:
when data filtering based on the time sequence variable and the position variable is carried out and filtering is carried out through a positive distribution criterion, the specific operation is as follows:
μ-kσ≤yx≤μ+kσ
wherein mu and sigma respectively represent the mean value and standard deviation of adjacent sampling data at the moment, and k is the length range of the confidence interval; when y isxIf the confidence interval is within the normal value, otherwise, judging the normal value as an abnormal value and replacing the abnormal value with NaN;
2) constructing a self-adaptive LSTM prediction module; grouping the preprocessed data according to spatial resolution, namely each sampling point is an independent prediction array, and adaptively defining the input period length L and the time step length H of window movement in each sampling point;
fig. 3 is a schematic diagram of data collected by the distributed optical fiber sensing system, where NaN is defined as a null value, and a noise signal satisfying a noise reduction algorithm is a normal monitoring value;
counting the time interval l between the current adjacent prediction point and the previous adjacent prediction point, and calculating the window length of the self-adaptive input:
wherein d iskThe window period length for the initial transformation satisfies:
window number L of consecutive missing data at k positions of sampling pointsk;LsumIs the sum of the missing windows of all the sampling points; t is the total sampling time; hkRepresenting the number of abnormal values at a sampling point k on the optical fiber;
then, according to a self-defined Loss function Loss in the training process, calculating a specific Loss parameter value Loss as follows:
where R (ω) is the default regularization term; β is an initial set weight factor constant;
3) according to the set calculation formula of the error correction coefficient, the average correction coefficient delta of the first n sampling points in the output stepping window of each point to be measured can be calculated as
In the formula, betaiFor the proposed weight factor, equal toComparing the average value of the measured values of the n step sampling points with the estimated value, the actual predicted value at the moment iThe output can be expressed as:
thus, according to the time variable, all time sequence missing data repairing work of the sampling point is completed through traversing by continuously moving the window of the training set; and finally, according to the graph shown in FIG. 3, the missing data of all the sampling points is predicted according to the traversal of the position variables.
The above examples are given for the purpose of illustrating the present invention clearly and not for the purpose of limiting the same, and it will be apparent to those skilled in the art that many modifications and variations can be made in the above examples without departing from the spirit and scope of the invention.
Claims (4)
1. A distributed optical fiber abnormal data restoration model based on self-adaptive long-term and short-term memory is characterized in that: combining the self-adaptive LSTM algorithm with distributed optical fiber time sequence data, and adaptively changing the length of the optical fiber time sequence input period of the LSTM and the control parameters in the training process so as to finish abnormal data repair work; the model mainly comprises a noise reduction preprocessing module and a self-adaptive LSTM data prediction module, and specifically comprises the following execution steps:
1) firstly, fusing a weighted difference algorithm based on an optical fiber time sequence variable and a 3 sigma threshold noise reduction algorithm based on a position variable through a noise reduction preprocessing module, so as to improve the noise reduction effect of an optical fiber abnormal noise signal;
2) by constructing a self-adaptive LSTM prediction module, two self-adaptive parameters of the input cycle length L and the Loss function Loss of the LSTM are continuously updated, and an error correction coefficient delta is set, so that the repair precision of the model is improved.
2. The distributed optical fiber abnormal data restoration model based on the adaptive long-short term memory as claimed in claim 1, wherein the specific method of the step 1) is as follows:
defining a weighted disturbance factor alpha in a noise reduction preprocessing modulei(ii) a Defining a strain threshold based on the mechanical properties of the optical fiber based on a 3 sigma threshold criterion of the position variable to further constrain the effective range of the monitored values; combining a time sequence variable weighted difference algorithm with a 3 sigma threshold method of a position variable to carry out noise signal filtering pretreatment;
1-1) in the time sequence variable weighted difference method, the independent variable is used as a time sequence, the dependent variable is used as a measured strain value, and the weighted first-order difference is calculated as follows:
where m represents participation in the computation of yxThe window length of (d); y ismeanRepresenting the mean of m samples within a window, i.e.i denotes the number of sampling points, yiIndicating the monitored value of the sample point, alphaiIs a weighted perturbation factor which satisfies:
wherein the content of the first and second substances,to pairThe influence factor of (2) is 1/m;to pairThe influence factor of (2/m); by the way of analogy, the method can be used,to pairThe influence factor of (a) is m-1/m;
1-2) in placeIn the set variable 3 sigma threshold criterion, on the basis of the standard 3 sigma, the threshold is set mainly through the ultimate stress sigma of the mechanical strength of the optical fiberuDeducing effective strain monitoring range [ epsilon ] of optical fibermin,εmax](ii) a Allowable stress of engineeringThe limit strain epsilon available with the fiber then satisfies:
E(1+cε)ε≤[σ]
wherein c is a nonlinear equilibrium constant and takes a value of 3.0-6.0; n is an allowable stress balance constant and is 3.0-5.0; combining the two formulas, and calculating the finite limit strain value range of the optical fiber as follows:
wherein E represents the elastic modulus, σuRepresents the ultimate breaking stress of the optical fiber;
when data of position variables are filtered, an extreme threshold method is adopted to screen abnormal valuesDefining a constraint coefficient K based on the mean value and standard deviation of sampled data at the same time of an optical fiber1And K2To define the width of the range, the width is measured normallySatisfies the following conditions:
wherein, K1εminShould approximate μ -k σ; k2εmaxApproximate μ + k σ; σ is the standard deviation of all sampling values in the value window, μ is equal to the mean value of the sample points, k is the confidence range coefficient, k belongs to [0, 3]]。
3. The distributed optical fiber abnormal data restoration model based on the adaptive long-short term memory as claimed in claim 1, wherein the specific method of the step 2) is as follows:
in the self-adaptive LSTM prediction module, the training efficiency of the model is improved by utilizing the self-adaptive input time period length L and the improved custom Loss function Loss;
2-1) in the time pre-sequencing, if the time interval between the current predicted point and the last predicted point is L, after each window movement, the window length L of the improved self-adaptation is as follows:
wherein d iskIs an initial input period window with k as a sampling point, which satisfies:
wherein L iskIs the window number of consecutive missing data for the k sample point location; l issumRepresenting the sum of the missing windows of all sampling points; hkRepresenting the number of abnormal values at a sampling point k on the optical fiber; t is the total sampling time;
2-2) on the basis of a standard regularization term function, integrating the segmented input cycle length L and the window step length distance n by utilizing the thought of a two-dimensional data group, and defining a training Loss function Loss of the LSTM so as to improve the learning efficiency; l is the length of the current segmentation window, n is less than L, n is the number of the step lengths of the sample points of the prediction set, and then the input and the output of the hidden layer meet a two-dimensional array with the dimension of (L-n, n); according to the error calculation formula, the Loss function Loss in the training process is defined as:
in the formula, i represents the number of sampling points;is the average of the split windows; y isiRepresenting a monitored value; l represents the input cycle length; n represents the step distance of the window; r (omega) is a regularization term used for limiting interference noise in model learning; β is its controllable weight factor constant.
4. The distributed optical fiber abnormal data restoration model based on the adaptive long-term and short-term memory as claimed in claim 1, wherein in the adaptive LSTM prediction module, an average error coefficient delta is defined by calculating the difference of monitoring values of sampling points of adjacent positions n steps before a time t to be measured based on a position independent variable to correct an actual LSTM predicted value, so as to reduce the iterative error of optical fiber data; in step n, the average coefficient of variation δ satisfies:
in the formula, betaiFor the proposed weight factor, equal ton represents the distance of stepping, i represents the number of movements of the monitored value at the previous time, j represents the number of movements of the monitored value at the subsequent time, yt-iIndicating the monitored value at the previous time instant,
yt-jindicating a monitoring value at a later time; the actual predicted value at time tThe output value is:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010959309.9A CN112235043B (en) | 2020-09-14 | 2020-09-14 | Distributed optical fiber abnormal data restoration device based on self-adaptive long-term and short-term memory |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010959309.9A CN112235043B (en) | 2020-09-14 | 2020-09-14 | Distributed optical fiber abnormal data restoration device based on self-adaptive long-term and short-term memory |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112235043A true CN112235043A (en) | 2021-01-15 |
CN112235043B CN112235043B (en) | 2022-12-23 |
Family
ID=74116351
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010959309.9A Active CN112235043B (en) | 2020-09-14 | 2020-09-14 | Distributed optical fiber abnormal data restoration device based on self-adaptive long-term and short-term memory |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112235043B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113377744A (en) * | 2021-06-16 | 2021-09-10 | 北京建筑大学 | Method and device for reconstructing structural anomaly monitoring data with environment temperature correlation |
CN113791275A (en) * | 2021-08-30 | 2021-12-14 | 国网福建省电力有限公司 | Method and system for repairing single-phase harmonic data loss |
CN116760466A (en) * | 2023-08-23 | 2023-09-15 | 青岛诺克通信技术有限公司 | Optical cable positioning method and system |
CN117451113A (en) * | 2023-12-22 | 2024-01-26 | 中国电建集团华东勘测设计研究院有限公司 | Self-elevating platform spud leg structure health monitoring system based on optical fiber sensing |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2016101690A1 (en) * | 2014-12-22 | 2016-06-30 | 国家电网公司 | Time sequence analysis-based state monitoring data cleaning method for power transmission and transformation device |
CN109343505A (en) * | 2018-09-19 | 2019-02-15 | 太原科技大学 | Gear method for predicting residual useful life based on shot and long term memory network |
CN109816008A (en) * | 2019-01-20 | 2019-05-28 | 北京工业大学 | A kind of astronomical big data light curve predicting abnormality method based on shot and long term memory network |
CN110334726A (en) * | 2019-04-24 | 2019-10-15 | 华北电力大学 | A kind of identification of the electric load abnormal data based on Density Clustering and LSTM and restorative procedure |
CN110689075A (en) * | 2019-09-26 | 2020-01-14 | 北京工业大学 | Fault prediction method of self-adaptive threshold of refrigeration equipment based on multi-algorithm fusion |
CN110879253A (en) * | 2018-09-05 | 2020-03-13 | 哈尔滨工业大学 | Steel rail crack acoustic emission signal detection method based on improved long-time and short-time memory network |
CN110995339A (en) * | 2019-11-26 | 2020-04-10 | 电子科技大学 | Method for extracting and identifying time-space information of distributed optical fiber sensing signal |
CN111563706A (en) * | 2020-03-05 | 2020-08-21 | 河海大学 | Multivariable logistics freight volume prediction method based on LSTM network |
-
2020
- 2020-09-14 CN CN202010959309.9A patent/CN112235043B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2016101690A1 (en) * | 2014-12-22 | 2016-06-30 | 国家电网公司 | Time sequence analysis-based state monitoring data cleaning method for power transmission and transformation device |
CN110879253A (en) * | 2018-09-05 | 2020-03-13 | 哈尔滨工业大学 | Steel rail crack acoustic emission signal detection method based on improved long-time and short-time memory network |
CN109343505A (en) * | 2018-09-19 | 2019-02-15 | 太原科技大学 | Gear method for predicting residual useful life based on shot and long term memory network |
CN109816008A (en) * | 2019-01-20 | 2019-05-28 | 北京工业大学 | A kind of astronomical big data light curve predicting abnormality method based on shot and long term memory network |
CN110334726A (en) * | 2019-04-24 | 2019-10-15 | 华北电力大学 | A kind of identification of the electric load abnormal data based on Density Clustering and LSTM and restorative procedure |
CN110689075A (en) * | 2019-09-26 | 2020-01-14 | 北京工业大学 | Fault prediction method of self-adaptive threshold of refrigeration equipment based on multi-algorithm fusion |
CN110995339A (en) * | 2019-11-26 | 2020-04-10 | 电子科技大学 | Method for extracting and identifying time-space information of distributed optical fiber sensing signal |
CN111563706A (en) * | 2020-03-05 | 2020-08-21 | 河海大学 | Multivariable logistics freight volume prediction method based on LSTM network |
Non-Patent Citations (1)
Title |
---|
许宁等: "改进型LSTM变形预测模型研究", 《江西理工大学学报》 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113377744A (en) * | 2021-06-16 | 2021-09-10 | 北京建筑大学 | Method and device for reconstructing structural anomaly monitoring data with environment temperature correlation |
CN113791275A (en) * | 2021-08-30 | 2021-12-14 | 国网福建省电力有限公司 | Method and system for repairing single-phase harmonic data loss |
CN113791275B (en) * | 2021-08-30 | 2022-12-06 | 国网福建省电力有限公司 | Method and system for repairing single-phase harmonic data loss |
CN116760466A (en) * | 2023-08-23 | 2023-09-15 | 青岛诺克通信技术有限公司 | Optical cable positioning method and system |
CN116760466B (en) * | 2023-08-23 | 2023-11-28 | 青岛诺克通信技术有限公司 | Optical cable positioning method and system |
CN117451113A (en) * | 2023-12-22 | 2024-01-26 | 中国电建集团华东勘测设计研究院有限公司 | Self-elevating platform spud leg structure health monitoring system based on optical fiber sensing |
CN117451113B (en) * | 2023-12-22 | 2024-03-26 | 中国电建集团华东勘测设计研究院有限公司 | Self-elevating platform spud leg structure health monitoring system based on optical fiber sensing |
Also Published As
Publication number | Publication date |
---|---|
CN112235043B (en) | 2022-12-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112235043B (en) | Distributed optical fiber abnormal data restoration device based on self-adaptive long-term and short-term memory | |
CN111914883B (en) | Spindle bearing state evaluation method and device based on deep fusion network | |
CN111144542B (en) | Oil well productivity prediction method, device and equipment | |
CN113723010B (en) | Bridge damage early warning method based on LSTM temperature-displacement correlation model | |
CN109884892B (en) | Process industrial system prediction model based on cross correlation time-lag grey correlation analysis | |
CN110414442B (en) | Pressure time sequence data segmentation characteristic value prediction method | |
CN108595803B (en) | Shale gas well production pressure prediction method based on recurrent neural network | |
CN112070322B (en) | High-voltage cable line running state prediction method based on long-short term memory network | |
CN113567131B (en) | Bearing fault diagnosis method based on S transformation and miniature convolutional neural network model | |
CN114282443B (en) | Residual service life prediction method based on MLP-LSTM supervised joint model | |
CN113051822A (en) | Industrial system anomaly detection method based on graph attention network and LSTM automatic coding model | |
CN114034486B (en) | Pump mechanical equipment bearing fault diagnosis method based on unsupervised transfer learning | |
CN114218872B (en) | DBN-LSTM semi-supervised joint model-based residual service life prediction method | |
CN112559598B (en) | Telemetry time series data abnormity detection method and system based on graph neural network | |
CN115200850A (en) | Mechanical equipment anomaly detection method under explicit representation of multi-point sample structure information | |
CN115017826A (en) | Method for predicting residual service life of equipment | |
CN111881413A (en) | Multi-source time sequence missing data recovery method based on matrix decomposition | |
CN114895656A (en) | Industrial Internet of things equipment fault diagnosis system capable of adaptively triggering incremental learning | |
CN116595876A (en) | Multi-sensor data fusion motor temperature situation prediction method based on GCN-LSTM | |
WO2022037172A1 (en) | Anomalous sampled data value repair method and apparatus | |
CN112767692A (en) | Short-term traffic flow prediction system based on SARIMA-GA-Elman combined model | |
CN112905436A (en) | Quality evaluation prediction method for complex software | |
CN112816191B (en) | Multi-feature health factor fusion method based on SDRSN | |
CN113468720B (en) | Service life prediction method for digital-analog linked random degradation equipment | |
CN115526435A (en) | Oil well yield prediction method under spatio-temporal neural network model based on Kalman filtering |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |