CN113988177A - Water quality sensor abnormal data detection and fault diagnosis method - Google Patents

Water quality sensor abnormal data detection and fault diagnosis method Download PDF

Info

Publication number
CN113988177A
CN113988177A CN202111255726.6A CN202111255726A CN113988177A CN 113988177 A CN113988177 A CN 113988177A CN 202111255726 A CN202111255726 A CN 202111255726A CN 113988177 A CN113988177 A CN 113988177A
Authority
CN
China
Prior art keywords
data
hyperplane
network
action
function
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111255726.6A
Other languages
Chinese (zh)
Inventor
蔡倩倩
朱雅璐
孟伟
麦达明
鲁仁全
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Original Assignee
Guangdong University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Technology filed Critical Guangdong University of Technology
Priority to CN202111255726.6A priority Critical patent/CN113988177A/en
Publication of CN113988177A publication Critical patent/CN113988177A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06F18/2155Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the incorporation of unlabelled data, e.g. multiple instance learning [MIL], semi-supervised techniques using expectation-maximisation [EM] or naïve labelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2431Multiple classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a water quality sensor abnormal data detection and fault diagnosis method, which comprises the following steps: collecting N label-free sample water quality sensor data sets and preprocessing the data sets; constructing a deep reinforcement learning network, wherein the deep reinforcement learning network comprises an Alex convolutional neural network and an Actor _ Critic network; establishing an environment for interaction of the intelligent agent in reinforcement learning, and setting actions and obtained returns of each interaction of the intelligent agent and the environment; inputting sample data aiming at the deep reinforcement learning network, performing iterative training until the total return value is stable and converged, extracting network model parameters and storing an optimal model; inputting a label-free sample sensor data set to be detected into a model to generate a plurality of hyperplanes; dividing the data set into positive and negative regions of different degrees; detecting data points appearing in a negative area of the hyperplane with low accuracy, and regarding the data points as abnormal data; and recording the corresponding sensor when the data point appears in the negative area of the hyperplane with lower accuracy for multiple times, and judging that the sensor possibly fails.

Description

Water quality sensor abnormal data detection and fault diagnosis method
Technical Field
The invention relates to the field of abnormal data detection of water quality sensors, in particular to a water quality sensor abnormal data detection and fault diagnosis method based on reinforcement learning.
Background
The environmental protection lies in water resource protection, which lies in water pollution control. In the prevention and treatment of water pollution, a water quality sensor is mainly used for detecting important indexes reflecting the water quality pollution condition, so that the pollution degree is monitored. The key to water pollution detection is to ensure the accuracy and validity of sensor data. Therefore, the detection of abnormal data of the original data collected by the sensor is particularly important. Sensor anomaly data refers to data that is inconsistent with a majority of data in a data set or deviates from normal behavior patterns. The common detection method is that a probability statistical model, namely probability gives the distribution of the population to deduce the sample property, and statistics is used for verifying the hypothesis of the population distribution from the sample. The machine learning method, in which clustering and support vector machine methods are commonly used, is characterized in that the generalization capability of the model is strong, and data samples with historical abnormal data labels are often needed. However, most of the abnormal data in actual production is not labeled. Therefore, there is a need to find a method of abnormal data detection for unlabeled data samples.
The convolutional neural network is a feedforward neural network and is mostly applied to the field of image processing, wherein the convolutional layer mainly extracts the characteristics of input data, and abstracting the implicit relation in the original data through convolutional cores. The pooling layer mainly plays a role in down-sampling, namely, the dimensionality reduction is carried out on the feature data, and overfitting is reduced. Reinforcement learning is a field of machine learning, and the content of reinforcement learning is that an agent performs actions based on the environment to obtain corresponding rewards and punishments, and gradually iterates and optimizes under the stimulation to achieve maximum benefits, namely, habitual behaviors pursuing expected maximum benefits are achieved, and the machine learning has universality.
Disclosure of Invention
The invention aims to provide a water quality sensor abnormal data detection and fault diagnosis method, which utilizes a convolutional neural network to extract data difference characteristics of unlabeled sample data, constructs an environment based on probability density as a standard, and classifies the sample data by reinforcement learning so as to achieve the purposes of water quality sensor abnormal data detection and fault diagnosis.
In order to realize the task, the invention adopts the following technical scheme:
a water quality sensor abnormal data detection and fault diagnosis method comprises the following steps:
step 1, collecting N unlabeled sample water quality sensor data sets { D1,D2,...DNAnd preprocessing, each data set DiDetection data comprising m time segments, Di=[V1,V2,...,Vm](ii) a Wherein ViThe data points are multidimensional data points and represent the water quality condition detected by the water quality sensor in a certain time period;
the data preprocessing process comprises the following steps:
for a single data set DiRandom extraction of r2Vectors and synthesizing n-r-dimensional tensors, wherein the contained n-dimensional monomial index number is used as an initial state S of a deep reinforcement learning single sampling round epamode;
step 2, constructing a deep reinforcement learning network
Step 2.1, the deep reinforcement learning network comprises an Alex convolutional neural network and an Actor _ Critic network, the Alex convolutional neural network is used for extracting data difference characteristics between data points, and the Actor _ Critic network is used as a decision and evaluation network;
the Alex convolutional neural network comprises five convolutional layers from front to back, each convolutional layer is provided with a convolutional core, an activation function ReLU function and a pooling layer are arranged between the adjacent convolutional layers, and a smooth layer Flatten is connected behind the last pooling layer to realize the transition of the convolutional layers and the fully-connected layers; connecting an adaptive average pooling layer between the convolutional neural network and the Actor _ critical network;
the Actor _ Critic network comprises a decision network Actor and an evaluation network criticic, wherein the decision network Actor comprises two output layers which respectively output a parameter mean value mu and a parameter variance sigma to form Gaussian probability distribution for generating action, and the exploration capability of the action is increased; wherein a is the output of the action and is represented as the weight W and the deviation b of the generated hyperplane; the evaluation network criticic comprises two hidden layers and an output layer, an evaluation value function value evaluated under the state S is output, and the larger the evaluation function value is, the more optimal the state S is;
step 2.2, the decision network Actor outputs n +1 mean values mu and mean square deviations sigma, where n is a multi-dimensional data point ViThe dimension of (a), i.e., the number of measurement indices of the sensor data; wherein the n Gaussian probability distributions generate a weight W ═ W for hyperplane U1,W2,..Wn]One gaussian probability distribution generates the deviation b, U ═ W of the hyperplane U1,W2,..Wn]Vi+ b; the hyperplane U divides the data set into a positive area, a negative area and a hyperplane area; wherein, the positive area is a normal data point, the negative area is an abnormal data point, and the hyperplane area is a point on the hyperplane;
step 3, establishing an environment for interaction of the agent in reinforcement learning according to the probability density characteristics of the multidimensional data points, and setting the action and the obtained return of each interaction of the agent and the environment; the action is a hyperplane for classifying the multi-dimensional data points each time, and the obtained return is a value for measuring the quality of the classification effect of the hyperplane generated at this time;
step 4, inputting sample data aiming at the deep reinforcement learning network, performing iterative training until the total return value is stable and converged, extracting network model parameters and storing an optimal model;
step 5, inputting a data set of the unlabeled sample sensor to be detected into a model to generate a plurality of hyperplanes;
according to different hyperplanes, dividing a label-free sample sensor data set into positive and negative areas with different degrees: the positive area is normal data, and the negative area is abnormal data; detecting data points appearing in a negative area of the hyperplane with low accuracy, and regarding the data points as abnormal data; and recording the corresponding sensor when the data point appears in the negative area of the hyperplane with lower accuracy for multiple times, and judging that the sensor possibly fails.
Furthermore, in the Alex convolutional neural network, a convolution kernel is arranged in each convolutional layer, and the dimension is 5 × n; convolution layer dimension output formula
Figure BDA0003323806730000031
W is an input size r, namely an input tensor of n x r dimension, k is a dimension 5 of a convolution kernel, p is a filling 0, and s is a step size 1; pooling layer dimension output formula
Figure BDA0003323806730000032
Where w is the convolutional layer output dimension, k is the dimension 3 of the pooling window, and s is the step size 1.
Further, the establishing an environment for interaction of the agent in reinforcement learning according to the probability density characteristic of the multidimensional data points, and setting actions and obtained rewards of each interaction of the agent and the environment, includes:
step 3.1, a Gaussian function is adopted as a window function of the kernel density estimation method, the counting weight is larger if the sample point is larger from the center of the sample area, and therefore the probability density estimation formula is as follows:
Figure BDA0003323806730000033
wherein
Figure BDA0003323806730000034
Represents the mean value;
mahalanobis distance between multi-dimensional data points
Figure BDA0003323806730000035
Where S is a covariance matrix
Figure BDA0003323806730000036
The probability density estimate, equation 1, can be written as:
Figure BDA0003323806730000037
wherein d isijIs a multi-dimensional data point ViAnd VjMahalanobis distance of;
step 3.2, introducing a segmentation function:
Figure BDA0003323806730000038
when the function input a is larger than 0, the output is 1; when the function input is equal to 0, the output is 0; when the function input is less than 0, the output is-1; for the selected data set Di=[V1,V2,...,Vm]Vector of data points Vi(i ═ 1, 2, 3,. m), calculate f ([ W · W)1,W2,..Wn]Vi+ b) when its value is 1, the multi-dimensional data point V is comparediInto a positive region F+The label is set to 1; otherwise, when the value is-1, the data is stored in the negative area F-The label is set to-1; when the value is 0, the data point is on the hyperplane and is stored in a hyperplane area F; the calculation is obtained by the calculation of the environment when an intelligent agent, namely a network, interacts with the environment;
step 3.3, set the reward for the positive zone
Figure BDA0003323806730000041
I.e. the sum of the relative probability densities of m multi-dimensional data points randomly drawn over a positive region, where piThe formula 2 shows that ζ is a set relatively dense constant, when the probability density of the data point is greater than ζ, the reward is obtained, and otherwise, the penalty is obtained;
step 3.4, setting punishment of negative region
Figure BDA0003323806730000042
I.e. the sum of the relative probability densities of k data randomly drawn in the negative region, where piIs derived from formula 2, wherein KpIs a magnification factor;
step 3.5, distance between two hyperplanes
Figure BDA0003323806730000043
Wherein xlastIs a point on the last hyperplane, i.e. satisfies Wlastxlast+b last0, where W is the weight of the hyperplane at that time, WlastWeight of the last hyperplane, blastDeviation of the last hyperplane; f denotes a dividing function, then
Figure BDA0003323806730000044
Step 3.6, the number of the data points in the hyperplane is d, namely d is the length of the hyperplane area;
step 3.7, setting penalty | | | R of hyperplane3(D + D), the penalty is smaller when the hyperplane is spaced from the data point more, and the transition of the hyperplane is smaller;
step 3.8, setting the return of single action
Reward=||R1||+||R2||+||R3Formula 3
Further, the iteratively training the input sample data until the total return value is stable and converged, and extracting the network model parameter saving model includes:
step 4.1, N unlabeled sample sensor datasets { D }collected1,D2,...DNAfter pretreatment, randomly and circularly entering network training, wherein each cycle is an epsilon;
step 4.2, during single loop iteration, generating an initial state S of a single round after data preprocessing of a data set; inputting the initial state S into the network, generating action, i.e. hyperplane U, dividing the data set into positive regions F+Negative region F-And a hyperplane region; randomly selecting C data points to calculate and obtain a Reward positive region F+Randomly pick r from the data points of2Preprocessing the data points to generate a next state S ', inputting the next state S' into the whole network, and finally outputting the next state S 'from the Actor network to form a new hyperplane U', a new positive region, a new negative region and a new Reward; continuously obtaining action from state, obtaining new state from action, and obtaining updated action until the action is completedEnding after the single round is completed by the maximum step max _ ep _ step;
4.3, storing the state, the action and the return of each step, and training the network after finishing sampling of one round;
step 4.4, obtaining the actual value function G of each step by adopting the Monte Carlo methodt=Rt+γRt+12Rt+2+...+γmax_ep_step-tRmax_ep_step(ii) a Where gamma is a discount factor, RtThe reward obtained after the t-th action is obtained by formula 3;
step 4.5, according to the optimization of the action strategy, the fitting of the evaluation network to the actual value function is carried out, and the weight of the whole network is updated;
step 4.6, calculate the total reward R of the new round generated by the updated networkallContinuing training according to the step 4.2;
step 4.7, loop to N unlabeled sample sensor datasets { D1,D2,...DNSampling and training randomly;
step 4.8, when the iterative training reaches the total return RallAnd when the convergence is stable, the network model is stored, and the total return is maximum at the moment according to the gradient descent principle, namely the model is optimal.
Further, in step 4.5, the loss function L of action strategy optimizationa=E(logπθ(as) V (s, a)), where E is desired, πθ(as) is the probability distribution of the action strategy generated in each step, V (s, a) is the output evaluation value of the evaluation network criticic, and the loss function L of the value function fittingc=(Gt-V(s,a))2
Further, a single round maximum step max _ ep _ step hyperplane { U _ is generated in step 51,U2,...,Umax_ep_step}; due to the exploratory nature of reinforcement learning, the accuracy U of each hyperplane can be known1<U2<...<Umax_ep_step(ii) a The hyperplane with lower accuracy is determined by setting an accuracy threshold.
Compared with the prior art, the invention has the following technical characteristics:
1. the method takes the probability characteristic of sample data as a standard, utilizes the Alex convolutional neural network to extract the data difference characteristic between each data point, uses the Actor _ Critic network as a decision and evaluation network, generates an optimal classification hyperplane group by iterating network weight and optimizing action strategies, classifies the data into normal data, abnormal data, secondary abnormal data and the like, solves the problem of label-free training of data collected in actual engineering, realizes more accurate division of the data, and achieves the purposes of monitoring the abnormal data of the water quality sensor and diagnosing whether the sensor fails.
2. The method is used for training a label-free sample, and optimizing the model based on probability density distribution of data points, so that detection data can be effectively classified, and further, detection and fault diagnosis of abnormal data are realized; the invention can realize universal applicability and generalization under the condition of sufficient training samples.
Drawings
FIG. 1 is a schematic illustration of pre-processing a water quality sensor data set;
FIG. 2 is a network architecture diagram of the method of the present invention;
fig. 3 is a schematic flow chart of network training in the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, specific technical solutions of the present invention will be described in further detail below with reference to the accompanying drawings.
Referring to the attached drawings, the invention provides a water quality sensor abnormal data detection and fault diagnosis method, which comprises the following steps:
step 1, collecting N unlabeled sample water quality sensor data sets { D1,D2,...DNAnd carrying out pretreatment, as shown in FIG. 1; wherein each data set Di(i-1, 2, …, N) contains detection data for m time segments, i.e., Di=[V1,V2,...,Vm],Vi(i-1, 2, …, m) is a multidimensional data point which represents the water quality condition detected by the water quality sensor in a certain time periodIn condition of Vi=[x1,x2,...,xn]TWherein x isi(i-1, 2, …, n) represents the detected data of a single index in a certain time period, and the detected data of the single index is an unlabeled sample.
The data preprocessing process comprises the following steps:
for a single data set DiRandom extraction of r2An individual vector
Figure BDA0003323806730000061
And synthesizing n-r-dimensional tensors including n-dimensional single index numbers as the initial state S of the deep reinforcement learning single sampling round epasopode, see FIG. 1; and the data in the same index and different time periods are synthesized into an r x r matrix, so that the characteristics of data difference are conveniently extracted.
Step 2, constructing a deep reinforcement learning network
Step 2.1, the deep reinforcement learning network comprises an Alex convolutional neural network and an Actor _ Critic network, the Alex convolutional neural network is used for extracting data difference characteristics between each data point, and the Actor _ Critic network is used as a decision and evaluation network, wherein:
the Alex convolutional neural network designed in the scheme comprises five convolutional layers from front to back, as shown in fig. 2, each convolutional layer is provided with a convolutional core, and the dimension is 5 × n; an activation function ReLU function and a pooling layer are arranged between adjacent convolutional layers, a smooth layer Flatten is connected behind the last pooling layer, multidimensional data are subjected to one-dimensional conversion, and transition between the convolutional layers and the fully-connected layer is realized. Wherein, the dimension of the convolution layer is output formula
Figure BDA0003323806730000071
W is an input size r, namely an input tensor of n x r dimension, k is a dimension 5 of a convolution kernel, p is a filling 0, and s is a step size 1; pooling layer dimension output formula
Figure BDA0003323806730000072
Where w is the convolutional layer output dimension, k is the dimension 3 of the pooling window, and s is the step size 1. N of Alex convolution network inputThe tensor of r x r dimension finally obtains the output dimension O of the smooth layer of the Alex convolution neural networkdim=(r-30)2. An adaptive AvgPool layer, namely an adaptive averaging pooling layer, is connected between the convolutional neural network and the Actor _ Critic network, and the integrity of information is reserved.
The Actor _ criticic network comprises a decision network Actor and an evaluation network criticic. The decision network Actor comprises two output layers which respectively output parameters mu and sigma and form Gaussian probability distribution for generating action:
Figure BDA0003323806730000073
Figure BDA0003323806730000074
increasing the exploratory power of the action; where a represents the output of the action, in the present invention, the weight W and the deviation b of the generated hyperplane. The evaluation network Critic comprises two hidden layers, an output layer and an evaluation value function V under the output evaluation state Sπ(S),VπThe greater (S), the more optimal this state S.
And 2.2, preprocessing the data set in the step 1 to generate an initial state S of a single epsilon, wherein the dimension of S is n x r.
Decision network Actor generates a hyperplane U (hyperplane U ═ W)1,W2,..Wn]Vi+ b, hyperplane refers to a subspace of dimension n-1 in n-dimensional linear space. It may divide the linear space into two disjoint parts. Used in the present invention to partition abnormal and non-abnormal data points in a multidimensional space; the weight of the hyperplane generated by the Actor _ critical network) partitions the dataset into positive regions, negative regions and hyperplane regions. Wherein the positive region is a normal data point, the negative region is an abnormal data point, and the hyperplane region is a point on the hyperplane.
The decision network Actor outputs n +1 Gaussian probability distributions, namely n +1 mean values mu and mean square deviations sigma are generated, wherein n is a multi-dimensional data point ViThe dimension of (a) is the number of measurement indices of the sensor data; wherein the n Gaussian probability distributions generate a weight W ═ W for hyperplane U1,W2,..Wn]A gaussian probability distribution generates the deviation b of the hyperplane U. I.e. U ═ W1,W2,..Wn]Vi+b,ViIndicating the water quality condition detected by the water quality sensor in a certain time period, and making U ═ W1,W2,..Wn]Vi+b=0。
Step 3, according to the multi-dimensional data points, i.e. V in the inventioniThe probability density characteristic of the method is used for establishing an environment for interaction of the intelligent agent in reinforcement learning, and setting actions and obtained returns of interaction of the intelligent agent and the environment each time. In the invention, the action refers to a hyperplane for classifying data points each time, and the obtained return refers to a value for measuring the quality of the classification effect of the hyperplane generated at this time. The environment is relatively independent of the network, the network outputs actions to interact with the environment, and the network outputs evaluation on the action interaction.
Step 3.1, since data point ViThe data is label-free data, and the distribution form of the data is unknown, so a nonparametric estimation function method and a nuclear density estimation method, namely a Parzen window density estimation method, are adopted. A Gaussian function is adopted as a window function of a kernel density estimation method, namely the counting weight is larger when a sample point is larger from the center of a sample area, so that a probability density estimation formula is as follows:
Figure BDA0003323806730000081
wherein
Figure BDA0003323806730000082
The mean value is indicated.
The dimension of the sensor detection data increases with the increase of the index quantity, a high-dimensional state is presented, and very large correlation exists between data in different dimensions. The Mahalanobis distance rotates the variable according to the principal component, because each direction of the principal component is the direction of the feature vector, the variance of each direction corresponds to the feature value, each dimension is independent, and then the Euclidean distance is used for calculation. Mahalanobis retains more features between the dimensions than euclidean distances.
Therefore, as used in the present invention, the mahalanobis distance between the multi-dimensional data points of the sample
Figure BDA0003323806730000083
Figure BDA0003323806730000084
Where S is a covariance matrix
Figure BDA0003323806730000085
The probability density estimate, equation 1, can be written as:
Figure BDA0003323806730000086
wherein d isijIs a multi-dimensional data point ViAnd VjThe distance between the two adjacent channels of the channel,
Figure BDA0003323806730000087
is the mean value.
Step 3.2, introducing a segmentation function:
Figure BDA0003323806730000091
when the function input is greater than 0, the output is 1; when the function is equal to 0, the output is 0; when the function is less than 0, the output is-1. Data set D to be selectedi=[V1,V2,...,Vm]Vector of data points Vi(i ═ 1, 2, 3,. m), calculate f ([ W · W)1,W2,..Wn]Vi + b) that, when it has a value of 1, stores the multidimensional data point Vi in the positive region F+The label is set to 1. Otherwise, when the value is-1, the data is stored in the negative area F-The label is set to-1. When the value is 0, the data point is on the hyperplane and is stored in a hyperplane area F; these calculations are calculated by the environment when the agent, i.e. the network, interacts with the environment.
Step 3.3, set the reward for the positive zone
Figure BDA0003323806730000092
I.e. the sum of the relative probability densities of m multi-dimensional data points randomly drawn over a positive region, where piFrom equation 2, where ζ is a relatively dense constant that is set, it is a reward when the probability density of the data point is greater than ζ, and a penalty otherwise. The more dense data points in the correct region can be reached, then R1The larger the | i, the larger the reward.
Step 3.4, setting punishment of negative region
Figure BDA0003323806730000093
I.e. the sum of the relative probability densities of k data randomly drawn in the negative region, where piIs derived from formula 2, wherein KpThe degree of density of the negative area is constrained for magnification. Can reach the condition that R is less and less sparse when the data point of the negative area is more and less2The smaller the | i, the smaller the penalty.
Step 3.5, distance between two hyperplanes
Figure BDA0003323806730000094
Wherein xlastIs a point on the last hyperplane, i.e. satisfies Wlastxlast+b last0, where W is the weight of the hyperplane at that time, WlastWeight of the last hyperplane, blastDeviation of the last hyperplane; f denotes a dividing function, then
Figure BDA0003323806730000095
And the stability of the model caused by the jump of the hyperplane is prevented, and the hyperplane is better when the D is smaller.
Step 3.6, according to the same region, the more sparse the data points are, the smaller the density among the data points is, and the better the number of the data points in the hyperplane is; i.e., fewer data points for the hyperplane region F, the better. Let the number of data points in the hyperplane be d, i.e., d is the length of the hyperplane region.
Step 3.7, setting penalty | | | R of hyperplane3I.e. when the hyperplane and numberThe larger the spot spacing, the smaller the transition of the hyperplane, and the smaller the penalty.
Step 3.8, setting the return of single action
Reward=||R1||+||R2||+||R3Formula 3
And 4, aiming at input sample data, performing iterative training until the total return value is stable and converged, and extracting a network model parameter storage model.
Step 4.1, N unlabeled sample sensor datasets { D }collected1,D2,...DNAnd after preprocessing, randomly and circularly entering network training, so that the network has better robustness. Once per cycle, the cycle is an epsilon, i.e. a round, which is a markov chain.
And 4.2, during single loop iteration, generating an initial state S of a single round after data preprocessing of the data set. Inputting the initial state S into the network, generating action, i.e. hyperplane U, dividing the data set into positive regions F+Negative region F-And a hyperplane region. And randomly selecting C data points to calculate and obtain Reward. Positive region F+Randomly picking the data points of r2And preprocessing the data points to generate a next state S ', inputting the next state S' into the whole network, and finally outputting the next state S 'from the Actor network to form a new hyperplane U', a new positive and negative area and a new Reward. And obtaining the action from the state continuously, obtaining a new state from the action, and obtaining an updated action until the maximum step max _ ep _ step of a single round is finished.
And 4.3, storing the state, action and return of each step, and training the network after finishing sampling of one round.
Step 4.4, obtaining the actual value function G of each step by adopting the Monte Carlo methodt=Rt+γRt+12Rt+2+...+γmax_ep_step-tRmax_ep_step. Where gamma is a discount factor, RtThe reward obtained after the t-th action is given by equation 3.
Step 4.5, optimizing according to the action strategy and simulating the actual value function by the evaluation networkIn combination, the weights for the entire network are updated. Wherein the action strategy is optimized by a loss function La=E(logπθ(as) V (s, a)), where πθ(a | s) is an action strategy probability distribution generated at each step, and V (s, a) is an output evaluation value of the evaluation network criticic. Loss function L of value function fitc=(Gt-V(s,a))2
Step 4.6, calculate the total reward R of the new round generated by the updated networkallContinuing to train according to the step 4.2.
Step 4.7, loop to N unlabeled sample sensor datasets { D1,D2,...DNAnd fifthly, randomly sampling and training.
Step 4.8, when the iterative training reaches the total return value RallAnd when the model is stably converged, the model is saved, and the total return is maximum at the moment according to the gradient descent principle, namely the model is optimal.
Step 5, inputting a label-free sample sensor data set to be detected into a model, and generating max _ ep _ step hyperplanes { U) in a single round1,U2,...,Umax_ep_step}; due to the exploratory nature of reinforcement learning, the accuracy U of each hyperplane can be known1<U2<...<Umax_ep_step
According to different hyperplanes, dividing the unlabeled data sample into positive and negative areas with different degrees: the positive area is normal data, and the negative area is abnormal data; data points that appear in the negative region of the hyperplane with lower accuracy are detected and considered as anomalous data. Wherein the hyperplane with lower accuracy can be determined by setting an accuracy threshold.
And recording the corresponding sensor when the data point appears in the negative area of the hyperplane with lower accuracy for multiple times, and judging that the sensor possibly fails.
The above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present application and are intended to be included within the scope of the present application.

Claims (6)

1. A water quality sensor abnormal data detection and fault diagnosis method is characterized by comprising the following steps:
step 1, collecting N unlabeled sample water quality sensor data sets { D1,D2,…DNAnd preprocessing, each data set DiDetection data comprising m time segments, Di=[V1,V2,…,Vm]In which V isiThe data points are multidimensional data points and represent the water quality condition detected by the water quality sensor in a certain time period;
the data preprocessing process comprises the following steps:
for a single data set DiRandom extraction of r2Vectors and synthesizing n-r-dimensional tensors, wherein the contained n-dimensional monomial index number is used as an initial state S of a deep reinforcement learning single sampling round epamode;
step 2, constructing a deep reinforcement learning network
Step 2.1, the deep reinforcement learning network comprises an Alex convolutional neural network and an Actor _ Critic network, the Alex convolutional neural network is used for extracting data difference characteristics between data points, and the Actor _ Critic network is used as a decision and evaluation network;
the Alex convolutional neural network comprises five convolutional layers from front to back, each convolutional layer is provided with a convolutional core, an activation function ReLU function and a pooling layer are arranged between the adjacent convolutional layers, and a smooth layer Flatten is connected behind the last pooling layer to realize the transition of the convolutional layers and the fully-connected layers; connecting an adaptive average pooling layer between the convolutional neural network and the Actor _ critical network;
the Actor _ Critic network comprises a decision network Actor and an evaluation network criticic, wherein the decision network Actor comprises two output layers which respectively output a parameter mean value mu and a parameter variance sigma to form Gaussian probability distribution for generating action, and the exploration capability of the action is increased; wherein a is the output of the action and is represented as the weight W and the deviation b of the generated hyperplane; the evaluation network criticic comprises two hidden layers and an output layer, an evaluation value function value evaluated under the state S is output, and the larger the evaluation function value is, the more optimal the state S is;
step 2.2, the decision network Actor outputs n +1 mean values mu and mean square deviations sigma, where n is a multi-dimensional data point ViThe dimension of (a), i.e., the number of measurement indices of the sensor data; wherein the n Gaussian probability distributions generate a weight W ═ W for hyperplane U1,W2,..Wn]One gaussian probability distribution generates the deviation b, U ═ W of the hyperplane U1,W2,..Wn]Vi+ b; the hyperplane U divides the data set into a positive area, a negative area and a hyperplane area; wherein, the positive area is a normal data point, the negative area is an abnormal data point, and the hyperplane area is a point on the hyperplane;
step 3, establishing an environment for interaction of the agent in reinforcement learning according to the probability density characteristics of the multidimensional data points, and setting the action and the obtained return of each interaction of the agent and the environment; the action is a hyperplane for classifying the multi-dimensional data points each time, and the obtained return is a value for measuring the quality of the classification effect of the hyperplane generated at this time;
step 4, inputting sample data aiming at the deep reinforcement learning network, performing iterative training until the total return value is stable and converged, extracting network model parameters and storing an optimal model;
step 5, inputting a data set of the unlabeled sample sensor to be detected into a model to generate a plurality of hyperplanes;
according to different hyperplanes, dividing a label-free sample sensor data set into positive and negative areas with different degrees: the positive area is normal data, and the negative area is abnormal data; detecting data points appearing in a negative area of the hyperplane with low accuracy, and regarding the data points as abnormal data; and recording the corresponding sensor when the data point appears in the negative area of the hyperplane with lower accuracy for multiple times, and judging that the sensor possibly fails.
2. The method for detecting the abnormal data and diagnosing the faults of the water quality sensor according to claim 1, wherein in the Alex convolutional neural network, a convolution kernel with the dimension of 5 x n is arranged in each convolution layer; convolution layer dimension output formula
Figure FDA0003323806720000021
W is an input size r, namely an input tensor of n x r dimension, k is a dimension 5 of a convolution kernel, p is a filling 0, and s is a step size 1; pooling layer dimension output formula
Figure FDA0003323806720000022
Where w is the convolutional layer output dimension, k is the dimension 3 of the pooling window, and s is the step size 1.
3. The method for detecting abnormal data and diagnosing faults of a water quality sensor according to claim 1, wherein the establishing of an environment for interaction of an agent in reinforcement learning according to probability density characteristics of multidimensional data points, and the setting of actions and obtained returns of each interaction of the agent and the environment comprises:
step 3.1, a Gaussian function is adopted as a window function of the kernel density estimation method, the counting weight is larger if the sample point is larger from the center of the sample area, and therefore the probability density estimation formula is as follows:
Figure FDA0003323806720000023
wherein
Figure FDA0003323806720000024
Represents the mean value;
mahalanobis distance between multi-dimensional data points
Figure FDA0003323806720000025
Where S is a covariance matrix
Figure FDA0003323806720000026
The probability density estimate, equation 1, can be written as:
Figure FDA0003323806720000027
wherein d isijIs a multi-dimensional data point ViAnd VjMahalanobis distance of;
step 3.2, introducing a segmentation function:
Figure FDA0003323806720000031
when the function input a is larger than 0, the output is 1; when the function input is equal to 0, the output is 0; when the function input is less than 0, the output is-1; for the selected data set Di=[V1,V2,…,Vm]Vector of data points Vi(i ═ 1, 2, 3,. m), calculate f ([ W · W)1,W2,..Wn]Vi+ b) when its value is 1, the multi-dimensional data point V is comparediInto a positive region F+The label is set to 1; otherwise, when the value is-1, the data is stored in the negative area F-The label is set to-1; when the value is 0, the data point is on the hyperplane and is stored in a hyperplane area F; the calculation is obtained by the calculation of the environment when an intelligent agent, namely a network, interacts with the environment;
step 3.3, set the reward for the positive zone
Figure FDA0003323806720000032
I.e. the sum of the relative probability densities of m multi-dimensional data points randomly drawn over a positive region, where piThe formula 2 shows that ζ is a set relatively dense constant, when the probability density of the data point is greater than ζ, the reward is obtained, and otherwise, the penalty is obtained;
step 3.4, setting punishment of negative region
Figure FDA0003323806720000033
I.e. the sum of the relative probability densities of k data randomly drawn in the negative region, where piIs derived from formula 2, wherein KpIs a magnification factor;
step 3.5, distance between two hyperplanes
Figure FDA0003323806720000034
Wherein xlastIs a point on the last hyperplane, i.e. satisfies Wlastxlast+blast0, where W is the weight of the hyperplane at that time, WlastWeight of the last hyperplane, blastDeviation of the last hyperplane; f denotes a dividing function, then
Figure FDA0003323806720000035
Step 3.6, the number of the data points in the hyperplane is d, namely d is the length of the hyperplane area;
step 3.7, setting penalty | | | R of hyperplane3(D + D), the penalty is smaller when the hyperplane is spaced from the data point more, and the transition of the hyperplane is smaller;
step 3.8, setting the return of single action
Reward=||R1||+||R2||+||R3Equation 3.
4. The method for detecting abnormal data and diagnosing faults of a water quality sensor according to claim 1, wherein the iterative training is performed on input sample data until a total return value is stable and converged, and a network model parameter storage model is extracted, and the method comprises the following steps:
step 4.1, N unlabeled sample sensor datasets { D }collected1,D2,…DNAfter pretreatment, randomly and circularly entering network training, wherein each cycle is an epsilon;
step 4.2, during single loop iteration, generating an initial state S of a single round after data preprocessing of a data set; will be as followsThe initial state S is input into the network, action, i.e. hyperplane U is generated, the data set is divided into positive regions F+Negative region F-And a hyperplane region; randomly selecting C data points to calculate and obtain a Reward positive region F+Randomly pick r from the data points of2Preprocessing the data points to generate a next state S ', inputting the next state S' into the whole network, and finally outputting the next state S 'from the Actor network to form a new hyperplane U', a new positive region, a new negative region and a new Reward; obtaining an action from the state continuously, obtaining a new state from the action, and obtaining an updated action until the maximum step max _ ep _ step of a single round is finished;
4.3, storing the state, the action and the return of each step, and training the network after finishing sampling of one round;
step 4.4, obtaining the actual value function G of each step by adopting the Monte Carlo methodt=Rt+γRt+12Rt+2+…+γmax_ep_step-tRmax_ep_step(ii) a Where gamma is a discount factor, RtThe reward obtained after the t-th action is obtained by formula 3;
step 4.5, according to the optimization of the action strategy, the fitting of the evaluation network to the actual value function is carried out, and the weight of the whole network is updated;
step 4.6, calculate the total reward R of the new round generated by the updated networkallContinuing training according to the step 4.2;
step 4.7, loop to N unlabeled sample sensor datasets { D1,D2,...DNSampling and training randomly;
step 4.8, when the iterative training reaches the total return RallAnd when the convergence is stable, the network model is stored, and the total return is maximum at the moment according to the gradient descent principle, namely the model is optimal.
5. The method for detecting abnormal data and diagnosing faults of a water quality sensor according to claim 4, wherein in step 4.5, the loss function L optimized by action strategya=E(logπθ(as) V (s, a)), wherein E isExpectation of,. piθ(as) is the probability distribution of the action strategy generated in each step, V (s, a) is the output evaluation value of the evaluation network criticic, and the loss function L of the value function fittingc=(Gt-V(s,a))2
6. The method as claimed in claim 1, wherein the maximum steps max _ ep _ step hyperplanes { U _ of a single round are generated in step 51,U2,...,Umax_ep_step}; accuracy U of each hyperplane due to reinforcement learning exploratory1<U2<...<Umax_ep_step(ii) a The hyperplane with lower accuracy is determined by setting an accuracy threshold.
CN202111255726.6A 2021-10-27 2021-10-27 Water quality sensor abnormal data detection and fault diagnosis method Pending CN113988177A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111255726.6A CN113988177A (en) 2021-10-27 2021-10-27 Water quality sensor abnormal data detection and fault diagnosis method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111255726.6A CN113988177A (en) 2021-10-27 2021-10-27 Water quality sensor abnormal data detection and fault diagnosis method

Publications (1)

Publication Number Publication Date
CN113988177A true CN113988177A (en) 2022-01-28

Family

ID=79742547

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111255726.6A Pending CN113988177A (en) 2021-10-27 2021-10-27 Water quality sensor abnormal data detection and fault diagnosis method

Country Status (1)

Country Link
CN (1) CN113988177A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116166966A (en) * 2023-04-18 2023-05-26 南京哈卢信息科技有限公司 Water quality degradation event detection method based on multi-mode data fusion
CN116336400A (en) * 2023-05-30 2023-06-27 克拉玛依市百事达技术开发有限公司 Baseline detection method for oil and gas gathering and transportation pipeline
CN116451142A (en) * 2023-06-09 2023-07-18 山东云泷水务环境科技有限公司 Water quality sensor fault detection method based on machine learning algorithm

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116166966A (en) * 2023-04-18 2023-05-26 南京哈卢信息科技有限公司 Water quality degradation event detection method based on multi-mode data fusion
CN116336400A (en) * 2023-05-30 2023-06-27 克拉玛依市百事达技术开发有限公司 Baseline detection method for oil and gas gathering and transportation pipeline
CN116336400B (en) * 2023-05-30 2023-08-04 克拉玛依市百事达技术开发有限公司 Baseline detection method for oil and gas gathering and transportation pipeline
CN116451142A (en) * 2023-06-09 2023-07-18 山东云泷水务环境科技有限公司 Water quality sensor fault detection method based on machine learning algorithm

Similar Documents

Publication Publication Date Title
KR102005628B1 (en) Method and system for pre-processing machine learning data
CN113988177A (en) Water quality sensor abnormal data detection and fault diagnosis method
Isa et al. Using the self organizing map for clustering of text documents
CN111553127B (en) Multi-label text data feature selection method and device
CN110009030B (en) Sewage treatment fault diagnosis method based on stacking meta-learning strategy
CN110363230B (en) Stacking integrated sewage treatment fault diagnosis method based on weighted base classifier
CN110940523B (en) Unsupervised domain adaptive fault diagnosis method
CN111353373A (en) Correlation alignment domain adaptive fault diagnosis method
CN111859010B (en) Semi-supervised audio event identification method based on depth mutual information maximization
CN113486578A (en) Method for predicting residual life of equipment in industrial process
CN110766060B (en) Time series similarity calculation method, system and medium based on deep learning
CN110956309A (en) Flow activity prediction method based on CRF and LSTM
CN110826611A (en) Stacking sewage treatment fault diagnosis method based on weighted integration of multiple meta-classifiers
CN112560596A (en) Radar interference category identification method and system
CN115051864B (en) PCA-MF-WNN-based network security situation element extraction method and system
CN114565021A (en) Financial asset pricing method, system and storage medium based on quantum circulation neural network
CN113179276B (en) Intelligent intrusion detection method and system based on explicit and implicit feature learning
CN114003900A (en) Network intrusion detection method, device and system for secondary system of transformer substation
Zhou et al. Credit card fraud identification based on principal component analysis and improved AdaBoost algorithm
Aljundi et al. Continual novelty detection
Isa et al. Text Document Pre-Processing Using the Bayes Formula for Classification Based on the Vector Space Model.
CN111107082A (en) Immune intrusion detection method based on deep belief network
CN116383747A (en) Anomaly detection method for generating countermeasure network based on multi-time scale depth convolution
CN116304941A (en) Ocean data quality control method and device based on multi-model combination
CN115564155A (en) Distributed wind turbine generator power prediction method and related equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination