CN115147645A

CN115147645A - Membrane module membrane pollution detection method based on multi-feature information fusion

Info

Publication number: CN115147645A
Application number: CN202210729844.4A
Authority: CN
Inventors: 石耀科; 王志文; 巩彬; 杜先君; 卢延荣; 李龙; 令国壁
Original assignee: Lanzhou University of Technology
Current assignee: Lanzhou University of Technology
Priority date: 2022-05-25
Filing date: 2022-06-24
Publication date: 2022-10-04

Abstract

The invention discloses a membrane module membrane pollution detection method based on multi-feature information fusion, which comprises the following steps: collecting membrane pollution data; classifying and coding the membrane pollution data; expanding the data set by using an image processing technology to obtain a time domain picture set, and obtaining a frequency domain data set by using image Fourier transform; dividing a time domain picture set and a frequency domain data set into a training set, a test set and a verification set according to a certain proportion; constructing a CBAM-MIL-CNN network model, and adjusting and optimizing the CBAM-MIL-CNN network model; and inputting the data information of the test set into the optimized CBAM-MIL-CNN network model, and identifying the membrane pollution state of the membrane module. The network model is applied to membrane module membrane pollution detection, has excellent comprehensive performance, can effectively realize efficient classification and positioning of all membrane pollution, reduces energy consumption while improving effluent quality through membrane water treatment, and provides a theoretical basis for actual production.

Description

Membrane module membrane pollution detection method based on multi-feature information fusion

Technical Field

The invention belongs to the technical field of membrane component membrane pollution detection.

Background

The Membrane Bioreactor (MBR) is a novel sewage and wastewater treatment system organically combining a membrane separation technology and a biological treatment technology, and compared with the traditional activated sludge process, the MBR has the advantages of high effluent quality, high operation organic load, low sludge production, easiness in realizing automatic control and the like, but also has the problems of easiness in blocking of an ultrafiltration microfiltration membrane and serious membrane pollution. Membrane pollution is a main factor for membrane module failure, and the membrane pollution causes damage to the membrane module in different degrees, such as reduction of water outlet efficiency and influence on water quality, even replacement of the membrane module is needed, and operation cost is increased. Therefore, the membrane module membrane fouling diagnosis technology is gradually becoming the research focus in the water treatment field. Because the mutual relationship of complicated and strong correlation coupling exists in the membrane module and the interior of the membrane module, uncertainty factors and uncertainty information are also involved, so that faults with the properties of randomness, secondary performance, concurrency, transmissibility and the like frequently occur, the correlation relationship among the constituent units is difficult to find by the traditional fault diagnosis method aiming at single equipment, subsystems and subunits, and the probability of misdiagnosis and missed diagnosis is extremely high. Unlike the shallow learning algorithm, the deep learning algorithm has a better ability to approximate complex functions. The algorithm generally comprises a multi-hidden-layer structure so as to realize the layer-by-layer conversion of data characteristics and ensure the most effective information extraction and characteristic expression.

In view of the dynamic and nonlinear characteristics of membrane water treatment systems, conventional diagnostic models are inefficient, ignoring potential, valuable features during the offline modeling phase, resulting in false alarms and inaccurate interpolation. Methods based on Probabilistic Principal Component Analysis (PPCA) have been widely used in the field of process monitoring. However, the conventional PPCA method is still limited to linear dimension reduction, and although the non-linear projection model of PPCA can be obtained by mapping through a gaussian process, the model still lacks robustness and is easily affected by process noise. Therefore, wang et al propose a nonlinear process monitoring and fault diagnosis method based on a bayesian-gaussian latent variable model (Bay-GPLVM) that is more robust because the Bay-GPLVM can obtain posterior distribution rather than point estimates of latent variables. Baklouti et al propose a maximum double adaptive Exponential Weighted Moving Average (EWMA) based on particle filtering for wastewater treatment process fault detection, enhancing the fault detection of wastewater treatment processes by monitoring state variables of the model, applying developed strategies for detecting mean faults or/and drifts in the system, wherein the particle filtering method is used to estimate the non-linear unknown state of the process. However, particle filters can create uncertainty in the model when estimating the states and parameters of a time-varying nonlinear system. In addition, most practical systems are multivariate and uncertain, and process models are also not available. Therefore, to extend to multivariable systems, it is also necessary to use data-driven models, including latent variable models, to account for uncertainties in the data. The Chemid provides a fault detection method based on parameter estimation by using multi-parameter planning, wherein a nonlinear ordinary differential equation model is converted into an algebraic equation by using an Euler method, then a quadratic system of the parametric nonlinear algebraic equation is obtained by formulating a Karush-Kuhn-Tucker (KKT) optimality condition, then an equation representing the KKT condition is solved through symbols, a model parameter is obtained and used as an explicit function of measurement, and the estimated model parameter is compared with normal operation of fault detection. If the residuals of the model parameters exceed a certain threshold, a fault is detected. On this basis, the substrate concentration, inhibition coefficient and specific growth rate in the influent water were considered by Che Mid as model parameters and obtained as an explicit function of the measured values using multi-parameter planning and monitored for fault detection and diagnosis. Ba-Alawi et al propose an inclusive framework for missing data interpolation and sensor self-verification based on Variational Automatic Encoder (VAE) and deep residual error network architecture (ResNet VAE) integration, automatically extract complex features by learning the potential probability distribution of input data, reduce the risk of gradient disappearance, and improve the reliability of a faulty sensor by inputting missing data, detecting anomalies, identifying fault sources, and reconstructing the fault data to a normal state. Qiao et al propose a diagnostic method of Data Knowledge Driven (DKD) for detecting fault points and root cause variables. The DKD model combines the advantages of data-driven and knowledge-based methods, and can extract causal relationships and probabilities between process variables and identify root cause variables from latent fault variables, thereby improving diagnostic performance. In order to ensure the process safety and the effluent quality, han et al propose an Intelligent Fault Detection (IFD) method based on a self-organizing type 2 fuzzy neural network (SOT 2 FNN) and an intelligent identification method, which is used for detecting and identifying different types of faults. Based on a data-driven model and an intelligent recognition algorithm, an information transmission intensity algorithm and a self-adaptive second-order algorithm, a Sludge Volume Index (SVI) is predicted with high precision, and relevant information is extracted by using a target relevance recognition algorithm (TRIA) to recognize fault types. However, the interrelationship behind these approaches focuses on the correlation between variables, rather than the causal relationship, which suggests that a set of variables is a possible cause of a fault occurrence and that the true root cause variable for the fault occurrence cannot be found.

In recent years, deep learning is taken as a breakthrough in the field of modern artificial intelligence, and valuable features can be automatically learned from original feature sets and even original data, which means that the deep learning can largely get rid of dependence on advanced signal processing technology, artificial feature extraction and tedious feature selection technology, so that the deep learning is widely applied in the field of fault diagnosis with strong learning capability and feature extraction capability. Zhao et al propose a fault diagnosis method based on Deep Belief Networks (DBNs), which adaptively extracts features from an original time series signal, increasing flexibility, and simulation results show the effectiveness of the method in fault diagnosis. The structural parameters of a typical DBN model are determined by a learning rate, so that Zhang, liu and the like realize the improvement of the fault diagnosis precision by applying the optimized DBN. The deep learning model needs a large amount of data optimization parameters, and an overfitting phenomenon easily occurs. More and more researchers optimize the convolutional neural network, simplify the diagnosis process, improve the diagnosis efficiency and performance, and verify the superiority of the CNN network in the problems. Zhang et al processed the data using a backward difference strategy and introduced a method with global average pooling (global average pooling, GAP) convolution neural network (CNN-GAP) for feature extraction and fault classification, and experimental results show that the method has advantages in the aspects of diagnosis accuracy and reliability. Wang et al propose a fault detection model under unbalanced data conditions based on Wavelet Packet Decomposition (WPD) and a double convolutional neural network (biCNN), where WPD obtains more abundant information of various time and frequency scales from collected fault samples, which is helpful to solve the problem of data imbalance. The improved biCNN combines partial and full convolution stages for feature extraction and fault detection. Furthermore, the attention-based model successfully ameliorates this problem since CNNs cannot autonomously select important channels. CNN is optimized to different degrees from Squeeze and Excitation Networks (SENet), selective Kernel Networks (SKNet) and volumetric block association modules (ECA), and the like, zhang Gongbin and the like realize feature fusion of heterogeneous layers by using the heterogeneous layer features with good complementarity in SENET, and the accuracy is improved. Fu et al added an ECA module in YOLOv4 and verified the effectiveness of the improved algorithm.

In the above documents, when the problem of analog circuit fault is studied, the problem is only from the perspective of time domain or frequency domain, and the fault diagnosis in deep learning has the problems of complex model, difficult extraction of essential features, and the like.

Disclosure of Invention

The invention aims to provide a membrane assembly membrane pollution detection method based on multi-feature information fusion.

Based on the purpose, the invention adopts the following technical scheme:

the membrane component membrane pollution detection method based on multi-feature information fusion comprises the following steps:

1) Collecting membrane pollution data;

2) Classifying and coding the membrane pollution data;

3) Expanding the data set by using an image processing technology to obtain a time domain picture set, and obtaining a frequency domain data set by using image Fourier transform;

4) Dividing the time domain picture set and the frequency domain data set in the step 3) into a training set, a test set and a verification set respectively according to a certain proportion;

5) Building a CBAM-MIL-CNN network model, inputting training set data information into the CBAM-MIL-CNN network model, verifying the error of the CBAM-MIL-CNN network model, inputting the verification set data information into the CBAM-MIL-CNN network model, and adjusting and optimizing the CBAM-MIL-CNN network model;

6) And inputting the data information of the test set into the optimized CBAM-MIL-CNN network model, and identifying the membrane pollution state of the membrane module.

And step 5) the CBAM-MIL-CNN model is composed of a network model unit 1, a network model unit 2 and a mode identification unit 3, wherein the network model unit 1 and the network model unit 2 have the same network structure, the network structures of the network model unit 1 and the network model unit 2 respectively comprise an input layer, a convolution layer a, a convolution layer b, a convolution layer c, a convolution layer d, a convolution layer e and a CBAM module, wherein activation functions are added to the convolution layer a, the convolution layer b and the convolution layer e respectively, batch normalization layers and pooling layers are sequentially added to the convolution layer a, the convolution layer b and the convolution layer e respectively, the CBAM module is connected to the output ends of the convolution layer a and the convolution layer b or the output ends of the batch normalization layers or the pooling layers on the convolution layer a and the convolution layer b, and the mode identification unit 3 splices and outputs the characteristic information of the network model unit 1 and the network model unit 2 by utilizing the full connection layers, and sends the characteristic information to a softmax classifier for classification and identification.

And the output ends of the convolution layer a and the convolution layer b are sequentially added with a batch normalization layer, a CBAM module and a pooling layer.

In step 5), time domain picture set information is input into the network model unit 1, and frequency domain data set information is input into the network model unit 2.

In the step 5), the activation function is a Relu activation function, the CBAM-MIL-CNN network model optimization strategy is an AdamW optimizer, and the AdamW optimizer consists of an Adam optimizer and weight attenuation.

In the step 5), the convolution kernel of the convolution layer a has the size of 3*3, the step size of 4*4 and the number of channels of 64, and the convolution, batch normalization, CBAM module feature extraction and pooling are carried out to obtain the output with the size of 30 × 64; the convolution kernel of the convolution layer b is 5*5, the step length is 1*1, the number of channels is 128, and the output with the size of 12 × 128 is obtained after convolution, batch normalization, CBAM module feature extraction and pooling; convolution kernels of the convolution layers c and d are 3*3, the step length is 1*1, the number of channels is 256, and output with the size of 12 × 256 is obtained after convolution; the convolution kernel of the convolution layer e is 3*3, the step length is 1*1, the number of channels is 64, and the convolution, batch normalization and pooling are carried out to obtain the output with the size of 5 × 64; the number of the fully-connected layers of the pattern recognition unit 3 is 2, and the number of the 2 fully-connected layers is set to 2048 and 512 respectively.

In step 5), the CBAM module includes a channel attention module and a space attention module.

The membrane component is a serial tubular membrane component or a parallel hollow fiber membrane component.

Compared with the prior art, the invention has the following beneficial effects:

the invention constructs a model (CBAM-MIL-CNN) based on the combination of a multi-input convolutional neural network and a self-attention mechanism, and is applied to the detection of membrane module membrane pollution; secondly, a Conditional Block Association Module (CBAM) module is added behind the batch normalization layer, so that the complexity of the model can be effectively reduced, and the network performance can be improved; the network model has excellent comprehensive performance in a membrane pollution diagnosis experiment of a serial tubular membrane device and a parallel hollow fiber membrane component, and can effectively realize efficient classification and positioning of all membrane pollution, so that the membrane water treatment can improve the quality of effluent water and reduce energy consumption at the same time, and a theoretical basis is provided for actual production.

Drawings

FIG. 1, (a) shows the MIL-CNN model structure; (b) is CBAM structure chart; (c) a CBAM typical structure of a channel attention module; (d) is a typical structure of a spatial attention module SAM;

FIG. 2 is a CBAM-MIL-CNN model of the present invention;

FIG. 3, (a) training set, test set, and validation set test results for the model of the present invention; (b) A loss function of different deep learning optimization algorithms in training parameters;

FIG. 4, (a) is a membrane fouling signal characteristic diagram of the tandem tubular membrane modules; (b) membrane pollution characteristic distribution of the tandem tubular membrane module;

in the context of figure 5, it is shown, (a) is a series-type tubular membrane independent membrane pollution diagnosis experimental diagram; (b) The relationship between the membrane pollution characteristic loss function and the iteration times of the serial management membrane device is disclosed;

fig. 6, (e) decomposing and extracting feature maps of the 9 types of fault data by using Wavelet Transform (WTF); (f) An energy change diagram formed by the features after the LargeVis algorithm dimensionality reduction;

FIG. 7 is a graph showing the comparison of the diagnostic results of the inputs of the BP, SVM and ELM networks and the deep network model;

FIG. 8 (a) is a membrane fouling signal characteristic diagram of a parallel hollow fiber membrane module; (b) The membrane is distributed according to the membrane pollution characteristics of a parallel hollow fiber membrane component;

FIG. 9, (c) a diagnostic test chart of the contamination of the hollow fiber membrane independent membrane; (d) The relationship between the membrane pollution characteristic loss function and the iteration times of the parallel hollow fiber membrane component is obtained;

fig. 10, (e) is a wavelet transformation energy spectrum of each membrane pollution mode of the hollow fiber membrane; (f) a hollow fiber membrane LargeVis algorithm characteristic dimension reduction diagram;

FIG. 11 is a graph of the accuracy and average operating time of a diagnostic experiment of a parallel hollow fiber membrane module;

fig. 12 shows comparison results of ablation experiment performance.

Detailed Description

The invention is further described below with reference to the following detailed description and the accompanying drawings.

Legacy CNN and BN layers

CNN is a kind of feedforward neural network containing convolution calculation and having a deep structure, and the basic structure is formed by cascading a convolution layer, a pooling layer, an activation layer and a full-connection layer.

1) The essence of convolution as the core convolution of CNN is a mathematical operation, and the calculation formula of convolution is:

where f is the activation function, l is the number of layers in the network, K is the convolution kernel,

an index vector for the feature map in the layer,

is the bias of the jth cell of the ith layer.

The convolutional neural network can effectively extract features. In addition, the local connection and weight sharing mode adopted by the CNN reduces the complexity of the deep network, and reduces the risk of overfitting.

The deep neural network has the problems of difficult training and low convergence speed along with the deepening of the network depth. Adding a batch normalization layer (BN) after the convolutional layer is effective in improvement. Any neuron input value of each layer of neural network is converted into standard normal distribution with the mean value of 0 and the variance of 1 by utilizing a standardized means, so that a larger gradient is obtained, the gradient can be effectively prevented from disappearing, the learning convergence speed is accelerated, and the training speed is effectively improved.

The forward conduction process of the BN layer is as follows:

1) The sample mean is calculated.

Wherein m is the number of samples and x is the sample.

2) The sample variance is calculated.

3) And 6, sample data standardization processing.

Wherein epsilon is a random value that ensures that the denominator is not 0.

4) And performing translation and scaling processing.

Wherein, gamma and beta are learning parameters.

MIL-CNN model

In order to fully utilize the capability of CNN for powerful feature extraction, a multiple-conditional neural networks (MIL-CNN) model is adopted, and the structure of the model is shown in fig. 1 (a). Compared with the traditional CNN, the multi-input layer of the MIL-CNN has the advantage of combining the time domain information graph and the frequency domain information graph of fault data, so that the characteristic extraction is more comprehensive, and the accuracy of fault diagnosis is effectively improved. The specific steps of the MIL-CNN are as follows:

1) Respectively sending the images to a model unit 1 (Net 1) and a model unit 2 (Net 2) for operation operations such as convolution pooling and the like, and forwarding information by using a forward function;

2) Aggregating the information of Net1 and Net2, and processing the information by using a model unit 3 (Net 3) full connection layer;

3) Calculating cross entropy loss based on the output of the label and the soft-max layer;

4) The back propagation loss and the weight and deviation in Net3 are updated;

5) Updating parameters in Net 1;

6) And updating the parameters in Net 2.

CBAM module

The accurate extraction of the fault characteristics is a premise for improving the fault diagnosis accuracy. Each channel of the default Feature Map (FM) in the convolutional pooling of the convolutional neural network is equally important, but it is not reasonable to assume that the channel importance is the same due to the difference in the importance of the information carried. The convolution attention module CBAM is based on the processing mechanism of the human visual system, ignores unimportant factors, and puts all attention to important areas so as to improve the classification accuracy. Specifically, each piece of information is assigned with different weights, and the information is more important when the weight is larger. The CBAM structure is shown in fig. 1 (b), and comprises two independent sub-modules, a Channel Attention Module (CAM) and a Spatial Attention Module (SAM). Compared with SE-Net, the channel attention module CAM is added with a parallel Global Maximum Pooling (GMP) module and a space attention SAM module, so that the obtained information is more comprehensive, the distribution of the information importance degree is more reasonable, and the subsequent diagnosis precision improvement is greatly facilitated.

Fig. 1 (C) shows a typical structure of a CBAM, where an input feature map F (hwc) is subjected to global maximum pooling (global max pooling) and global average pooling (global average pooling) based on width and height, respectively, to obtain two feature maps of 11C, which are then fed into a double-layer neural network (MLP), where the number of neurons in the first layer is C/r (r is a reduction rate) and the activation function is ReLU. The second layer of neurons has a C number and the two layers of neuron networks are shared. Then, MLP output characteristics are summed based on elements, and then sigmoid activation operation is carried out to generate final channel attention characteristics, namely M _ c. And finally, performing element level multiplication operation by using the M _ c and the input feature map F to generate the input features required by the space attention module.

CAM：

Fig. 1 (d) shows a typical structure of the spatial attention module SAM. The output profile F' of the channel attention module is used as the input profile of the module. First, global maximum pooling and global average pooling based on channels are performed to obtain two hw1 feature maps, and then connection operation (channel splicing) is performed based on the channels. And then after 7*7 convolution operation (7*7 is better than 3*3), the dimension is reduced to 1 channel, namely H W. sigmoid generates a spatial attention feature, which is M _ s. And finally, multiplying the obtained result by the characteristic diagram input by the module to finally obtain the required characteristic.

SAM:

Fourier transform of image

An image F (x, y) of size M × N pixels, whose discrete fourier transform F (u, v) is given by equation (6):

wherein u =0,1,2,3 … M-1; v =0,1,2,3 … N-1.

F (x, y) can be obtained by inverse fourier transform according to F (u, v), as shown in equation (7):

wherein x =0,1,2,3 … M-1; y =0,1,2,3 … N-1.

The expression (6) and the expression (7) constitute a two-dimensional discrete fourier transform pair of images. In the formula: the variables u and v are transform or frequency components, and x and y are spatial or image variables. According to the fourier transform formula (6), F (u, v) is a frequency domain image spectrum, and the intensity of the image signal F (x, y) at each frequency point (u, v) can be obtained by extracting the amplitude of the frequency domain image spectrum. The amplitude spectrum, the phase spectrum and the energy spectrum of the Fourier transform are respectively as follows:

E(u,v)＝R ² (u,v)+I ² (u,v) (10)

where R (u, v) and I (u, v) are the real and imaginary parts of F (u, v), respectively.

Examples

1) Collecting membrane pollution data;

2) Classifying and coding the membrane pollution data;

3) Enlarging the data set by using an image processing technology to obtain a time domain picture set, and obtaining a frequency domain data set by using image Fourier transform;

4) Dividing the time domain picture set and the frequency domain data set in the step 3) into a training set, a test set and a verification set according to a certain proportion;

5) Constructing a CBAM-MIL-CNN network model, inputting training set data information into the CBAM-MIL-CNN network model, verifying errors of the CBAM-MIL-CNN network model, inputting verification set data information into the CBAM-MIL-CNN network model, and adjusting and optimizing the CBAM-MIL-CNN network model;

Step 5) the CBAM-MIL-CNN model is composed of a network model unit 1, a network model unit 2, and a pattern recognition unit 3, as shown in fig. 2, the network model unit 1 (Net 1) and the network model unit 2 (Net 2) have the same network structure, the network structures of the network model unit 1 (Net 1) and the network model unit 2 (Net 2) each include an input layer, a convolution layer a (in the figure, i.e., convolution layer a + ReLU activation function 1) to which a ReLU activation function is added, a convolution layer b (in the figure, i.e., convolution layer b + ReLU activation function 7), a convolution layer c (in the figure, i.e., convolution layer c + ReLU activation function 8), a convolution layer d (i.e., convolution layer d + ReLU activation function 9), and a convolution layer e (i.e., convolution layer e + ReLU activation function 10), 256 × 256 time domain picture set information is input into the network model unit 1, 256 × 256 frequency domain data set information is input into the network model unit 2, the convolutional layer a + ReLU activation function 1 and the convolutional layer b + ReLU activation function 7 are respectively and sequentially added with a BN layer 3, a CBAM module 6 and a pooling layer 2, the CBAM module 6 comprises a channel attention module and a space attention module, the BN layer 3 and the pooling layer 2 are added into the convolutional layer e + ReLU activation function 10, the pattern recognition unit 3 uses the full connection layer 4 to splice and output the feature information of the network model unit 1 and the network model unit 2, and sends the feature information into a softmax classifier (soft-max layer 5) to classify and recognize, and uses a cross entropy loss function.

In the step 5), the CBAM-MIL-CNN network model optimization strategy is an AdamW optimizer, and the AdamW optimizer consists of an Adam optimizer and weight attenuation.

In step 5), inputting 256 × 256 fault picture sets generated by overlapping sampling into the convolutional layer for feature extraction, wherein the parameters are set as follows: the convolution kernel of convolution layer a is 3*3, the step length is 4*4, the number of channels is 64, after convolution, the output with the size of 62 × 64 is obtained, after convolution layer a, a ReLU activation function is connected to reserve the effect of convolution layer a, the nonlinear expression capability is improved to obtain the output with the size of 62 × 64, and a BN layer acceleration network is added to converge to obtain the output with the size of 62 × 64; inputting a CBAM module for feature extraction and splicing, and then connecting the CBAM module with 3*3 and a pooling layer with 2*2 step length for reducing parameter quantity to accelerate network learning, and obtaining output with 30 × 64; the convolution kernel of the convolution layer b is 5*5, the step length is 1*1, the number of channels is 128, the convolution is carried out to obtain 26 × 128 output, and the ReLU activation function and the BN layer are added to obtain 26 × 128 output; inputting a CBAM module for feature extraction and splicing, and then connecting a pooling layer with the size of 3*3 and the step length of 2*2 to obtain 12 × 128 output; the convolution kernel of the convolution layer c and the convolution layer d is 3*3, the step length is 1*1, the number of channels is 256, and the convolution is carried out to obtain the output with the size of 12 × 256; the convolution kernel of convolution layer e is 3*3, the step size is 1*1, the number of channels is 64, the convolution is performed to obtain the output with the size of 12 × 64, the output size is not changed by connecting the ReLU activation function and the BN layer (batch normalization layer), the pooling layer with the size of 3*3 and the step size of 2*2 is added to obtain the output of 5 × 5 64; the number of the full connection layers of the pattern recognition unit 3 is 2, the number of the 2 full connection layers is respectively set to 2048 and 512, and finally 9 fault probability judgments are carried out by utilizing the softmax layer for output.

Wherein: the CBAM module performs global maximum pooling and global average pooling on each channel respectively for the input feature map F (C × H × W) to obtain C values! Then, taking C value as input of input layer of full-connection neural network, setting the compression of neuron number of intermediate hidden layer as C/r (r is compression multiple), setting the neuron number of output layer as C, respectively obtaining result (using Relu activation function in hidden layer and Sigmoid activation function in output layer)! And global maximum pooling obtains a weight of 1 multiplied by C, global average pooling obtains a weight of 1 multiplied by C, then the corresponding positions of the 2 weight maps of 1 multiplied by C are added, and finally the dimension of the M _ C of the result of Channel Attention is 1 multiplied by C after the result is output by using a Sigmoid activation function.

CBAM-MIL-CNN model structure parameter

When the model is built, the superposition use effect of a plurality of small convolution kernels is far better than the independent use effect of one large convolution kernel, and the parameter number and the calculation complexity are greatly reduced under the condition that the connectivity is not changed. Of course, it is not as small as possible, and the invention selects a number of relatively small convolution kernels to perform the convolution. Deep learning models are typically trained by stochastic gradient descent algorithms. There are many variations of the stochastic gradient descent algorithm: such as Adam, RMSProp, adapelta, etc. These algorithms require a learning rate to be set in advance. The learning rate determines the distance the weight moves in the gradient direction in a mini-batch. A low learning rate may ensure the retention of local minima, but the training process is lengthy and prone to overfitting. The high learning rate reduces the training time, but is easy to cause gradient explosion, and although the BN layer can effectively relieve the problem, the proper learning rate still has a non-negligible influence on the excellence of the model. Therefore, training should start with a relatively large learning rate because at the beginning, the initial random weights are far from the optimal values. During training, the learning rate should be decreased to allow fine-grained weight updates.

Unlike the conventional fixed learning rate, in the present invention, a learning rate attenuation factor α is set, a learning period is set to t, and every other period, the learning rate τ is multiplied by α, which is expressed as:

τ _t+1 ＝τ _t ·α (11)

where α =0.1.

The model was tested by randomly selecting the training set, test set and validation set of the training set, as shown in FIG. 3 (a), and when the model was trained using the decaying learning rate, the loss value was 10 ^-6 Left and right, and gradually tends to be stable. The loss value of the fixed learning rate is far from the loss value of the fixed learning rate, and cannot be stable under the same iteration times, but the dynamic learning rate adopted by the method has non-negligible importance on the stability of the model.

In the experimental process, the AdamW optimizer is adopted to continuously update the network training parameters, the dynamic learning rate is used to train the network, the loss functions of the different deep learning optimization algorithms in training the parameters are compared, as shown in fig. 3 (b). Although the adapelta optimization algorithm does not depend on the global learning rate, the acceleration effect is good in the early and middle stages of training, only the items with fixed weight values are accumulated, and the items are not directly stored, and only the corresponding average values are approximately calculated, so that the adapelta optimization algorithm repeatedly shakes around the local minimum value in the later stage of training. While the RMSprop optimization algorithm still relies on the global learning rate. The Adam optimization algorithm dynamically adjusts the learning rate of each parameter by using the first moment estimation and the second moment estimation of the gradient, so that the learning rate has a step length in a fixed range during each updating, and the parameters are kept stable during updating. The Adam algorithm combines the advantages of the Adadelta algorithm which is good at processing sparse gradients and the RMSprop which is good at processing non-stationary targets, and calculates different adaptive learning rates for different parameters. In the invention, an AdamW optimizer, namely the Adam optimizer plus the weight attenuation, is adopted, the effect is the same as that of Adam plus L2 regularization, but the calculation efficiency is higher, because the L2 regularization needs to calculate the gradient after adding the regularization term into the loss function and finally carry out back propagation, and AdamW directly adds the gradient of the regularization term into a back propagation formula, thereby avoiding the manual addition of the regularization term into the loss function.

The CBAM-MIL-CNN model is constructed on a convolutional neural network model, and related parameters are shown in the following table 1.

TABLE 1 CBAM-MIL-CNN model structural parameters

Subject and membrane fouling data acquisition process

By adopting Computational Fluid Dynamics (CFD) software, aiming at the problem that membrane flux is easily influenced by inflow, temperature and the like, so that membrane pollution occurs, the experiment uses a serial tubular membrane device and a parallel hollow fiber membrane component as research objects, and accurately classifies factors causing the membrane pollution in the serial tubular membrane device and the parallel hollow fiber membrane component.

The invention utilizes overlapping sampling to enhance data, obtains more training samples and enhances the generalization capability of the machine learning model. When data enhancement is carried out by using overlapping sampling, namely training fault characteristics are obtained, each section of fault characteristics is partially overlapped with the next section of characteristics. Adopting Computational Fluid Dynamics (CFD) software to simulate and simulate the water yield in the MBR system to acquire fault data, wherein 168000 points are sampled for each type of fault in simulation time, the length of each fault sample is 65536, and the offset is 1024, so that 100 samples can be manufactured after overlapping sampling. The faulty samples are converted after normalization into a set of grayscale images of 256 x 256 in size.

According to the invention, each experimental object selects 9 types of membrane pollution, each type of membrane pollution is sampled 100 times, each time 256 × 100 points are sampled, the experimental object is divided into images with the size of 256 × 256, each type of 100 faults and 9 types of faults are 900 images, a label (code) is added to each type of fault, and a training set, a verification set and a test set are selected according to the proportion of 7:2: 1.

Table 2 shows the membrane fouling pattern of the membrane unit, wherein the membrane fouling was mainly affected by the difference in COD concentration (C) of the inlet and outlet water, the difference in BOD concentration (B) of the inlet and outlet water, the solid concentration (X) of the mixed suspension, and the hydraulic retention time (H) when the transmembrane pressure difference was constant.

TABLE 2 Membrane device Membrane fouling modes

Failure mode of membrane fouling	Species of	Tolerance of ¹	Tolerance of ²
				f1	Without fault	-	-
f2	Greater C		5％					5％
				f3	C is smaller than	5％	5％
f4	Greater B		5％					5％
				f5	B is smaller	5％	5％
fi5	Greater X	7％	5％
				f7	X is smaller	7％	5％
f8	Greater H		7％					5％
				f9	H is smaller	7％	5％

In order to better accelerate the training speed of the network model and enable the data to be convenient for calculation and obtain more generalized results, the input data is subjected to standardization processing, and the mathematical expression is as follows:

experimental procedure

The experimental processes of the invention are respectively fault data acquisition, fault classification coding, data preprocessing, data analysis and division, CBAM-MIL-CNN model building, predictive coding, result analysis and the like. The method comprises the following specific steps:

1) Collecting membrane fouling data;

2) Classifying and coding the membrane pollution data;

3) Expanding the data set by utilizing a resampling image processing technology to obtain a time domain picture set, and obtaining a frequency domain data set by utilizing image Fourier transform;

4) Respectively dividing the time domain picture set and the frequency domain data set in the step 3) into a training set, a testing set and a verification set according to the proportion of 7;

6) The actual codes of the test set and the model generation prediction codes are subjected to coding comparison, if the prediction codes are consistent with the actual coding results, the classification is correct, and if the prediction codes are inconsistent with the actual coding results, the classification is wrong;

7) Further analyzing CBAM-MIL-CNN model from average accuracy, average precision, average recall, running time, and coefficient of determination R ² And from the aspect of angle, judging the performance of the model.

1. Membrane pollution diagnosis result and analysis of tandem tubular membrane module

The membrane pollution modes of the tandem tubular membrane modules are respectively set to be overlarge, undersize and normal. Wherein the tolerance for the difference in COD concentration of the inlet and outlet water and the difference in BOD concentration of the inlet and outlet water is set to 5%, and the tolerance for the solid concentration of the mixed suspension and the hydraulic retention time is set to 7%, as shown in Table 2 above ¹ As shown. When the magnitude of the membrane fouling influence factor in the tandem tubular membrane device exceeds a set tolerance, the membrane fouling influence factor is over-large, and when the magnitude of the membrane fouling influence factor is lower than the set tolerance, the membrane fouling influence factor is over-largeIf the set tolerance is too small, the film contamination influence factor is within the set tolerance, indicating that the film contamination influence factor is normal. According to the importance analysis of the membrane pollution factors, the COD concentration difference of inlet and outlet water, the BOD concentration difference of inlet and outlet water, the solid concentration of mixed suspension and the hydraulic retention time have obvious influence on the membrane pollution, so the four influencing factors are selected as research objects for analysis.

FIGS. 4-7 show the results of membrane fouling diagnosis experiments for tandem tubular membrane modules. Wherein: fig. 4 (a) is a membrane fouling signal characteristic diagram of the tandem tubular membrane module, and the membrane fouling characteristics can be accurately extracted. Fig. 4 (b) shows membrane fouling characteristics distribution of the tandem tubular membrane modules, and it can be seen that only a small amount of f3 and f5 membrane fouling types overlap, the separation degree between the remaining membrane fouling type data is very high, and the same membrane fouling type data is tightly aggregated, which is beneficial to improving the correct diagnosis rate of the fault diagnosis model of the invention. In 10 independent membrane fouling diagnostic experiments, 9 faults were accurately identified without error, as shown in fig. 5 (a). FIG. 5 (b) is the relationship between the membrane fouling characteristic loss function and the iteration number of the tandem tubular membrane module, and the final loss function value is 10 ^-8 And the model is close and stable, which shows that the model has excellent robustness. Meanwhile, the diagnosis method adopted by the invention is compared with some traditional fault diagnosis methods for comparative experiments. Because a large amount of data are extracted in a short time by a data set, the difference between the data is slight, and the data contrast is not strong enough, the characteristics of three models, namely a Back Propagation (BP) neural network, a Support Vector Machine (SVM) and an Extreme Learning Machine (ELM), cannot be accurately extracted, and thus the membrane pollution data cannot be effectively classified. Therefore, when the conventional method is adopted, the data needs to be preprocessed first. The method comprises the following steps: the 9 types of fault data are decomposed by Wavelet Transform (WTF) to extract features, as shown in fig. 6 (e). Feature dimension reduction is performed on the 9 types of fault data by using the LargeVis algorithm, and as shown in fig. 6 (f), an energy change diagram of feature components after the dimension reduction is performed by the LargeVis algorithm is shown, and the feature data are more obvious than the previous difference. These features are used as inputs to BP, SVM and ELM networks for diagnostic testing, andthe results of the comparison with the deep network model are shown in fig. 7.

As can be seen from fig. 7, the membrane fouling diagnosis can also be realized by shallow learning after data processing, but the accuracy is low, and in 10 experimental tests of the tandem tubular membrane device, the accuracy of the membrane fouling diagnosis model is not high, and the diagnosis accuracy is below 75%. The average accuracy of the WTF-LargeVis-ELM diagnostic model is only 57.39%, which is the worst diagnostic effect of all diagnostic models. After the deep network is adopted, the diagnosis accuracy is obviously improved, the membrane pollution diagnosis accuracy of the fault diagnosis model based on the deep learning method is maintained at a higher level, and the superiority of the deep network in the membrane pollution diagnosis of the tandem tubular membrane module is proved. The MIL-CNN network is superior to the traditional CNN network, and the membrane pollution diagnosis rate is improved to different degrees after the SENet module and the SKNet attention module are added in the MIL-CNN network. The average accuracy of the four methods of MIL-CNN, SEnet-CNN and SKNet-CNN is 96.16%, 95.48%, 96.25% and 95.76%, wherein the CBAM-MIL-CNN network membrane pollution diagnosis effect is optimal, the number of times of error-free accurate classification is 3, the average accuracy is 98.47%, the diagnosis accuracy of 10 independent experiments is higher than that of other models, and the model can extract features more reasonably by adding the attention mechanism module. The method can reduce the parameter quantity and the over-fitting risk of the model in the aspect of membrane pollution diagnosis, improve the generalization capability of the model, accurately and quickly extract important characteristics of membrane pollution, and can keep higher membrane pollution diagnosis accuracy. Therefore, the diagnostic method of the invention has great advantages compared with other methods when the membrane pollution of the tandem tubular membrane module is diagnosed.

2. Membrane pollution diagnosis result and analysis of parallel hollow fiber membrane component

And (3) carrying out simulation verification by taking the parallel hollow fiber membrane module as an object. Selecting CFD software ANSYS, establishing a parallel hollow fiber membrane module to obtain membrane pollution data, setting the tolerance of COD concentration difference of inlet and outlet water, BOD concentration difference of inlet and outlet water, solid concentration of mixed suspension and hydraulic retention time to be 5%, using the tolerance as the membrane pollution diagnosis basis, and when the membrane pollution factor value is above or below the standard valueWithin 5% of the float is normal, above or below 5% is too large or too small, and the failure mode is class 9 (as in the tolerance in Table 2 above) ² Shown), the membrane fouling diagnosis experiment of the parallel hollow fiber membrane module was performed under the same experimental conditions. FIGS. 8 to 11 show the results of membrane fouling diagnosis experiments of the parallel hollow fiber membrane module. Wherein, fig. 8 (a) is a membrane fouling signal characteristic diagram of the parallel type hollow fiber membrane module, and fig. 8 (b) is a membrane fouling characteristic distribution of the parallel type hollow fiber membrane module. Likewise, 10 independent membrane pollution diagnosis experiments are performed on the parallel hollow fiber membrane module, and 9 faults can be accurately identified, as shown in fig. 9 (c). FIG. 9 (d) is the relationship between the membrane fouling characteristic loss function and the iteration number of the parallel hollow fiber membrane module. By adopting a data preprocessing method in membrane fouling diagnosis of the tandem tubular membrane module, a wavelet transformation energy spectrum and a LargeVis algorithm characteristic dimension reduction graph in each membrane fouling mode are shown in FIG. 10. In 10 diagnostic accuracy experiments of the parallel type hollow fiber membrane module, the accuracy of each diagnostic experiment, and the average accuracy and average running time of 10 diagnostic experiments are shown in fig. 11.

As can be known from fig. 8 to 11, although the line layer neural network and the support vector machine diagnostic model which are subjected to data processing can diagnose the membrane contamination, the misclassification and misjudgment are serious, and the accurate diagnosis of the membrane contamination cannot be completed, and particularly, the misclassification of the WTF + LargeVis + ELM model is close to 50%, and the model cannot be applied to actual production activities. Although the diagnosis accuracy of the WTF + LargeVis + BP model and the WTF + LargeVis + SVM model is improved to a certain extent compared with the WTF + LargeVis + ELM model, due to the defects of the structure and the performance of the model, the membrane pollution characteristics are not fully extracted, and the membrane pollution cannot be accurately diagnosed. The diagnosis performance of the MIL-CNN network optimized by using SENet and SKNet on membrane pollution is better than that of the MIL-CNN network, but still lower than that of the MIL-CNN network optimized by using a CBAM module. The CBAM can avoid reducing dimensionality, adaptively select the kernel size, simplify the complexity of the model and improve the diagnostic performance of the model. In the parallel hollow fiber membrane module membrane pollution diagnosis experiment, the CBAM-MIL-CNN membrane pollution diagnosis model has the highest diagnosis accuracy rate of 99.08 percent and the lowest diagnosis accuracy rate of 97.21 percent in 10 diagnosis experiments, the diagnosis accuracy rate is higher than that of other diagnosis models, the average accuracy rate is 98.19 percent, and the membrane pollution can be accurately and rapidly diagnosed.

The film pollution test of the tandem type hollow fiber film device and the film pollution test of the parallel type hollow fiber film component both verify the rationality and superiority of the network model of the invention, and the CBAM-MIL-CNN does not need complex data preprocessing, thereby greatly reducing the time required by the model; the time domain and frequency domain information is spliced, the obtained characteristics are more comprehensive, the membrane pollution of membrane devices with different structures is effectively extracted and classified, and compared with other methods, the diagnostic method has obvious advantages in membrane pollution diagnosis.

3. Diagnosis result and analysis of membrane fouling of different models under different noises

In the actual operation process of the membrane bioreactor, environmental noise exists when the membrane component treats sewage, and noise also exists due to the self characteristics of the membrane component, these noises generate unnecessary randomness when the membrane fouling data is collected, so that it is important to add a variable noise experiment in the membrane fouling diagnosis experiment. The method aims at that membrane pollution data of a parallel hollow fiber membrane module is a training sample, gaussian white noise with signal-to-noise ratio of-2-6 dB is added into a test sample, a CBAM-MIL-CNN model is adopted for membrane pollution diagnosis, and in order to verify the superiority of the network model in fault diagnosis, the membrane pollution diagnosis method is compared with MIL-CNN, squeze-and-Excitation Networks-MIL-CNN (SENET-CNN) and Selective Kernel Networks-MIL-CNN (SKNet-CNN), and meanwhile, the membrane pollution diagnosis method is improved by using an experiment model of internal fault diagnosis of simulation model, IEEE.115.10.10.10, WO.10.10, J.Wu, C.Deng, Z.Cheng, X.Shao, connected neutral network-based filtration diagnosis method for internal fault diagnosis of simulation, I. 185. And WO.10.10.10.10. Blend of fault diagnosis methods. Wu et al propose a chemical process fault diagnosis method (referred to as DCNN for short) based on a DCNN model, which is composed of a convolution layer, a pool layer, a leakage layer and a full connection layer. Li et al propose a three-step intelligent fault diagnosis method (CNN-BGM for short) based on CNN and Bayesian-Gaussian mixture (BGM). The results of the membrane fouling diagnosis obtained by the method herein were compared with other networks and analyzed, and the experimental results are shown in table 3.

TABLE 3 different methods for diagnosing accuracy under different noises

As can be seen from the comparative data in Table 3, the membrane module membrane fouling diagnosis accuracy based on CBAM-MIL-CNN is higher than that of other methods in the experimental results of different signal to noise ratios. Although the MIL-CNN-based diagnostic method can share a convolution kernel and automatically extract features, a gradient descent algorithm is adopted to easily enable a training result to be converged to a local minimum value rather than a global minimum value, meanwhile, a pooling layer can lose a large amount of valuable information, and correlation between the local part and the whole is ignored. The method based on SENet-MIL-CNN and SKNet-MIL-CNN starts from the relation between the characteristic channels, models and expresses the relation between the characteristic channels, enhances useful characteristics and inhibits useless characteristics according to the importance degree, improves the accuracy rate of membrane pollution diagnosis on the basis of the CNN network, and simultaneously maintains the diagnosis accuracy rate in a relatively stable range, but can not avoid dimension reduction, so that the model diagnosis accuracy rate is lower than that of the network model. The convolutional and pooling layers of the fault diagnosis method proposed by Wu et al are locally connected by filters, which helps to better extract local patterns or features, while overfitting can be avoided by using dropout layers and pooling layers, but since the model still relies on historical fault data samples, it is not suitable for fault diagnosis with no or less historical data. Li et al combine CNN with BGM to provide an end-to-end intelligent fault diagnosis method. The method can directly utilize the original signal to carry out end-to-end fault diagnosis without preprocessing the signal, so that the causal relationship between the condition characteristics and the corresponding fault types cannot be accurately described. The time domain information and the frequency domain information of fault data are used as input of CNN, the characteristics are extracted through the convolution layer, and then the time domain characteristics and the frequency domain characteristics are spliced by utilizing the full connection layer and input into a classifier for classification. The batch normalization layer in the model can effectively prevent gradient disappearance, the ReLU layer can improve the expression capability of the nonlinear model, the CBAM module can simplify the complexity of the model and improve the characteristic expression capability of the network, the pooling layer can improve the fault tolerance of the model, and compared with other membrane pollution diagnosis methods, the method has the advantages of higher diagnosis precision, better generalization capability and stronger anti-noise performance.

4. Model performance comparison experiment

An ablation experiment is carried out by utilizing a series tubular membrane module membrane pollution simulation data set, and the average accuracy, the average recall rate, the average time and the average decision coefficient R are utilized ² And 5 performances in total are taken as model judgment bases, and the performances of five models, namely CNN, MIL-CNN, CNN + BN + CBAM and MIL-CNN + BN + CBAM, are verified respectively. The results are shown in FIG. 12.

From the analysis of fig. 12, it can be seen that, after the BN layer and the CBAM module are respectively added to the CNN, the model performance is improved to different degrees, and the accuracy is improved to different extents while the running time is reduced. And 5 performance effects of the MIL-CNN + BN + CBAM model are superior to those of other four network models, so that the effectiveness and superiority of the CBAM-MIL-CNN model are verified.

Claims

1. The membrane component membrane pollution detection method based on multi-feature information fusion is characterized by comprising the following steps:

1) Collecting membrane fouling data;

2) Classifying and coding the membrane pollution data;

2. The membrane module membrane contamination detection method based on multi-feature information fusion as claimed in claim 1, wherein the CBAM-MIL-CNN model of step 5) is composed of a network model unit 1, a network model unit 2, and a pattern recognition unit 3, the network model unit 1 and the network model unit 2 have the same network structure, the network structure of the network model unit 1 and the network model unit 2 each includes an input layer, a convolution layer a, a convolution layer b, a convolution layer c, a convolution layer d, a convolution layer e and a CBAM module, to which an activation function is added respectively, the convolution layer a, the convolution layer b, and the convolution layer e are further added with a batch normalization layer and a pooling layer respectively, the CBAM module is connected to an output end of the convolution layer a, the convolution layer b, or an output end of the batch normalization layer or the pooling layer on the convolution layer a, the convolution layer b, and the pattern recognition unit 3 uses the full connection layer to splice and output the feature information of the network model unit 1 and the network model unit 2, and send the feature information to a softmax classifier to classify and recognize.

3. The membrane module membrane fouling detection method based on multi-feature information fusion as claimed in claim 2, characterized in that a batch normalization layer, a CBAM module and a pooling layer are sequentially added to the output ends of the convolution layer a and the convolution layer b.

4. The membrane module membrane fouling detection method based on multi-feature information fusion of claim 3, characterized in that in step 5), time domain picture set information is input in the network model unit 1, and frequency domain data set information is input in the network model unit 2.

5. The membrane module membrane pollution detection method based on multi-feature information fusion as claimed in claim 4, wherein in the step 5), the activation function is Relu activation function, the CBAM-MIL-CNN network model optimization strategy is AdamW optimizer, and the AdamW optimizer consists of Adam optimizer and weight attenuation.

6. The membrane module membrane contamination detection method based on multi-feature information fusion of claim 5, wherein in step 5), the convolution kernel of convolution layer a has a size of 3*3, the step size is 4*4, the number of channels is 64, and the output with a size of 30 × 64 is obtained after convolution, batch normalization, CBAM module feature extraction and pooling; the convolution kernel of the convolution layer b is 5*5, the step length is 1*1, the number of channels is 128, and the output with the size of 12 × 128 is obtained after convolution, batch normalization, CBAM module feature extraction and pooling; convolution kernels of the convolution layers c and d are 3*3, the step length is 1*1, the number of channels is 256, and output with the size of 12 × 256 is obtained after convolution; the convolution kernel of the convolution layer e is 3*3, the step length is 1*1, the number of channels is 64, and the convolution, batch normalization and pooling are carried out to obtain the output with the size of 5 × 64; the number of the fully-connected layers of the pattern recognition unit 3 is 2, and the number of the 2 fully-connected layers is respectively set to 2048 and 512.

7. The membrane module membrane fouling detection method based on multi-feature information fusion of claim 6, wherein in step 5), the CBAM module comprises a channel attention module and a space attention module.

8. The membrane module membrane fouling detection method based on multi-feature information fusion of claim 7, wherein the membrane module is a serial tubular membrane module or a parallel hollow fiber membrane module.