CN117422695A - CR-deep-based anomaly detection method - Google Patents

CR-deep-based anomaly detection method Download PDF

Info

Publication number
CN117422695A
CN117422695A CN202311477998.XA CN202311477998A CN117422695A CN 117422695 A CN117422695 A CN 117422695A CN 202311477998 A CN202311477998 A CN 202311477998A CN 117422695 A CN117422695 A CN 117422695A
Authority
CN
China
Prior art keywords
network
image
deep
abnormal
convolution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311477998.XA
Other languages
Chinese (zh)
Inventor
王子巍
彭道刚
吴立斌
李�杰
潘俊臻
王丹豪
陈晨
周威仪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Launchbyte Intelligent Technology Co ltd
Shanghai Electric Power University
Original Assignee
Shanghai Launchbyte Intelligent Technology Co ltd
Shanghai Electric Power University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Launchbyte Intelligent Technology Co ltd, Shanghai Electric Power University filed Critical Shanghai Launchbyte Intelligent Technology Co ltd
Priority to CN202311477998.XA priority Critical patent/CN117422695A/en
Publication of CN117422695A publication Critical patent/CN117422695A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0004Industrial image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Quality & Reliability (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to the technical field of background image processing, in particular to an anomaly detection method based on CR-deep. The method comprises the following steps: s1, acquiring an original abnormal image and constructing a data training set; s2, training on a CR-deep network; s3, obtaining a prediction feature map; s4, comparing the predicted feature map with the real segmented image set marked in the step S1, calculating the distance between the real value and the predicted value, constructing a loss function, and optimizing a network; s5, updating network parameters; s6, repeating the steps S3 to S5, and performing e times of network training; s7, analyzing the abnormal detection performance of the model; s8, deploying the trained model on a server, collecting images in real time, and then identifying and segmenting abnormal conditions by a CR-deep image segmentation algorithm. Compared with the prior art, the residual convolution modules with different scales are fused with the convolution attention module CBAM, and a CBAM convolution attention mechanism is added after the ASPP module, so that the integrity and reliability of model identification are improved.

Description

CR-deep-based anomaly detection method
Technical Field
The invention relates to the technical field of background image processing, in particular to an anomaly detection method based on CR-deep.
Background
The production process of the power plant has complex environment, has the characteristics of high temperature, high pressure, high noise, dense pipeline valves and the like, has great potential safety hazards when in abnormal conditions such as valve leakage and the like, not only seriously threatens the physical health and life safety of operators, but also can cause a series of faults and accidents which threaten the safe and stable operation of the generator set, and can also influence the efficiency and economic benefit of the electric energy production of the power plant.
Abnormal conditions of a power plant include: oil leakage phenomenon occurs in valves of a steam turbine running layer, a chemical region and the like. Aiming at the problems existing in the current power plant, the method for accurately detecting the abnormal condition of the valve by using the fixed-point camera has great practical significance on the sustainable power supply of the power plant.
The current common image segmentation method is mainly a segmentation method based on deep learning, and the segmentation method based on the deep learning mainly comprises two segmentation methods, namely a feature coding (feature encoder based) based and a region selection (regional proposal based) based. 1) Based on feature coding: four Chinese people such as Kaiming He in 2015 propose a convolution residual error network, and the algorithm is an efficient feature extraction detection algorithm. Convolutional residual networks are the most popular and also the neural networks accepted and used by vast academic subjects in the field of semantic segmentation. The original input image information is allowed to be directly transmitted to the later layers of the network in the network, and the network can be formed by splicing shallower networks and residual errors when the number of the network layers is large. Advantages and disadvantages of convolutional residual networks: a brand new network structure is designed, and in a new network, the number of network layers can be increased all the time; the feedforward and feedback algorithms can be calculated smoothly, and the construction is simplified; adding an identity map does not affect network performance to a great extent; as the number of network layers is deepened continuously, the training error is increased continuously and the problem of gradient disappearance possibly occurs, and the problems are solved effectively by the proposal of the convolution residual error network; because the convolutional residual network is deeper, it takes longer training time. 2) Based on region selection: the area selection method is also a common algorithm in the field of computer image processing target detection. The main flow of the region selection method is as follows: firstly, detecting a color space and a similar matrix to obtain a calculation result detection image, and then, carrying out image classification prediction according to the region detection result. In the field of semantic segmentation, several algorithms based on region selection are mostly extended from some previous ideas and methods related to object detection to the field of semantic segmentation. R-CNN (Region-based Convolutional Neural Network, R-CNN) is an algorithm proposed by the university of Berkeley, ji Xike, professor, et al, and applied for the first time on a deep learning network model. The basic process of the model is as follows: firstly, extracting a target candidate frame on an image by using a selective search algorithm (a candidate frame regression method is one of effective modes for realizing accurate positioning of a target); then, carrying out serial feature extraction on the candidate frames through a convolutional neural network; secondly, classifying and predicting the candidate frames according to the image features extracted by the network, wherein a machine learning SVM algorithm can be used in the classifying method; and finally, correcting the region frame by using a regration method. The improvement points of Fast R-CNN on the basis of an R-CNN model are as follows: firstly, the neural network can be directly used for extracting the characteristics of the whole image, so that the cost can be saved; then, a RoIPooling Layer is used for extracting image features corresponding to each region of interest on the whole feature map, and classification and correction of rectangular frames are performed through the full connection Layer. The Fast R-CNN then has new breakthroughs in the accuracy and speed of detection, as it changes the most time-consuming selective search algorithm used in previous networks. After the Faster R-CNN is proposed, the He Kaiming team also provides a new model structure Mask R-CNN, which plays an important role in three fields of target detection, target classification and pixel-level image segmentation. The authors' Mask R-CNN paper obtained an 2017 yearly best paper in which they added the Mask prediction part to the Faster R-CNN algorithm model, while improving RoIPooling, and proposed RoI alignment. The CR-deep network is improved on the basis of deep 3plus, and the network is applied to abnormality detection of valve leakage of a power plant.
At present, convolution residual error network and semantic segmentation algorithm are combined to apply to segmentation tasks of remote sensing images, but the power plant production process is complex, the process flow is complex, various pipelines, valves and other devices are numerous, background information is extremely large in interference on segmentation targets, meanwhile leaked oil has the characteristics of tiny targets and transparent texture, the characteristics are not possessed by the remote sensing images, if the semantic segmentation method of the related remote sensing images is used for identifying the abnormal states of the power plant valves, the problems of error detection, missed detection, incomplete segmentation of the edges of abnormal regions and the like are extremely easy to occur, and therefore, the invention provides a CR-Deeplab network for solving the problems and difficulties of the abnormal identification of the power plant valves by integrating the residual error convolution network and a CBAM attention mechanism into the Deeplab network through a large number of comparison tests, and experiments show that the invention has extremely great advantages for the identification and segmentation of the abnormal states of the power plant valves, can greatly reduce the problems of error detection, missed detection, incomplete segmentation of the edges of abnormal regions and the like, and can better meet the actual requirements of the power plant.
Disclosure of Invention
The invention provides an anomaly detection method based on CR-deep, which can quickly identify the anomaly condition in a factory by performing background image processing on an image shot by a fixed point position camera, is convenient for timely maintenance and meets the actual production requirement in the factory.
In order to achieve the above purpose, a CR-deep based abnormality detection method is designed, which is characterized in that: the method comprises the following steps:
s1, acquiring an original abnormal image, performing image cleaning on the acquired original abnormal image, removing a low-quality abnormal image, and constructing a data training set;
s2, training on a CR-deep network by using the constructed data training set, and processing an original fault image by the CR-deep network;
s3, processing the data training set constructed in the step S1 according to the batch size parameters set by the network to obtain a prediction feature map
S4, predicting the feature mapThe true segmented image set marked in step S1 +.>By contrast, the distance between the true value and the predicted value is calculated, and a loss function is constructed>Optimizing the network by continuously minimizing the loss function;
s5, updating the network parameter theta through an Adam strategy, wherein the learning rate of the updated parameter theta' is set to be 0.00001;
s6, repeating the steps S3 to S5, performing e times of network training until e > epochs or the training performance of the network is not improved any more, and completing the training by the network;
s7, analyzing the abnormal detection performance of the model by using two evaluation indexes of Precision and Mean Intersection over Union;
s8, deploying the trained model on a server, acquiring images in real time through a plurality of key point cameras, and then identifying and segmenting abnormal conditions through a CR-deep image segmentation algorithm.
In the step S1, labeling the screened images by using a labelme tool, firstly importing the original abnormal images into labelme software, then selecting a polygon tool of the software to draw an abnormal region, inputting an abnormal category, generating a json file of leakage information, converting the json file by using a python script to obtain label images corresponding to the abnormal images, and packaging all the label files to obtain a real segmented image setForming a data training set, and marking the g-th original abnormal image in the data training set as { I } g (I, j) } and { I } in the training set g The true semantic segmentation image corresponding to (i, j) is denoted +.>Wherein I is g (i,j)、/>Respectively represent { I } g (i, j) } and +.>The pixel value of the pixel point whose middle coordinate position is (i, j).
In the step S2, the CR-deep network adopts an encoder-decoder structure, the processing procedure of the CR-deep network on the original abnormal image is as follows, firstly, the original abnormal image is input into the encoder to extract the characteristics after the rotated and cut picture is preprocessed, the original abnormal image is firstly sent into the Res-back network at the front end of the encoder, the network has 4 layer modules, and each layer module extracts the fault characteristic diagram Q1 of the abnormal region through the residual convolution modules and the convolution attention module CBAM of different scales; then, the output characteristic diagram Q1 of the Res-backup network is respectively sent to ASPP networks of a decoder network and an encoder; the method comprises the steps that a cavity space pyramid ASPP module processes a Q1 characteristic diagram in parallel by using 3*3 convolution, common 1*1 convolution and a maximum pooling layer with expansion rates of 6, 12 and 18 respectively, and after all parts are spliced, outputs of the ASPP module are fused by using convolution attention modules CBAM and 1*1 convolution to obtain an output characteristic diagram Q2 of an encoder; the decoder respectively carries out 1*1 convolution and upsampling on the Q1 and Q2 feature images output by the encoder, then fuses the two feature images, and carries out 3*3 convolution and 8 times upsampling to obtain the prediction data of the network.
The specific method in the step S3 is as follows: b Zhang Yuanshi abnormal images are randomly selected from the data training set constructed in the S1 according to the batch size parameters set by the CR-deep network and packaged, original abnormal data are preprocessed by the network in the step S2, the image size is changed, and then the original abnormal data sequentially pass through a 1 st layer module, a 2 nd layer module, a 3 rd layer module and a 4 th layer module of the Res-back network to obtain an output characteristic diagram Q1 of the Res-back network; the Q1 feature map is subjected to ASPP network to obtain combined features, 1*1 convolution is performed to obtain the Q2 and Q1 features of the encoder, the Q1 features and the Q2 features are spliced to obtain decoder fusion features, 3*3 convolution and 8-time up-sampling are performed to obtain a final prediction feature map consistent with the original input feature size, and the final prediction feature map is recorded as
In step S3, b=batch size.
The loss function in the step S4As DF_loss function, the function is defined by the Dice Loss function L d L d Focal Loss function L f L f Linear combination, the form of which is as follows:
wherein TP is p (i) For class i to be correctly divided into positive samples, FN p (i) For class i being incorrectly divided into negative samples, FP p (i) For class i being divided into positive samples, p n (i) G is the predicted value for category i n (i) Is the true value of class i, i is the number of classes including background, k is all pixel values of the image, and α is L d And L is equal to f The weight trade-offs between are used to balance overall losses.
In the step S6, e represents the number of network training, and epochs is the set total number of network training.
In the step S7, the calculation formula of the Precision index is as follows:
where TP is true positive (divided into positive samples and divided into pairs), TN is true negative (divided into negative samples and divided into pairs), FP is false positive (divided into positive samples and divided into errors), and FN is false negative (divided into negative samples and divided into errors).
In the step S7, the calculation formula of the Mean Intersection over Union index is as follows:
where n represents the class, n+1 represents the addition of the background class, i represents the true value, j represents the predicted value, P ii Representing the prediction of i as i, P ij Representing i predicted as j, P ji Indicating that j is predicted as i.
In the step S8, the model sets the confidence coefficient to be 0.8, then detects each frame of the acquired image or video, when the confidence coefficient is smaller than or equal to 0.8, the original image is pushed to the front end for display through the flash frame, when the confidence coefficient is larger than 0.8, the abnormal image is pushed to the front end for display, an alarm is sent out through the popup window, and the detected abnormal information is stored in the database.
Compared with the prior art, the invention has the following beneficial effects:
1. according to the CR-deep algorithm, the residual convolution modules and the convolution attention modules CBAM with different scales are fused in each layer module of the Res-back network, and the CBAM convolution attention mechanism is added behind the ASPP module, so that the model can better regulate the weight of the model according to the tiny and transparent characteristics of a leaked oil target when facing the extremely complex background environment of a power plant, strengthen the study of a key area and key characteristics, simultaneously consider the difference among different pixel types, channel characteristics and context associated information, weaken the interference of the complex background of the power plant on leakage characteristics, greatly reduce the problems of error detection, leakage detection, incomplete edge segmentation of an abnormal area and the like, improve the integrity and reliability of model identification, and can more completely segment the abnormal condition of oil leakage of a valve of the power plant.
2. The model uses DF_Loss as a Loss function, so that the model can take the weight of each pixel into consideration through the Dice Loss, the problem of unbalanced categories can be better processed, meanwhile, the condition of unclear pixel classification on a leakage image can be better learned by means of the Focal Loss, the problems of gradient saturation and instability possibly occurring in the model when the Dice Loss processes a small target positive sample are avoided, and the detection and segmentation precision of the model on leakage oil under a complex background can be further improved.
3. The invention utilizes video detection, is not easy to cause false detection and omission detection, can display the detection condition on the front page in real time, and can send out early warning in time once abnormal condition occurs.
Drawings
FIG. 1 is a diagram of a CR-deep network according to the present invention.
Fig. 2 is a schematic diagram of detection and early warning of abnormal oil leakage of a valve according to an embodiment of the present invention.
FIG. 3 is a schematic diagram of a power plant valve oil leakage according to an embodiment of the present invention.
FIG. 4 is a graph of an image data enhancement variation according to an embodiment of the present invention.
Fig. 5 is a block diagram of a convolutional residual network Res50 and CBAM according to an embodiment of the present invention.
FIG. 6 is a graph of index results of a model training according to an embodiment of the present invention.
FIG. 7 is a graph showing a comparison of model segmentation results according to an embodiment of the present invention.
Detailed Description
The invention is further described below with reference to the accompanying drawings.
Embodiment one:
in order to improve the precision of semantic segmentation in a power plant valve leakage scene, the embodiment provides an improved deep 3plus neural network segmentation algorithm. The main part of the CR-deep model adopts a convolution residual error network Res50 to extract the characteristics, different training weight proportions of various targets are realized in a dense connection mode, and the characteristic weight of a leakage target is adjusted by using a CBAM convolution attention mechanism, so that the influence of a complex background is weakened. The model improves the cavity space convolution pyramid pooling module while expanding the receptive field, improves the feature reusability and does not sacrifice the feature space resolution, and the feature fusion capability of the model is enhanced by integrating a CBAM mechanism after the ASPP module. 3 low-level semantic features with different scales are extracted through a high-efficiency decoding end fusion coding end, lost spatial information is restored, and an improved CR-deep network model is shown in figure 1.
As shown in fig. 2, the flow of the detection and early warning of the abnormal oil leakage of the valve in this embodiment is as follows:
the first step: and calling a visible light camera at a fixed point position of a power plant steam turbine running layer to perform real-time image monitoring.
And a second step of: the camera monitors 5 points of the turbine running layer and transmits image data to the power plant oil leakage segmentation image server in real time through a wireless AP and a wired network.
And a third step of: and the algorithm server calls a CR-deep image segmentation algorithm to identify and segment the abnormal condition of the valve oil leakage.
Fourth step: and finally, pushing the detection result to the front end for display, and storing the original image, the detected image and the detection result into a system database, wherein if the valve is leaked, the front end can display abnormal conditions in real time through a large screen, and a pop-up early warning window can also prompt operation and maintenance personnel in time.
The specific method for the third step of the algorithm server to call the CR-deep image segmentation algorithm to identify and segment the abnormal condition of the valve oil leakage comprises the following steps:
s1, collecting an original leakage image, cleaning the collected original leakage image, removing low-quality leakage images, marking the screened images by using a labelme tool, firstly importing the original abnormal images into labelme software, then selecting a polygonal tool of the software to draw an abnormal region, inputting an abnormal category, generating a json file of leakage information, converting the json file through python script to obtain label images corresponding to the abnormal images, and packaging all label files to obtain a real segmented image setComposition data trainingSet, record the g original abnormal image in the data training set as { I } g (I, j) } and { I } in the training set g The true semantic segmentation image corresponding to (i, j) is recorded asWherein I is g (i,j)、/>Respectively represent { I } g (i, j) } and +.>The pixel value of the pixel point whose middle coordinate position is (i, j).
S2, training on a CR-deep network by using a constructed data training set, wherein the CR-deep network adopts an encoder-decoder structure, the processing process of the CR-deep network on an original abnormal image is as follows, firstly, the original abnormal image is input into an encoder to extract characteristics after being preprocessed by a rotated and cut picture, firstly, the original abnormal image is sent into a Res-back network at the front end of the encoder, the network is provided with 4 layer modules, and each layer module extracts a fault characteristic diagram Q1 of an abnormal region through residual convolution modules and convolution attention modules CBAM of different scales; then, the output characteristic diagram Q1 of the Res-backup network is respectively sent to ASPP networks of a decoder network and an encoder; the method comprises the steps that a cavity space pyramid ASPP module processes a Q1 characteristic diagram in parallel by using 3*3 convolution, common 1*1 convolution and a maximum pooling layer with expansion rates of 6, 12 and 18 respectively, and after all parts are spliced, outputs of the ASPP module are fused by using convolution attention modules CBAM and 1*1 convolution to obtain an output characteristic diagram Q2 of an encoder; the decoder respectively carries out 1*1 convolution and upsampling on the Q1 and Q2 feature images output by the encoder, then fuses the two feature images, and carries out 3*3 convolution and 8 times upsampling to obtain the prediction data of the network.
S3, randomly selecting and packaging B Zhang Yuanshi leakage images according to the batch size parameter set by the CR-deep network from the data training set constructed in the S1, wherein the image size of the original leakage data is changed into [ B,3,640,640 ] after the original leakage data is preprocessed by the network]Then go through Res-backhaul networkThe dimension of the leakage fault signature is compressed to [ B,256,320,320 ]]The method comprises the steps of carrying out a first treatment on the surface of the The dimension of the leakage fault signature is compressed to [ B,512,160,160 ] through layer 2 modules of the Res-backup network]The method comprises the steps of carrying out a first treatment on the surface of the The dimension of the leakage fault signature is compressed to [ B,1024,80,80 ] through layer 3 module of Res-backup network]The method comprises the steps of carrying out a first treatment on the surface of the The 4 th layer module of the Res-backup network is used for obtaining an output characteristic diagram Q1 of the Res-backup network, and the dimension of the output characteristic diagram is [ B,2048,80,80 ]]. The Q1 feature map is subjected to ASPP network to obtain a dimension [ B,2560,80,80 ]]Is convolved by 1*1 to obtain the combination characteristics with the dimension of [ B,512,80,80 ]]The characteristic of the encoder Q2 and the characteristic of the encoder Q1, and the characteristic of the encoder Q1 and the characteristic of the encoder Q2 are spliced to obtain the encoder with the dimension of [ B,1024,80,80 ]]And after 3*3 convolution and 8 up-sampling, a final predicted feature map is obtained that is consistent with the original input feature size, noted asIn step S3, b=batch size.
S4, predicting the feature mapThe true segmented image set marked in step S1 +.>By contrast, the distance between the true value and the predicted value is calculated, and a loss function is constructed>The network is optimized by constantly minimizing the loss function.
Loss functionAs DF_loss function, the function is defined by the Dice Loss function L d L d Focal Loss function L f L f Linear combination, the form of which is as follows:
wherein TP is p (i) For class i to be correctly divided into positive samples, FN p (i) For class i being incorrectly divided into negative samples, FP p (i) For class i being divided into positive samples, p n (i) G is the predicted value for category i n (i) Is the true value of class i, i is the number of classes including background, k is all pixel values of the image, and α is L d And L is equal to f The weight trade-offs between are used to balance overall losses. In this embodiment, i has a value of 2 and is classified into background and oil leakage.
S5, updating the network parameter theta through an Adam strategy, wherein the learning rate of the updated parameter theta' is set to be 0.00001;
and S6, repeating the steps S3 to S5, performing e times of network training until e > epochs or the training performance of the network is not improved any more, and completing the training by the network. Where e represents the number of network training and epochs is the set total number of network training, and this embodiment initializes to a constant epochs=20000.
S7, analyzing the abnormality detection performance of the model by using two evaluation indexes of Precision and Mean Intersection over Union.
The calculation formula of the Precision index is as follows:
where TP is true positive (divided into positive samples and divided into pairs), TN is true negative (divided into negative samples and divided into pairs), FP is false positive (divided into positive samples and divided into errors), and FN is false negative (divided into negative samples and divided into errors).
In the step S7, the calculation formula of the Mean Intersection over Union index is as follows:
where n represents the class, n+1 represents the addition of the background class, i represents the true value, j represents the predicted value, P ii Representing the prediction of i as i, P ij Representing i predicted as j, P ji Indicating that j is predicted as i.
S8, deploying a trained model on a server, acquiring images in real time through a plurality of key point cameras, identifying and segmenting abnormal conditions through a CR-deep image segmentation algorithm, setting the confidence coefficient of the model to be 0.8, detecting each frame of the acquired images or videos, pushing an original image to be displayed at the front end through a mask frame when the confidence coefficient is smaller than or equal to 0.8, pushing the abnormal image to be displayed at the front end and giving an alarm through a popup window when the confidence coefficient is larger than or equal to 0.8, and storing detected abnormal information into a database.
In step S1, since there is no available oil leakage image segmentation dataset for open source at present, in order to study the abnormal condition of oil leakage of the valve of the power plant, an oil leakage dataset is obtained by performing an oil leakage simulation experiment at the power plant. The data are shot through a mobile phone, a patrol robot and a fixed-point camera to obtain visible light oil leakage images, the sizes of shot images are 1920x1080, then the shot images are uniformly changed into 640x640, the images are sent to a network for training, and the oil leakage schematic of a valve of a power plant is shown in figure 3. In order to perform detection and verification of oil leakage of a valve in a certain power plant, 2000 pieces of oil leakage image data of the power plant are simulated, and before training of an oil leakage model of the valve in the power plant is performed by using different backbone networks, the leakage image data and a label graph are required to be divided into training data, verification data and test data according to the ratio of 6:3:1 at random. In order to solve the problems of under fitting, poor generalization capability and the like of the model, the number of original image data is increased in the modes of X-axis overturning, horizontal translation, image rotation and the like, and as shown in fig. 4, the number of experimental images is increased by 2 times after data enhancement.
In step S2, the improved network uses the convolution residual network Res50 as a backbone network to perform efficient feature extraction, and merges a CBAM mechanism after each Layer module of the backbone network to improve the feature extraction capability of the model. Res50 is composed of Conv and Identity blocks, and the input and output dimensions of Conv Block are not the same, so that connection cannot be realized, but the Conv Block can modify the dimension of the network; the Identity Block module can be connected because the input and output dimensions are the same, and the functional module is mainly used for enhancing the network. Conv Block can be divided into two parts of a backbone and a residual, the backbone part comprises convolution, normalization, activation function and one-time convolution, normalization; the residual part comprises one convolution and normalization, and since the residual edge part has convolution, we can change the width and the height of the output characteristic layer and the channel number by using Conv Block. The structural parameters of the convolution residual network Res50 are shown in table 1, and the residual part connects the image input with the convolution layer, so that the convolution layer behind the network can directly learn the residual, and thus complete oil leakage image information can be obtained. When Res50 is used as a backbone network of deep 3plus, the visualization of the intermediate feature map is realized through a CNN interpreter, the feature map output by the shallow part of the network is rich in details, the features are more obvious, and the local detail information such as the position and the outline of a target object can be seen; along with the continuous deep network structure, the picture size is continuously reduced, the number of channels is continuously increased, the receptive field of the convolution kernel in the network is continuously increased, and the global feature information of the image is more conveniently extracted.
The CBAM mechanism behind each Layer module comprises two independent sub-modules, namely a channel attention module and a space attention module, which respectively carry out channel and space attention operations. In the channel attention module, the attention to the oil leakage area is enhanced by re-distributing the weights among the feature graphs, and the influence of the background features on the detection result is weakened. The input feature images are subjected to global maximum pooling and global average pooling in parallel to obtain two feature images, the pooling results are respectively sent to a multi-layer perceptron (MLP), the neural network is shared by the two feature images, and finally the generated features are transmitted to a spatial attention module. In the space attention module, an output feature map of the channel attention module is used as an input feature map of the space attention module, global maximum pooling and global average pooling are firstly carried out, then splicing operation (concat) is carried out on the two pooled feature maps according to the channel direction, dimension is reduced to 1 channel, finally a space attention feature is generated through a sigmoid activation function, and finally the feature finally output by the space attention module is multiplied with the input feature to obtain a feature map finally output by a CBAM attention mechanism. The CBAM mechanism and Res50 network architecture is shown in fig. 5.
S3, adopting DF_Loss as a Loss function of the model. S is S dice The similarity measurement function is a set similarity measurement function, is usually used for calculating the similarity degree of two samples, multiplies the intersection of a predicted result and a real result by 2, and divides the intersection by the sum of the predicted result and the real result, wherein the value range is between 0 and 1. The greater the overlap ratio between the predicted result and the real result is S dice The larger the loss L of image segmentation at this time dice The smaller. When the Dice Loss is used in the image segmentation network and the positive sample is a small target, the problems of instability and gradient saturation can occur in the extreme case of the training process at the moment. Since small objects have pixel mispredictions when there is only foreground and background, there is a large variation in loss. In contrast, the Focal Loss function can better learn the condition of undefined pixel classification on the leakage image, so that the Focal Loss function and the Focal Loss function are combined together as a segmented Loss function in a valve oil leakage experiment of a training power plant, and the specific combination form is as follows.
S4, training and verifying analysis are carried out by using the image algorithm server. In order to verify the applicability and real-time performance of the valve oil leakage detection of the CR-deep algorithm provided by the invention in the actual environment of a power plant, a visible light image simulating oil leakage in a turbine running layer of the power plant is selected for detection segmentation. Meanwhile, the integrity of the main feature extraction network and other main network segmentation oil leakage areas improved by the algorithm is compared, and the superiority of the CR-deep segmentation algorithm provided by the invention is proved by comparing experimental data. The hardware configuration of the image server used for the verification experiment is as follows: ubuntu20.04 system; CPU processor: i7 12700; display card: NVIDIA GeForce RTX308012G; memory RAM:32GB, as shown in Table 2. The experiments all adopt the current mainstream deep learning target detection and image segmentation framework Pytorch, python programming language is adopted during model training, and C++ programming language is adopted after the model is quantized, so that the network can quickly identify abnormal conditions during actual operation of a power plant.
Table 2 training and testing environment
S5, comparing the algorithm with other backbone network algorithms. The CR-deep network provided by the invention is improved on the basis of deep 3plus, the feature extraction network is replaced by Res50 and is integrated with a CBAM mechanism at a plurality of positions, and a new network structure CR-deep is formed. The experiment adopts an Adam first-order optimization algorithm, and the learning rate is set to be 0.00001. The batch size is set to 4 according to the video memory size of the image algorithm server. The index of model training after 20000 iterations is shown in fig. 5, from which it can be seen that the training model is basically convergent. In order to verify the accuracy and the high efficiency of detecting the oil leakage of the valve in the power plant by the CR-deep network, the same loss function and evaluation standard are used for comparing and analyzing the segmentation effect of different feature extraction networks in the experiment, as shown in fig. 7. The invention adopts an algorithm based on deep learning image segmentation, so that the approximate area of leakage of liquid leakage, namely the label graph of the leakage image, is required to be manually marked before training, and is shown in the second row of the table. The 3 network model segmentation effect graphs of the experiment of the embodiment are shown in 3-5 rows. Because the shape of the oil leakage of the valve of the power plant is uncertain, the problems of valve shielding and the like exist, the generalization capability of the Mobilene-deep 3plus network model is insufficient, the method cannot well adapt to various changes of the oil leakage of the valve of the power plant, and the final detection result cannot well divide the oil leakage area; when the Xreception is adopted as a backbone network, the Xreception-deep 3plus model has a better effect than the Mobilene-deep 3plus model in detecting the oil leakage of the power plant, but partial areas still exist at the edge of the image and cannot be completely segmented; the CR-deep model segmentation result is better than Xception-deep v3plus, and the oil liquid leakage area can be segmented more completely. Meanwhile, according to the table 3, mean Intersection over Union (M) indexes of the CR-deep model are 2% higher than those of the Xception-deep 3plus, and average precision (ap) is 1% higher than that of the Xception-deep 3plus, so that the CR-deep model has better detection performance in the detection of oil leakage of a valve of a power plant, and is quantized by a Tensor and then applied to a camera at a fixed point of the power plant, so that whether the valve of a key facility of the power plant leaks or not is detected in real time.
TABLE 3 model segmentation index comparison
/>

Claims (10)

1. An anomaly detection method based on CR-deep is characterized in that: the method comprises the following steps:
s1, acquiring an original abnormal image, performing image cleaning on the acquired original abnormal image, removing a low-quality abnormal image, and constructing a data training set;
s2, training on a CR-deep network by using the constructed data training set, and processing an original fault image by the CR-deep network;
s3, constructing a data training set from the step S1Processing according to the batch size parameter set by the network to obtain a prediction feature map
S4, predicting the feature mapThe true segmented image set marked in step S1 +.>By contrast, the distance between the true value and the predicted value is calculated, and a loss function is constructed>Optimizing the network by continuously minimizing the loss function;
s5, updating the network parameter theta through an Adam strategy, wherein the learning rate of the updated parameter theta' is set to be 0.00001;
s6, repeating the steps S3 to S5, performing e times of network training until e > epochs or the training performance of the network is not improved any more, and completing the training by the network;
s7, analyzing the abnormal detection performance of the model by using two evaluation indexes of Precision and Mean Intersection over Union;
s8, deploying the trained model on a server, acquiring images in real time through a plurality of key point cameras, and then identifying and segmenting abnormal conditions through a CR-deep image segmentation algorithm.
2. The CR-deep based abnormality detection method according to claim 1, wherein: in the step S1, the screened images are marked by using a labelme tool, the original abnormal images are firstly imported into labelme software, then a polygon tool of the software is selected to draw an abnormal region, an abnormal category is input, a json file of leakage information is generated, and then the json file is converted by using a python script to obtain each abnormal image pairPackaging all the label files to obtain a real segmented image set according to the label imagesForming a data training set, and marking the g-th original abnormal image in the data training set as { I } g (I, j) } and { I } in the training set g The true semantic segmentation image corresponding to (i, j) is recorded asWherein I is g (i,j)、/>Respectively represent { I } g (i, j) } and +.>The pixel value of the pixel point whose middle coordinate position is (i, j).
3. The CR-deep based abnormality detection method according to claim 1, wherein: in the step S2, the CR-deep network adopts an encoder-decoder structure, the processing procedure of the CR-deep network on the original abnormal image is as follows, firstly, the original abnormal image is input into the encoder to extract the characteristics after the rotated and cut picture is preprocessed, the original abnormal image is firstly sent into the Res-back network at the front end of the encoder, the network has 4 layer modules, and each layer module extracts the fault characteristic diagram Q1 of the abnormal region through the residual convolution modules and the convolution attention module CBAM of different scales; then, the output characteristic diagram Q1 of the Res-backup network is respectively sent to ASPP networks of a decoder network and an encoder; the method comprises the steps that a cavity space pyramid ASPP module processes a Q1 characteristic diagram in parallel by using 3*3 convolution, common 1*1 convolution and a maximum pooling layer with expansion rates of 6, 12 and 18 respectively, and after all parts are spliced, outputs of the ASPP module are fused by using convolution attention modules CBAM and 1*1 convolution to obtain an output characteristic diagram Q2 of an encoder; the decoder respectively carries out 1*1 convolution and upsampling on the Q1 and Q2 feature images output by the encoder, then fuses the two feature images, and carries out 3*3 convolution and 8 times upsampling to obtain the prediction data of the network.
4. The CR-deep based abnormality detection method according to claim 1, wherein: the specific method in the step S3 is as follows: b Zhang Yuanshi abnormal images are randomly selected from the data training set constructed in the S1 according to the batch size parameters set by the CR-deep network and packaged, original abnormal data are preprocessed by the network in the step S2, the image size is changed, and then the original abnormal data sequentially pass through a 1 st layer module, a 2 nd layer module, a 3 rd layer module and a 4 th layer module of the Res-back network to obtain an output characteristic diagram Q1 of the Res-back network; the Q1 feature map is subjected to ASPP network to obtain combined features, 1*1 convolution is performed to obtain the Q2 and Q1 features of the encoder, the Q1 features and the Q2 features are spliced to obtain decoder fusion features, 3*3 convolution and 8-time up-sampling are performed to obtain a final prediction feature map consistent with the original input feature size, and the final prediction feature map is recorded as
5. The CR-deep based abnormality detection method according to claim 1, wherein: in step S3, b=batch size.
6. The CR-deep based abnormality detection method according to claim 1, wherein: the loss function in the step S4As DF_loss function, the function is defined by the Dice Loss functionFocal Loss function->Linear combination, the form of which is as follows:
wherein TP is p (i) For class i to be correctly divided into positive samples, FN p (i) For class i being incorrectly divided into negative samples, FP p (i) For class i being divided into positive samples, p n (i) G is the predicted value for category i n (i) Is the true value of class i, i is the number of classes including background, k is all pixel values of the image, and α is L d And L is equal to f The weight trade-offs between are used to balance overall losses.
7. The CR-deep based abnormality detection method according to claim 1, wherein: in the step S6, e represents the number of network training, and epochs is the set total number of network training.
8. The CR-deep based abnormality detection method according to claim 1, wherein: in the step S7, the calculation formula of the Precision index is as follows:
where TP is true positive (divided into positive samples and divided into pairs), TN is true negative (divided into negative samples and divided into pairs), FP is false positive (divided into positive samples and divided into errors), and FN is false negative (divided into negative samples and divided into errors).
9. The CR-deep based abnormality detection method according to claim 1, wherein: in the step S7, the calculation formula of the Mean Intersection over Union index is as follows:
where n represents the class, n+1 represents the addition of the background class, i represents the true value, j represents the predicted value, P ii Representing the prediction of i as i, P ij Representing i predicted as j, P ji Indicating that j is predicted as i.
10. The CR-deep based abnormality detection method according to claim 1, wherein: in the step S8, the model sets the confidence coefficient to be 0.8, then detects each frame of the acquired image or video, when the confidence coefficient is smaller than or equal to 0.8, the original image is pushed to the front end for display through the flash frame, when the confidence coefficient is larger than 0.8, the abnormal image is pushed to the front end for display, an alarm is sent out through the popup window, and the detected abnormal information is stored in the database.
CN202311477998.XA 2023-11-08 2023-11-08 CR-deep-based anomaly detection method Pending CN117422695A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311477998.XA CN117422695A (en) 2023-11-08 2023-11-08 CR-deep-based anomaly detection method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311477998.XA CN117422695A (en) 2023-11-08 2023-11-08 CR-deep-based anomaly detection method

Publications (1)

Publication Number Publication Date
CN117422695A true CN117422695A (en) 2024-01-19

Family

ID=89522733

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311477998.XA Pending CN117422695A (en) 2023-11-08 2023-11-08 CR-deep-based anomaly detection method

Country Status (1)

Country Link
CN (1) CN117422695A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117725543A (en) * 2024-02-18 2024-03-19 中国民航大学 Multi-element time sequence anomaly prediction method, electronic equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117725543A (en) * 2024-02-18 2024-03-19 中国民航大学 Multi-element time sequence anomaly prediction method, electronic equipment and storage medium
CN117725543B (en) * 2024-02-18 2024-05-03 中国民航大学 Multi-element time sequence anomaly prediction method, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN110705457B (en) Remote sensing image building change detection method
CN111754498A (en) Conveyor belt carrier roller detection method based on YOLOv3
CN107247952B (en) Deep supervision-based visual saliency detection method for cyclic convolution neural network
CN115546565A (en) YOLOCBF-based power plant key area pipeline oil leakage detection method
CN117422695A (en) CR-deep-based anomaly detection method
CN116229052B (en) Method for detecting state change of substation equipment based on twin network
CN112257741B (en) Method for detecting generative anti-false picture based on complex neural network
CN111598854B (en) Segmentation method for small defects of complex textures based on rich robust convolution feature model
Shajihan et al. CNN based data anomaly detection using multi-channel imagery for structural health monitoring
CN116206112A (en) Remote sensing image semantic segmentation method based on multi-scale feature fusion and SAM
CN112084860A (en) Target object detection method and device and thermal power plant detection method and device
Lin et al. Optimal CNN-based semantic segmentation model of cutting slope images
CN113516652A (en) Battery surface defect and adhesive detection method, device, medium and electronic equipment
CN116778346B (en) Pipeline identification method and system based on improved self-attention mechanism
CN113052103A (en) Electrical equipment defect detection method and device based on neural network
CN116363075A (en) Photovoltaic module hot spot detection method and system and electronic equipment
CN116030050A (en) On-line detection and segmentation method for surface defects of fan based on unmanned aerial vehicle and deep learning
CN115273009A (en) Road crack detection method and system based on deep learning
Wang et al. Identify Solar Panels in Low Resolution Satellite Imagery with Siamese Architecture and Cross-Correlation
CN114298909A (en) Super-resolution network model and application thereof
CN113989632A (en) Bridge detection method and device for remote sensing image, electronic equipment and storage medium
CN113313678A (en) Automatic sperm morphology analysis method based on multi-scale feature fusion
CN113112599A (en) Hydrogenation station remote diagnosis method and system based on VR technology and electronic equipment
CN116503398B (en) Insulator pollution flashover detection method and device, electronic equipment and storage medium
CN112329743B (en) Abnormal body temperature monitoring method, device and medium in epidemic situation environment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination