CN116824352A - Water surface floater identification method based on semantic segmentation and image anomaly detection - Google Patents

Water surface floater identification method based on semantic segmentation and image anomaly detection Download PDF

Info

Publication number
CN116824352A
CN116824352A CN202310894099.3A CN202310894099A CN116824352A CN 116824352 A CN116824352 A CN 116824352A CN 202310894099 A CN202310894099 A CN 202310894099A CN 116824352 A CN116824352 A CN 116824352A
Authority
CN
China
Prior art keywords
water surface
feature
image
module
floater
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310894099.3A
Other languages
Chinese (zh)
Inventor
唐俊
李运生
郑向宏
王国庆
葛新科
张艳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ANHUI BOWEI GUANGCHENG INFORMATION TECHNOLOGY CO LTD
Anhui University
Original Assignee
ANHUI BOWEI GUANGCHENG INFORMATION TECHNOLOGY CO LTD
Anhui University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ANHUI BOWEI GUANGCHENG INFORMATION TECHNOLOGY CO LTD, Anhui University filed Critical ANHUI BOWEI GUANGCHENG INFORMATION TECHNOLOGY CO LTD
Priority to CN202310894099.3A priority Critical patent/CN116824352A/en
Publication of CN116824352A publication Critical patent/CN116824352A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/70Labelling scene content, e.g. deriving syntactic or semantic representations

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a water surface floater identification method based on semantic segmentation and image anomaly detection, which comprises the following steps: 1. collecting and preprocessing image data of the water surface floaters; 2. constructing a water surface segmentation network to segment a water surface and a background; 3. constructing a water surface abnormality detection network, and detecting a floater area in the water surface; 4. constructing an image classification network and identifying specific categories of floaters; 5. and identifying the water surface floater image by using the trained model. According to the invention, the water surface part in the water surface floater image is segmented through the semantic segmentation model, so that interference of the background part on floater identification can be eliminated, then the floater area in the water surface is detected through the anomaly detection model, and the specific category of the floater is identified through the image classification model, so that the comprehensiveness and accuracy of floater identification can be greatly improved.

Description

Water surface floater identification method based on semantic segmentation and image anomaly detection
Technical Field
The invention relates to the technical field of water surface floater image processing, in particular to a water surface floater identification method based on semantic segmentation and image anomaly detection.
Background
The problem of pollution of water surface floaters seriously affects the production and life of human beings, and real-time detection of water quality is an important link of water quality management and floater pollution control. Because of the complexity of the water surface environment, the water surface image has the characteristics of uneven illumination, easy noise pollution, easy weather influence and the like, so that the detection of the water surface floaters has certain specificity, and the traditional target detection algorithm has limited feature extraction capability under the condition of being interfered by the external environment or noise and the like.
The monitoring of the water surface environment is mainly realized by arranging special persons to watch the monitoring picture manually. The method is simple but needs to consume a great deal of manpower and material resources; according to the research, after keeping the monitor screen at a close position for 22 minutes, the human eyes can not see more than 95% of active information in the screen, and the fatigue of monitoring personnel is easily caused when the human eyes face a plurality of monitor screens for a long time, so that the human eyes can hardly respond to problems on the water surface in time, especially in severe weather conditions.
Disclosure of Invention
The invention aims to solve the defects of the prior art, and provides a water surface floater identification method based on semantic segmentation and image anomaly detection, so as to solve the problem of water surface floater identification in a video monitoring scene.
In order to achieve the aim of the invention, the invention adopts the following technical scheme:
the invention relates to a water surface floater identification method based on semantic segmentation and image anomaly detection, which is characterized by comprising the following steps of:
step 1, acquiring a water surface floater image data set, and sequentially performing pretreatment of screening, standardization and size adjustment to obtain a pretreated water surface floater image sample data set X= { X 1 ,...,X n ,...,X N },X n Representing an nth water surface float image, wherein N represents the total number of samples; from X n The water surface area divided in the middle is taken as X n Is denoted as Y n The method comprises the steps of carrying out a first treatment on the surface of the And Y is n E, C represents a category set; labeling the water surface floater image sample data set X to obtain a mask set S= { S corresponding to the data set X 1 ,...,S n ,...,S N -a }; wherein S is n X represents n True category information of each pixel point;
step 2, building a water surface segmentation network based on a feature pyramid FPN for the nth water surface floater image X n Processing and obtaining a water surface floater image X n Is a water surface image P of (2) n
Constructing a loss function L of a water surface segmentation network by using the method (4) α
L α =L ce +λL ls (4)
In the formula (4), λ is a weight parameter; l (L) ls Representing lovassz-SoftMax loss; l (L) ce Representing cross entropy loss;
step 3, constructing a water surface abnormality detection network based on a single classification algorithm for the nth water surface image P n Processing and obtaining X n Is a float area image Z of (1) n
Constructing a loss function L of a water surface abnormality detection network by using the method (7) β
In formula (7), μ is a weight parameter; l (L) θ Indicating the loss of coding,representing a classification loss;
step 4, building an image classification network of a base ResNet, which sequentially comprises a backbone network module and a classification module:
step 4.1, adding an average pooling layer after the ResNet101 network by the backbone network module;
X n is a float area image Z of (1) n Inputting the backbone network module, extracting the feature graphs output by the three layers of convolution layers, carrying out feature fusion, and obtaining Z after the dimension reduction treatment of the average pooling layer n Feature vector T of (1) n
Step 4.2, the classification module is sequentially composed of a full connection layer and a Softmax activation function, and the feature vector T is obtained n Inputting the images into the classification module to obtain an nth water surface floater image X n Predictive classification labels for class mM=1, 2,..m, M is the total number of float categories, whereby the correspondence of the predictive classification labels to the float categories is based, whereby a float category result is obtained;
step 4.3, utilization (8)) Constructing a loss function L of an image classification network γ
In the formula (8), J n,m X represents n True category labels for the m-th category;
step 5, based on the image sample data set X= { X of the water surface floating object 1 ,...,X n ,...,X N Training a water surface segmentation network, a water surface abnormality detection network and an image classification network by using a gradient descent method, and calculating a loss function L α 、L β And L γ Updating network parameters until the loss function converges, so as to obtain a trained water surface segmentation model, a trained water surface abnormality detection model and a trained image classification model, wherein the water surface segmentation model is used for segmenting a water surface floater image acquired in real time to obtain a water surface image; the anomaly detection model carries out anomaly detection on the water surface image to obtain a floater area, then the floater area is input into the trained image classification model to obtain the predicted category of the floater, and finally the area set where the floater is located and the category of the floater are output.
The water surface floater identification method based on semantic segmentation and image anomaly detection is characterized in that the water surface segmentation network sequentially comprises the following steps: backbone network module, pyramid pooling module PPM, stream alignment module FAM and refinement residual error module RRB, and obtain water surface image P according to the following steps n
Step 2.1, the backbone network is based on a ResNet101 network, and sequentially comprises: a first convolution layer, a second convolution layer, a third convolution layer, a fourth convolution layer, and a fifth convolution layer;
the nth water surface floater image X n Inputting into the backbone network, sequentially processing by five convolution layers, and respectively generating corresponding feature graphs by each convolution layerWherein (1)>Representing a water surface convolution characteristic map generated by an ith convolution layer;
step 2.2, the pyramid pooling module PPM consists of k pyramids with different scales, and a water surface convolution characteristic map generated for the 5 th convolution layerCarrying out pooling operation of k different scales to obtain k water surface feature maps with different sizes; then up-sampling the k water surface characteristic images with different sizes to recover +.>Finally, the k up-sampled water surface feature graphs are spliced in the channel dimension, so that an nth low-resolution water surface pooling feature graph is obtained>
Step 2.3, the flow alignment module FAM pools the water surface characteristic map through a bilinear interpolation methodUpsampling to the water surface convolution feature map generated with the j+1th convolution layer +.>After the two are spliced in the channel dimension with the same size, the spliced characteristic diagram is distorted to a water surface convolution characteristic diagram +.>In (a) generating a j-th stream alignment feature map->Thereby obtaining the nth water surface pooling feature map +.>Flow alignment feature map set +.>Alignment of three streams profile +.>Reducing the dimension by point convolution to obtain a return first-class alignment characteristic diagram +.>
Step 2.4, constructing the cross entropy loss L of the stream alignment module FAM using equation (1) ce
In the formula (1), the components are as follows,representation->Predicted value of ith pixel, p n,i Representing mask S n The true value of the i-th pixel in (a), N representing the number of pixels;
step 2.5, the refinement residual module RRB is represented by e 1 Each residual unit uses a 1X 1 common convolution layer to reduce the input feature, the feature after dimension reduction is divided into two branches, one branch is directly input into the adder, and the other branch is firstly input into the adder after passing through two serial 3X 3 common convolution layers, so as to obtain the integrated feature;
convoluting the four layers of characteristic diagrams after water surface convolutionRespectively input the refinement residual error modules RRB, and after e1 residual units, obtaining an integrated feature map +.>
Integrating four feature mapsInputting the data into the integration adder, and performing dimension reduction by the 1X 1 dimension reduction convolution layer to obtain a refined residual error characteristic diagram +.>Then align the return to first order feature map +.>And refinement of residual feature map->Performing point-by-point convolution to obtain a water surface floater image X n Is a water surface image P of (2) n
Step 2.6, constructing Lovasz-SoftMax loss L of the refinement residual module RRB by using the formula (2) ls
In the formula (2), C represents the number of categories of the category set C, C represents any one category of the category set C,is DeltaJ c Lovasz expansion, deltaJ c Is the Jaccard index for category c; m is m i (c) Representation pair S n The vector of the i-th pixel class c prediction error is:
in the formula (3), the amino acid sequence of the compound,representation of a refined residual feature map->Predicted value of the i-th pixel in (b).
The water surface abnormality detection network sequentially comprises: encoder module f θ Texture enhancement module TEM, pyramid texture feature extraction module PTFEM and classifier moduleAnd X is obtained as follows n Is a float area image Z of (1) n
Step 3.1, the encoder module f θ Each basic unit is formed by two-dimensional convolution Conv2D and an activation function layer LeakyReLU in sequence;
in h 1 For step length, the nth water surface image P n Is divided into H pieces with the size of H 2 Obtain a block set P n ={w n,1 ,...,w n,h ,...,w n,H (wherein H is P) n Block total number, w n,h Is P n The h-th block of (a);
will w n,h Input the encoder f θ Extracting the h coding feature A n,h
Step 3.2, constructing the encoder f using (5) θ Loss L of (2) θ
In formula (5), w n,h′ Is w n,h Is adjacent to (a) a block;
step 3.3, the texture enhancement module TEM consists of a global average pooling layer, a one-dimensional quantized coding QCO layer and an MLP multi-layer perceptron;
for the h code feature A n,h After the global average pooling layer is processed, the h texture pooling feature g is obtained n,h Then calculate the coding feature A n,h Is associated with the texture pooling feature g n,h Cosine similarity of (1), generating a feature similarity matrix G n,h The one-dimensional quantization coding QCO layer pair characteristic similarity matrix G n,h One-dimensional quantization coding is carried out to obtain an h quantization coding matrix E n,h The method comprises the steps of carrying out a first treatment on the surface of the The h quantized coding matrix E n,h After passing through the MLP multi-layer perceptron, generating the h statistical feature D n,h H quantized coding matrix E n,h And then with the h statistical feature D n,h After multiplication, the h high-quality texture feature O is obtained n,h
Step 3.4, the pyramid texture feature extraction module PTFEM is represented by b 1 The parallel branches with different scales are formed, and each parallel branch comprises a texture feature extraction unit; the texture feature extraction unit is composed of b 2 Two-dimensional quantization coding 2d-QCO layer and b 3 The multi-layer perceptron MLP is formed in sequence;
will h high quality texture feature O n,h And the h coding feature A n,h After feature fusion, generating an h fusion feature map K n,h Inputting the feature images into the pyramid texture feature extraction module PTFEM, and firstly merging the h feature images K n,h Divided into b 1 After feature graphs with different scales are respectively input b 1 Processing in parallel branches of different scales, each branch extracting corresponding texture representation feature map by its own texture feature extraction unitWherein, the liquid crystal display device comprises a liquid crystal display device,representing a texture representation feature map generated by the q-th branch; will b 1 The texture representation feature graphs obtained by the branches are respectively subjected to up-sampling operation so as to restore to the fusion feature graph K n,h Is of the size of (a)And then b 1 After the feature graphs after up sampling are fused, the h multi-scale texture feature F is generated n,h Finally F is arranged n,h And A is a n,h After fusion, the h texture feature B is generated n,h
Step 3.5, the classifier moduleFrom x 1 Each linear unit is composed of a linear layer and an activation function layer LeakyReLU in sequence;
will h texture feature B n,h Inputting the classifierAnd get the h block w n,h Obtaining the anomaly scores of all the blocks; wherein the nth water surface image P n The abnormal score of the middle pixel point i is the average value of the sum of the abnormal scores of all the cut blocks where the pixel point i is positioned;
setting an abnormality score threshold value, and setting an nth water surface image P n In the abnormal score of the middle pixel point, an abnormal region higher than an abnormal score threshold value is extracted, so that X is obtained n Is a float area image Z of (1) n
Step 3.6, constructing the classifier by using the step (6)Loss of->
In the formula (6), cross_entcopy represents Cross entropy,representing the nth water surface imageRandom one small block in Pn, < +.>Expressed as +.>Any one of the small blocks with the center in 8 directions is equal to or less than 1 and p 1 ,p 2 Less than or equal to H, y is->Relative to->And y=1, 2,..8.
The electronic device of the invention comprises a memory and a processor, wherein the memory is used for storing a program for supporting the processor to execute the method for identifying the floating object on the water surface, and the processor is configured to execute the program stored in the memory.
The invention relates to a computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being run by a processor, performs the steps of the above-water float identification method.
Compared with the prior art, the invention has the beneficial effects that:
1. according to the invention, the accuracy of floater identification is improved by adopting a deep learning algorithm, the manpower and material resources are reduced, and the labor cost is reduced. The types of the floating garbage are various, the characteristics such as color and texture are various, and the like, rather than identifying the types of the floating garbage on the water surface one by one, all the floating garbage can be regarded as an abnormality, and no matter what floating garbage exists on the water surface, the floating garbage is abnormal as long as the floating garbage is different from the normal water surface. As long as the floaters appear on the water surface, the floaters can be judged to be abnormal, so that the automatic detection of the abnormal condition of the water surface is realized. The semantic segmentation and image anomaly detection algorithm adopted by the invention can automatically identify the floating garbage on the water surface, and can monitor the water surface conditions of a plurality of areas simultaneously by combining with the camera, thereby realizing more efficient management.
2. Compared with the traditional target detection method, the method for identifying the water surface floaters has better floaters, and is more comprehensive and accurate in identifying the floaters. If anomaly detection is used directly to identify water surface floats, the monitored area must be free of background interference, and the whole monitored area is water surface, which is not practical; therefore, the method of the invention adopts a semantic segmentation network to perform front water surface segmentation treatment, and separates the foreground part and the background of the water surface, so that the whole detection system is suitable for more scenes, and the applicability and the practicability of the method are greatly improved. The method has the advantages that the floater garbage is identified in an abnormal detection mode, various floater garbage can be identified through image classification, and the missing detection condition of floater garbage identification is effectively reduced.
Drawings
FIG. 1 is an overall flow chart of the present invention for surface float identification;
FIG. 2 is a flow chart of reasoning for identifying the surface floats according to the present invention.
Detailed Description
In the embodiment, a water surface floater identification method based on semantic segmentation and image anomaly detection mainly comprises the steps of firstly segmenting a water surface and a background part in a water surface floater image by using a semantic segmentation network, acquiring the water surface image, simultaneously eliminating interference of the background part on floater identification, then detecting a floater area in the water surface image by using an anomaly detection network, and then identifying specific categories of floaters by using an image classification network. As shown in fig. 1, the whole process can be specifically divided into the following steps:
step 1, acquiring a water surface floater image data set, and sequentially performing pretreatment of screening, standardization and size adjustment to obtain a pretreated water surface floater image sample data set X= { X 1 ,...,X n ,...,X N },X n Representing an nth water surface float image, wherein N represents the total number of samples; from X n The water surface area divided in the middle is taken as X n Is denoted as Y n The method comprises the steps of carrying out a first treatment on the surface of the And Y is n E, C represents a category set; labeling a water surface floater image sample data set X to obtain a mask set S= { S corresponding to the data set X 1 ,...,S n ,...,S N -a }; wherein S is n X represents n True category information of each pixel point;
step 2, building a water surface segmentation network based on a feature pyramid FPN, which sequentially comprises the following steps: backbone network module, pyramid pooling module PPM, stream alignment module FAM and refinement residual module RRB:
step 2.1, the backbone network is based on a ResNet101 network, and sequentially comprises: a first convolution layer, a second convolution layer, a third convolution layer, a fourth convolution layer, and a fifth convolution layer;
in this embodiment, the first convolution layer is composed of one 7×7 convolution layer, one normalization layer BN, one activation function layer ReLU and one maximum pooling layer MaxPool, the second convolution layer is composed of three residual blocks, the third convolution layer is composed of four residual blocks, the fourth convolution layer is composed of twenty-three residual blocks, and the fifth convolution layer is composed of three residuals, each residual block contains two 1×1 convolution layers and one 3×3 convolution layer;
the nth water surface floater image X n In the input backbone network, five convolution layers are processed in sequence, and each convolution layer generates a corresponding characteristic diagram respectivelyWherein (1)>Representing a water surface convolution characteristic map generated by an ith convolution layer;
step 2.2, pyramid pooling module PPM is composed of k pyramids with different scales, and a water surface convolution characteristic map is generated for the 5 th convolution layerCarrying out pooling operation of k different scales to obtain k water surface feature maps with different sizes; then respectively carrying out upward mining on k water surface characteristic graphs with different sizesSample manipulation to restore to +.>Finally, the k up-sampled water surface feature graphs are spliced in the channel dimension, so that an nth low-resolution water surface pooling feature graph is obtained>In this example, k is 4, and the 4 different scales are 1×1,2×2, 3×3, and 6×6, respectively;
step 2.3, the flow alignment module FAM pools the water surface characteristic map through bilinear interpolation methodUpsampling to the water surface convolution feature map generated with the j+1th convolution layer +.>After the two are spliced in the channel dimension with the same size, the spliced characteristic diagram is distorted to a water surface convolution characteristic diagram +.>In this embodiment, the warping operation is made up of two 3×3 convolution layers, generating the j-th flow alignment feature map +.>Thereby obtaining the nth water surface pooling feature map +.>Flow alignment feature graph set of (1) Alignment of three streams profile +.>Reducing the dimension by point convolution to obtain a return first-class alignment characteristic diagram +.>
Step 2.4, constructing Cross entropy loss L of stream alignment Module FAM Using (1) ce
In the formula (1), the components are as follows,representation->Predicted value of ith pixel, p n,i Representing mask S n The true value of the i-th pixel in (a), N representing the number of pixels;
step 2.5, refining the residual error module RRB from e 1 Each residual unit uses a 1X 1 common convolution layer to reduce the input feature, the feature after dimension reduction is divided into two branches, one branch is directly input into the adder, and the other branch is firstly input into the adder after passing through two serial 3X 3 common convolution layers, so as to obtain the integrated feature; in this embodiment, e 1 Taking 4;
convoluting the four layers of characteristic diagrams after water surface convolutionRespectively input into a refinement residual error module RRB and pass through e 1 After the residual units, an integrated feature map is obtained>
Integrating four feature mapsInputting the residual error feature map into an integrated adder, and performing dimension reduction through a 1 multiplied by 1 dimension reduction convolution layer to obtain a refined residual error feature map +.>Then align the return to first order feature map +.>And refinement of residual feature map->Performing point-by-point convolution to obtain a binary image of a water surface segmentation result, wherein the binary image is obtained by respectively setting gray values of pixel points on the image to 0 and 255, namely pure black and pure white, setting all pixel points with gray values of 255 in the obtained binary image to 1 by using an image processing technology, and then carrying out image segmentation on the binary image and the water surface floating object image X n Multiplying pixel by pixel to obtain a water surface float image X n Is a water surface image P of (2) n
Step 2.6 constructing Lovasz-SoftMax loss L of RRB using equation (2) ls
In the formula (2), C represents the number of categories of the category set C, C represents any one category of the category set C,is DeltaJ c Lovasz expansion, deltaJ c Is the Jaccard index for category c; m is m i (c) Representation pair S n The vector of the i-th pixel class c prediction error is:
in the formula (3), the amino acid sequence of the compound,representation of a refined residual feature map->A predicted value of the i-th pixel in (a);
step 2.7, constructing a loss function L of the water surface segmentation network by utilizing the step (4) α
L α =L ce +λL ls (4)
In the formula (4), λ is a weight parameter;
step 3, constructing a water surface abnormality detection network based on a single classification algorithm, which sequentially comprises the following steps: encoder module f θ Texture enhancement module TEM, pyramid texture feature extraction module PTFEM and classifier module
Step 3.1 encoder Module f θ Each basic unit is formed by two-dimensional convolution Conv2D and an activation function layer LeakyReLU in sequence; in this embodiment, r is 8;
in h 1 For step length, the nth water surface image P n Is divided into H pieces with the size of H 2 Obtain a block set P n ={w n,1 ,...,w n,h ,...,w n,H (wherein H is P) n Block total number, w n,h Is P n The h-th block of (a); in this embodiment, h 1 Take 4,h 2 The size is 32 multiplied by 32;
will w n,h Input encoder f θ Extracting the h coding feature A n,h
Step 3.2, constructing an encoder f using (5) θ Loss L of (2) θ
In formula (5), w n,h′ Is w n,h Is adjacent to (a) a block; adjacent blocks are cut blocks in the directions of up, down, left, right, left upper, right upper, left lower and right lower respectively;
step 3.3, a texture enhancement module TEM consists of a global average pooling layer, a one-dimensional quantized coding QCO layer and an MLP multi-layer perceptron;
for the h code feature A n,h After the global average pooling layer is processed, the h texture pooling feature g is obtained n,h Then calculate the coding feature A n,h Is associated with the texture pooling feature g n,h Cosine similarity of (1), generating a feature similarity matrix G n,h One-dimensional quantization coding QCO layer pair characteristic similarity matrix G n,h One-dimensional quantization coding is carried out to obtain an h quantization coding matrix E n,h The method comprises the steps of carrying out a first treatment on the surface of the The h quantized coding matrix E n,h After passing through the MLP multi-layer perceptron, generating the h statistical feature D n,h H quantized coding matrix E n,h And then with the h statistical feature D n,h After multiplication, the h high-quality texture feature O is obtained n,h
Step 3.4, pyramid texture feature extraction Module PTFEM by b 1 The parallel branches with different scales are formed, and each parallel branch comprises a texture feature extraction unit; the texture feature extraction unit is composed of b 2 Two-dimensional quantization coding 2d-QCO layer and b 3 The multi-layer perceptron MLP is formed in sequence; in this embodiment, b 1 Taking 4, b 2 1, b 3 Taking 1, and respectively dividing the input features into 1,2, 4 and 8 small blocks by four scales in PTFEM, wherein the four scales are 1,2, 4 and 8;
will h high quality texture feature O n,h And the h coding feature A n,h After feature fusion, generating an h fusion feature map K n,h Inputting the feature images into a pyramid texture feature extraction module PTFEM, and firstly merging the h feature images K n,h Divided into b 1 After feature graphs with different scales are respectively input b 1 Processing in parallel branches of different scales, each branch passing throughThe self texture feature extraction unit extracts the corresponding texture representation feature imageWherein (1)>Representing a texture representation feature map generated by the q-th branch; will b 1 The texture representation feature graphs obtained by the branches are respectively subjected to up-sampling operation so as to restore to the fusion feature graph K n,h And then b 1 After the feature graphs after up sampling are fused, the h multi-scale texture feature F is generated n,h Finally F is arranged n,h And A is a n,h After fusion, the h texture feature B is generated n,h
Step 3.5, classifier ModuleFrom x 1 Each linear unit is composed of a linear layer and an activation function layer LeakyReLU in sequence; in this embodiment, x 1 Taking 4;
will h texture feature B n,h Input classifierAnd get the h block w n,h Obtaining the anomaly scores of all the blocks; wherein the nth water surface image P n The abnormal score of the middle pixel point i is the average value of the sum of the abnormal scores of all the cut blocks where the pixel point i is located;
an anomaly score threshold value is set, in this embodiment, the anomaly score threshold value is set to 0.22, and the nth water surface image P is set n In the abnormal score of the middle pixel point, an abnormal region higher than an abnormal score threshold value is extracted, the pixel point higher than the abnormal score threshold value is set as 255 according to the abnormal score threshold value, the pixel point lower than the abnormal score threshold value is set as 0, and the outline of the abnormal region is acquired, and the outline of the abnormal region is not necessarily a regular rectangular region, so thatIntercepting the minimum circumscribed rectangle of the outline of the abnormal region by using an image processing technology so as to obtain X n Is a float area image Z of (1) n
Step 3.6, constructing a classifier by using the method (6)Loss of->
In the formula (6), cross_entcopy represents Cross entropy,representing the nth water surface image P n Random one of the small blocks, +.>Expressed as +.>Any one of the small blocks with the center in 8 directions is equal to or less than 1 and p 1 ,p 2 Less than or equal to H, y is->Relative to->And y=1, 2, 8;
step 3.7, constructing a loss function L of the water surface abnormality detection network by using the step (7) β
In formula (7), μ is a weight parameter;
step 4, building an image classification network of a base ResNet, which sequentially comprises a backbone network module and a classification module:
step 4.1, a backbone network module adds an average pooling layer after the ResNet101 network;
X n is a float area image of (2) Z n is input into a backbone network module, the feature graphs output by the three layers of convolution layers are extracted and subjected to feature fusion, and then the Z is obtained after the dimension reduction treatment of the average pooling layer n Feature vector T of (1) n
Step 4.2, the classification module is sequentially composed of a full connection layer and a Softmax activation function, and the feature vector T is obtained n Inputting the images into a classification module to obtain an nth water surface floater image X n Predictive classification labels for class mm=1, 2..m, M is the total number of float categories, whereby the correspondence of the predictive classification labels to the float categories is based, whereby a float category result is obtained; in this embodiment, the classification tag 1 corresponds to a float category of a bottle, 2 corresponds to duckweed, and M corresponds to the other.
Step 4.3, constructing a loss function L of the image classification network by using the step (8) γ
In the formula (8), J n,m X represents n True category labels for the m-th category;
step 5, based on the image sample data set X= { X of the water surface floating object 1 ,...,X n ,...,X N Training a water surface segmentation network, a water surface abnormality detection network and an image classification network by using a gradient descent method, and calculating a loss function L α 、L β And L γ Updating network parameters until the loss function converges, thereby obtaining a trained water surface segmentation model and a trained water surface abnormality detection modelThe system comprises a water surface segmentation model and an image classification model, wherein the water surface segmentation model is used for segmenting a water surface floater image acquired in real time to obtain a water surface image; the anomaly detection model carries out anomaly detection on the water surface image to obtain a floater area, then the floater area is input into the trained image classification model to obtain the predicted category of the floater, and finally the area set where the floater is located and the category of the floater are output. The specific identification process is shown in fig. 2, and includes:
the water surface floating object image is subjected to a water surface segmentation model to obtain a binary image of a water surface segmentation result, and then the water surface floating object image is multiplied by the binary image by utilizing an image processing technology to obtain a water surface image, so that the background part of the water surface image is removed, and only floating objects in the water surface and the water surface are reserved;
then, the water surface image is sent to a water surface abnormality detection model, an abnormality score threshold value is set, whether floaters exist in the water surface image or not is judged according to the abnormality score threshold value and pixel point abnormality scores of the water surface image, and if the floaters exist on the water surface, a result that the floaters exist on the water surface can be directly output; if the water surface has floaters, acquiring the outline of the abnormal area according to the relation between the abnormal score of the pixel points of the water surface image and the threshold value of the abnormal score, acquiring the minimum circumscribed rectangle of the outline according to the outline of the abnormal area, and intercepting the minimum circumscribed rectangle of the outline on the water surface image so as to acquire the area image of the floaters on the water surface;
and then sending the image of the area of the floating object on the water surface into an image classification model to obtain a classification label value, obtaining the specific category of the floating object according to the corresponding relation between the classification label and the category of the floating object, and finally outputting the area of the floating object and the category of the floating object.
In this embodiment, an electronic device includes a memory for storing a program supporting the processor to execute the above method, and a processor configured to execute the program stored in the memory.
In this embodiment, a computer-readable storage medium stores a computer program that, when executed by a processor, performs the steps of the method described above.

Claims (5)

1. A water surface floater identification method based on semantic segmentation and image anomaly detection is characterized by comprising the following steps:
step 1, acquiring a water surface floater image data set, and sequentially performing pretreatment of screening, standardization and size adjustment to obtain a pretreated water surface floater image sample data set X= { X 1 ,...,X n ,...,X N },X n Representing an nth water surface float image, wherein N represents the total number of samples; from X n The water surface area divided in the middle is taken as X n Is denoted as Y n The method comprises the steps of carrying out a first treatment on the surface of the And Y is n E, C represents a category set; labeling the water surface floater image sample data set X to obtain a mask set S= { S corresponding to the data set X 1 ,...,S n ,...,S N -a }; wherein S is n X represents n True category information of each pixel point;
step 2, building a water surface segmentation network based on a feature pyramid FPN for the nth water surface floater image X n Processing and obtaining a water surface floater image X n Is a water surface image P of (2) n
Constructing a loss function L of a water surface segmentation network by using the method (4) α
L α =L ce +λL ls (4)
In the formula (4), λ is a weight parameter; l (L) ls Representing lovassz-SoftMax loss; l (L) ce Representing cross entropy loss;
step 3, constructing a water surface abnormality detection network based on a single classification algorithm for the nth water surface image P n Processing and obtaining X n Is a float area image Z of (1) n
Constructing a loss function L of a water surface abnormality detection network by using the method (7) β
In formula (7), μ is a weight parameter; l (L) θ Indicating the loss of coding,representing a classification loss;
step 4, building an image classification network of a base ResNet, which sequentially comprises a backbone network module and a classification module:
step 4.1, adding an average pooling layer after the ResNet101 network by the backbone network module;
X n is a float area image Z of (1) n Inputting the backbone network module, extracting the feature graphs output by the three layers of convolution layers, carrying out feature fusion, and obtaining Z after the dimension reduction treatment of the average pooling layer n Feature vector T of (1) n
Step 4.2, the classification module is sequentially composed of a full connection layer and a Softmax activation function, and the feature vector T is obtained n Inputting the images into the classification module to obtain an nth water surface floater image X n Predictive classification labels for class mM is the total number of the categories of the floaters, so that the category results of the floaters are obtained according to the corresponding relation between the predictive classification labels and the categories of the floaters;
step 4.3, constructing a loss function L of the image classification network by using the step (8) γ
In the formula (8), J n,m X represents n True category labels for the m-th category;
step 5, based on the image sample data set X= { X of the water surface floating object 1 ,...,X n ,...,X N Training a water surface segmentation network, a water surface abnormality detection network and an image classification network by using a gradient descent method, and calculating a loss function L α 、L β And L γ Updating network parameters until the loss function converges, so as to obtain a trained water surface segmentation model, a trained water surface abnormality detection model and a trained image classification model, wherein the water surface segmentation model is used for segmenting a water surface floater image acquired in real time to obtain a water surface image; the anomaly detection model carries out anomaly detection on the water surface image to obtain a floater area, then the floater area is input into the trained image classification model to obtain the predicted category of the floater, and finally the area set where the floater is located and the category of the floater are output.
2. The water surface floater identification method based on semantic segmentation and image anomaly detection of claim 1, wherein the water surface segmentation network sequentially comprises: backbone network module, pyramid pooling module PPM, stream alignment module FAM and refinement residual error module RRB, and obtain water surface image P according to the following steps n
Step 2.1, the backbone network is based on a ResNet101 network, and sequentially comprises: a first convolution layer, a second convolution layer, a third convolution layer, a fourth convolution layer, and a fifth convolution layer;
the nth water surface floater image X n Inputting into the backbone network, sequentially processing by five convolution layers, and respectively generating corresponding feature graphs by each convolution layerWherein (1)>Representing a water surface convolution characteristic map generated by an ith convolution layer;
step 2.2, the pyramid pooling module PPM consists of k pyramids with different scales, and a water surface convolution characteristic map generated for the 5 th convolution layerCarrying out pooling operation of k different scales to obtain k water surface characteristics with different sizesA figure; then up-sampling the k water surface characteristic images with different sizes to recover +.>Finally, the k up-sampled water surface feature graphs are spliced in the channel dimension, so that an nth low-resolution water surface pooling feature graph is obtained>
Step 2.3, the flow alignment module FAM pools the water surface characteristic map through a bilinear interpolation methodUpsampling to the water surface convolution feature map generated with the j+1th convolution layer +.>After the two are spliced in the channel dimension with the same size, the spliced characteristic diagram is distorted to a water surface convolution characteristic diagram +.>In (a) generating a j-th stream alignment feature map->Thereby obtaining the nth water surface pooling feature map +.>Flow alignment feature map set +.>Alignment of three streams profile +.>The dimension reduction is carried out by point-by-point convolution to obtain a normalizationFlow alignment feature map->
Step 2.4, constructing the cross entropy loss L of the stream alignment module FAM using equation (1) ce
In the formula (1), the components are as follows,representation->Predicted value of ith pixel, p n,i Representing mask S n The true value of the i-th pixel in (a), N representing the number of pixels;
step 2.5, the refinement residual module RRB is represented by e 1 Each residual unit uses a 1X 1 common convolution layer to reduce the input feature, the feature after dimension reduction is divided into two branches, one branch is directly input into the adder, and the other branch is firstly input into the adder after passing through two serial 3X 3 common convolution layers, so as to obtain the integrated feature;
convoluting the four layers of characteristic diagrams after water surface convolutionRespectively inputting the data into the refinement residual error module RRB, and passing through e 1 After the residual units, an integrated feature map is obtained>
Integrating four feature mapsInputting the data into the integration adder, and performing dimension reduction by the 1X 1 dimension reduction convolution layer to obtain a refined residual error characteristic diagram +.>Then align the return to first order feature map +.>And refinement of residual feature map->Performing point-by-point convolution to obtain a water surface floater image X n Is a water surface image P of (2) n
Step 2.6, constructing Lovasz-SoftMax loss L of the refinement residual module RRB by using the formula (2) ls
In the formula (2), C represents the number of categories of the category set C, C represents any one category of the category set C,is DeltaJ c Lovasz expansion, deltaJ c Is the Jaccard index for category c; m is m i (c) Representation pair S n The vector of the i-th pixel class c prediction error is:
in the formula (3), the amino acid sequence of the compound,representation of a refined residual feature map->Predicted value of the i-th pixel in (b).
3. The water surface float identification method based on semantic segmentation and image anomaly detection according to claim 2, wherein the method is characterized by comprising the following steps: the water surface abnormality detection network sequentially comprises: encoder module f θ Texture enhancement module TEM, pyramid texture feature extraction module PTFEM and classifier moduleAnd X is obtained as follows n Is a float area image Z of (1) n
Step 3.1, the encoder module f θ Each basic unit is formed by two-dimensional convolution Conv2D and an activation function layer LeakyReLU in sequence;
in h 1 For step length, the nth water surface image P n Is divided into H pieces with the size of H 2 Obtain a block set P n ={w n,1 ,...,w n,h ,...,w n,H (wherein H is P) n Block total number, w n,h Is P n The h-th block of (a);
will w n,h Input the encoder f θ Extracting the h coding feature A n,h
Step 3.2, constructing the encoder f using (5) θ Loss L of (2) θ
In formula (5), w n,h′ Is w n,h Is adjacent to (a) a block;
step 3.3, the texture enhancement module TEM consists of a global average pooling layer, a one-dimensional quantized coding QCO layer and an MLP multi-layer perceptron;
for the h code feature A n,h Through global levelingAfter the treatment of the pooling layer, the h texture pooling feature g is obtained n,h Then calculate the coding feature A n,h Is associated with the texture pooling feature g n,h Cosine similarity of (1), generating a feature similarity matrix G n,h The one-dimensional quantization coding QCO layer pair characteristic similarity matrix G n,h One-dimensional quantization coding is carried out to obtain an h quantization coding matrix E n,h The method comprises the steps of carrying out a first treatment on the surface of the The h quantized coding matrix E n,h After passing through the MLP multi-layer perceptron, generating the h statistical feature D n,h H quantized coding matrix E n,h And then with the h statistical feature D n,h After multiplication, the h high-quality texture feature O is obtained n,h
Step 3.4, the pyramid texture feature extraction module PTFEM is represented by b 1 The parallel branches with different scales are formed, and each parallel branch comprises a texture feature extraction unit; the texture feature extraction unit is composed of b 2 Two-dimensional quantization coding 2d-QCO layer and b 3 The multi-layer perceptron MLP is formed in sequence;
will h high quality texture feature O n,h And the h coding feature A n,h After feature fusion, generating an h fusion feature map K n,h Inputting the feature images into the pyramid texture feature extraction module PTFEM, and firstly merging the h feature images K n,h Divided into b 1 After feature graphs with different scales are respectively input b 1 Processing in parallel branches of different scales, each branch extracting corresponding texture representation feature map by its own texture feature extraction unitWherein (1)>Representing a texture representation feature map generated by the q-th branch; will b 1 The texture representation feature graphs obtained by the branches are respectively subjected to up-sampling operation so as to restore to the fusion feature graph K n,h And then b 1 Up-sampling ofAfter the feature graphs are fused, generating an h multi-scale texture feature F n,h Finally F is arranged n,h And A is a n,h After fusion, the h texture feature B is generated n,h
Step 3.5, the classifier moduleFrom x 1 Each linear unit is composed of a linear layer and an activation function layer LeakyReLU in sequence;
will h texture feature B n,h Inputting the classifierAnd get the h block w n,h Obtaining the anomaly scores of all the blocks; wherein the nth water surface image P n The abnormal score of the middle pixel point i is the average value of the sum of the abnormal scores of all the cut blocks where the pixel point i is positioned;
setting an abnormality score threshold value, and setting an nth water surface image P n In the abnormal score of the middle pixel point, an abnormal region higher than an abnormal score threshold value is extracted, so that X is obtained n Is a float area image Z of (1) n
Step 3.6, constructing the classifier by using the step (6)Loss of->
In the formula (6), cross_entcopy represents Cross entropy,representing the nth water surface image P n Is a small block of a random number of the blocks,expressed as +.>Any one of the small blocks with the center in 8 directions is equal to or less than 1 and p 1 ,p 2 Less than or equal to H, y is->Relative to->And y=1, 2,..8.
4. An electronic device comprising a memory and a processor, wherein the memory is for storing a program for supporting the processor to perform the method of identifying a surface float as claimed in any one of claims 1 to 3, the processor being configured to execute the program stored in the memory.
5. A computer readable storage medium having stored thereon a computer program, characterized in that the computer program when run by a processor performs the steps of the method for identifying a surface float according to any one of claims 1-3.
CN202310894099.3A 2023-07-20 2023-07-20 Water surface floater identification method based on semantic segmentation and image anomaly detection Pending CN116824352A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310894099.3A CN116824352A (en) 2023-07-20 2023-07-20 Water surface floater identification method based on semantic segmentation and image anomaly detection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310894099.3A CN116824352A (en) 2023-07-20 2023-07-20 Water surface floater identification method based on semantic segmentation and image anomaly detection

Publications (1)

Publication Number Publication Date
CN116824352A true CN116824352A (en) 2023-09-29

Family

ID=88139213

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310894099.3A Pending CN116824352A (en) 2023-07-20 2023-07-20 Water surface floater identification method based on semantic segmentation and image anomaly detection

Country Status (1)

Country Link
CN (1) CN116824352A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117710379A (en) * 2024-02-06 2024-03-15 杭州灵西机器人智能科技有限公司 Nondestructive testing model construction method, nondestructive testing device and medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117710379A (en) * 2024-02-06 2024-03-15 杭州灵西机器人智能科技有限公司 Nondestructive testing model construction method, nondestructive testing device and medium
CN117710379B (en) * 2024-02-06 2024-05-10 杭州灵西机器人智能科技有限公司 Nondestructive testing model construction method, nondestructive testing device and medium

Similar Documents

Publication Publication Date Title
CN111080620B (en) Road disease detection method based on deep learning
CN113052210B (en) Rapid low-light target detection method based on convolutional neural network
CN111209810A (en) Bounding box segmentation supervision deep neural network architecture for accurately detecting pedestrians in real time in visible light and infrared images
CN112465790A (en) Surface defect detection method based on multi-scale convolution and trilinear global attention
CN109886159B (en) Face detection method under non-limited condition
CN116824352A (en) Water surface floater identification method based on semantic segmentation and image anomaly detection
CN112396635A (en) Multi-target detection method based on multiple devices in complex environment
CN116311254B (en) Image target detection method, system and equipment under severe weather condition
Li et al. A review of deep learning methods for pixel-level crack detection
CN115035295A (en) Remote sensing image semantic segmentation method based on shared convolution kernel and boundary loss function
CN111461006B (en) Optical remote sensing image tower position detection method based on deep migration learning
CN112149612A (en) Marine organism recognition system and recognition method based on deep neural network
CN114155474A (en) Damage identification technology based on video semantic segmentation algorithm
CN114772208B (en) Non-contact belt tearing detection system and method based on image segmentation
CN114677377A (en) Display screen defect detection method, training method, device, equipment and medium
CN114596477A (en) Foggy day train fault detection method based on field self-adaption and attention mechanism
CN110991374B (en) Fingerprint singular point detection method based on RCNN
CN117372853A (en) Underwater target detection algorithm based on image enhancement and attention mechanism
CN116994161A (en) Insulator defect detection method based on improved YOLOv5
CN113192018B (en) Water-cooled wall surface defect video identification method based on fast segmentation convolutional neural network
CN116739991A (en) Liquid crystal display screen surface defect detection method based on deep learning and electronic device
CN114722928B (en) Blue algae image recognition method based on deep learning
CN115984133A (en) Image enhancement method, vehicle snapshot method, device and medium
CN113343977B (en) Multipath automatic identification method for container terminal truck collection license plate
Yang et al. Multi visual feature fusion based fog visibility estimation for expressway surveillance using deep learning network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination