CN115131557A - Lightweight segmentation model construction method and system based on activated sludge image - Google Patents

Lightweight segmentation model construction method and system based on activated sludge image Download PDF

Info

Publication number
CN115131557A
CN115131557A CN202210596703.XA CN202210596703A CN115131557A CN 115131557 A CN115131557 A CN 115131557A CN 202210596703 A CN202210596703 A CN 202210596703A CN 115131557 A CN115131557 A CN 115131557A
Authority
CN
China
Prior art keywords
segmentation model
activated sludge
lightweight
image
loss
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210596703.XA
Other languages
Chinese (zh)
Inventor
赵立杰
田博仁
黄明忠
王国刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenyang University of Chemical Technology
Original Assignee
Shenyang University of Chemical Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenyang University of Chemical Technology filed Critical Shenyang University of Chemical Technology
Priority to CN202210596703.XA priority Critical patent/CN115131557A/en
Publication of CN115131557A publication Critical patent/CN115131557A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a lightweight segmentation model construction method and a lightweight segmentation model construction system based on an activated sludge image, and relates to a sewage treatment intelligent detection model construction method and a system, firstly, an activated sludge microscopic image data set is created; constructing a segmentation model for segmenting the activated sludge microscopic image based on the void space convolution pooling pyramid ASPP and the coordinate attention mechanism CA and OCRNet; training a segmentation model by using an activated sludge microimage data set, and updating parameters of the segmentation model by using a pre-constructed mixed loss function; and compressing the trained segmentation model to obtain a lightweight segmentation model. The lightweight segmentation model based on the ASPP-CA-OCRNet constructed by the invention can be deployed on a mobile terminal or edge equipment with limited computing capacity, so that the problem of difficult model deployment on the edge equipment with limited resources is effectively solved, and meanwhile, the segmentation precision of activated sludge flocs and filamentous bacteria is improved, so that sludge bulking can be effectively monitored and prevented.

Description

Lightweight segmentation model construction method and system based on activated sludge image
Technical Field
The invention relates to a method and a system for constructing an intelligent detection model for sewage treatment, in particular to a method and a system for constructing a lightweight segmentation model based on an activated sludge image.
Background
The microscopic morphology and characteristics of microorganisms in the phase contrast microscopic image can directly reflect the sedimentation performance of activated sludge in a sewage treatment plant, and the real-time effective segmentation of flocs and filamentous bacteria is important for monitoring and preventing sludge bulking. However, the phase contrast microscope has inherent halo and shadow effects and the problems of low contrast and unbalanced category of flocs, filamentous fungi and background, and the traditional image processing method faces huge challenges.
With the rapid development of deep learning, the image segmentation method based on deep learning obtains good effects in the image segmentation field. Compared with the traditional segmentation method, the segmentation precision of the deep learning model is higher, but the deep learning model has large parameter quantity and high memory requirement and calculation cost, so that the deep learning model is difficult to deploy on a mobile terminal or embedded edge equipment with limited calculation capability.
The present invention has been made in view of this situation.
Disclosure of Invention
The invention aims to solve the technical problem of providing a lightweight segmentation model construction method and system based on an activated sludge image. The trained model is subjected to channel pruning, so that the model is light in weight, and the problem of difficulty in model deployment on resource-limited edge equipment is effectively solved.
The technical scheme of the invention is as follows:
the invention provides a construction method of an activated sludge image segmentation model, which comprises the following steps:
s1, creating an activated sludge microscopic image data set;
s2, constructing a segmentation model for segmenting the activated sludge microscopic image based on the void space convolution pooling pyramid ASPP and the coordinate attention mechanism CA and OCRNet;
s3, training the segmentation model by using the activated sludge micro-image data set, and updating parameters of the segmentation model by adopting a pre-constructed mixed loss function;
and S4, compressing the trained segmentation model by using a preset algorithm to obtain a lightweight segmentation model.
Further optionally, step S1 includes the steps of:
s11, acquiring a plurality of activated sludge microscopic images by using a phase contrast microscope;
and S12, labeling each activated sludge microscopic image to obtain a label image with the same size as the original image, wherein the label image comprises a background, flocs and filamentous fungi.
Further optionally, step S2 includes the steps of:
s21, selecting HRNet as OCRNet backbone network;
s22, taking the rough semantic segmentation result obtained by the calculation of the intermediate layer of the backbone network as a soft object area;
s23, calculating vectors according to the soft object area and the pixel representation output by the deepest layer of the main network to form object area representation, wherein each vector corresponds to an object area feature representation which is a positive integer;
s24, calculating a relation matrix between pixel representation and object region representation output from the deepest layer of the backbone network, and performing weighted summation according to each pixel value and the numerical value of the object region feature representation in the relation matrix to obtain object context representation;
and S25, performing channel connection on the object context representation and a parallel structure to obtain an enhanced representation, wherein the parallel structure is composed of pixel representations and passes through an ASPP module and a CA module in parallel.
Further optionally, step S3 includes the steps of:
s31, dividing the activated sludge micro-image data set into a training set, a verification set and a test set;
s32, combining the loss aiming at the MIoU (Mean Intersection over Union) and the cross entropy loss into a mixed loss function, and calculating the loss in the soft object region and the final output region of the segmentation model by adopting the mixed loss function;
s33, solving the optimal solution of the mixed loss function by using a gradient descent method, and updating parameters of the segmentation model according to the optimal solution of the mixed loss function;
s34, the segmentation model is tested using the test set.
Further optionally, the mixing loss function is formulated as follows:
Figure 452769DEST_PATH_IMAGE001
wherein the content of the first and second substances,
Figure 774029DEST_PATH_IMAGE002
indicating the loss of the soft object region,
Figure 528358DEST_PATH_IMAGE003
indicating the loss of the final output area,
Figure 327687DEST_PATH_IMAGE004
Figure 851072DEST_PATH_IMAGE005
to represent
Figure 343233DEST_PATH_IMAGE006
Figure 319280DEST_PATH_IMAGE007
The weight lost;
Figure 925229DEST_PATH_IMAGE008
which represents the cross-entropy loss in the entropy domain,
Figure 568700DEST_PATH_IMAGE009
indicating a loss for the class MIoU,
Figure 966183DEST_PATH_IMAGE010
Figure 695105DEST_PATH_IMAGE011
represent
Figure 836236DEST_PATH_IMAGE008
Figure 334214DEST_PATH_IMAGE009
The weight lost;
Figure 840281DEST_PATH_IMAGE012
an output representing the region of the soft object,
Figure 118816DEST_PATH_IMAGE013
representing the final output.
Specifically, MIoU uses the Jaccard loss function
Figure 1321DEST_PATH_IMAGE014
Lov sz expansion
Figure 416122DEST_PATH_IMAGE015
And error vector
Figure 358670DEST_PATH_IMAGE016
Construction of Lov-sz-Softmax loss (C)
Figure 858922DEST_PATH_IMAGE009
) The definition is as follows:
Figure 279539DEST_PATH_IMAGE017
wherein the content of the first and second substances,
Figure 814425DEST_PATH_IMAGE016
is a vector of the error of the pixel,
Figure 662296DEST_PATH_IMAGE018
is thatThe class of the said plant is the genus,
Figure 649843DEST_PATH_IMAGE019
is the total number of categories that are,
Figure 874151DEST_PATH_IMAGE020
is a function of the loss of the Jaccard,
Figure 284052DEST_PATH_IMAGE015
is a proxy loss function of Jaccard loss;
jaccard loss function
Figure 568403DEST_PATH_IMAGE020
Is defined as:
Figure 43247DEST_PATH_IMAGE021
wherein the content of the first and second substances,
Figure 805666DEST_PATH_IMAGE022
representing categories
Figure 49566DEST_PATH_IMAGE018
A set of mispredictions of;
Figure 504818DEST_PATH_IMAGE023
is a real label and is a label of the real,
Figure 466958DEST_PATH_IMAGE024
is the predicted result;
Figure 33068DEST_PATH_IMAGE025
is defined as follows:
Figure 131474DEST_PATH_IMAGE026
Figure 757628DEST_PATH_IMAGE027
wherein the content of the first and second substances,
Figure 207063DEST_PATH_IMAGE028
representing a pixel
Figure 311286DEST_PATH_IMAGE029
The error vector of (a) is calculated,
Figure 529777DEST_PATH_IMAGE030
is a pixel calculated using the SoftMax method
Figure 326832DEST_PATH_IMAGE029
Belong to the category
Figure 997985DEST_PATH_IMAGE018
A probability distribution of (a);
Figure 171477DEST_PATH_IMAGE031
is an Lov-sz expansion
Figure 247405DEST_PATH_IMAGE032
The derivative of (a) of (b),
Figure 215361DEST_PATH_IMAGE033
show in descending order
Figure 373810DEST_PATH_IMAGE034
And (4) components.
Further optionally, step S4 includes the steps of:
s41, introducing a scale factor gamma as an evaluation factor into each channel of the trained segmentation model BN layer, evaluating the importance of each channel, and thinning each channel evaluation factor gamma;
s42, after the evaluation factors gamma of the channels corresponding to the trained segmentation model BN layer are thinned, sorting the channels according to the absolute values, intercepting the channels with high absolute values as the channels needing to be reserved according to the importance threshold corresponding to the required pruning proportion, and recording the channels needing to be reserved of each BN layer to generate pruning configuration files;
and S43, performing model compression according to the pruning configuration file to obtain a lightweight segmentation model.
Further optionally, step S43 specifically includes: and extracting the number of channels required by the current BN layer from the pruning configuration file, generating a BN layer example according to the number of channels, and modifying the number of output channels of a previous convolutional layer of the BN layer and the number of input channels of a next convolutional layer of the BN layer.
Further optionally, step S4 further includes the following steps:
and S44, retraining the lightweight model obtained in the step S43 by using the original training set, thereby achieving the purpose of recovering the model accuracy.
The invention also proposes an activated sludge image-based lightweight segmentation model construction system comprising one or more processors and a non-transitory computer-readable storage medium having stored thereon program instructions, the one or more processors being adapted to carry out the method according to any one of the above-mentioned aspects when the program instructions are executed by the one or more processors.
The invention also provides an activated sludge sample collection system, which comprises a server and edge equipment, wherein the server is connected with the edge equipment, and the server adopts a method of any one of the technical schemes or a lightweight segmentation model construction system with the technical scheme; and a lightweight segmentation model constructed by a server is deployed on the edge equipment.
After adopting the technical scheme, compared with the prior art, the invention has the following beneficial effects:
the input of the segmentation model is an activated sludge microscopic image, each pixel of flocs and filamentous bacteria in the activated sludge microscopic image is mapped to the position and the category of the flocs and the filamentous bacteria, and a segmentation result image of the flocs and the filamentous bacteria with the same size as the original image is output. Based on an ASPP-CA-OCRNet segmentation model, the pixel representation is connected with the object context representation in parallel through an ASPP module and a CoordAttention module to serve as context enhancement representation, and the feature expression capability of the network is further improved. Because the floc and the filamentous fungus sample have the problem of class imbalance, the model parameters are updated by adopting a mixed loss function (Mixloss) which is formed by MIoU-like loss and cross entropy loss, the completeness and the accuracy of the class are improved, and the problem of low segmentation precision caused by class imbalance is solved. Because the model parameters are large and are not easy to be deployed to the edge equipment, the model compression is carried out by using a channel pruning method, and the problem of difficult model deployment on the resource-limited edge equipment is effectively solved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention, are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention without limiting the invention to the right. It is obvious that the drawings in the following description are only some embodiments and that for a person skilled in the art, other drawings can also be derived from them without inventive effort. In the drawings:
FIG. 1: the method is a flow schematic diagram of a lightweight segmentation model construction method based on an activated sludge image in one embodiment of the invention;
FIG. 2: the structure of the segmentation model based on the ASPP-CA-OCRNet is shown as a schematic diagram;
FIG. 3: a flow chart of a channel pruning operation according to an embodiment of the present invention;
FIG. 4: is a system framework diagram of an activated sludge sample collection system according to an embodiment of the invention.
It should be noted that the drawings and the description are not intended to limit the scope of the inventive concept in any way, but to illustrate it by a person skilled in the art with reference to specific embodiments.
Detailed Description
The following describes embodiments of the present invention in further detail with reference to the accompanying drawings.
In the description of the present invention, it should be noted that the terms "inside", "outside", etc. indicate the orientation or positional relationship based on the orientation or positional relationship shown in the drawings only for the convenience of describing the present invention and simplifying the description, but do not indicate or imply that the device or element referred to must have a specific orientation, be constructed and operated in a specific orientation, and thus, should not be construed as limiting the present invention.
In the description of the present invention, it should be noted that, unless otherwise explicitly specified or limited, the terms "mounted," "connected," "contacting," and "communicating" are to be construed broadly, e.g., as meaning fixedly connected, detachably connected, or integrally connected; can be mechanically or electrically connected; may be directly connected or indirectly connected through an intermediate. The specific meanings of the above terms in the present invention can be understood in a specific case to those of ordinary skill in the art.
The invention provides a lightweight segmentation model construction method based on an activated sludge image, and aims to solve the problem that the deep learning model is difficult to deploy on a mobile terminal or embedded edge equipment with limited computing capacity due to large parameter quantity, high memory requirement and high computing overhead.
Fig. 1 shows a flow chart of a lightweight segmentation model construction method based on an activated sludge image according to an embodiment of the present invention. As shown in fig. 1, the lightweight segmentation model construction method includes:
s1, creating an activated sludge microimage dataset. Wherein, S1 includes the following steps:
s11, taking 500mL of each sample at the position 50 cm away from the liquid level from the outlet of the aeration tank of the municipal sewage treatment plant, wherein the interval time between sample collection, image collection and water quality index measurement is not more than 3 hours. A10 mu L activated sludge sample is extracted by using a pipette gun and is dripped on a glass slide, a sample to be detected is covered by a square cover glass with the thickness of 24 mm multiplied by 24 mm, the sample is placed on a phase contrast microscope with a color CCD camera, and the sample on the glass slide is scanned from the upper left to the lower right by using image acquisition software (ToupView), so that a plurality of phase contrast microscopic images under upper, middle and lower three different focal planes are obtained. A total of 108 RGB color images (12 × 3 × 3) were obtained per sample to improve the representativeness of the sludge information;
and S12, selecting 420 typical image samples, and labeling the original image by using a data labeling tool LabelMe to obtain a label image with the size consistent with that of the original image, wherein the background in the label image is black, the floc is red, and the filamentous fungi are green.
And S2, constructing a segmentation model for segmenting the activated sludge microscopic image (hereinafter referred to as an ASPP-CA-OCRNet-based segmentation model) based on the void space convolution pooling pyramid ASPP and the coordinate attention mechanism CA and OCRNet. Specifically, the ASPP-CA-OCRNet-based segmentation model is used for but not limited to segmenting flocs and filamentous bacteria of an activated sludge microscopic image. As shown in fig. 2, the segmentation model includes a backbone network, soft object regions, object region representations, object context representations, and enhanced representations. Wherein, S2 includes the following steps:
and S21, constructing a Backbone network Backbone part of the model. HRNet is selected as the OCRNet backbone network. The key feature of the HRNet is a parallel connection structure, and the number of subnets of the main network is gradually increased from high resolution to low resolution to form a parallel network which can fully integrate the high resolution and the low resolution. HRNet performs feature fusion of high-resolution subnets and low-resolution subnets multiple times, and the multi-scale feature fusion method enables each high-resolution to low-resolution representation to repeatedly accept information from other parallel representations to obtain rich high-resolution representations.
And S22, constructing a model soft object region. The soft object area is output after 1 × 1 convolution by the Backbone network backhaul middle layer
Figure 882152DEST_PATH_IMAGE035
The tensor of (a) as a soft object region;
s23, constructing the model object region representation. Calculating according to the rough semantic segmentation result and the pixel representation output from the deepest layer of the backbone network
Figure 12919DEST_PATH_IMAGE036
Vectors, i.e. object region representations, wherein each vector corresponds to an object region feature representation (also called feature representation of semantic category);
s24, constructing the context representation of the model object. Calculating a relation matrix between pixel representation output from the deepest layer of the backbone network and the calculated object region feature representation, and weighting and summing the object region features according to the value of each pixel and the numerical value of the object region feature representation in the relation matrix to obtain the final object context representation;
s25, constructing a model enhancement representation. The enhanced representation part is obtained by splicing object context representation and a parallel structure, wherein the parallel structure is formed by pixel representation in parallel through a hole space pyramid module ASPP and a coordinate attention mechanism CA module.
The ASPP module obtains context information of the image on different scales through the cavity convolution with the cavity rates of 6, 12 and 18 respectively in parallel, and finally, global average pooling is introduced to obtain global information of the image. CoordAttention is a lightweight attention mechanism, which consists of coordinate information embedding and coordinate attention generation, wherein the coordinate information embedding takes pixel representation as input to realize feature extraction on each channel along the horizontal direction and the vertical direction by using self-adaptive average pooling to generate a pair of direction-sensing feature mapping; and coordinate attention generation, namely firstly, embedding coordinate information into the extracted feature maps in the horizontal direction and the vertical direction of output, cascading the feature maps, performing 1 × 1 convolution and ReLU activation function operation, and secondly, splitting the features, and respectively performing 1 × 1 convolution and Sigmoid activation function. The CoordAttention formula is:
Figure 214093DEST_PATH_IMAGE037
wherein, the first and the second end of the pipe are connected with each other,
Figure 797521DEST_PATH_IMAGE038
is shown as
Figure 47237DEST_PATH_IMAGE038
A channel;
Figure 94827DEST_PATH_IMAGE039
and
Figure 404586DEST_PATH_IMAGE040
attention weights for horizontal and vertical directions;
Figure 537627DEST_PATH_IMAGE041
is the final output profile.
And S3, training the segmentation model by using the activated sludge micro-image data set, and updating parameters of the segmentation model by adopting a pre-constructed mixed loss function. Wherein, S3 includes the following steps:
and S31, dividing a training set, a verification set and a test set.
Specifically, in step S1, the obtained 420 activated sludge data sets are as follows: 1: and 2, dividing the training set, the verification set and the test set.
S32, constructing a model mixing loss function. Combining the loss aiming at the MIoU-like and the cross entropy loss into a mixed loss function, and calculating the loss in a soft object region and a final output region of the segmentation model respectively by adopting the mixed loss function.
Specifically, the loss (Lov-sz-Softmax loss) for the MIoU class is adopted, and the loss is optimized for IoU (Jaccard index), namely the loss value of the pixel point is related to the predicted value of the pixel point and the predicted values of other pixel points in the area. Lov sz-Softmax loss can improve the learning ability of the network for small samples, but there is also a problem that the model does not converge well. The embodiment combines Cross-entropy loss (Cross-entropy loss) and Lov-sz-Softmax loss into a mixed loss function, can solve the problem of unbalance of floc and filamentous fungus categories, and can accelerate model convergence. The mixing loss function is formulated as:
Figure 591034DEST_PATH_IMAGE042
wherein the content of the first and second substances,
Figure 493131DEST_PATH_IMAGE043
indicating the loss of the soft object region,
Figure 973791DEST_PATH_IMAGE044
indicating the loss of the final output area,
Figure 328548DEST_PATH_IMAGE045
Figure 185646DEST_PATH_IMAGE046
to represent
Figure 939320DEST_PATH_IMAGE043
Figure 590881DEST_PATH_IMAGE044
The weight lost is set to 0.3, 0.7;
Figure 698514DEST_PATH_IMAGE047
represents a Cross-entropy loss (Cross-entropy loss),
Figure 93724DEST_PATH_IMAGE048
represents the loss for the class MIoU (Lov sz-Softmax loss),
Figure 704833DEST_PATH_IMAGE049
Figure 792875DEST_PATH_IMAGE050
to represent
Figure 122225DEST_PATH_IMAGE051
Figure 55546DEST_PATH_IMAGE052
The weight lost is set to 0.2, 0.8;
Figure 786742DEST_PATH_IMAGE053
an output representing the region of the soft object,
Figure 108002DEST_PATH_IMAGE054
representing the final output.
Lovász-Softmax loss(
Figure 596752DEST_PATH_IMAGE048
) Is directed to MIoU-like losses using the Jaccard loss function
Figure 661660DEST_PATH_IMAGE055
Lov sz expansion
Figure 185045DEST_PATH_IMAGE056
And error vector
Figure 411627DEST_PATH_IMAGE057
Construction, defined as follows:
Figure 653253DEST_PATH_IMAGE058
wherein the content of the first and second substances,
Figure 259202DEST_PATH_IMAGE057
is a vector of the error of the pixel,
Figure 637094DEST_PATH_IMAGE059
is the class to which the gene belongs,
Figure 300156DEST_PATH_IMAGE060
is the total number of categories that are,
Figure 763499DEST_PATH_IMAGE061
is a function of the loss of the Jaccard,
Figure 170209DEST_PATH_IMAGE062
is a proxy loss function of the Jaccard loss. This is due to the Jaccard loss function
Figure 668187DEST_PATH_IMAGE061
The dispersion of (a) is such that,
Figure 236571DEST_PATH_IMAGE061
cannot be directly used for continuous optimization, and a Jaccard loss function which is discrete by Lov-sz expansion is adopted for the purpose
Figure 452789DEST_PATH_IMAGE061
Smooth continuation as continuous, piecewise linear can be aligned to the error vector
Figure 397611DEST_PATH_IMAGE057
Direct derived new loss function
Figure 750095DEST_PATH_IMAGE062
Replacing discrete Jaccard loss function
Figure 489381DEST_PATH_IMAGE061
. Jaccard loss function
Figure 927316DEST_PATH_IMAGE061
Is defined as:
Figure 675829DEST_PATH_IMAGE063
wherein, the first and the second end of the pipe are connected with each other,
Figure 882819DEST_PATH_IMAGE057
representing categories
Figure 58586DEST_PATH_IMAGE059
The set of mispredictions of (c);
Figure 983816DEST_PATH_IMAGE064
is a real label and is a label of the real,
Figure 267511DEST_PATH_IMAGE065
is the predicted result. Although it is not limited to
Figure 594588DEST_PATH_IMAGE061
The discreteness of (A) results in that it cannot be directly derived, but Jaccard loses
Figure 941255DEST_PATH_IMAGE061
Satisfying sub-model function definition in order to optimize in continuous optimization frameworkThe Jaccard loss is transformed, Softmax is fused into Lov-sz extension, and the Lov-sz extension is adopted to smoothly prolong the Jaccard loss
Figure 88203DEST_PATH_IMAGE061
Making its domain discrete
Figure 178519DEST_PATH_IMAGE066
Mapping to a continuous real space
Figure 360101DEST_PATH_IMAGE067
New loss function after prolongation
Figure 877670DEST_PATH_IMAGE062
The method is a tight and micro convex closure, and the multi-classification semantic segmentation problem is converted into an Lov-sz-Softmax loss optimization problem.
Figure 839810DEST_PATH_IMAGE062
Is defined as:
Figure 405921DEST_PATH_IMAGE068
wherein the content of the first and second substances,
Figure 504327DEST_PATH_IMAGE069
representing a pixel
Figure 130480DEST_PATH_IMAGE071
The error vector of (a) is calculated,
Figure 314337DEST_PATH_IMAGE072
is a pixel calculated using the SoftMax method
Figure 684138DEST_PATH_IMAGE071
Belong to the category
Figure 902630DEST_PATH_IMAGE059
A probability distribution of (a);
Figure 699685DEST_PATH_IMAGE073
is an Lov sz extension
Figure 373767DEST_PATH_IMAGE062
The derivative of (a) of (b),
Figure 281680DEST_PATH_IMAGE074
the representation being arranged in descending order
Figure 354678DEST_PATH_IMAGE075
A component;
and S33, solving the optimal solution of the mixed loss function by using a gradient descent method by using the training set and the verification set, and updating parameters of the segmentation model based on the ASPP-CA-OCRNet according to the optimal solution of the mixed loss function.
Specifically, a training set and a verification set are utilized, an optimal solution of a mixed loss function is solved by using a gradient descent method, parameters of the complex deep learning model ASPP-CA-OCRNet are updated according to the optimal solution of the mixed loss function, and parameters when the highest accuracy is obtained on the verification set in the parameter updating iteration process are stored and serve as the optimized parameters obtained through training.
And S34, testing the segmentation model by using the test set.
Specifically, in order to verify the segmentation performance of the segmentation model based on the ASPP-CA-OCRNet on flocs (flocs) and filamentous bacteria (Fila), accuracy (Precision), intersection ratio (IoU) and parameter quantity (parameter) are selected as evaluation indexes of the model, the evaluation indexes are shown in table 1, the Precision represents the ratio of the number of pixels of a certain type of correct prediction to the total number of the type of prediction pixels, and IoU represents the overlapping rate of the prediction result and the original result, namely the ratio of intersection and union of the two;
TABLE 1 partitioning index of different modules on OCRNet
Figure 322634DEST_PATH_IMAGE077
Here, the operation of training the ASPP-CA-OCRNet segmentation model of this embodiment is completed, and the test result of the model is: the MIoU in the experiment of the original OCRNet network was 0.6552, and the model parameters were 300M. After the mixed loss function is introduced into the original OCRNet network, the MIoU is improved by 1.27% from 0.6679, and the model parameters are not increased, so that the mixed loss function is proved to be capable of well improving the target area representation precision and optimizing the segmentation result. After a mixed loss function and an ASPP module are added in an OCRNet network, the MIoU reaches 0.6689 improvement of 0.1%, and model parameters are not increased, which shows that part of targets can obtain a larger receptive field through the ASPP. After a mixed loss function is adopted in the OCRNet network and a CoordAttention (CA) module is added, the MIoU is improved by 0.2 percent from 0.6699, and the model parameters are not increased, which shows that the coordinate attention can improve the segmentation performance of the network to a certain extent. Compared with the OCRNet, the MIoU index of the complex deep learning network built based on the ASPP module, the CA module and the OCRNet network is improved by 2%, the parameter quantity of the model is not increased, and the model in the embodiment can better finish the task of segmenting flocs and filamentous fungi.
S4, compressing the trained segmentation model to obtain a lightweight segmentation model, as shown in fig. 3, specifically:
and S41, introducing a scale factor gamma as an evaluation factor into each channel of the trained segmentation model BN layer, evaluating the importance of each channel, and thinning the evaluation factors gamma of each channel.
The sparsification comprises L1 regular sparsification and L2 regular sparsification, both of which can achieve the purpose of sparsifying the evaluation factors, and the invention does not limit the specific form of sparsification.
Wherein, for the nth channel, its importance score is:
Figure 481083DEST_PATH_IMAGE079
wherein
Figure 192687DEST_PATH_IMAGE081
For the scaling factor of the nth channel,
Figure 385771DEST_PATH_IMAGE083
corresponding parameters of each channel in BN layer
Figure 524629DEST_PATH_IMAGE085
. N is the total number of channels in the model to be compressed.
Specifically, the ASPP-CA-OCRNet model in this embodiment has a total of 320 BN layers and a total of 46896 channels.
And S42, after the evaluation factors gamma of the channels corresponding to the trained BN layers of the segmentation model are thinned, sorting the channels according to the absolute values, determining the channels needing to be reserved according to the importance threshold corresponding to the required pruning proportion, recording the channels needing to be reserved of each BN layer, and generating a pruning configuration file.
Wherein, an importance threshold value is set according to the importance score and a preset pruning rate
Figure 904794DEST_PATH_IMAGE087
Wherein
Figure 420089DEST_PATH_IMAGE089
In order to be the threshold value of importance,
Figure 467680DEST_PATH_IMAGE091
is the set of importance scores for all channels,
Figure DEST_PATH_IMAGE093
Figure DEST_PATH_IMAGE095
to sort the objects in descending order and select a number output that is close to or equal to the p position. Therefore, the channels in the set I that are greater than the importance threshold t are the channels to be reserved, and the location sequence numbers of the BN layers where the reserved channels are located and the locations of the BN layers in the network are recorded and saved as the pruning configuration file.
In this embodiment, the pruning rate is selected to be 0.5, that is, 50% of the channels are pruned, and after this step, the number of channels remained in 320 BN layers is 23448.
And S43, performing model compression according to the pruning configuration file to obtain the lightweight segmentation model. Pruning the model and importing the parameters of the original model: and iteratively pruning the configuration of all BN layers in the configuration file, and establishing a new BN layer, the number of output channels of the previous layer and the number of input channels of the next layer corresponding to the new BN layer according to the number of channels reserved by each BN layer, wherein the previous layer and the next layer are convolution layers. And importing corresponding parameters in the original model according to the channel position serial numbers reserved in the pruning configuration file, and completing the pruning operation to obtain a lightweight model.
It should be further noted that the channel pruning algorithm is only an exemplary model compression algorithm of the present invention, and when the trained ASPP-CA-OCRNet model is compressed, other model compression algorithms such as a compact network, scalar quantization, vector quantization, and the like may be selected.
And S44, fine tuning the lightweight model by using the original training set, namely retraining by using the original training set, and recovering the precision of the model to a certain extent.
Before and after the ASPP-CA-OCRNet segmentation model in this embodiment completes channel pruning, each index pair is shown in table 2. As can be seen from Table 2, the total number of parameters of the model is reduced from 300M to 25.3MB, and the segmentation accuracy is reduced from 0.7888 to 0.7712, and the segmentation accuracy is reduced from 0.6752 to 0.6485 in IoU. Compared with the original model, the size of the lightweight model is reduced by 91.6%, the model accuracy, IoU is reduced by only 1.76% and 2.67%, and the accuracy and IoU are reduced within an acceptable range.
TABLE 2 comparison of indexes before and after ASPP-CA-OCRNet pruning
Figure DEST_PATH_IMAGE097
There is also provided, in accordance with an embodiment of the present invention, an activated sludge image-based lightweight segmentation model construction system, including one or more processors and a non-transitory computer-readable storage medium having program instructions stored thereon, the one or more processors being configured to implement the method provided according to any one of the preceding embodiments when the program instructions are executed by the one or more processors.
FIG. 4 illustrates an activated sludge sample collection system according to one embodiment of the present invention. As shown in fig. 4, the activated sludge sample collection system includes: the server is connected with the edge device, and the server adopts the method provided by any one of the preceding items or is provided with the lightweight segmentation model building system provided by any one of the preceding items; and a lightweight segmentation model constructed by a server is deployed on the edge equipment.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment may be implemented by software plus a necessary general hardware platform, and may also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Although the present invention has been described with reference to a preferred embodiment, it should be understood that various changes, substitutions and alterations can be made herein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (10)

1. A lightweight segmentation model construction method based on an activated sludge image is characterized by comprising the following steps:
s1, creating an activated sludge microscopic image data set;
s2, constructing a segmentation model for segmenting the activated sludge microscopic image based on the void space convolution pooling pyramid ASPP and the coordinate attention mechanism CA and OCRNet;
s3, training the segmentation model by using the activated sludge micro-image data set, and updating parameters of the segmentation model by adopting a pre-constructed mixed loss function;
and S4, compressing the trained segmentation model to obtain a lightweight segmentation model.
2. The method for constructing a lightweight segmentation model based on an activated sludge image as claimed in claim 1, wherein the step S1 comprises the steps of:
s11, acquiring a plurality of activated sludge microscopic images by using a phase contrast microscope;
and S12, labeling each activated sludge microscopic image to obtain a label image with the same size as the original image, wherein the label image comprises a background, flocs and filamentous bacteria.
3. The method for constructing a lightweight segmentation model based on an activated sludge image as claimed in claim 1, wherein the step S2 comprises the steps of:
s21, selecting HRNet as OCRNet backbone network;
s22, taking the rough semantic segmentation result obtained by the calculation of the intermediate layer of the backbone network as a soft object area;
s23, calculating vectors according to the soft object area and the pixel representation output by the deepest layer of the main network to form object area representation, wherein each vector corresponds to an object area feature representation which is a positive integer;
s24, calculating a relation matrix between pixel representation output by the deepest layer of the backbone network and the object region representation, and performing weighted summation according to each pixel value and a numerical value of the object region feature representation in the relation matrix to obtain the object context representation;
s25, performing channel connection on the object context representation and a parallel structure to obtain the enhanced representation, wherein the parallel structure is composed of the pixel representation and the coordinate attention mechanism CA through the void space convolution pooling pyramid ASPP in parallel.
4. The method for constructing a lightweight segmentation model based on an activated sludge image as claimed in claim 1, wherein the step S3 comprises the steps of:
s31, dividing the activated sludge micro-image data set into a training set, a verification set and a test set;
s32, combining the MIoU-like loss and cross entropy loss into a mixed loss function, and calculating loss in the soft object region and the final output region of the segmentation model respectively by using the mixed loss function;
s33, solving the optimal solution of the mixed loss function by using the training set and the verification set and using a gradient descent method, and updating the parameters of the segmentation model according to the optimal solution of the mixed loss function;
and S34, testing the segmentation model by using the test set.
5. The method for constructing a lightweight segmentation model based on an activated sludge image according to claim 4,
the mixing loss function is formulated as follows:
Figure DEST_PATH_IMAGE002
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE004
representing a loss of the soft object region,
Figure DEST_PATH_IMAGE006
representing the loss of the final output area,
Figure DEST_PATH_IMAGE008
Figure DEST_PATH_IMAGE010
to represent
Figure 356006DEST_PATH_IMAGE004
Figure DEST_PATH_IMAGE012
The weight lost;
Figure DEST_PATH_IMAGE014
representing the cross-entropy loss as a function of time,
Figure DEST_PATH_IMAGE016
represents the loss for the class MIoU,
Figure DEST_PATH_IMAGE018
Figure DEST_PATH_IMAGE020
to represent
Figure 285785DEST_PATH_IMAGE014
Figure 404438DEST_PATH_IMAGE016
The weight lost;
Figure DEST_PATH_IMAGE022
an output representing the region of the soft object,
Figure DEST_PATH_IMAGE024
representing the final output.
6. The method for constructing a lightweight segmentation model based on an activated sludge image as claimed in claim 1, wherein the step S4 comprises the steps of:
s41, introducing a scale factor gamma as an evaluation factor into each channel of the trained segmentation model BN layer, evaluating the importance of each channel, and thinning each channel evaluation factor gamma;
s42, after the evaluation factors gamma of the channels corresponding to the trained BN layers of the segmentation model are thinned, sorting the channels according to the absolute value, determining the channels needing to be reserved according to the importance threshold corresponding to the required pruning proportion, recording the channels needing to be reserved of each BN layer, and generating a pruning configuration file;
and S43, performing model compression according to the pruning configuration file to obtain the lightweight segmentation model.
7. The method for constructing a lightweight segmentation model based on an activated sludge image as claimed in claim 6, wherein the step S43 includes:
and extracting the number of channels required by the current BN layer from the pruning configuration file, generating an BN layer example according to the number of channels, and simultaneously modifying the number of output channels of a previous convolutional layer of the BN layer and the number of input channels of a next convolutional layer of the BN layer.
8. The method for constructing a lightweight segmentation model based on an activated sludge image as claimed in claim 6, wherein the step S4 further comprises the steps of:
and S44, retraining the lightweight model by using an original training set.
9. An activated sludge image-based lightweight segmentation model construction system, characterized in that the system comprises one or more processors and a non-transitory computer-readable storage medium having program instructions stored thereon, which when executed by the one or more processors, are configured to implement the method according to any one of claims 1-8.
10. An activated sludge sample collection system, comprising a server and an edge device, wherein the server is connected with the edge device, and the server adopts the method of any one of the preceding claims 1 to 8 or has the system of claim 9; and the edge equipment is provided with a lightweight segmentation model constructed by the server.
CN202210596703.XA 2022-05-30 2022-05-30 Lightweight segmentation model construction method and system based on activated sludge image Pending CN115131557A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210596703.XA CN115131557A (en) 2022-05-30 2022-05-30 Lightweight segmentation model construction method and system based on activated sludge image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210596703.XA CN115131557A (en) 2022-05-30 2022-05-30 Lightweight segmentation model construction method and system based on activated sludge image

Publications (1)

Publication Number Publication Date
CN115131557A true CN115131557A (en) 2022-09-30

Family

ID=83377943

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210596703.XA Pending CN115131557A (en) 2022-05-30 2022-05-30 Lightweight segmentation model construction method and system based on activated sludge image

Country Status (1)

Country Link
CN (1) CN115131557A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116863465A (en) * 2023-09-01 2023-10-10 四川省每文环保科技有限公司 Sewage intelligent operation monitoring system
CN116912351A (en) * 2023-09-12 2023-10-20 四川大学 Correction method and system for intracranial structure imaging based on artificial intelligence
CN117409331A (en) * 2023-12-15 2024-01-16 四川泓宝润业工程技术有限公司 Method and device for detecting hidden danger of surrounding environment of oil and gas pipeline and storage medium

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116863465A (en) * 2023-09-01 2023-10-10 四川省每文环保科技有限公司 Sewage intelligent operation monitoring system
CN116863465B (en) * 2023-09-01 2023-12-12 四川省每文环保科技有限公司 Sewage intelligent operation monitoring system
CN116912351A (en) * 2023-09-12 2023-10-20 四川大学 Correction method and system for intracranial structure imaging based on artificial intelligence
CN116912351B (en) * 2023-09-12 2023-11-17 四川大学 Correction method and system for intracranial structure imaging based on artificial intelligence
CN117409331A (en) * 2023-12-15 2024-01-16 四川泓宝润业工程技术有限公司 Method and device for detecting hidden danger of surrounding environment of oil and gas pipeline and storage medium
CN117409331B (en) * 2023-12-15 2024-03-15 四川泓宝润业工程技术有限公司 Method and device for detecting hidden danger of surrounding environment of oil and gas pipeline and storage medium

Similar Documents

Publication Publication Date Title
CN115131557A (en) Lightweight segmentation model construction method and system based on activated sludge image
CN113221911B (en) Vehicle weight identification method and system based on dual attention mechanism
CN112348036A (en) Self-adaptive target detection method based on lightweight residual learning and deconvolution cascade
CN113361645B (en) Target detection model construction method and system based on meta learning and knowledge memory
CN113221787A (en) Pedestrian multi-target tracking method based on multivariate difference fusion
CN115311502A (en) Remote sensing image small sample scene classification method based on multi-scale double-flow architecture
CN114019467A (en) Radar signal identification and positioning method based on MobileNet model transfer learning
CN109697236A (en) A kind of multi-medium data match information processing method
CN111222534B (en) Single-shot multi-frame detector optimization method based on bidirectional feature fusion and more balanced L1 loss
CN113706544A (en) Medical image segmentation method based on complete attention convolution neural network
CN114581789A (en) Hyperspectral image classification method and system
CN114627424A (en) Gait recognition method and system based on visual angle transformation
CN114973136A (en) Scene image recognition method under extreme conditions
CN117765258A (en) Large-scale point cloud semantic segmentation method based on density self-adaption and attention mechanism
CN113554653A (en) Semantic segmentation method for long-tail distribution of point cloud data based on mutual information calibration
CN116402690A (en) Road extraction method, system, equipment and medium in high-resolution remote sensing image based on multi-head self-attention mechanism
CN110363101A (en) A kind of flowers recognition methods based on CNN Fusion Features frame
CN112508958B (en) Lightweight multi-scale biomedical image segmentation method
CN115546474A (en) Few-sample semantic segmentation method based on learner integration strategy
CN112185466B (en) Method for constructing protein structure by directly utilizing protein multi-sequence association information
CN115035408A (en) Unmanned aerial vehicle image tree species classification method based on transfer learning and attention mechanism
CN114782983A (en) Road scene pedestrian detection method based on improved feature pyramid and boundary loss
CN111369124A (en) Image aesthetic prediction method based on self-generation global features and attention
CN118072115B (en) Medical cell detection method and system
CN115082703B (en) Concept-associated color extraction method, device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination