CN114549536A

CN114549536A - Microbial colony segmentation method based on attention mechanism

Info

Publication number: CN114549536A
Application number: CN202210146742.XA
Authority: CN
Inventors: 查玄根; 徐建
Original assignee: Hangzhou Dianzi University
Current assignee: Hangzhou Dianzi University
Priority date: 2022-02-17
Filing date: 2022-02-17
Publication date: 2022-05-27

Abstract

The invention discloses a microbial colony segmentation method based on an attention mechanism, which comprises the steps of 1, shooting colony images after colony culture, and obtaining colony data setsDDividing the training set, the verification set and the test set; 2, processing the pre-input picture by using a data enhancement method; designing an attention mechanism module CSA, and embedding the attention mechanism module CSA into a high-resolution convolutional neural network to form a deep convolutional neural network model; 4 convolution by loading pre-trained network model parametersThe neural network carries out parameter initialization, and the neuron of each layer updates the parameters in the network structure according to errors; after each training is finished, verifying the network model by using a verification set, and saving the model parameter with the highest accuracy as an optimal model; and 6, testing the optimal model on the test set, and outputting a colony semantic segmentation result graph. The colony boundary of the method is closer to the real situation, and the accuracy is higher.

Description

Microbial colony segmentation method based on attention mechanism

Technical Field

The invention belongs to the field of deep learning of an Attention Mechanism (Attention Mechanism) and a Convolutional Neural Network (Convolutional Neural Network), and particularly relates to a microbial colony segmentation method based on the Attention Mechanism.

Background

In recent years, food safety problems have been frequent, and no food safety events have been associated with bacterial colonies. The determination of the colony species for each particular situation has also become a common concern. To a certain extent, the quantity of the bacterial colony is more or less, and the quality conditions of food safety and environmental sanitation can be directly or indirectly reflected. With the enhancement of the supervision of the fields of food safety, environmental protection and the like in China in recent years, the demands of the works such as bacterial colony observation and analysis and the like are increased, however, the traditional bacterial colony processing method relying on manual work is not only complicated in work content and time-consuming, but also easily causes subjective errors.

The convolutional neural network is composed of a plurality of convolutional layers and a top fully-connected layer (corresponding to a classical neural network), and also comprises an activation layer and a pooling layer. Compared with other deep learning structures, the convolutional neural network can give better results in the aspects of image segmentation and recognition.

The attention mechanism is based on the visual attention of human beings, and the human beings have selectivity when observing external things or an image, namely, focus on the interested part of area and ignore other irrelevant details, namely the attention focus which is often called. The mechanism can help people to quickly and accurately screen out interesting and valuable information from a large amount of redundant information, so that the information acquisition efficiency is greatly improved. The attention mechanism is applied to the neural network, long-range dependence in the characteristic diagram can be captured, and the position and the shape of a certain kind of colony can be more accurately positioned.

The existing colony processing method has a good colony classification effect or is ideal for colony boundary shape and detail segmentation, but the existing method cannot achieve high-precision segmentation of complex colony boundaries while obtaining a good colony classification result.

Disclosure of Invention

The invention aims to provide a microbial colony segmentation method based on an attention mechanism, and the method is used for solving the technical problem that the existing method cannot achieve high-precision segmentation of complex colony boundaries while obtaining good colony classification results.

In order to solve the technical problems, the specific technical scheme of the microorganism colony segmentation method based on the attention mechanism is as follows:

a microbial colony segmentation method based on an attention mechanism comprises the following steps:

step 1, shooting colony images by an industrial camera after colony culture to obtain a colony data set D, and then dividing all the n colony image data in the D into a training set, a verification set and a test set according to the ratio of 40:7: 7;

step 2, processing the pre-input picture by using a data enhancement method;

step 3, designing an attention mechanism module CSA, and embedding the attention mechanism module CSA into a high-resolution convolution neural network to form a deep convolution neural network model;

step 4, initializing parameters of the convolutional neural network by loading pre-trained network model parameters, obtaining an expected output by the input data through forward propagation in the convolutional neural network, if the expected output is different from the actual class label of the data, reversely propagating errors layer by layer to an input layer, and updating the parameters in the network structure according to the errors by the neurons of each layer;

step 5, after each training is finished, verifying the network model by using a verification set, and saving the model parameter with the highest accuracy as an optimal model;

and 6, testing the optimal model on the test set, outputting a colony semantic segmentation result graph, and representing different types of colonies by using areas with different gray levels in the result graph.

Further, the step 1 comprises data input of the deep convolutional neural network, wherein each input data comprises an original picture and a marked picture, and the marked pictures mark different microbial colony types through different colors.

Further, the data enhancement method in the step 2 comprises random horizontal turning, random vertical turning, 360-degree random rotation and random cutting.

Further, the attention mechanism module CSA in step 3 first performs horizontal average pooling and vertical average pooling on the feature image with input dimensions of C × H × W, to obtain intermediate feature maps with dimensions of C × H × 1 and C × 1 × W; c represents a c channel, h represents an h row in the feature diagram, w represents a w column in the feature diagram, and the calculation formulas of the process are as follows:

horizontal average pooling:

vertical average pooling:

then, processing the obtained intermediate characteristic graph by three different 1 multiplied by 1 convolution kernels respectively; for row pooling, the three 1 × 1 convolutions yield a profile Q^h，K^hAnd V^h(ii) a Correspondingly, the column pooling results in Q^w，K^wAnd V^w(ii) a The first two 1 × 1 convolution kernels can change the dimension of the middle feature map into C ' × H × 01 or C ' × 11 × 2W, and C ' is the number of channels with reduced dimension; while the third 1 × 31 convolution kernel does not change its dimensions, and is still C × 4H × 51 or C × 61 × 7W; next, matrix multiplication is performed on the feature maps processed by the first two 1 × 81 convolution kernels, and the process performs dimension transformation on the feature maps, so that the matrix multiplication is in the form of: (1 × 9H × C ') (1 × 0C' × 1H) or (1 × 2W × 3C ') (1 × 4C' × 5W), so the attention map dimension obtained is 1 × 6H × 7H or 1 × W; and then, after the processing is carried out by using a Softmax function, carrying out matrix multiplication on the processed result and a third 1 × 1 convolution result, and similarly, carrying out dimension conversion on the 1 × 1 convolution result into 1 × C × H or 1 × C × W to carry out calculation, wherein a specific calculation formula can be expressed as follows:

horizontal direction:

vertical direction:

in the above formula Q^h _i、Q^w _iRepresentation characteristic diagram Q^hAnd Q^wAt the i-th position of, K^h _j、K^w _jRepresentation characteristic diagram K^hAnd K^wThe jth position of (a); and s_ijAnd t_ijThen the influence of the ith position on the jth position in the attention map is indicated, s_ijOr t_ijThe larger the correlation between the two positions; the dimensions of the attention feature map thus obtained are C × H × 1 and C × 1 × W; then, the results obtained in the horizontal direction and the vertical direction are expanded to C × H × W, and finally, the results are added with the original input feature map pixel by pixel, and a feature map subjected to CSA processing is output.

Further, in step 4, after embedding the attention mechanism module CSA into each convolution module of the host network, performing forward propagation to calculate network parameters, and obtaining an output vector O_iObtaining a correlation weight vector W of a certain pixel point and all other pixels through a Softmax function_i(ii) a Then according to the vector W_iThe final result y predicted by each pixel point by the network is obtained by multiplying and accumulating the weighted values and the pixels at the corresponding positions_i(ii) a The obtained classification result y_iAnd the current correct tag value y'_iCalculating loss values as two inputs of a cross Entropy loss function respectively; transmitting the error signal to the output of each layer, and obtaining the gradient of the parameter through the derivative of the function of each layer to the parameter; and updating and calculating network parameters influencing model training and model output by a Stochastic Gradient Descent (SGD) optimizer to enable the network parameters to approach or reach an optimal value, so that a loss function is minimized. Further, the step 5: after each training is finished, the most common evaluation standard intersection ratio IOU in image segmentation is used for verifying the effectiveness of the model provided by the invention on a verification set, and the specific calculation formula is as follows: IOU ═ y ' #/y)/(y ' #/y), where y ' is the true value and y is the predictionA value; meanwhile, storing the maximum model parameter of the IOU, and stopping training when the training times reach a certain number; and finally, loading the maximum model parameter of the IOU to obtain a final model of the trained convolutional neural network. Further, step 6 inputs an unprocessed colony image into the convolutional neural network, first down-samples the input image to 1/4 of the original size using two convolutional layers with convolution kernel size of 3 × 3 and convolution step size of 2, and reduces the error using Batch Normalization (BN) after each convolutional layer; then, inputting the convolution layer processing result X into an attention mechanism module CSA, so that the characteristic diagram obtained by further processing captures remote dependence or intensive context information while retaining accurate position information of the bacterial colony; then, the feature map obtained by the CSA processing and the input map X are processed by the pixel-by-pixel camera to obtain a final result X' processed by the attention mechanism module.

Further, after the processing of the attention mechanism module for multiple times, the convolutional neural network also comprises a Segmentation layer, firstly, a convolutional layer with the convolutional kernel size of 1 multiplied by 1 and the step length of 1 is used for processing a previously obtained feature map, and the channel dimension of the feature map is kept unchanged; then, the accuracy of the feature data is ensured by using batch normalization and a ReLU activation function; finally, changing the channel dimension of the characteristic diagram into the colony variety number through a convolution layer with the convolution kernel size of 1 multiplied by 1 and the step length of 1; this makes it possible to obtain a final colony segmentation effect map.

The microbial colony segmentation method based on the attention mechanism has the following advantages:

(1) the invention provides a novel microorganism colony segmentation mode, provides an attention mechanism module CSA, and segments and classifies microorganism colonies by combining a semantic segmented convolutional neural network with an attention mechanism for the first time.

(2) The colony boundary segmented by the colony segmentation method is closer to the real condition, the accuracy is higher, the subsequent analysis operation on the colony is facilitated, and the method has great significance in the industries of food safety, medical health, environmental protection and the like.

Drawings

FIG. 1 is an overall flow diagram of the present invention.

FIG. 2 is a detailed process diagram of the attention mechanism module of the present invention.

Fig. 3 is a structural diagram of the convolutional neural network main body of the present invention.

FIG. 4 is a schematic diagram of the overall process of the model of the present invention.

Detailed Description

In order to better understand the purpose, structure and function of the present invention, a microorganism colony segmentation method based on attention mechanism of the present invention is described in detail below with reference to the accompanying drawings.

As shown in FIG. 1, the microorganism colony segmentation method based on the attention mechanism comprises the following specific steps:

step 1: after colony culture, an industrial camera is used for shooting colony images to obtain a colony data set D, and then all the n colony image data in the D are divided into a training set, a verification set and a test set according to the ratio of 40:7: 7. For example, the colony images of the training set, the validation set and the test set in 216 colony images are 160, 28 and 28 respectively. The data input by each time of the deep convolutional neural network comprises an original picture and a marked picture, wherein the marked picture marks different microbial colony types through different colors.

And 2, step: when a network model is trained, pre-input pictures are processed on a training set by using a data enhancement method (random horizontal turning, random vertical turning, 360-degree random rotation and random cutting), 80 of 160 colony images in the training set are randomly horizontally turned, 80 colony images in the training set are vertically turned by a water machine, all 160 colony images are randomly rotated and randomly cut, and finally, the picture data are converted into a vector form;

and step 3: as shown in fig. 2, a attention mechanism module csa (cross Strip attention) is designed and embedded into a high-resolution convolutional neural network to form a deep convolutional neural network model proposed by the present invention. Different from the global average pooling commonly used in the conventional neural network, the attention mechanism module CSA provided by the invention first performs horizontal average pooling and vertical average pooling on the feature image with the input dimension of C × H × W respectively to obtain intermediate feature maps with the dimensions of C × H × 1 and C × 1 × W respectively. C represents a c channel, h represents an h row in the feature diagram, w represents a w column in the feature diagram, and the calculation formulas of the process are as follows:

horizontal average pooling:

vertical average pooling:

we then processed the resulting intermediate feature maps with three different 1 x 1 convolution kernels, respectively. For row pooling, the three 1 × 1 convolutions yield a profile Q^h，K^hAnd V^h(ii) a Correspondingly, the column pooling results in Q^w，K^wAnd V^w. The first two 1 × 1 convolution kernels can change the dimension of the middle feature map into C ' × H × 01 or C ' × 11 × 2W, and C ' is the number of channels with reduced dimension; the third 1 × 31 convolution kernel does not change its dimension, and is still C × 4H × 51 or C × 61 × 7W, which is to reduce the complexity of the computation. Next, matrix multiplication is performed on the feature maps processed by the first two 1 × 81 convolution kernels, and the process performs dimension transformation on the feature maps, so that the matrix multiplication is in the form of: (1 × 9H × C ') (1 × 0C' × 1H) or (1 × 2W × 3C ') (1 × 4C' × 5W), so the attention map dimensions obtained are 1 × 6H × 7H or 1 × W. And then, after the processing is carried out by using a Softmax function, carrying out matrix multiplication on the processed result and a third 1 × 1 convolution result, and similarly, carrying out dimension conversion on the 1 × 1 convolution result into 1 × C × H or 1 × C × W to carry out calculation, wherein a specific calculation formula can be expressed as follows:

horizontal direction:

vertical direction:

in the above formula, Qh i and Qw i represent a characteristic diagram Q^hAnd Q^wSimilarly, Khj, Kw j represent the characteristic diagram K^hAnd K^wThe jth position of (a); and s_ijAnd t_ijThen the influence of the ith position on the jth position in the attention map is indicated, s_ijOr t_ijThe larger the correlation between the two positions. The dimensions of the attention feature map thus obtained are C × H × 1 and C × 1 × W. The traditional convolutional neural network method can only obtain context information with the same size as the convolutional kernel, and the amount of information used in classification or segmentation is small, so that the classification or segmentation has low accuracy. The attention map acquired by the CSA module designed by the invention can obtain the context information of the original input feature map in the horizontal direction or the vertical direction, thereby increasing the available information amount in the processing process and being beneficial to the segmentation of the overall shape and the boundary details of the colony and the classification of different colonies. Expanding the results obtained in the horizontal direction and the vertical direction to C multiplied by H multiplied by W, finally adding the results with the original input feature map pixel by pixel, and outputting a feature map processed by CSA; due to the fusion of horizontal attention and vertical attention, the process further enhances the characteristic characterization capability of the network provided by the invention, and is beneficial to the segmentation and classification of microbial colonies which are at different positions and have the same type.

And 4, step 4: the convolutional neural network designed by the invention is initialized by loading the pre-trained network model parameters, input data can obtain an expected output through forward propagation in the convolutional neural network, if the expected output is different from the actual class label of the data, errors are propagated to an input layer by layer in a backward direction, and neurons in each layer update the parameters in the network structure according to the errors. As shown in fig. 3, the parameters loaded with the pre-trained model initialize the neural network, which is the main body of the convolutional neural network designed by the present invention. After the attention mechanism module CSA designed by the invention is embedded into each convolution module of the main network, the forward propagation is executed to calculate the network parameters, and the obtained output vector O is used_iObtaining a correlation weight vector W of a certain pixel point and all other pixels through a Softmax function_i(ii) a Then according to the vector W_iThe final result y predicted by each pixel point by the network is obtained by multiplying and accumulating the weighted values and the pixels at the corresponding positions_i(ii) a The obtained classification result y_iAnd the current correct tag value y'_iThe loss values are calculated as two inputs to the crossEntropy loss function, respectively. The gradient of the parameter is found by passing the error signal to the output of each layer and then taking the derivative of the parameter as a function of each layer. And updating and calculating network parameters influencing model training and model output by a Stochastic Gradient Descent (SGD) optimizer to enable the network parameters to approach or reach an optimal value, so that a loss function is minimized.

And 5: after each training is finished, the most common evaluation standard intersection ratio IOU in image segmentation is used for verifying the effectiveness of the model provided by the invention on a verification set, and the specific calculation formula is as follows: IOU ═ y)/(y ═ y @, where y' is the true value and y is the predicted value. Meanwhile, the maximum model parameter of the IOU is stored, and the training is stopped when the training times reach a certain number. And finally, loading the maximum model parameter of the IOU to obtain a final model of the trained convolutional neural network.

Step 6: the deep convolutional neural network model provided by the invention is used for testing the optimal model on a test set and outputting a semantic segmentation result graph of microbial colonies, and areas with different gray levels in the result graph respectively represent different types of bacterial colonies. This process is illustrated in fig. 4. An unprocessed colony image is input into the convolutional neural network proposed by the present invention, the input image is first downsampled to the original size 1/4 using two convolutional layers with convolution kernel size of 3 x 3 and convolution step size of 2, and the error is reduced using Batch Normalization (BN) after each convolutional layer. The convolutional layer processing results X are then input into the attention mechanism module CSA proposed by the present invention, so that the feature map obtained by further processing captures remote dependent or dense context information while preserving accurate location information of the colonies. Then, the feature map obtained by the CSA processing and the input map X are processed by the pixel-by-pixel camera to obtain a final result X' processed by the attention mechanism module. After the processing of the attention mechanism module for many times, the network provided by the invention also has a Segmentation layer, firstly, a convolution layer with the convolution kernel size of 1 multiplied by 1 and the step length of 1 is used for processing a feature map obtained previously, and the channel dimension of the feature map is kept unchanged; then, the accuracy of the feature data is ensured by using batch normalization and a ReLU activation function; finally, the signature channel dimension is changed to the number of colony species (4 in FIG. 4, including background) by convolution of a convolution kernel size of 1X 1 with a step size of 1. Therefore, a final colony segmentation effect map can be obtained, and areas with different gray levels in the effect map respectively represent colonies of different types.

It is to be understood that the present invention has been described with reference to certain embodiments, and that various changes in the features and embodiments, or equivalent substitutions may be made therein by those skilled in the art without departing from the spirit and scope of the invention. In addition, many modifications may be made to adapt a particular situation or material to the teachings of the invention without departing from the essential scope thereof. Therefore, it is intended that the invention not be limited to the particular embodiment disclosed, but that the invention will include all embodiments falling within the scope of the appended claims.

Claims

1. A microbial colony segmentation method based on an attention mechanism is characterized by comprising the following steps:

step 2, processing the pre-input picture by using a data enhancement method;

and 6, testing the optimal model on the test set, and outputting a colony semantic segmentation result graph, wherein areas with different gray levels in the result graph respectively represent colonies of different types.

2. The method for microbial colony segmentation based on attention mechanism as claimed in claim 1, wherein the step 1 comprises data input of a deep convolutional neural network, each input data comprises an original picture and a marked picture, wherein the marked picture marks different types of microbial colonies by different colors.

3. The attention-based microbial colony segmentation method according to claim 1, wherein the step 2 data enhancement method comprises random horizontal inversion, random vertical inversion, 360-degree random rotation and random cropping.

4. The method for dividing a microbial colony according to claim 1, wherein the attention mechanism module CSA of step 3 first performs horizontal average pooling and vertical average pooling on the feature images with input dimension of C x H x W to obtain intermediate feature maps with dimensions of C x H x 1 and C x 1 x W; c represents a c channel, h represents an h row in the feature diagram, w represents a w column in the feature diagram, and the calculation formulas of the process are as follows:

horizontal average pooling:

vertical average pooling:

then, processing the obtained intermediate characteristic graph by three different 1 × 1 convolution kernels respectively; for row pooling, the three 1 × 1 convolutions yield a profile Q^h，K^hAnd V^h(ii) a Correspondingly, the column pooling results in Q^w，K^wAnd V^w(ii) a The first two 1 × 1 convolution kernels can change the dimension of the middle feature map into C ' × H × 01 or C ' × 11 × 2W, and C ' is the number of channels with reduced dimension; while the third 1 × 31 convolution kernel does not change its dimensions, and is still C × 4H × 51 or C × 61 × 7W; next, matrix multiplication is performed on the feature maps processed by the first two 1 × 81 convolution kernels, and the process performs dimension transformation on the feature maps, so that the matrix multiplication is in the form of: (1 × 9H × C ') (1 × 0C' × 1H) or (1 × 2W × 3C ') (1 × 4C' × 5W), so the attention map dimensions obtained are 1 × 6H × 7H or 1 × W; and then, after the processing is carried out by using a Softmax function, carrying out matrix multiplication on the processed result and a third 1 × 1 convolution result, and similarly, carrying out dimension conversion on the 1 × 1 convolution result into 1 × C × H or 1 × C × W to carry out calculation, wherein a specific calculation formula can be expressed as follows:

horizontal direction:

vertical direction:

in the above formula Q^h _i、Q^w _iRepresentation characteristic diagram Q^hAnd Q^wAt the i-th position of, K^h _j、K^w _jRepresentation characteristic diagram K^hAnd K^wThe jth position of (c); and s_ijAnd t_ijThen the influence of the ith position on the jth position in the attention map is indicated, s_ijOr t_ijThe larger the correlation between the two positions; the attention feature map thus obtainedThe dimensions of (a) are C × H × 1 and C × 1 × W; then, the results obtained in the horizontal direction and the vertical direction are expanded to C × H × W, and finally, the results are added to the original input feature map pixel by pixel, and a feature map subjected to CSA processing is output.

5. The method for microbial colony segmentation based on attention mechanism as claimed in claim 1, wherein step 4 comprises embedding an attention mechanism module CSA into each convolution module of the host network, performing forward propagation to calculate network parameters, and obtaining an output vector O_iObtaining a correlation weight vector W of a certain pixel point and all other pixels through a Softmax function_i(ii) a Then according to the vector W_iThe final result y predicted by each pixel point by the network is obtained by multiplying and accumulating the weighted values and the pixels at the corresponding positions_i(ii) a The obtained classification result y_iAnd a current correct tag value y'_iCalculating loss values as two inputs of a crossEncopy loss function respectively; transmitting the error signal to the output of each layer, and obtaining the gradient of the parameter through the derivative of the function of each layer to the parameter; and updating and calculating network parameters influencing model training and model output by a Stochastic Gradient Descent (SGD) optimizer to enable the network parameters to approach or reach an optimal value, so that a loss function is minimized.

6. The method for dividing a colony of microorganisms based on an attention mechanism as claimed in claim 1, wherein the step 5: after each training is finished, the most common evaluation standard intersection ratio IOU in image segmentation is used for verifying the effectiveness of the model provided by the invention on a verification set, and the specific calculation formula is as follows: IOU ═ y)/(y ═ y @, where y' is the true value and y is the predicted value; meanwhile, the maximum model parameter of the IOU is stored, and the training is stopped when the training times reach a certain number; and finally, loading the maximum model parameter of the IOU to obtain a final model of the trained convolutional neural network.

7. The attention-based microbial colony segmentation method according to claim 1, wherein step 6 is to input an unprocessed colony image into a convolutional neural network, first downsample the input image to 1/4 of the original size using two convolutional layers with convolution kernel size of 3 x 3 and convolution step size of 2, and reduce errors using Batch Normalization (BN) after each convolutional layer; then, inputting the convolution layer processing result X into an attention mechanism module CSA, so that the characteristic diagram obtained by further processing captures remote dependence or intensive context information while retaining accurate position information of the bacterial colony; then, the feature map obtained by the CSA processing and the input map X are processed by the pixel-by-pixel camera to obtain a final result X' processed by the attention mechanism module.

8. The microbial colony Segmentation method based on attention mechanism as claimed in claim 7, wherein after the processing of the attention mechanism module for a plurality of times, the convolutional neural network further comprises a Segmentation layer, and the previously obtained feature map is processed by a convolutional layer with convolution kernel size of 1 x 1 and step size of 1, and the channel dimension of the feature map is kept unchanged; then, the accuracy of the feature data is ensured by using batch normalization and a ReLU activation function; finally, changing the channel dimension of the characteristic diagram into the colony seed number through a convolution layer with the convolution kernel size of 1 multiplied by 1 and the step length of 1; this makes it possible to obtain a final colony segmentation effect map.