CN116993756A - Method for dividing verticillium wilt disease spots of field cotton - Google Patents
Method for dividing verticillium wilt disease spots of field cotton Download PDFInfo
- Publication number
- CN116993756A CN116993756A CN202310816117.6A CN202310816117A CN116993756A CN 116993756 A CN116993756 A CN 116993756A CN 202310816117 A CN202310816117 A CN 202310816117A CN 116993756 A CN116993756 A CN 116993756A
- Authority
- CN
- China
- Prior art keywords
- layer
- verticillium wilt
- feature map
- inputting
- field
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 241000082085 Verticillium <Phyllachorales> Species 0.000 title claims abstract description 69
- 201000010099 disease Diseases 0.000 title claims abstract description 63
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 title claims abstract description 63
- 238000000034 method Methods 0.000 title claims abstract description 51
- 229920000742 Cotton Polymers 0.000 title claims abstract description 45
- 230000011218 segmentation Effects 0.000 claims abstract description 55
- 230000004927 fusion Effects 0.000 claims abstract description 46
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 claims abstract description 21
- 238000013527 convolutional neural network Methods 0.000 claims description 28
- 238000000605 extraction Methods 0.000 claims description 21
- 238000011176 pooling Methods 0.000 claims description 21
- 230000009467 reduction Effects 0.000 claims description 11
- 238000003709 image segmentation Methods 0.000 claims description 10
- 238000012545 processing Methods 0.000 claims description 9
- 230000004913 activation Effects 0.000 claims description 8
- 238000005070 sampling Methods 0.000 claims description 8
- 238000010606 normalization Methods 0.000 claims description 7
- 238000002372 labelling Methods 0.000 claims description 6
- 230000002776 aggregation Effects 0.000 claims description 5
- 238000004220 aggregation Methods 0.000 claims description 5
- 238000010276 construction Methods 0.000 claims description 5
- 238000012549 training Methods 0.000 claims description 5
- 238000005286 illumination Methods 0.000 claims description 4
- 238000003062 neural network model Methods 0.000 claims description 4
- 238000012937 correction Methods 0.000 claims description 3
- 238000013507 mapping Methods 0.000 claims description 2
- 238000012795 verification Methods 0.000 claims description 2
- 238000010586 diagram Methods 0.000 abstract description 18
- 230000000694 effects Effects 0.000 abstract description 4
- 238000001514 detection method Methods 0.000 abstract description 2
- 238000012271 agricultural production Methods 0.000 abstract 2
- 230000002411 adverse Effects 0.000 abstract 1
- 230000003902 lesion Effects 0.000 description 15
- 230000006870 function Effects 0.000 description 11
- 238000010801 machine learning Methods 0.000 description 6
- 238000004364 calculation method Methods 0.000 description 3
- 238000011156 evaluation Methods 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 238000006116 polymerization reaction Methods 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20212—Image combination
- G06T2207/20221—Image fusion; Image merging
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30181—Earth observation
- G06T2207/30188—Vegetation; Agriculture
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A40/00—Adaptation technologies in agriculture, forestry, livestock or agroalimentary production
- Y02A40/10—Adaptation technologies in agriculture, forestry, livestock or agroalimentary production in agriculture
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention provides a method for dividing verticillium wilt spots of cotton in a field, which comprises the steps of constructing a trained model for dividing the verticillium wilt spots of cotton in the field by a CNN (computer numerical network) network with a plurality of layers of residual blocks, a transform block, a characteristic pyramid module and a fusion module on a channel for outputting a characteristic diagram of the CNN network and multi-scale characteristics output by a characteristic golden sub-tower module, inputting an image to be detected into the trained model for dividing the verticillium wilt spots in the field, and obtaining an accurate dividing result. The method aims to solve the problems of difficult identification of the disease spot form, low segmentation precision and the like of the traditional method, so as to improve the accuracy and efficiency of segmentation of the disease spots of the verticillium wilt in the field. The invention also aims to provide an efficient and reliable disease detection tool for agricultural production, improve the benefits of agricultural production and economic benefits and reduce adverse effects on the environment.
Description
Technical Field
The invention relates to the technical field of agricultural disease detection, in particular to a method for dividing verticillium wilt disease spots of field cotton.
Background
Related patents for lesion segmentation in the prior art are as follows:
a multi-scale deconvolution network is used for realizing plant leaf spot segmentation and identification (application number: 202011047680.4, application date: 2020.09.29), and end-to-end plant leaf spot segmentation and identification is realized by using a small number of pixel-level marks. Firstly, constructing a multi-scale feature extraction module by utilizing a multi-scale residual block, and extracting multi-scale disease features; then, a classification and bridging module is introduced to acquire an activation diagram of a specific class, the activation diagram contains key information of a disease spot of the specific class, and the activation diagram is up-sampled to realize the segmentation of the disease spot; and finally, designing a deconvolution module, and extracting the true positions of the network focus lesion by combining a small number of lesion mark guide features, so as to further optimize the recognition and segmentation effects. The method can be suitable for the condition of identifying and dividing plant leaf diseases with insufficient pixel-level labeling samples, and realizes the integration of identifying and dividing. The model has stronger robustness in the disease image with insufficient light and noise interference.
"method and system for dividing cotton leaf adhesion disease spot image" (application number: 201811061115.6 application date: 2018.09.12), the method comprises: s1, acquiring a least square circle error value of a connected component in an image of a cotton disease spot area; s2, adjusting an H threshold value of an H-minimum method based on a least square circle error value, and comparing the transformed cotton disease spot area image with the H threshold value until the number of the minimum point is changed after the transformation of the H-minimum method, and then carrying out distance transformation and watershed segmentation; s3, judging whether the least square circle error value before dividing the watershed is larger than the least square circle error value after dividing the watershed; if not, finishing the segmentation to obtain a lesion segmentation area; s4, marking the disease spot segmentation area, and carrying out logic operation on the disease spot segmentation area and the cotton disease spot original image to obtain an adhesion disease spot image segmentation result. Can realize the extraction of the cotton disease spot area and the automatic segmentation of the adhesion disease spot, and has important significance for the diagnosis of cotton diseases.
It is not difficult to find out after the above search and the above-listed patent documents, but although there have been related patents for lesion segmentation based on CNN and the conventional machine learning algorithm before, they have some problems in terms of both design ideas and technical effects.
(1) The traditional lesion segmentation scheme adopts a traditional machine learning algorithm and a Convolutional Neural Network (CNN) based method, and the traditional machine learning segmentation method is mainly a manual labeling and threshold-based method, and has the problems of high manual labeling difficulty, low efficiency, low precision, influence by factors such as illumination and the like, and is difficult to meet the requirement of large-scale data analysis. The conventional CNN model has some limitations in processing long-range dependency and sequence data. The CNN is mainly suitable for processing local characteristics and spatial relations, and has weak modeling capability on global information.
(2) The existing lesion segmentation scheme also adopts a non-end-to-end method combining CNN with a traditional machine learning algorithm, wherein the CNN is responsible for extracting the bottom features of the image, and the traditional machine learning algorithm is responsible for further processing the features. Such layering may result in loss and loss of information. The quality and expressive power of the underlying features have a significant impact on the final segmentation result, which may limit the overall algorithm performance if the underlying features are not accurate or rich enough. And the traditional machine learning algorithm has limitations on the coupling and dependency between features. This may result in the algorithm not fully utilizing the information related between features, thereby limiting the improvement of segmentation performance.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention aims to provide a method for dividing verticillium wilt lesions of cotton in a field.
In order to achieve the above object, the present invention provides the following solutions:
a method for dividing verticillium wilt disease spots of field cotton comprises the following steps:
acquiring an image to be detected;
inputting the image to be detected into a trained field verticillium wilt disease spot segmentation model to obtain a segmentation result;
the construction method of the field verticillium wilt disease spot segmentation model comprises the following steps:
performing image processing on the acquired multiple cotton verticillium wilt disease spot images to obtain a sample image;
inputting the sample image into a CNN network with a multi-layer residual block for feature extraction, and performing dimension reduction by using a maximum pooling layer to obtain a dimension-reduced local feature map;
inputting the reduced local feature map to a feature map embedding module to obtain a one-dimensional vector with position information;
the one-dimensional vector is input into a transducer block and repeated for 12 times to obtain global feature fusion data, and the global feature fusion data is adjusted to obtain a two-dimensional global feature map;
inputting the two-dimensional global feature map to a feature pyramid module to extract and fuse multi-scale features of the two-dimensional global feature map so as to obtain feature maps fused with features of different scales;
the local feature images extracted by the CNN network with residual blocks of the last layer and feature images fused with features of different scales are subjected to channel splicing, two-dimensional convolution aggregation is carried out, and then the size space of the aggregated feature images is expanded to the same size of a sample image by using a bilinear interpolation up-sampling method so as to obtain a predictive image segmentation mask;
and training a neural network model based on the CNN-transducer and the characteristic pyramid module by taking the minimum multi-classification loss function as a target to obtain a trained field verticillium wilt disease spot segmentation model.
Preferably, the field verticillium wilt spot segmentation model comprises a CNN, transformer feature pyramid pooling module, a fusion module on a channel for carrying out channel on the output feature map of the CNN network and the multi-scale features output by the feature golden sub-tower module, and an image segmentation network.
Preferably, image processing is performed on the acquired multiple cotton verticillium wilt disease spot images to obtain a sample image, including:
acquiring a cotton verticillium wilt disease spot data set; the cotton verticillium wilt disease spot data set comprises a plurality of cotton verticillium wilt disease spot images under a field background;
and carrying out illumination correction, image denoising, image labeling and image enhancement on each cotton verticillium wilt disease spot image to obtain a plurality of sample images and corresponding real image verticillium wilt disease spot segmentation masks.
Preferably, the number of residual blocks is 3.
Preferably, the sample image is input to a CNN network with a multi-layer residual block for feature extraction, and dimension reduction is performed by using a maximum pooling layer, so as to obtain a dimension-reduced local feature map, which comprises:
inputting the sample image into a first residual block for feature extraction to obtain a first layer local feature map;
inputting the first layer local feature map to a second residual block for feature extraction to obtain a second layer local feature map;
inputting the second layer local feature map to a third residual block for feature extraction to obtain a third layer local feature map;
and inputting the third layer local feature map to a maximum pooling layer, and performing dimension reduction on the third layer local feature map to obtain a dimension-reduced local feature map.
Preferably, the step of inputting the reduced local feature map to a feature map embedding module to obtain a one-dimensional vector with position information includes:
inputting the dimension-reduced local feature map to a Patch_empeddings module, and cutting the dimension-reduced local feature map into small blocks with fixed sizes;
subjecting each small block to a flat operation to convert the small block into a vector;
adding the Position codes generated by each vector and the position_empeddings module to obtain a one-dimensional vector with Position information;
one-dimensional vector data with position information is mapped into a vector space of another dimension by linear projection.
Preferably, the one-dimensional vector is input into a transducer block and repeated 12 times to obtain global feature fusion data, and the global feature fusion data is adjusted to obtain a two-dimensional global feature map, which comprises:
inputting a one-dimensional vector obtained by linear projection mapping into a first LayerNorm layer to obtain first layer normalization data;
inputting the first layer normalized data into a Multi-Head Self-attribute layer to obtain first layer global feature data;
adding and fusing the first layer global feature data and the data input by the first LayerNorm layer to obtain first layer global feature fusion data;
inputting the first layer global feature fusion data to a second LayerNorm layer to obtain second layer normalized data;
inputting the second layer normalized data to the MLP layer to obtain second layer global feature fusion data;
adding the second-layer global feature fusion data with the first-layer global feature fusion data to obtain third-layer global feature fusion data;
inputting the third-layer global feature fusion data to the next Transformer Layer, repeating 12 times in total;
and adjusting the global feature fusion data output by the last transformerLayer to obtain a two-dimensional global feature map.
Preferably, the feature pyramid pooling module comprises four MCBR layers, an upsampling layer, a 1×1 convolution layer and a jump connection; the MCBR layer comprises a MaxPool layer, a convolution layer, a batch normalization layer and a ReLU activation function.
Preferably, the construction method of the field verticillium wilt spot segmentation model further comprises the following steps:
and performing accuracy verification on the trained field verticillium wilt spot segmentation model to obtain a verified field verticillium wilt spot segmentation model.
According to the specific embodiment provided by the invention, the invention discloses the following technical effects:
the invention provides a method for dividing verticillium wilt spots of cotton in a field, which comprises the following steps: acquiring an image to be detected; inputting the image to be detected into a trained field verticillium wilt disease spot segmentation model to obtain a segmentation result; the construction method of the field verticillium wilt disease spot segmentation model comprises the following steps: performing image processing on the acquired multiple cotton verticillium wilt disease spot images to obtain a sample image; inputting the sample image into a CNN network with a multi-layer residual block for feature extraction, and performing dimension reduction by using a maximum pooling layer to obtain a dimension-reduced local feature map; inputting the reduced local feature map to a feature map embedding module to obtain a one-dimensional vector with position information; the one-dimensional vector is input into a transducer block and repeated for 12 times to obtain global feature fusion data, and the global feature fusion data is adjusted to obtain a two-dimensional global feature map; inputting the two-dimensional global feature map to a feature pyramid module to extract and fuse multi-scale features of the two-dimensional global feature map so as to obtain feature maps fused with features of different scales; the local feature images extracted by the CNN network with residual blocks of the last layer and feature images fused with features of different scales are subjected to channel splicing, two-dimensional convolution aggregation is carried out, and then the size space of the aggregated feature images is expanded to the same size of a sample image by using a bilinear interpolation up-sampling method so as to obtain a predictive image segmentation mask; and training a neural network model based on the CNN-transducer and the characteristic pyramid module by taking the minimum multi-classification loss function as a target to obtain a trained field verticillium wilt disease spot segmentation model. The method can more accurately divide the verticillium wilt spots of the field, has certain robustness, and can cope with lesion areas with different scales and shapes.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions of the prior art, the drawings that are needed in the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a method according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of steps performed in accordance with an embodiment of the present invention;
FIG. 3 is a diagram of a Residual unit network according to an embodiment of the present invention;
FIG. 4 is a diagram of a feature map embedding module according to an embodiment of the present invention;
FIG. 5 is a diagram of a transducer structure according to an embodiment of the present invention;
FIG. 6 is a network diagram of a feature pyramid pooling module provided by an embodiment of the present invention;
fig. 7 is a block diagram of an MCBR network provided in an embodiment of the present invention;
fig. 8 is a diagram of a complete network model structure according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The invention aims to provide a method for dividing verticillium wilt disease spots of field cotton, which can divide the verticillium wilt disease spots of the field more accurately, has certain robustness and can cope with lesion areas with different scales and shapes.
In order that the above-recited objects, features and advantages of the present invention will become more readily apparent, a more particular description of the invention will be rendered by reference to the appended drawings and appended detailed description.
Fig. 1 is a flowchart of a method provided by an embodiment of the present invention, and as shown in fig. 1, the present invention provides a method for dividing a greensickness spot of cotton in a field, including:
step 100: acquiring an image to be detected;
step 200: inputting the image to be detected into a trained field verticillium wilt disease spot segmentation model to obtain a segmentation result;
the construction method of the field verticillium wilt disease spot segmentation model comprises the following steps:
step 201: performing image processing on the acquired multiple cotton verticillium wilt disease spot images to obtain a sample image;
step 202: inputting the sample image into a CNN network with a multi-layer residual block for feature extraction, and performing dimension reduction by using a maximum pooling layer to obtain a dimension-reduced local feature map;
step 203: inputting the reduced local feature map to a feature map embedding module to obtain a one-dimensional vector with position information;
step 204: the one-dimensional vector is input into a transducer block and repeated for 12 times to obtain global feature fusion data, and the global feature fusion data is adjusted to obtain a two-dimensional global feature map;
step 205: inputting the two-dimensional global feature map to a feature pyramid module to extract and fuse multi-scale features of the two-dimensional global feature map so as to obtain feature maps fused with features of different scales;
step 206: the local feature images extracted by the CNN network with residual blocks of the last layer and feature images fused with features of different scales are subjected to channel splicing, two-dimensional convolution aggregation is carried out, and then the size space of the aggregated feature images is expanded to the same size of a sample image by using a bilinear interpolation up-sampling method so as to obtain a predictive image segmentation mask;
step 207: and training a neural network model based on the CNN-transducer and the characteristic pyramid module by taking the minimum multi-classification loss function as a target to obtain a trained field verticillium wilt disease spot segmentation model.
Fig. 2 is a schematic diagram of implementation steps provided in the embodiment of the present invention, as shown in fig. 2, and the steps in implementation of this embodiment are as follows:
step 1: a cotton verticillium wilt disease spot dataset is obtained. The dataset includes a plurality of cotton verticillium wilt spot images in a field setting. And carrying out illumination correction, image denoising, image labeling and image enhancement treatment on each field cotton verticillium wilt disease spot image in the blade data set to obtain a plurality of sample verticillium wilt disease spot images and corresponding real image verticillium wilt disease spot segmentation masks.
Step 2: and inputting each cotton verticillium wilt spot image into a CNN network of a plurality of Residual units for feature extraction, and obtaining a local feature map of each cotton verticillium wilt spot image in the field. And inputting the local feature map to a Maxpool layer to obtain a reduced-dimension local feature map. And (3) embedding the local feature map after dimension reduction into a module to obtain a one-dimensional vector with position information, inputting the one-dimensional vector into a transducer block, repeating the one-dimensional vector for 12 times, outputting a one-dimensional vector with the same size as the original size, and outputting a one-dimensional vector reshape two-dimensional global feature map.
Fig. 3 is a structural diagram of a Residual unit network provided by an embodiment of the present invention, fig. 4 is a structural diagram of a feature map embedding module provided by an embodiment of the present invention, and fig. 5 is a structural diagram of a transducer provided by an embodiment of the present invention. As shown in fig. 3, 4 and 5, the Residual unit has three components: a convolution layer consisting of two 1 x 1 convolutions and a 3 x 3 convolution, three Group Normalization (GN) layers and three ReLU activation functions, and a residual jump connection. The feature map embedding module structure includes a Patch_empeddings, a position_empeddings and Linear Projection, and a Flatten operation. The Transformer Layer network structure is 12 in total, each of the network structure comprises two residual error modules, and the first residual error module comprises a layernormal and a Multi-Head Self-Attention (MSA) layer and a residual error jump connection; the second residual block contains a LayerNormalization and an MLP layer and a residual skip connection.
Therefore, the field cotton verticillium wilt spot image is input into three Residual unit networks for feature extraction to obtain a local feature map, which specifically comprises the following steps:
step 2.1.1: and inputting the verticillium wilt spot image of the field cotton to a first Residual unit network module for feature extraction to obtain a first layer of local feature map.
Step 2.1.2: and inputting the first layer local feature map to a second Residual unit network module for feature extraction to obtain a second layer local feature map.
Step 2.1.3: and inputting the second layer local feature map to a third Residual unit network module for feature extraction to obtain a third layer local feature map.
Step 2.2.1: and inputting the third layer local feature map to a Maxpool layer, and reducing the dimension of the third layer local feature map to obtain a dimension-reduced local feature map.
Further, the dimension-reduced local feature map is input to a feature map embedding module to obtain a one-dimensional vector with position information, which specifically comprises:
step 2.3.1: inputting the dimension reduction local feature map to a Patch_empeddings module, and cutting the dimension reduction local feature map into small blocks (Patches) with fixed sizes.
Step 2.3.2: each fixed tile is converted to a vector by the flat operation.
Step 2.3.3: and adding each vector to the corresponding Position code generated by the position_emmbeddings module to obtain one-dimensional vector data with Position information.
Step 2.3.4: one-dimensional vector data with position information is mapped into a vector space of another dimension by linear projection.
And (3) inputting the one-dimensional vector obtained in the step (2.3.4) into a transformerler layer, repeating for 12 times to obtain global feature fusion data, and carrying out reshape two-dimensional global feature map on the global feature fusion data. Each layer of transformerlayers specifically includes:
step 2.4.1: and (3) inputting the one-dimensional vector (or the one-dimensional data output by the previous transformerLayer) obtained in the step (2.3.4) to a first LayerNorm layer to obtain first layer normalized data.
Step 2.4.2: and inputting the first-layer normalized data into a Multi-Head Self-Attention (MSA) layer to obtain first-layer global feature data.
Step 2.4.3: and adding and fusing the first layer global feature data with the data input by the first LayerNorm layer to obtain first layer global feature fusion data.
Step 2.4.4: and inputting the first layer global feature fusion data into a second LayerNorm layer to obtain second layer normalized data.
Step 2.4.5: and inputting the normalized data of the second layer to the MLP layer to obtain the global feature fusion data of the second layer.
Step 2.4.6: and adding the second-layer global feature fusion data with the first-layer global feature fusion data to obtain third-layer global feature fusion data.
Step 2.4.7: the third layer global feature fusion data is input to the next transformerlyer and repeated 12 times in total.
Step 2.4.8: and merging the global features output by the last transformerLayer into the data reshape two-dimensional global feature map.
Step 3: inputting the two-dimensional global feature map to a feature pyramid module, and carrying out multi-scale feature extraction and fusion on the two-dimensional global feature map to obtain the feature map fused with the features of different scales.
Fig. 6 is a network diagram of a feature pyramid pooling module provided by an embodiment of the present invention, and fig. 7 is a network structure diagram of an MCBR provided by an embodiment of the present invention. As shown in fig. 6 and 7, the feature pyramid pooling module includes four MCBR layers, an upsampling layer (upsampled layer), a 1×1 convolution layer, and a jump connection. The MCBR layers include a MaxPool layer, a convolution layer (Conv layer), a batch normalization layer (BatchNormalization, BN) and a ReLU activation function. Firstly, inputting the two-dimensional global feature map into four MCBR layers with different scales to obtain four multi-scale receptive field feature maps, then expanding the spatial sizes of the four multi-scale receptive field feature maps to the same size of the two-dimensional global feature map by using a bilinear interpolation up-sampling method, then splicing the four feature maps containing multi-scale information and the two-dimensional global feature map in channel dimensions, and then carrying out polymerization treatment by using 1X 1 convolution to obtain feature maps fused with different scale features and different channel information.
Step 4: and (3) performing channel splicing on the third-layer local feature map obtained in the step (2.1.3) and the feature map obtained in the step (3), performing two-dimensional convolution aggregation, and expanding the size space of the aggregated feature map to the same size of the input image by using a bilinear interpolation up-sampling method to obtain a predictive image segmentation mask.
Fig. 8 is a diagram of a complete network model structure according to an embodiment of the present invention. As shown in fig. 8, the third layer local feature map obtained in step 2.1.3 is input to a 1×1 convolution for channel fusion to obtain a feature map of fusion channel information, the spatial size of the feature map obtained in step 3 is enlarged to the same size as the feature map of the fusion channel information by using a bilinear interpolation up-sampling method, then the feature map is spliced with the feature map of the fusion channel information, the obtained feature map is polymerized by 3×3 convolution, and is subjected to batch normalization and input to a ReLU activation function, and then input to a Dropout layer, and then the spatial size of the feature map output by Dropout is enlarged to the size of an input image by using a bilinear interpolation up-sampling method to obtain a predictive image segmentation mask.
Step 5: and training the neural segmentation network by taking the minimum multi-classification loss function (Cross EntropyLoss) as a target to obtain a field verticillium wilt spot segmentation model. The field verticillium wilt disease spot segmentation model comprises CNN, transformer, a characteristic pyramid pooling module and an image segmentation network.
The calculation formula of the multi-classification loss function (Cross EntropyLoss) is as follows:
wherein K represents the total number of categories, y i True tags representing input samples, p i The representation model predicts a predicted value of class i for the sample. The goal of the loss function is to minimize the difference between the predicted value and the true value so that the predicted value of the model is closer to the true label. The gradient of the loss function to the model parameters is calculated by a back propagation algorithm and the model parameters are updated.
Step 6: and (5) evaluating the result predicted by the model, wherein evaluation indexes are mIoU and mPA.
The calculation formula for each category IoU is as follows:
where j represents the index of the category, TP j Representing the number of correctly classified pixels predicted as the j-th class, FP j Representing the number of pixels predicted as the j-th class but misclassified, FN j Representing the number of pixels in the j-th class of the real label that are not classified. mIoU is the average of IoU for each class.
The calculation formula of PA for each class is as follows:
where j represents the index of the category, TP j Representing the number of correctly classified pixels predicted as the j-th class, FP j Representing the number of pixels predicted to be the j-th class but misclassified. mPA is the average of the PAs for each class.
The invention utilizes a CNN-transducer fusion module, and one of the key points of the method is to combine the CNN method and the transducer method. CNN is used as a backbone network for local feature extraction, while a transducer module is used for global context modeling of CNN extracted features. The fused architecture enables the model to capture local and global information at the same time, so that the segmentation accuracy is improved. Through the cooperative work of the CNN and the transducer, the model can better understand the relation between the object and the context in the image, and further improve the expression capability of the features. And the feature pyramid pooling module can adaptively pool the feature information with different scales, so that the richness and the diversity of the features are improved. By carrying out pyramid pooling operation on the features of different layers, the model can acquire multi-scale feature information and better adapt to the lesions of different scales. In this embodiment, the low-level local features extracted by the CNN are fused with the multi-scale global feature channels which are subjected to transform and feature pyramid pooling, so that the low-level semantic information and the high-level multi-scale global semantic information can be fused, and the segmentation precision is improved.
The beneficial effects of the invention are as follows:
(1) Higher precision: according to the method, the mIoU and the mPA of the data segmentation of the verticillium wilt spots of the field are 87.14 and 92.62 respectively, the mIoU and the mPA of the PSPNet are 84.63 and 91.0 respectively, the mIoU and the mPA of the Unet are 86.1 and 91.7 respectively, and the method is superior to the PSPNet and the Unet in performance from the two evaluation indexes of the mIoU and the mPA. The method adopts the feature pyramid pooling module and the transform module, so that the multi-scale and cross-scale feature information can be effectively extracted and utilized, and the segmentation accuracy is improved.
(2) Global and local information can be captured: the method introduces a transducer module, and weights the input characteristics through a self-attention mechanism, so that global and local information can be captured. Such characteristics contribute to an improvement in the accuracy of segmentation. The transducer module is able to model the relationships between features and achieve more accurate segmentation by adaptively adjusting the importance of the features.
(3) The robustness is high: based on the feature pyramid pooling module, the method can fuse and screen the multi-scale features, so that the lesions with different scales and sizes can be accurately segmented. Such a mechanism enhances the robustness of the model, enabling it to handle lesion areas of different sizes and shapes. The method can effectively divide both a small range of lesions and a large range of lesions.
In the present specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, and identical and similar parts between the embodiments are all enough to refer to each other.
The principles and embodiments of the present invention have been described herein with reference to specific examples, the description of which is intended only to assist in understanding the methods of the present invention and the core ideas thereof; also, it is within the scope of the present invention to be modified by those of ordinary skill in the art in light of the present teachings. In view of the foregoing, this description should not be construed as limiting the invention.
Claims (9)
1. The method for dividing the verticillium wilt disease spots of the cotton in the field is characterized by comprising the following steps of:
acquiring an image to be detected;
inputting the image to be detected into a trained field verticillium wilt disease spot segmentation model to obtain a segmentation result;
the construction method of the field verticillium wilt disease spot segmentation model comprises the following steps:
performing image processing on the acquired multiple cotton verticillium wilt disease spot images to obtain a sample image;
inputting the sample image into a CNN network with a multi-layer residual block for feature extraction, and performing dimension reduction by using a maximum pooling layer to obtain a dimension-reduced local feature map;
inputting the reduced local feature map to a feature map embedding module to obtain a one-dimensional vector with position information;
the one-dimensional vector is input into a transducer block and repeated for 12 times to obtain global feature fusion data, and the global feature fusion data is adjusted to obtain a two-dimensional global feature map;
inputting the two-dimensional global feature map to a feature pyramid module to extract and fuse multi-scale features of the two-dimensional global feature map so as to obtain feature maps fused with features of different scales;
the local feature images extracted by the CNN network with residual blocks of the last layer and feature images fused with features of different scales are subjected to channel splicing, two-dimensional convolution aggregation is carried out, and then the size space of the aggregated feature images is expanded to the same size of a sample image by using a bilinear interpolation up-sampling method so as to obtain a predictive image segmentation mask;
and training a neural network model based on the CNN-transducer and the characteristic pyramid module by taking the minimum multi-classification loss function as a target to obtain a trained field verticillium wilt disease spot segmentation model.
2. The method for segmenting the verticillium wilt disease spots of the field cotton according to claim 1, wherein the field verticillium wilt disease spot segmentation model comprises CNN, transformer, a feature pyramid pooling module, a fusion module on a channel for carrying out channel on an output feature map of a CNN network and multi-scale features output by a feature golden sub-tower module, and an image segmentation network.
3. The method for segmenting verticillium wilt spots of field cotton according to claim 1, wherein the image processing of the acquired plurality of verticillium wilt spot images of cotton to obtain a sample image comprises:
acquiring a cotton verticillium wilt disease spot data set; the cotton verticillium wilt disease spot data set comprises a plurality of cotton verticillium wilt disease spot images under a field background;
and carrying out illumination correction, image denoising, image labeling and image enhancement on each cotton verticillium wilt disease spot image to obtain a plurality of sample images and corresponding real image verticillium wilt disease spot segmentation masks.
4. The method for dividing verticillium wilt spots of field cotton according to claim 1, wherein the number of residual blocks is 3.
5. The method for segmenting the verticillium wilt disease spots of field cotton according to claim 4, wherein the step of inputting the sample image into a CNN network with a multi-layer residual block for feature extraction and performing dimension reduction by using a maximum pooling layer to obtain a dimension-reduced local feature map comprises the steps of:
inputting the sample image into a first residual block for feature extraction to obtain a first layer local feature map;
inputting the first layer local feature map to a second residual block for feature extraction to obtain a second layer local feature map;
inputting the second layer local feature map to a third residual block for feature extraction to obtain a third layer local feature map;
and inputting the third layer local feature map to a maximum pooling layer, and performing dimension reduction on the third layer local feature map to obtain a dimension-reduced local feature map.
6. The method for segmenting the verticillium wilt disease spots of the field cotton according to claim 1, wherein the step of inputting the reduced local feature map to the feature map embedding module to obtain a one-dimensional vector with position information comprises the steps of:
inputting the dimension-reduced local feature map to a Patch_empeddings module, and cutting the dimension-reduced local feature map into small blocks with fixed sizes;
subjecting each small block to a flat operation to convert the small block into a vector;
adding the Position codes generated by each vector and the position_empeddings module to obtain a one-dimensional vector with Position information;
one-dimensional vector data with position information is mapped into a vector space of another dimension by linear projection.
7. The method for segmenting the verticillium wilt disease spots of field cotton according to claim 6, wherein the step of inputting the one-dimensional vector into a transducer block and repeating the one-dimensional vector 12 times to obtain global feature fusion data, and adjusting the global feature fusion data to obtain a two-dimensional global feature map comprises:
inputting a one-dimensional vector obtained by linear projection mapping into a first LayerNorm layer to obtain first layer normalization data;
inputting the first layer normalized data into a Multi-Head Self-attribute layer to obtain first layer global feature data;
adding and fusing the first layer global feature data and the data input by the first LayerNorm layer to obtain first layer global feature fusion data;
inputting the first layer global feature fusion data to a second LayerNorm layer to obtain second layer normalized data;
inputting the second layer normalized data to the MLP layer to obtain second layer global feature fusion data;
adding the second-layer global feature fusion data with the first-layer global feature fusion data to obtain third-layer global feature fusion data;
inputting the third-layer global feature fusion data to the next Transformer Layer, repeating 12 times in total;
and adjusting the global feature fusion data output by the last transformerLayer to obtain a two-dimensional global feature map.
8. The method of claim 1, wherein the feature pyramid pooling module comprises four MCBR layers, an upsampling layer, a 1 x 1 convolution layer, and a skip connection; the MCBR layer comprises a MaxPool layer, a convolution layer, a batch normalization layer and a ReLU activation function.
9. The method for segmenting the verticillium wilt spots of the field cotton according to claim 1, wherein the method for constructing the segmentation model of the verticillium wilt spots of the field further comprises:
and performing accuracy verification on the trained field verticillium wilt spot segmentation model to obtain a verified field verticillium wilt spot segmentation model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310816117.6A CN116993756B (en) | 2023-07-05 | 2023-07-05 | Method for dividing verticillium wilt disease spots of field cotton |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310816117.6A CN116993756B (en) | 2023-07-05 | 2023-07-05 | Method for dividing verticillium wilt disease spots of field cotton |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116993756A true CN116993756A (en) | 2023-11-03 |
CN116993756B CN116993756B (en) | 2024-09-27 |
Family
ID=88520470
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310816117.6A Active CN116993756B (en) | 2023-07-05 | 2023-07-05 | Method for dividing verticillium wilt disease spots of field cotton |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116993756B (en) |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114579794A (en) * | 2022-03-31 | 2022-06-03 | 西安建筑科技大学 | Multi-scale fusion landmark image retrieval method and system based on feature consistency suggestion |
CN114648535A (en) * | 2022-03-21 | 2022-06-21 | 北京工商大学 | Food image segmentation method and system based on dynamic transform |
CN115035361A (en) * | 2022-05-11 | 2022-09-09 | 中国科学院声学研究所南海研究站 | Target detection method and system based on attention mechanism and feature cross fusion |
CN115035131A (en) * | 2022-04-24 | 2022-09-09 | 南京农业大学 | Unmanned aerial vehicle remote sensing image segmentation method and system of U-shaped self-adaptive EST |
CN115908241A (en) * | 2022-09-16 | 2023-04-04 | 重庆邮电大学 | Retinal vessel segmentation method based on fusion of UNet and Transformer |
CN116071668A (en) * | 2022-09-01 | 2023-05-05 | 重庆理工大学 | Unmanned aerial vehicle aerial image target detection method based on multi-scale feature fusion |
CN116091770A (en) * | 2023-01-30 | 2023-05-09 | 中国农业大学 | Grape leaf lesion image segmentation method based on cross-resolution transducer model |
CN116109920A (en) * | 2022-12-12 | 2023-05-12 | 浙江工业大学 | Remote sensing image building extraction method based on transducer |
CN116258976A (en) * | 2023-03-24 | 2023-06-13 | 长沙理工大学 | Hierarchical transducer high-resolution remote sensing image semantic segmentation method and system |
CN116310335A (en) * | 2023-03-11 | 2023-06-23 | 湖州师范学院 | Method for segmenting pterygium focus area based on Vision Transformer |
WO2023116507A1 (en) * | 2021-12-22 | 2023-06-29 | 北京沃东天骏信息技术有限公司 | Target detection model training method and apparatus, and target detection method and apparatus |
-
2023
- 2023-07-05 CN CN202310816117.6A patent/CN116993756B/en active Active
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2023116507A1 (en) * | 2021-12-22 | 2023-06-29 | 北京沃东天骏信息技术有限公司 | Target detection model training method and apparatus, and target detection method and apparatus |
CN114648535A (en) * | 2022-03-21 | 2022-06-21 | 北京工商大学 | Food image segmentation method and system based on dynamic transform |
CN114579794A (en) * | 2022-03-31 | 2022-06-03 | 西安建筑科技大学 | Multi-scale fusion landmark image retrieval method and system based on feature consistency suggestion |
CN115035131A (en) * | 2022-04-24 | 2022-09-09 | 南京农业大学 | Unmanned aerial vehicle remote sensing image segmentation method and system of U-shaped self-adaptive EST |
CN115035361A (en) * | 2022-05-11 | 2022-09-09 | 中国科学院声学研究所南海研究站 | Target detection method and system based on attention mechanism and feature cross fusion |
CN116071668A (en) * | 2022-09-01 | 2023-05-05 | 重庆理工大学 | Unmanned aerial vehicle aerial image target detection method based on multi-scale feature fusion |
CN115908241A (en) * | 2022-09-16 | 2023-04-04 | 重庆邮电大学 | Retinal vessel segmentation method based on fusion of UNet and Transformer |
CN116109920A (en) * | 2022-12-12 | 2023-05-12 | 浙江工业大学 | Remote sensing image building extraction method based on transducer |
CN116091770A (en) * | 2023-01-30 | 2023-05-09 | 中国农业大学 | Grape leaf lesion image segmentation method based on cross-resolution transducer model |
CN116310335A (en) * | 2023-03-11 | 2023-06-23 | 湖州师范学院 | Method for segmenting pterygium focus area based on Vision Transformer |
CN116258976A (en) * | 2023-03-24 | 2023-06-13 | 长沙理工大学 | Hierarchical transducer high-resolution remote sensing image semantic segmentation method and system |
Non-Patent Citations (4)
Title |
---|
QIANKUN WANG等: "Swin Transformer Based Pyramid Pooling Network for Food Segmentation", 《2022 IEEE 2ND INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND ARTIFICIAL INTELLIGENCE (SEAI)》, 25 July 2022 (2022-07-25), pages 64 - 68 * |
刘佩林: "《AI嵌入式系统 算法优化与实现》", 31 October 2021, 北京:机械工业出版社, pages: 205 * |
王冬: "智能系统与技术丛书 Python深度学习 基于TensorFlow 第2版", 31 October 2022, 北京:机械工业出版社, pages: 227 * |
胡晋玮: "基于DeeplabV3+改进的煤岩显微组分组自动化测试模型", 《煤田地质与勘探》, vol. 51, no. 10, 22 May 2023 (2023-05-22), pages 27 - 36 * |
Also Published As
Publication number | Publication date |
---|---|
CN116993756B (en) | 2024-09-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111739075B (en) | Deep network lung texture recognition method combining multi-scale attention | |
CN110647874B (en) | End-to-end blood cell identification model construction method and application | |
CN110175613A (en) | Street view image semantic segmentation method based on Analysis On Multi-scale Features and codec models | |
CN111738363B (en) | Alzheimer disease classification method based on improved 3D CNN network | |
CN114495029B (en) | Traffic target detection method and system based on improved YOLOv4 | |
CN112241762A (en) | Fine-grained identification method for pest and disease damage image classification | |
CN112819748B (en) | Training method and device for strip steel surface defect recognition model | |
CN111652273B (en) | Deep learning-based RGB-D image classification method | |
CN114758288A (en) | Power distribution network engineering safety control detection method and device | |
CN115439458A (en) | Industrial image defect target detection algorithm based on depth map attention | |
CN110059765B (en) | Intelligent mineral identification and classification system and method | |
CN111368637B (en) | Transfer robot target identification method based on multi-mask convolutional neural network | |
CN115881265B (en) | Intelligent medical record quality control method, system and equipment for electronic medical record and storage medium | |
Han et al. | An improved YOLOv5 algorithm for wood defect detection based on attention | |
CN115546466A (en) | Weak supervision image target positioning method based on multi-scale significant feature fusion | |
CN114581789A (en) | Hyperspectral image classification method and system | |
CN117593514B (en) | Image target detection method and system based on deep principal component analysis assistance | |
CN103268494B (en) | Parasite egg recognition methods based on rarefaction representation | |
CN118230354A (en) | Sign language recognition method based on improvement YOLOv under complex scene | |
CN117593244A (en) | Film product defect detection method based on improved attention mechanism | |
CN116934696A (en) | Industrial PCB defect detection method and device based on YOLOv7-Tiny model improvement | |
CN116993756B (en) | Method for dividing verticillium wilt disease spots of field cotton | |
CN114494703B (en) | Intelligent workshop scene target lightweight semantic segmentation method | |
CN118587733B (en) | Bridge structure identification and parameter extraction method for bridge PDF design drawing | |
CN114998609B (en) | Multi-class commodity target detection method based on dense feature extraction and lightweight network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |