CN115937533A

CN115937533A - Aeroponic tomato feature extraction method based on semantic segmentation

Info

Publication number: CN115937533A
Application number: CN202211578932.5A
Authority: CN
Inventors: 董俊; 朱智佳; 马凡; 吴双
Original assignee: Anhui Zhongke Deji Intelligent Technology Co ltd; Hefei Institutes of Physical Science of CAS
Current assignee: Anhui Zhongke Deji Intelligent Technology Co ltd; Hefei Institutes of Physical Science of CAS
Priority date: 2022-12-05
Filing date: 2022-12-05
Publication date: 2023-04-07
Anticipated expiration: 2042-12-05
Also published as: CN115937533B

Abstract

The invention provides a semantic segmentation-based aeroponic tomato feature extraction method, and relates to the field of soilless culture. The aeroponic tomato feature extraction method based on semantic segmentation inputs the picture of plant leaves; preprocessing an input picture; transmitting the preprocessed pictures to an encoder for extracting features, extracting local abstract features of the pictures through a ResNet network, adding a context module before residual fusion of a ResLayer structure of the ResNet network and after the last convolution, and overlapping global context information of the pictures through the context module; the extracted features are transmitted to a decoder for fusion and decoding of the features, the ASPP optimized for small targets is used for decoding coarse-grained features, the coarse-grained features are converted into fine-grained features through a dynamic kernel updating mechanism, the feature information is mapped to classification information through a main classifier, classification is completed, the segmentation precision is effectively improved, and the requirements of subsequent plant growth state judgment on the leaf and root segmentation precision are met.

Description

Aeroponic tomato feature extraction method based on semantic segmentation

Technical Field

The invention relates to the technical field of soilless culture, in particular to a semantic segmentation-based aeroponic tomato feature extraction method.

Background

Aeroponic cultivation (aeroponic cultivation for short) is an indispensable cultivation mode in soilless cultivation technology, and is considered to be in line with the development direction of future intelligent agricultural production. The nutrient solution is atomized into small fog drops by using an atomizing device and is directly sprayed to the root system of the plant to provide water and nutrients required by the growth of the plant. In order to reasonably control the irrigation time and irrigation quantity of the regulated and deficient irrigation, the accurate estimation of the physiological water demand condition of the aeroponics tomatoes is necessary. In the process of evaluating the physiological water requirement of the aeroponics tomatoes, the growth states of leaves and roots of the tomatoes need to be judged in time, so that the identification and the segmentation of the characteristics of the tomatoes are particularly important.

Disclosure of Invention

Technical problem to be solved

Aiming at the defects of the prior art, the invention provides a semantic segmentation-based aeroponic culture tomato feature extraction method, which effectively improves the segmentation precision and meets the requirements of subsequent plant growth state discrimination on the leaf and root system segmentation precision.

(II) technical scheme

In order to realize the purpose, the invention is realized by the following technical scheme:

in a first aspect, a method for extracting features of aeroponics tomatoes based on semantic segmentation is provided, which comprises the following steps:

inputting a picture of a plant leaf;

preprocessing an input picture;

transmitting the preprocessed picture to an encoder for extracting features, extracting local abstract features of the picture through a ResNet network, adding a context module before residual fusion of a ResLayer structure of the ResNet network and after the last convolution, and overlapping global context information of the picture through the context module;

the extracted features are transmitted to a decoder for fusion and decoding of the features, the ASPP optimized for small targets is used for decoding coarse-grained features, the coarse-grained features are converted into fine-grained features through a dynamic kernel updating mechanism, the classification accuracy of edge pixels and pixels difficult to classify is improved, the feature information is mapped to classification information through a main classifier, and classification is finished.

Preferably, the context module specifically includes:

the global context module is used for context modeling;

bottleneck layer translation to capture inter-channel dependencies;

element-by-element addition is used for feature fusion.

Preferably, the extracted features are transmitted to a decoder for fusion and feature decoding, and an auxiliary classifier is added in the process, wherein the auxiliary classifier is used for accelerating the optimization process and rapidly converging.

Preferably, the expansion coefficient used for the expansion convolution layer on each branch in the ASPP structure is set to [1,3,5,8].

Preferably, a dynamic core self-updating module is added in the ASPP structure.

Preferably, a dynamic core self-updating module is added to the ASPP structure, which specifically includes:

characteristic assembly: multiplying the feature mapping F by MaskMi-1 to serve as a new assembly feature FK;

dynamic kernel self-update: first, element multiplication is carried out between FK and Ki-1, two gating values are calculated, and a new group of cores are calculated by the two gating values;

kernel interaction: calculating a group of new Ki by multi-head attention and a feedforward neural network, and completing the updating of a convolution kernel K from Ki-1 to Ki;

a convolution kernel Ki passes through FC-LN-ReLU to generate a new binary mask MaskMi;

the generation of the binary mask MaskMi is repeated until the convolution kernel K is updated for a prescribed turn.

In a second aspect, a terminal device is provided, which includes a memory, a processor, and a computer program stored in the memory and capable of running on the processor, and is characterized in that the memory stores the computer program capable of running on the processor, and when the processor loads and executes the computer program, the method for extracting features of aeroponics tomatoes based on semantic segmentation is adopted.

In a third aspect, a computer-readable storage medium storing a computer program is provided, wherein the program, when executed by a processor, implements the method for extracting features of aeroponic tomatoes based on semantic segmentation.

(III) advantageous effects

The aeroponic tomato feature extraction method based on semantic segmentation fuses a new semantic segmentation network of a self-attention module and a co-prime factor ASPP module aiming at small target improvement and dynamic kernel update, effectively improves the segmentation precision, and meets the requirement of subsequent plant growth state discrimination on the leaf and root system segmentation precision.

Drawings

FIG. 1 is a schematic flow diagram of the process of the present invention;

FIG. 2 is a diagram illustrating a context module for adding a self-attention mechanism in a feature extraction stage according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of a comparison between NLBlock and GCBlock in the embodiment of the present invention;

FIG. 4 is a block diagram illustrating the addition of a dynamic core self-refresh module in an ASPPHead according to the present invention;

FIG. 5 is a flow chart of the image pre-processing according to the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings of the present invention, and it is to be understood that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Examples

Referring to fig. 1 and 5, an embodiment of the invention provides a method including:

inputting a picture of a plant leaf;

preprocessing an input picture;

Most of the traditional CNN semantic segmentation model DeepLabV3 based on the CNN structure follows the framework of an encoder-decoder, the CNN in the encoder is used for feature extraction, the feature graph is rich in semantic information while the resolution of the feature graph is gradually reduced, and then the CNN in the decoder utilizes the encoder encoding features as input to decode the final segmentation prediction result;

such a frame has the following problems:

the semantic segmentation task needs detail information besides semantic information, so deep labv3 adopts an ASPP (feature pooling pyramid) structure to extract multi-scale information, but ignores global context information, and the expansion convolution coefficient in the ASPP (feature pooling pyramid) does not optimize any specific task of the leaf segmentation class.

Referring to fig. 2, a context module based on the self-attention mechanism (global context lock) is added in the feature extraction stage:

the following three steps of the module are:

the global context module is used for context modeling;

bottleneck layer translation to capture inter-channel dependencies;

element-by-element addition is used for feature fusion.

The position is added before residual fusion of the ResLayer structure of the feature extraction backbone network (ResNet-50) and after the last convolution.

Referring to fig. 3, GCBlock has the following advantages compared to the conventional self-attention module (e.g., NLBlock):

unlike the conventional non-local block (NLBlock) which has a large number of parameters, GCBlock replaces 1 × 1 convolution in the conventional transform module (Wv) with a bottleneck layer (depth separable convolution), greatly reducing the parameter values.

According to the self-attention mechanism-related research, the learned global context information is almost the same at different positions, namely the global context without position dependence is learned, so that the traditional NLBlock is simplified by calculating a global (Global) attention parameter, and the parameter is shared by all positions, namely the value of the query (Wq) is ignored.

To reduce the computational burden of the reduced version non-localblock, we move Wv outside the global context module and remove the linear transformation matrix (Wz) before element-by-element addition;

according to the sizes of the characteristics of the leaves and the root system, the expansion convolution module in the ASPP is redesigned, so that the ASPP is more suitable for small target detection.

In the original ASPP structure, in order to detect multi-scale information, a characteristic diagram is passed through four parallel expansion convolution layers, the expansion coefficients adopted by the expansion convolution layers on each branch are different, the original structure is set to be [1,6,12,18], but for the detection of a small target, such as a blade detection target, an overlarge expansion coefficient is not beneficial to accurate segmentation of small target edge information, and therefore the expansion coefficient is changed to be [1,3,5,8]. Meanwhile, the original expansion convolution coefficient has a theoretical problem, namely the problem of the grid effect, and after the problem is changed into a relatively prime number, the loss of local information caused by the grid effect is greatly reduced.

Referring to fig. 4, a dynamic kernel update module (dynamic kernel update) is added to the asppehead to solve the problem of limited effect caused by the difference between the input data and the training data, and the flexibility and performance of the model are improved by dynamic deformation of the kernel:

specifically, the dynamic self-update is performed by using N kernels (N is the number of classes of the target to be detected, such as leaf + root system + background, and N is 3), and the method includes the following three steps:

characteristic assembly: multiplying the feature map F by MaskMi-1 as a new assembly feature FK (red dotted frame on the upper graph)

Dynamic kernel self-update: first, perform an element multiplication between FK and Ki-1, then compute two gates (gates), and then compute a new set of kernels from the two gates;

kernel interaction: calculating a group of new Ki through an MSA-FFN (Multi-HeadSelf-Attention + Feed-Forward neural network) (Multi-head Attention + Feed-Forward neural network), and then completing the updating of the convolution kernel K from Ki-1 to Ki;

subsequently, the convolution kernel Ki passes through FC-LN-ReLU to generate a new binary mask MaskMi;

As another embodiment of the present invention, there is provided a terminal device, including a memory, a processor, and a computer program stored in the memory and capable of running on the processor, wherein the memory stores a computer program capable of running on the processor, and when the processor loads and executes the computer program, the method for extracting features of aeroponic tomato based on semantic segmentation in the above embodiments is adopted.

As a further embodiment of the present invention, there is provided a computer readable storage medium storing a computer program, wherein the program is executed by a processor to implement the method for extracting features of aeroponic tomatoes based on semantic segmentation in the above embodiments.

It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrases "comprising a," "8230," "8230," or "comprising" does not exclude the presence of additional like elements in a process, method, article, or apparatus that comprises the element.

Claims

1. A method for extracting features of aeroponic tomatoes based on semantic segmentation is characterized by comprising the following steps:

inputting a picture of a plant leaf;

preprocessing an input picture;

transmitting the preprocessed pictures to an encoder for extracting features, extracting local abstract features of the pictures through a ResNet network, adding a context module before residual fusion of a ResLayer structure of the ResNet network and after the last convolution, and overlapping global context information of the pictures through the context module;

the extracted features are transmitted to a decoder for fusing and decoding the features, the ASPP optimized for small targets is used for decoding coarse-grained features, the coarse-grained features are converted into fine-grained features through a dynamic kernel updating mechanism, the classification accuracy of edge pixels and pixels difficult to classify is improved, and feature information is mapped to classification information through a main classifier to finish classification.

2. The method for extracting features of aeroponic tomatoes based on semantic segmentation as claimed in claim 1, wherein the method comprises: the context module specifically includes:

the global context module is used for context modeling;

bottleneck layer translation to capture inter-channel dependencies;

element-by-element addition is used for feature fusion.

3. The method for extracting features of aeroponic tomatoes based on semantic segmentation as claimed in claim 1, wherein the method comprises: the transmission of the extracted features to the decoder for fusion and addition of an auxiliary classifier in the decoding of the features, the auxiliary classifier for speeding up the optimization process and for fast convergence.

4. The method for extracting features of aeroponic tomatoes based on semantic segmentation as claimed in claim 1, wherein the method comprises: the expansion coefficient employed by the expansion convolution layer on each branch in the ASPP structure is set to [1,3,5,8].

5. The method for extracting features of aeroponic tomatoes based on semantic segmentation as claimed in claim 4, wherein the method comprises: and a dynamic core self-updating module is added in the ASPP structure.

6. The method for extracting features of aeroponic tomatoes based on semantic segmentation as claimed in claim 1, wherein the method comprises: the method for adding the dynamic core self-updating module in the ASPP structure specifically comprises the following steps:

characteristic assembly: multiplying the feature mapping F by Mask Mi-1 to serve as a new assembly feature FK;

dynamic kernel self-update: first, element multiplication is performed between FK and Ki-1, then two gates are calculated, and a new set of kernels is calculated by the two gates;

kernel interaction: calculating a group of new Ki by multi-head attention and a feedforward neural network, and updating the convolution kernel K from Ki-1 to Ki;

a convolution kernel Ki passes through FC-LN-ReLU to generate a new binary Mask Mi;

the generation of the binary Mask Mi is repeated until the convolution kernel K is updated for a specified turn.

7. A terminal device, comprising a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the memory stores the computer program capable of running on the processor, and the processor loads and executes the computer program, and the method for extracting features of aeroponics tomatoes based on semantic segmentation as claimed in any one of claims 1 to 6 is used.

8. A computer-readable storage medium storing a computer program, which when executed by a processor implements a method for extracting features of aeroponic tomatoes based on semantic segmentation as claimed in any one of claims 1 to 6.