CN112036437A

CN112036437A - Rice seedling detection model based on improved YOLOV3 network and method thereof

Info

Publication number: CN112036437A
Application number: CN202010736191.3A
Authority: CN
Inventors: 王姗姗; 余山山; 张文毅
Original assignee: Nanjing Research Institute for Agricultural Mechanization Ministry of Agriculture
Current assignee: Nanjing Research Institute for Agricultural Mechanization Ministry of Agriculture
Priority date: 2020-07-28
Filing date: 2020-07-28
Publication date: 2020-12-04
Anticipated expiration: 2040-07-28
Also published as: CN112036437B

Abstract

The invention discloses a rice seedling detection model based on an improved YOLOV3 network and a method thereof, which solves the disadvantages of poor adaptability and robustness of traditional image processing methods for rice seedling detection and poor image processing algorithm effect; The main points of the scheme include a feature extraction module that performs multi-scale feature extraction on the input rice seedling image, and a multi-scale prediction module that predicts the position of rice seedlings according to the feature map extracted from the multi-scale features; the multi-scale prediction module includes a multi-scale prediction module. A multi-scale fusion feature building module for merging and constructing the extracted feature maps, a multi-scale seedling position prediction module for predicting the position of seedlings according to the feature maps constructed by fusion, the rice seedling detection model and method based on the improved YOLOV3 network of the present invention , to solve the problem of poor robustness and accuracy of rice seedling detection in complex paddy field environments with high and low weed densities, different lighting conditions, and lack of seedlings.

Description

Rice seedling detection model and method based on improved YOLOV3 network

技术领域technical field

本发明涉及人工智能技术领域，特别涉及基于改进YOLOV3网络的水稻秧苗检测模型及其方法。The invention relates to the technical field of artificial intelligence, in particular to a rice seedling detection model based on an improved YOLOV3 network and a method thereof.

背景技术Background technique

水稻是世界三大粮食作物之一，其栽培面积和总产量仅次于小麦，是全球半数以上人口的主食。目前水稻田采用的除草方式主要是喷洒化学药剂和人工除草，化学药剂的使用容易造成农药残留量超标、环境污染和生态链的破坏等问题，而人工除草耗时费力，效率低下且成本高昂。因此对于水稻机械化除草的需求越来越迫切。水稻机械化除草技术对于减少化学药剂的使用，减少生态环境的污染和提高粮食安全具有非常重大的意义。Rice is one of the three major food crops in the world. Its cultivation area and total output are second only to wheat, and it is the staple food for more than half of the world's population. At present, the weeding methods used in paddy fields are mainly spraying chemical agents and artificial weeding. The use of chemical agents can easily cause problems such as excessive pesticide residues, environmental pollution and damage to the ecological chain. Manual weeding is time-consuming, laborious, inefficient and costly. Therefore, the demand for mechanized weeding of rice is more and more urgent. Rice mechanized weeding technology is of great significance for reducing the use of chemical agents, reducing ecological environment pollution and improving food security.

水稻秧苗检测是水稻机械化除草的关键环节之一，并且对于精准农业中的精准除草、施肥具有重要的指导作用。通过水稻秧苗检测得到水稻秧苗的位置信息，可以指导除草机构的动作除去水稻封行前的行间杂草。Rice seedling detection is one of the key links in mechanized weeding of rice, and plays an important guiding role in precision weeding and fertilization in precision agriculture. The position information of the rice seedlings can be obtained by detecting the rice seedlings, which can guide the action of the weeding agency to remove the weeds between the rows before the rice is closed.

作物和杂草检测主要有2类方法，第一类是基于图像处理的方法，将作物从背景中分割出来，然后进行作物行中心线检测。第二类是基于深度学习的方法进行作物和杂草检测。由于水田环境非常复杂，水稻秧苗的检测影响因素较多，传统的图像处理方法对水稻秧苗检测的适应性和鲁棒性较差。尤其当水田图像在R，G和B像素强度上变化不显著的时候，当杂草很多或与农作物大小相似时，传统的图像处理算法效果不佳。There are two main types of methods for crop and weed detection. The first type is based on image processing. The crop is segmented from the background, and then the centerline of the crop line is detected. The second category is deep learning based methods for crop and weed detection. Due to the complex paddy field environment and many factors affecting the detection of rice seedlings, traditional image processing methods have poor adaptability and robustness to the detection of rice seedlings. Especially when the paddy field image does not change significantly in R, G and B pixel intensities, when there are many weeds or when the size is similar to the crops, the traditional image processing algorithm does not work well.

发明内容SUMMARY OF THE INVENTION

本发明的目的是提供一种基于改进YOLOV3网络的水稻秧苗检测模型及其方法，解决复杂水田环境中多种情况下的水稻秧苗检测鲁棒性和准确性较差的问题。The purpose of the present invention is to provide a rice seedling detection model and method based on the improved YOLOV3 network, which solves the problem of poor robustness and accuracy of rice seedling detection under various conditions in a complex paddy field environment.

本发明的上述技术目的是通过以下技术方案得以实现的：The above-mentioned technical purpose of the present invention is achieved through the following technical solutions:

一种基于改进YOLOV3网络的水稻秧苗检测模型，包括有对输入的水稻秧苗图像进行多尺度特征提取以获得多尺度的秧苗特征图的特征提取模块、根据多尺度的秧苗特征图对水稻秧苗位置进行预测的多尺度预测模块；A rice seedling detection model based on an improved YOLOV3 network, including a feature extraction module that performs multi-scale feature extraction on an input rice seedling image to obtain a multi-scale rice seedling feature map, and performs a multi-scale seedling feature map. Predicted multi-scale prediction module;

所述多尺度预测模块包括有对多尺度的秧苗特征图进行融合构建处理以获得多尺度的融合特征图的多尺度融合特征构建模块、根据融合特征图中对应的融合特征对秧苗位置进行预测的多尺度秧苗位置预测模块。The multi-scale prediction module includes a multi-scale fusion feature construction module that performs fusion construction processing on multi-scale seedling feature maps to obtain multi-scale fusion feature maps, and a multi-scale fusion feature construction module that predicts the position of the seedlings according to the fusion features corresponding to the fusion feature maps. Multi-scale seedling location prediction module.

作为优选，所述特征提取模块包括有对输入的水稻秧苗图像依次进行倍率采样提取处理以分别得到52尺度秧苗特征图、26尺度秧苗特征图、13尺度秧苗特征图的52尺度特征提取子模块、26尺度特征提取子模块、13尺度特征提取子模块。Preferably, the feature extraction module includes a 52-scale feature extraction sub-module for sequentially performing magnification sampling and extraction processing on the input rice seedling images to obtain a 52-scale seedling feature map, a 26-scale seedling feature map, and a 13-scale seedling feature map, respectively. 26-scale feature extraction sub-module, 13-scale feature extraction sub-module.

作为优选，所述多尺度融合特征构建模块包括有对不同尺度的秧苗特征图进行融合处理的52尺度融合特征构建子模块、26尺度融合特征构建子模块、13尺度融合特征构建子模块。Preferably, the multi-scale fusion feature construction module includes a 52-scale fusion feature construction sub-module, a 26-scale fusion feature construction sub-module, and a 13-scale fusion feature construction sub-module for fusing the seedling feature maps of different scales.

作为优选，所述52尺度特征提取子模块对52尺度特征提取子模块的输入秧苗特征图进行特征提取以获得包含有52尺度秧苗特征的52尺度秧苗特征图；所述26尺度特征提取子模块对26尺度特征提取子模块的输入秧苗特征图进行特征提取以获得包含有26尺度秧苗特征的26尺度秧苗特征图；所述13尺度特征提取子模块对为13尺度特征提取子模块的输入秧苗特征图进行特征提取以获得包含有13尺度秧苗特征的13尺度秧苗特征图。Preferably, the 52-scale feature extraction sub-module performs feature extraction on the input seedling feature map of the 52-scale feature extraction sub-module to obtain a 52-scale seedling feature map including the 52-scale seedling features; the 26-scale feature extraction sub-module is for The input seedling feature map of the 26-scale feature extraction sub-module performs feature extraction to obtain a 26-scale seedling feature map containing the 26-scale seedling features; the 13-scale feature extraction sub-module pair is the input seedling feature map of the 13-scale feature extraction sub-module Feature extraction is performed to obtain a 13-scale seedling feature map containing 13-scale seedling features.

一种基于改进YOLOV3网络的水稻秧苗检测模型的检测方法，包括有以下步骤：A detection method for a rice seedling detection model based on an improved YOLOV3 network, comprising the following steps:

获得输入的水稻秧苗图像；Obtain the input rice seedling image;

对水稻秧苗图像分别进行不同倍率下采样，通过特征提取模块获得52尺度秧苗特征图、26尺度秧苗特征图、13尺度秧苗特征图；The images of rice seedlings are down-sampled at different magnifications respectively, and the 52-scale seedling feature map, the 26-scale seedling feature map, and the 13-scale seedling feature map are obtained through the feature extraction module;

通过多尺度融合特征构建模块对获取的13尺度秧苗特征图、26尺度秧苗特征图、52尺度秧苗特征图依次进行特征融合以获得13尺度秧苗融合特征图、26尺度秧苗融合特征图、52尺度秧苗融合特征图；Through the multi-scale fusion feature building module, feature fusion is performed on the obtained 13-scale seedling feature map, 26-scale seedling feature map, and 52-scale seedling feature map in turn to obtain a 13-scale seedling fusion feature map, a 26-scale seedling fusion feature map, and a 52-scale seedling feature map. fusion feature map;

通过多尺度秧苗位置预测模块依次对获得的13尺度秧苗融合特征图、26尺度秧苗融合特征图、52尺度秧苗融合特征图进行预测以获得13尺度秧苗预测结果、26尺度秧苗预测结果、52尺度秧苗预测结果。Through the multi-scale seedling position prediction module, the obtained 13-scale seedling fusion feature map, 26-scale seedling fusion feature map, and 52-scale seedling fusion feature map are successively predicted to obtain 13-scale seedling prediction results, 26-scale seedling prediction results, and 52-scale seedling prediction results. forecast result.

作为优选，特征提取模块进行特征提取的具体步骤如下：Preferably, the specific steps of feature extraction performed by the feature extraction module are as follows:

对输入的水稻秧苗图经过10个卷积层及3个残差层得到52尺度特征提取子模块的输入卷积层，通过52尺度特征提取子模块提取获得52尺度秧苗特征；The input convolution layer of the 52-scale feature extraction sub-module is obtained through 10 convolution layers and 3 residual layers for the input rice seedling image, and the 52-scale seedling features are extracted through the 52-scale feature extraction sub-module;

对52尺度特征提取子模块的输出层经过3*3/2*512的卷积核进行卷积得到26尺度特征提取子模块的输入卷积层，通过26尺度特征提取子模块提取获得26尺度秧苗特征；The output layer of the 52-scale feature extraction sub-module is convolved with a 3*3/2*512 convolution kernel to obtain the input convolution layer of the 26-scale feature extraction sub-module, and the 26-scale feature extraction sub-module is extracted to obtain 26-scale seedlings feature;

对26尺度特征提取子模块的输出层经过3*3/2*512的卷积核进行卷积得到13尺度特征提取子模块的输入卷积层，通过13尺度特征提取子模块提取获得13尺度秧苗特征。The output layer of the 26-scale feature extraction sub-module is convolved with a convolution kernel of 3*3/2*512 to obtain the input convolution layer of the 13-scale feature extraction sub-module, and the 13-scale seedlings are extracted by the 13-scale feature extraction sub-module. feature.

作为优选，多尺度融合特征构建模块进行特征融合构建具体步骤如下：Preferably, the multi-scale fusion feature building module performs the following specific steps for feature fusion construction:

对26尺度特征提取子模块的输出层进行下采样得到13*13*512特征图，并与13尺度特征提取子模块的输出层进行张量拼接，得到13尺度秧苗融合特征图；The output layer of the 26-scale feature extraction sub-module is downsampled to obtain a 13*13*512 feature map, and tensor splicing is performed with the output layer of the 13-scale feature extraction sub-module to obtain a 13-scale seedling fusion feature map;

对52尺度特征提取子模块的输出层进行下采样后得到26*26*256特征图，对13尺度特征提取子模块的输出层进行上采样获得26*26*512特征图，将下采样得到的26*26*256特征图、上采样获得的26*26*512特征图与26尺度特征提取子模块的输出层进行张量拼接，得到26尺度秧苗融合特征图；After downsampling the output layer of the 52-scale feature extraction sub-module, a 26*26*256 feature map is obtained, and the output layer of the 13-scale feature extraction sub-module is upsampled to obtain a 26*26*512 feature map. The 26*26*256 feature map, the 26*26*512 feature map obtained by upsampling, and the output layer of the 26-scale feature extraction sub-module are spliced by tensor, and the 26-scale seedling fusion feature map is obtained;

对输入的水稻秧苗图片进行特征提取获得的104尺度输出秧苗特征图进行下采样得到52*52*128特征图，对26尺度特征提取子模块的输出层进行上采样获得52*52*512特征图，将下采样得到的52*52*128特征图、上采样获得的52*52*512特征图与52尺度特征提取子模块的输出层进行张量拼接，得到52尺度秧苗融合特征图。The 104-scale output seedling feature map obtained by feature extraction from the input rice seedling image is down-sampled to obtain a 52*52*128 feature map, and the output layer of the 26-scale feature extraction sub-module is upsampled to obtain a 52*52*512 feature map. , the 52*52*128 feature map obtained by downsampling, the 52*52*512 feature map obtained by upsampling and the output layer of the 52-scale feature extraction sub-module are tensor-spliced to obtain a 52-scale seedling fusion feature map.

作为优选，多尺度秧苗位置预测模型的具体预测步骤如下：Preferably, the specific prediction steps of the multi-scale seedling position prediction model are as follows:

13尺度融合特征构建子模块构建获得的13尺度秧苗融合特征经过一组卷积核和1个3*3*512卷积核及1个1*1*18卷积核的卷积得到13尺度的预测结果13*13*18；The 13-scale seedling fusion feature constructed by the 13-scale fusion feature construction sub-module is obtained by convolution of a set of convolution kernels, a 3*3*512 convolution kernel and a 1*1*18 convolution kernel to obtain a 13-scale fusion feature. The prediction result is 13*13*18;

26尺度融合特征构建子模块构建获得的26尺度秧苗融合特征经过一组卷积核和1个3*3*512卷积核及1个1*1*18卷积核的卷积得到26尺度的预测结果26*26*18；The 26-scale seedling fusion feature constructed by the 26-scale fusion feature construction sub-module is obtained by convolution of a set of convolution kernels, a 3*3*512 convolution kernel and a 1*1*18 convolution kernel to obtain a 26-scale fusion feature. The prediction result is 26*26*18;

52尺度融合特征构建子模块构建获得的52尺度秧苗融合特征经过一组卷积核和1个3*3*256卷积核及1个1*1*18卷积核的卷积得到52尺度的预测结果52*52*18。The 52-scale seedling fusion feature constructed by the 52-scale fusion feature construction sub-module is obtained by convolution of a set of convolution kernels, a 3*3*256 convolution kernel and a 1*1*18 convolution kernel to obtain a 52-scale fusion feature. The predicted result is 52*52*18.

综上所述，本发明具有以下有益效果：To sum up, the present invention has the following beneficial effects:

基于改进Yolov3网络的水稻秧苗检测模型能够自动提取水稻秧苗特征，该模型在原本的特征提取网络里对接近水稻秧苗尺寸的52尺度特征图采用52尺度特征提取子模块替代原本的残差单元进行加宽网络，在构建各个尺度的融合特征时加入了不同尺度的采样特征，融合后的特征增加了对低层的位置信息和高层的语义信息的结合，得到的各个尺度的水稻秧苗融合特征层次更加丰富，改进后的水稻秧苗检测模型对复杂水田环境中高低杂草密度分布，不同光照条件，水稻秧苗缺失情况下的水稻秧苗检测具有较好的准确性和鲁棒性。The rice seedling detection model based on the improved Yolov3 network can automatically extract the characteristics of rice seedlings. In the original feature extraction network, the model uses the 52-scale feature extraction sub-module to replace the original residual unit for the 52-scale feature map that is close to the rice seedling size. In the wide network, sampling features of different scales are added when constructing fusion features of various scales. The fusion features increase the combination of low-level location information and high-level semantic information, and the resulting fusion feature levels of rice seedlings at various scales are more abundant. , the improved rice seedling detection model has better accuracy and robustness for the high and low weed density distribution in complex paddy field environment, different light conditions, and the absence of rice seedlings.

附图说明Description of drawings

图1为检测模型的结构示意框图；Fig. 1 is the structural schematic block diagram of the detection model;

图2为检测模型的网络结构图；Fig. 2 is the network structure diagram of the detection model;

图3为本发明检测方法的流程示意图；Fig. 3 is the schematic flow chart of the detection method of the present invention;

图4为基于改进Yolov3和Yolov3水稻秧苗检测模型的测试p-r曲线图。Figure 4 is a test p-r curve diagram based on the improved Yolov3 and Yolov3 rice seedling detection models.

具体实施方式Detailed ways

以下结合附图对本发明作进一步详细说明。The present invention will be further described in detail below with reference to the accompanying drawings.

根据一个或多个实施例，公开了一种基于改进YOLOV3网络的水稻秧苗检测模型，如图1所示，包括有对输入的水稻秧苗图像进行多尺度水稻秧苗特征提取以获得多尺度的秧苗特征图的特征提取模块、根据特征提取模块提取的多尺度的秧苗特征图对水稻秧苗位置进行预测的多尺度预测模块。According to one or more embodiments, a rice seedling detection model based on an improved YOLOV3 network is disclosed, as shown in FIG. 1 , which includes performing multi-scale rice seedling feature extraction on an input rice seedling image to obtain multi-scale rice seedling features The feature extraction module of the map, and the multi-scale prediction module for predicting the position of rice seedlings according to the multi-scale seedling feature map extracted by the feature extraction module.

如图1所示，特征提取模块包括有对输入的水稻秧苗图像依次进行倍率采样提取处理的52尺度特征提取子模块、26尺度特征提取子模块、13尺度特征提取子模块。为表述清楚，现举例输入的水稻秧苗图像采用416*416尺度的图像，进行采样提取时，分别进行8倍、16倍和32倍下采样。As shown in Figure 1, the feature extraction module includes a 52-scale feature extraction sub-module, a 26-scale feature extraction sub-module, and a 13-scale feature extraction sub-module that sequentially perform magnification sampling and extraction processing on the input rice seedling image. For the sake of clarity, the input image of rice seedlings is an image of 416*416 scale. When sampling and extraction, down-sampling is performed by 8 times, 16 times and 32 times respectively.

多尺度预测模块包括有对特征提取模块提取的不同尺度的秧苗特征图进行融合构建处理以获得多尺度的融合特征图的多尺度融合特征构建模块、根据融合特征图中的多尺度的融合特征对秧苗位置进行预测的多尺度秧苗位置预测模块。其中多尺度融合特征构建模块具体的包括有52尺度融合特征构建子模块、26尺度融合特征构建子模块、13尺度融合特征构建子模块，多尺度秧苗位置预测模块具体的包括有52尺度位置预测子模块、26尺度位置预测子模块、13尺度位置预测子模块。The multi-scale prediction module includes a multi-scale fusion feature construction module that fuses and constructs the seedling feature maps of different scales extracted by the feature extraction module to obtain a multi-scale fusion feature map. A multi-scale seedling position prediction module for seedling position prediction. Among them, the multi-scale fusion feature building module specifically includes a 52-scale fusion feature building sub-module, a 26-scale fusion feature building sub-module, and a 13-scale fusion feature building sub-module. The multi-scale seedling position prediction module specifically includes a 52-scale position predictor. module, 26-scale position prediction sub-module, 13-scale position prediction sub-module.

52尺度特征提取子模块进行提取的输入秧苗特征图具体为：对输入的水稻秧苗图像经过10个卷积层和3个残差层后得到52尺度特征提取子模块的输入卷积层作为52尺度特征提取子模块的输入秧苗特征图。52尺度特征提取子模块对其输入秧苗特征图进行特征提取得到包含有52尺度秧苗特征的52尺度秧苗特征图。The input seedling feature map extracted by the 52-scale feature extraction sub-module is specifically: the input convolutional layer of the 52-scale feature extraction sub-module is obtained after 10 convolution layers and 3 residual layers for the input rice seedling image as the 52-scale input convolution layer. The input seedling feature map of the feature extraction submodule. The 52-scale feature extraction sub-module performs feature extraction on its input seedling feature map to obtain a 52-scale seedling feature map containing 52-scale seedling features.

26尺度特征提取子模块进行提取的输入秧苗特征图具体为：对52尺度特征提取子模块的输出层即输出的52尺度秧苗特征图，经过3*3/2*512的卷积核进行卷积得到26尺度特征提取子模块的输入卷积层，得到的即为26尺度特征提取子模块的输入秧苗特征图。26尺度特征提取子模块对其输入秧苗特征图进行特征提取获得包含有26尺度秧苗特征的26尺度秧苗特征图。The input seedling feature map extracted by the 26-scale feature extraction sub-module is specifically: the output layer of the 52-scale feature extraction sub-module, that is, the output 52-scale seedling feature map, is convolved with a 3*3/2*512 convolution kernel. The input convolution layer of the 26-scale feature extraction sub-module is obtained, and the input seedling feature map of the 26-scale feature extraction sub-module is obtained. The 26-scale feature extraction sub-module performs feature extraction on its input seedling feature map to obtain a 26-scale seedling feature map containing 26-scale seedling features.

13尺度特征提取子模块进行提取的输入秧苗特征图具体为：对26尺度特征提取子模块的输出层即输出的26尺度秧苗特征图，经过3*3/2*512的卷积核进行卷积得到13尺度特征提取子模块的输入卷积层，得到的即为13尺度特征提取子模块的输入秧苗特征图。通过13尺度特征提取子模块提取获得包含有13尺度秧苗特征的13尺度秧苗特征图。三个尺度的特征提取模块依次进行提取。The input seedling feature map extracted by the 13-scale feature extraction sub-module is specifically: the output layer of the 26-scale feature extraction sub-module, that is, the output 26-scale seedling feature map, is convolved with a 3*3/2*512 convolution kernel. The input convolution layer of the 13-scale feature extraction sub-module is obtained, and the input seedling feature map of the 13-scale feature extraction sub-module is obtained. The 13-scale seedling feature map containing the 13-scale seedling features is extracted by the 13-scale feature extraction sub-module. The feature extraction modules of the three scales are extracted sequentially.

13尺度融合特征构建子模块融合构建的特征图包括有13尺度特征提取子模块的输出层即提取得到的13尺度秧苗特征图，以及对26尺度特征提取子模块的输出层即提取得到的26尺度秧苗特征图进行下采样得到的13*13*512特征图，将得到的13尺度秧苗特征图及13*13*512特征图这两个特征图进行张量拼接得到13尺度秧苗融合特征图。The feature map constructed by the fusion of the 13-scale feature construction sub-module includes the output layer of the 13-scale feature extraction sub-module, which is the 13-scale seedling feature map extracted, and the output layer of the 26-scale feature extraction sub-module, which is the extracted 26-scale feature map. The 13*13*512 feature map is obtained by down-sampling the seedling feature map. The obtained 13-scale seedling feature map and the 13*13*512 feature map are tensor spliced to obtain the 13-scale seedling fusion feature map.

同理，26尺度融合特征构建子模块融合构建的秧苗特征图包括有26尺度特征提取子模块的输出层，还包括有对52尺度特征提取子模块的输出层即提取得到的52尺度秧苗特征图进行下采样得到26*26*256特征图以及对13尺度特征提取子模块的输出层进行上采样得到26*26*512特征图，通过将下采样得到的26*26*256特征图、上采样得到的26*26*512特征图以及26尺度特征提取子模块的输出层进行张量拼接得到26尺度秧苗融合特征图。Similarly, the seedling feature map constructed by the fusion of the 26-scale feature construction sub-module includes the output layer of the 26-scale feature extraction sub-module, and also includes the output layer of the 52-scale feature extraction sub-module, that is, the extracted 52-scale seedling feature map. Perform downsampling to obtain a 26*26*256 feature map and upsample the output layer of the 13-scale feature extraction sub-module to obtain a 26*26*512 feature map. The obtained 26*26*512 feature map and the output layer of the 26-scale feature extraction sub-module are spliced by tensor to obtain the 26-scale seedling fusion feature map.

同理，52尺度融合特征构建子模块融合构建的秧苗特征图包括有52尺度特征提取子模块的输出层，还包括有对输入的水稻秧苗图片进行特征提取获得的104尺度输出秧苗特征图进行下采样得到52*52*128特征图以及对26尺度特征提取子模块的输出层进行上采样得到52*52*512特征图，通过将下采样得到的52*52*128特征图、上采样得到的52*52*512特征图以及52尺度特征提取子模块的输出层进行张量拼接得到52尺度秧苗融合特征图。Similarly, the seedling feature map constructed by the 52-scale fusion feature construction sub-module includes the output layer of the 52-scale feature extraction sub-module, and also includes the 104-scale output seedling feature map obtained by feature extraction from the input rice seedling image. Sampling to obtain a 52*52*128 feature map and upsampling the output layer of the 26-scale feature extraction sub-module to obtain a 52*52*512 feature map. The 52*52*512 feature map and the output layer of the 52-scale feature extraction sub-module are spliced by tensor to obtain the 52-scale seedling fusion feature map.

多尺度秧苗位置预测模块的13尺度位置预测子模块通过将13尺度秧苗融合特征经过一组卷积核，和1个3*3*512卷积核及1个1*1*18卷积核的卷积得到13尺度的预测结果13*13*18。The 13-scale position prediction sub-module of the multi-scale seedling position prediction module passes the 13-scale seedling fusion features through a set of convolution kernels, and a 3*3*512 convolution kernel and a 1*1*18 convolution kernel. The convolution obtains a prediction result of 13 scales 13*13*18.

26尺度位置预测子模块通过将26尺度秧苗融合特征经过一组卷积核，和1个3*3*512卷积核及1个1*1*18卷积核的卷积得到26尺度的预测结果26*26*18。The 26-scale position prediction sub-module obtains 26-scale prediction by convolution of 26-scale seedling fusion features through a set of convolution kernels, and a 3*3*512 convolution kernel and a 1*1*18 convolution kernel The result is 26*26*18.

52尺度位置预测子模块通过将52尺度秧苗融合特征经过一组卷积核，和1个3*3*256卷积核及1个1*1*18卷积核的卷积得到52尺度的预测结果52*52*18。The 52-scale position prediction sub-module obtains a 52-scale prediction by passing the 52-scale seedling fusion feature through a set of convolution kernels, and convolution with a 3*3*256 convolution kernel and a 1*1*18 convolution kernel The result is 52*52*18.

原始的Yolov3网络模型在每个尺度特征融合构建处理的时候，采用小尺度特征图上采样和该尺度特征图进行融合的方式，本发明基于改进后的Yolov3模型采用小尺度特征图上采样，大尺度特征图下采样和该尺度特征图进行融合的方式，融合后的特征增加了对低层的位置信息和高层的语义信息的结合，得到的各个尺度的水稻秧苗融合特征层次更加丰富。The original Yolov3 network model adopts the method of small-scale feature map upsampling and fusion of the scale feature map when each scale feature is fused and constructed. The present invention uses small-scale feature map upsampling based on the improved Yolov3 model. The scale feature map downsampling and the scale feature map are fused. The fused features increase the combination of low-level location information and high-level semantic information, and the resulting fusion feature levels of rice seedlings at various scales are more abundant.

根据一个或多个实施例，公开了一种基于改进YOLOV3网络的水稻秧苗检测模型的检测方法，如图2及图3所示，包括有以下步骤：According to one or more embodiments, a detection method for a rice seedling detection model based on an improved YOLOV3 network is disclosed, as shown in Figures 2 and 3, including the following steps:

获得输入的水稻秧苗图像；Obtain the input rice seedling image;

对水稻秧苗图像分别进行不同倍率下采样，并通过特征提取模块获得52尺度秧苗特征图、26尺度秧苗特征图、13尺度秧苗特征图；The images of rice seedlings were down-sampled at different magnifications, and the feature maps of 52-scale seedlings, 26-scale seedling feature maps, and 13-scale seedling feature maps were obtained through the feature extraction module;

具体的，其中特征提取模块进行特征提取的具体步骤如下：Specifically, the specific steps for the feature extraction module to perform feature extraction are as follows:

具体的，多尺度融合特征构建模块进行特征融合构建具体步骤如下：Specifically, the specific steps of the multi-scale fusion feature construction module for feature fusion construction are as follows:

52尺度特征提取子模块为Inception模块，采用三路卷积加宽网络：第一路卷积采用1*1*64卷积核降低输入秧苗特征图的通道数，然后采用3*3*128和3*3*64两个卷积核进行卷积提取特征；第二路先采用1*1*64卷积核降低输入秧苗特征图的通道数，再采用3*3*128卷积核进行卷积提取特征；第三路通过1*1*64卷积降低输入秧苗特征图的通道数。通过Inception模块三路卷积处理后获得的52尺度秧苗特征图与上/下采样得到的52*52*128特征图、52*52*512特征图进行张量拼接完成52尺度特征融合。其中，Inception模块优选设置为4个，代替原本的残差单元进行加宽网络，能够提取更多特征，可有效减小参数和训练时间。The 52-scale feature extraction sub-module is the Inception module, which uses a three-way convolution to widen the network: the first convolution uses a 1*1*64 convolution kernel to reduce the number of channels of the input seedling feature map, and then uses 3*3*128 and 3*3*64 two convolution kernels are used to extract features by convolution; in the second way, the 1*1*64 convolution kernel is used to reduce the number of channels of the input seedling feature map, and then the 3*3*128 convolution kernel is used for convolution. Product extraction features; the third channel reduces the number of channels of the input seedling feature map through 1*1*64 convolution. The 52-scale seedling feature map obtained after the three-way convolution processing of the Inception module, the 52*52*128 feature map and the 52*52*512 feature map obtained by up/down sampling are tensor spliced to complete the 52-scale feature fusion. Among them, the number of Inception modules is preferably set to 4, instead of the original residual unit to widen the network, more features can be extracted, and parameters and training time can be effectively reduced.

进一步的，多尺度秧苗位置预测模型的具体预测步骤如下：Further, the specific prediction steps of the multi-scale seedling position prediction model are as follows:

将13尺度秧苗融合特征图的13尺度秧苗融合特征经过一组卷积核和1个3*3*512卷积核及1个1*1*18卷积核的卷积得到13尺度的预测结果13*13*18；The 13-scale seedling fusion feature of the 13-scale seedling fusion feature map is subjected to the convolution of a set of convolution kernels, a 3*3*512 convolution kernel and a 1*1*18 convolution kernel to obtain a 13-scale prediction result. 13*13*18;

将26尺度秧苗融合特征图的26尺度秧苗融合特征经过一组卷积核和1个3*3*512卷积核及1个1*1*18卷积核的卷积得到26尺度的预测结果26*26*18；The 26-scale seedling fusion feature of the 26-scale seedling fusion feature map is subjected to the convolution of a set of convolution kernels, a 3*3*512 convolution kernel and a 1*1*18 convolution kernel to obtain a 26-scale prediction result. 26*26*18;

将52尺度秧苗融合特征图的52尺度秧苗融合特征经过一组卷积核和1个3*3*256卷积核及1个1*1*18卷积核的卷积得到52尺度的预测结果52*52*18。The 52-scale seedling fusion feature of the 52-scale seedling fusion feature map is subjected to the convolution of a set of convolution kernels, a 3*3*256 convolution kernel and a 1*1*18 convolution kernel to obtain a 52-scale prediction result. 52*52*18.

为表述清楚，现举一实例，对基于改进的Yolov3和Yolov3的水稻秧苗检测模型使用水稻秧苗测试集进行测试，测试后分别绘制p-r曲线，如图4(a)、(b)所示，(a)图为改进Yolov3的p-r曲线，(b)图为Yolov3的p-r曲线，当置信度阈值是0.5的时候，基于改进Yolov3和Yolov3网络的水稻秧苗检测模型的精度分别是0.82和0.48，因此基于改进Yolov3的水稻秧苗检测算法在精度上提升了大约34％。In order to express clearly, an example is given. The rice seedling detection model based on the improved Yolov3 and Yolov3 is tested using the rice seedling test set, and p-r curves are drawn respectively after the test, as shown in Figure 4(a), (b), a) The picture shows the p-r curve of the improved Yolov3, and (b) the picture shows the p-r curve of the Yolov3. When the confidence threshold is 0.5, the accuracy of the rice seedling detection model based on the improved Yolov3 and Yolov3 networks is 0.82 and 0.48 respectively, so based on The improved Yolov3 rice seedling detection algorithm improves the accuracy by about 34%.

本具体实施例仅仅是对本发明的解释，其并不是对本发明的限制，本领域技术人员在阅读完本说明书后可以根据需要对本实施例做出没有创造性贡献的修改，但只要在本发明的权利要求范围内都受到专利法的保护。This specific embodiment is only an explanation of the present invention, and it does not limit the present invention. Those skilled in the art can make modifications without creative contribution to the present embodiment as needed after reading this specification, but as long as the rights of the present invention are used All claims are protected by patent law.

Claims

1. A rice seedling detection model based on an improved YOLOV3 network is characterized in that: the system comprises a feature extraction module for performing multi-scale feature extraction on an input rice seedling image to obtain a multi-scale seedling feature map, and a multi-scale prediction module for predicting the position of a rice seedling according to the multi-scale seedling feature map;

the multi-scale prediction module comprises a multi-scale fusion characteristic construction module for performing fusion construction processing on the multi-scale seedling characteristic diagram to obtain a multi-scale fusion characteristic diagram, and a multi-scale seedling position prediction module for predicting the seedling position according to the corresponding fusion characteristic in the fusion characteristic diagram.

2. The improved YOLOV3 network-based rice seedling detection model of claim 1, wherein: the characteristic extraction module comprises a 52-scale characteristic extraction submodule, a 26-scale characteristic extraction submodule and a 13-scale characteristic extraction submodule which are used for sequentially carrying out multiplying factor sampling extraction processing on the input rice seedling image so as to respectively obtain a 52-scale seedling characteristic diagram, a 26-scale seedling characteristic diagram and a 13-scale seedling characteristic diagram.

3. The improved YOLOV3 network-based rice seedling detection model of claim 2, wherein: the multi-scale fusion feature construction module comprises a 52-scale fusion feature construction submodule for performing fusion processing on seedling feature maps with different scales, a 26-scale fusion feature construction submodule and a 13-scale fusion feature construction submodule.

4. The rice seedling detection model based on the improved YOLOV3 network as claimed in claim 3, wherein: the 52-scale feature extraction sub-module performs feature extraction on the input seedling feature map of the 52-scale feature extraction sub-module to obtain a 52-scale seedling feature map containing 52-scale seedling features; the 26-scale feature extraction sub-module performs feature extraction on the input seedling feature map of the 26-scale feature extraction sub-module to obtain a 26-scale seedling feature map containing 26-scale seedling features; and the 13-scale feature extraction sub-module performs feature extraction on the input seedling feature map of the 13-scale feature extraction sub-module to obtain a 13-scale seedling feature map containing 13-scale seedling features.

5. A detection method of a rice seedling detection model based on an improved YOLOV3 network is characterized by comprising the following steps:

acquiring an input rice seedling image;

respectively carrying out down-sampling with different multiplying powers on the rice seedling images, and obtaining a 52-scale seedling characteristic diagram, a 26-scale seedling characteristic diagram and a 13-scale seedling characteristic diagram through a characteristic extraction module;

sequentially performing feature fusion on the obtained 13-scale seedling feature map, 26-scale seedling feature map and 52-scale seedling feature map through a multi-scale fusion feature construction module to obtain a 13-scale seedling fusion feature map, a 26-scale seedling fusion feature map and a 52-scale seedling fusion feature map;

and predicting the obtained 13-scale seedling fusion characteristic graph, 26-scale seedling fusion characteristic graph and 52-scale seedling fusion characteristic graph in sequence through a multi-scale seedling position prediction module to obtain a 13-scale seedling prediction result, a 26-scale seedling prediction result and a 52-scale seedling prediction result.

6. The improved YOLOV3 network-based rice seedling detection method as claimed in claim 5, wherein the feature extraction module performs the following steps:

the input convolution layer of the 52-scale feature extraction submodule is obtained by passing the input rice seedling image through 10 convolution layers and 3 residual error layers, and the 52-scale seedling features are obtained by extraction of the 52-scale feature extraction submodule;

convolving the output layer of the 52-scale feature extraction submodule by a convolution kernel of 3 × 3/2 × 512 to obtain an input convolution layer of the 26-scale feature extraction submodule, and extracting by the 26-scale feature extraction submodule to obtain 26-scale seedling features;

and (3) convolving the output layer of the 26-scale feature extraction submodule by a convolution kernel of 3 × 3/2 × 512 to obtain an input convolution layer of the 13-scale feature extraction submodule, and extracting by the 13-scale feature extraction submodule to obtain 13-scale seedling features.

7. The improved YOLOV3 network-based rice seedling detection method as claimed in claim 6, wherein the multi-scale fusion feature construction module performs the specific steps of feature fusion construction as follows:

carrying out down-sampling on the output layer of the 26-scale feature extraction sub-module to obtain a 13 x 512 feature map, and carrying out tensor splicing on the feature map and the output layer of the 13-scale feature extraction sub-module to obtain a 13-scale seedling fusion feature map;

carrying out down-sampling on an output layer of the 52-scale feature extraction sub-module to obtain 26 × 256 feature maps, carrying out up-sampling on an output layer of the 13-scale feature extraction sub-module to obtain 26 × 512 feature maps, and carrying out tensor splicing on the 26 × 256 feature maps obtained by down-sampling, the 26 × 512 feature maps obtained by up-sampling and the output layer of the 26-scale feature extraction sub-module to obtain 26-scale seedling fusion feature maps;

and carrying out down-sampling on 104-scale output seedling feature maps obtained by carrying out feature extraction on the input rice seedling pictures to obtain 52 x 128 feature maps, carrying out up-sampling on output layers of the 26-scale feature extraction sub-modules to obtain 52 x 512 feature maps, and carrying out tensor splicing on the 52 x 128 feature maps obtained by down-sampling, the 52 x 512 feature maps obtained by up-sampling and the output layers of the 52-scale feature extraction sub-modules to obtain 52-scale seedling fusion feature maps.

8. The improved YOLOV3 network-based rice seedling testing method as claimed in claim 7, wherein the multi-scale seedling position prediction model comprises the following steps:

the 13-scale seedling fusion features constructed by the 13-scale fusion feature construction submodule are subjected to convolution of a set of convolution kernels, 13 × 512 convolution kernels and 1 × 18 convolution kernel to obtain a 13-scale prediction result 13 × 18;

the 26-scale seedling fusion features constructed by the 26-scale fusion feature construction submodule are subjected to convolution of a set of convolution kernels, 13 × 512 convolution kernels and 1 × 18 convolution kernel to obtain a 26-scale prediction result 26 × 18;

the 52-scale seedling fusion features constructed by the 52-scale fusion feature construction submodule are subjected to convolution of a set of convolution kernels, 13 × 256 convolution kernel and 1 × 18 convolution kernel to obtain a 52-scale prediction result 52 × 18.