CN116579446A

CN116579446A - Method for estimating high-precision wheat grain yield by using deep learning and phenotype characteristics

Info

Publication number: CN116579446A
Application number: CN202211503201.4A
Authority: CN
Inventors: 彭金榜; 孙志刚; 娄金勇; 王东亮; 张艺璇
Original assignee: Cas Shandong Dongying Institute Of Geographic Sciences; Institute of Geographic Sciences and Natural Resources of CAS
Current assignee: Cas Shandong Dongying Institute Of Geographic Sciences; Institute of Geographic Sciences and Natural Resources of CAS
Priority date: 2022-11-28
Filing date: 2022-11-28
Publication date: 2023-08-11
Anticipated expiration: 2042-11-28
Also published as: CN116579446B

Abstract

The method for estimating the high-precision wheat grain yield by utilizing deep learning and phenotype features comprises the following steps: step 1, shooting and collecting phenotype characteristic data of wheat kernels; step 2, after shooting and collecting in the step 1 are completed, sampling, counting and weighing wheat ears in a sampling area; step 3, preprocessing the image shot and collected in the step 1; step 4, substituting the preprocessed image in the step 3 into a deep learning model for iterative training to divide wheat ears, and evaluating the model performance; step 5, extracting the phenotype characteristics of the wheat ears from the images subjected to the wheat ear segmentation in the step 4; step 6, establishing a relation between the phenotype characteristic and the kernel yield under the drive of the wheat ear phenotype characteristic extracted in the step 5, so as to estimate the kernel yield; the invention establishes a direct relationship between crop phenotype characteristics and seed yield; the method is not easy to be interfered by external environment factors, and the yield estimation result is accurate.

Description

Method for estimating high-precision wheat grain yield by using deep learning and phenotype characteristics

Technical Field

The invention relates to the technical field of wheat grain yield estimation, in particular to a method for estimating high-precision wheat grain yield by using deep learning and phenotypic characteristics.

Background

The field-scale yield is an important parameter of an agricultural decision system, and the spatial distribution characteristics of the field yield are drawn to be essential for accurate agricultural management, so that the field fixed-point fertilization, irrigation, pest control and other works can be guided; even though field-scale yield data is readily available after grain harvesting, the spatial distribution details of yield remain a difficult piece of information to obtain.

According to the remote sensing platform, the remote sensing means comprises two main means of satellite remote sensing and unmanned aerial vehicle remote sensing; satellite remote sensing is a method of observing crops from above space, which has been widely used in crop yield estimation over the past decades; however, even though satellite remote sensing can cover a wide range of geographical environments and spaces for estimation of crop yield, it has significant limitations in terms of yield details on a small scale (such as the field scale) because of the large ground resolution; the unmanned aerial vehicle can provide images with high spatial resolution due to elasticity in flying height, so that small-scale output simulation can be realized; from this aspect, unmanned aerial vehicle remote sensing is advantageous over satellite remote sensing in estimating small scale crop yields.

In the prior art, yield prediction based on unmanned aerial vehicle observation is mainly based on empirical relation between measured field yield and canopy characteristics (called a canopy characteristic-based method), and the inherent principle is that the state of canopy affects crop yield.

The common canopy features mainly comprise three kinds of spectral features, structural features and physiological features extracted from images; spectral features are the most widely used features in yield estimation and often occur in the form of vegetation indices, and many studies have achieved crop yield estimation using single or multiple vegetation indices; structural features mainly include canopy height, canopy coverage, plant density, etc., which are often used to improve the accuracy of yield predictions and proved to have a clear correlation with yield; in addition, few studies have employed physiological characteristics of crops as auxiliary characteristics to improve the accuracy of yield estimation, such as the water-containing state of crops, canopy temperature, chlorophyll content, etc.; taken together, these studies fully demonstrate the effectiveness of canopy feature-based methods in unmanned aerial vehicle yield estimation.

However, the prior art also has significant practical limitations in unmanned aerial vehicle field yield estimation; essentially, crop seed yield is directly determined by yield components (thousand kernel weight, kernel per ear and ear number), whereas prior art canopy feature-based methods establish an indirect relationship between crop canopy features and seed yield; such indirect relationships are susceptible to interference from environmental factors including weeds, uneven rows of soil, and crop shading; and the vegetation index is susceptible to saturation at high crop densities; also, many photosynthetic-related canopy features may lose observation sensitivity during the reproductive phase of the crop due to canopy color changes, such as flowering or grouting, which can severely impact the accuracy of yield estimation.

Accordingly, those skilled in the art have focused their efforts on developing a method for estimating high precision wheat grain yield using deep learning and phenotypic characteristics that aims to solve the problems of the prior art.

Disclosure of Invention

In view of the above-mentioned drawbacks of the prior art, the present invention aims to solve the technical problem that in the prior art, an indirect relationship is established between the canopy characteristics of crops and the yield of seeds; is susceptible to interference from environmental factors; and the vegetation index is susceptible to saturation at high crop densities; at the same time, many photosynthetic canopy features may lose observation sensitivity during the reproductive phase of the crop due to canopy color changes, which can seriously affect the accuracy of yield estimation.

To achieve the above object, the present invention provides a method for estimating high precision wheat grain yield using deep learning and phenotypic characteristics, comprising the steps of:

step 1, shooting and collecting phenotype characteristic data of wheat kernels;

step 2, after shooting and collecting in the step 1 are completed, sampling, counting and weighing wheat ears in a sampling area;

step 3, preprocessing the image shot and collected in the step 1;

step 4, substituting the preprocessed image in the step 3 into a deep learning model for iterative training to divide wheat ears, and evaluating the model performance;

Step 5, extracting the phenotype characteristics of the wheat ears from the images subjected to the wheat ear segmentation in the step 4;

step 6, establishing a relation between the phenotype characteristic and the kernel yield under the drive of the wheat ear phenotype characteristic extracted in the step 5, so as to estimate the kernel yield;

step 1, shooting and collecting phenotype characteristic data of wheat grains at a certain height from the ground;

the shooting collection in the step 1 is divided into two sets of data, wherein the first set of data is randomly shot at different areas of the wheat producing area, but the sampling area cannot be covered, and the shot images should not be overlapped;

the second set of data in the step 1 is to collect images of a sampling area of a wheat producing area;

in the step 1, the first set of data is used as a training data set for training a deep learning model, and the second set of data is used as verification data for verifying the performance of the model in wheat ear segmentation;

step 2, need to go on after the shooting of step 1 is finished, and need to make the sampling area cover different output gradients;

step 2, selecting a certain sampling range in a sampling area during the mature period of wheat, collecting all wheat ears in the sampling range, and counting the wheat ears to obtain the number of the wheat ears;

After the wheat spike number is obtained in the step 2, the wheat spike needs to be subjected to preliminary dehydration treatment, so that the follow-up shelling treatment is facilitated;

step 2, carrying out unshelling treatment after primary dehydration, carrying out secondary dehydration after unshelling to obtain the dry weight of the seeds, and obtaining the yield of the seeds in a sampling range after normalization by utilizing the water content;

the pretreatment in the step 3 is divided into three aspects; the first aspect is: dividing the original data image shot and collected in the step 1, and preventing the oversized image from influencing the working efficiency of the subsequent step; and manually picking a certain number of relatively high quality images;

the second aspect of the pretreatment of step 3 is: processing the image shot in the second set of data in the step 1, and separating out only part in the sampling frame, so that the separated image completely covers the sampling frame and wheat ears except the sampling frame are removed;

the third aspect of the pretreatment in step 3 is: manually marking the manually selected image in the first aspect and the processed image in the second aspect, and manually marking all wheat ears on the image, wherein during marking, all wheat middles are ignored and are not calculated;

The deep learning model in the step 4 can generate a feature map, and a full convolution network is adopted to draw a mask of the wheat ear example;

step 4, model robustness and fitting speed are provided through two model training skills of data enhancement and migration learning;

step 4, after the model is trained, the second set of data acquired by shooting in the step 1 is input into the model through the image preprocessed in the step 3, and the wheat ear instance on the image block is segmented;

the model performance evaluation in the step 4 may determine the model prediction result as three results, namely True Positive (TP), false Positive (FP) and False Negative (FN);

the step 4, the determined intersection set threshold (intersection over union, ioU), i.e. the intersection set between the predicted instance and the true instance, can be expressed by formula (1);

the specific model performance is evaluated based on accuracy (formula (2)) and recall (formula (3)), and the formulas (2) and (3) are as follows;

in the step 4, since the accuracy and recall rate evaluate only the model in one aspect, the F1score is used to evaluate the overall performance formula (4) of the model;

The phenotype characteristics in the step 5 comprise the number of wheat ears, the size of the wheat ears and the abnormal index of the wheat ears;

the number of wheat ears and the size of the wheat ears in the step 5 can be obtained from the cutting result of the wheat ears in the step 4;

the abnormal wheat ears in step 5 means that healthy wheat ears should be golden yellow when wheat is ripe, and unhealthy wheat ears may tend to appear green; thus, these green-carrying ears are defined as abnormal ears;

the abnormal wheat ear index in the step 5 can effectively screen effective wheat ears. Ear fertility (fertility) is highly related to wheat kernel yield; when wheat is poorly fertile (such as premature or malnutrition), ears of wheat in the field often exhibit abnormal spectral characteristics (or color characteristics);

the abnormal state of the wheat ears in the step 5 can be judged by using a blue-green difference index (difference between green and yellow index, DGYI);

the DGYI in step 5 is essentially a normalization of the contrast of green and yellow light over the total visible light reflection, and wherein yellow light can be regarded as the sum of red and green light (equation (5));

wherein R, G, B, Y in the formula (5) represents the reflection amounts of red, green, blue, and yellow light, respectively;

The DGYI values on all pixels of the ear instance are averaged to obtain the overall DGYI for that ear, and the DGYI relative frequency distribution map for all ears can be plotted;

in the step 5, each image block calculates an abnormal ear index, which is used to represent the proportion of abnormal ears in the image block, and can be expressed by a formula (6);

wherein in the formula (6), n and i respectively represent the wheat head overview and the wheat head number on the corresponding image block (or sample frame);

in the step 5, the Relative Error (RE) is used to evaluate and calculate the accuracy of the two features of the number of ears and the size of ears extracted from the ear segmentation result, which can be expressed by the formula (7);

V _Pre and V _GT Respectively representing a predicted value and a true value of the wheat head number (or the wheat head size);

step 6, establishing relations between three phenotypic characteristics and seed yield through a machine learning regression algorithm;

in the step 6, due to limitation of data volume, model verification is performed by a leave-one-out cross-validation method;

in the step 6, the loss equations of the three models are all mean square errors (mean squared error), and can be expressed by a formula (8);

in said step 6, the model evaluation uses the determination of the coefficients (R ² ) And relative root mean square error (relative root mean square error, rRMSE), which can be expressed using equation (9) and equation (10), respectively;

wherein, the formula (8)N in (9), (10) represents the total number of samples in the validation set; e (E) _i And S is _i The i-th estimated and observed grain yield, respectively;the observed average grain yield;

further, the shooting height in the step 1 needs to be higher than the height of the wheat, and the shooting collected image is an orthographic image;

further, in the step 1, when shooting and collecting, the shooting and collecting are required to be performed in a time period with good weather conditions, small wind speed and good illumination conditions;

further, in the step 1, before the second set of data is captured, the wheat plants in the sampling frame need to be carefully processed, so that the original state is prevented from being damaged, and the surrounding environment of the sampling area needs to be cleaned;

further, the manual marking of the wheat ears in the third aspect in the step 3 requires at least 2 operators to be completed together, wherein one is responsible for marking the mask, and the other is responsible for checking the marking result, so that subjective errors can be avoided as much as possible;

further, the accuracy evaluation formula (2) in the step 4 is the proportion of the correctly predicted wheat ear example in the whole prediction, and the purpose is mainly to limit false alarm;

Further, the recall formula (3) in the step 4 is the proportion of the correctly predicted ear instance in all true values, and the purpose is mainly to limit the missing report;

further, in the step 4, specific methods adopted for data enhancement are image rotation (90, 180 and 270 degree rotation) and flipping (up-down and left-right flipping);

further, in the step 4, the specific practice of the transfer learning is to use the pre-trained network weight on the COCO data;

further, in the step 4, in evaluating the model, in order to balance the accuracy and recall, a confidence score (confidence score) threshold of the classification is set to 0.6;

further, in the step 6, the sampled machine learning regression algorithm includes multiple linear regression, support vector regression and random forest regression;

further, in the step 6, before the model is run, the data set is subjected to normalization and random sorting to improve the robustness of the model;

by adopting the scheme, the method for estimating the high-precision wheat grain yield by utilizing the deep learning and phenotype characteristics has the following advantages:

(1) The method for estimating the high-precision wheat grain yield by utilizing deep learning and phenotypic characteristics utilizes the actual phenotypic characteristics of wheat: spike number, spike size, spike abnormality index; estimating yield based on a direct relationship established between the phenotypic characteristics of the crop and the yield of the kernel; the method is not easy to be interfered by external environment factors, is not influenced by the fact that crops are in a flowering period or a grouting period, and can accurately realize high-precision estimation under the condition of high crop density; the estimation result is more accurate, and the method has good anti-interference characteristics;

(2) According to the method for estimating the high-precision wheat grain yield by utilizing the deep learning and the phenotype characteristics, a more direct relation with the grain yield is established based on the phenotype characteristics; the phenotype features can avoid noise caused by soil, weed shadows and the like in the image, and can directly extract the features on organs directly related to the seed yield, so that the yield estimation accuracy realized by the method is obviously higher than that of the prior art;

in summary, the method for estimating the high-precision wheat grain yield by using the deep learning and the phenotype features disclosed by the invention uses the actual phenotype features of wheat to estimate the yield; is based on the direct relationship established between the phenotypic characteristics of crops and the yield of seeds; the method is not easy to be interfered by external environment factors, and can directly extract the characteristics on organs directly related to the grain yield, so that the yield estimation result realized by the method is more accurate, and the method has good anti-interference characteristics.

The conception, specific technical scheme, and technical effects produced by the present invention will be further described in conjunction with the specific embodiments below to fully understand the objects, features, and effects of the present invention.

Drawings

FIG. 1 is a flow chart of a method for estimating grain yield based on phenotypic characteristics of the present invention;

FIG. 2 (a) orthographic stitched image of an east ying station, (b) a wheat farming area of the east ying station, (c) 10 sample frames in the sample area, (d) image of a sample frame, (e) wheat ears collected in a sample frame, (f) wheat kernels processed in a sample frame;

FIG. 3 is a graph showing the relative frequency distribution of DGYI (blue-green difference index) values of example 1 over 400 image blocks in a training dataset;

FIG. 4 shows an example of the ear abnormality index of embodiment 1 of the present invention; (a) an image block instance on the training set, (b) four ear anomalies on the image are highlighted with red circles, (c) a marked ear mask, wherein normal ears are represented by blue polygons and ear anomalies are represented by red polygons;

FIG. 5 (a) is a relationship (red dot) between the number of ears of wheat and the grain yield observed in example 1 of the present invention, wherein the dark dotted line represents the fit line; (b) A relationship (red dot) between the observed spike count and the marked spike count, wherein the dark dotted line represents a fit line and the light dotted line represents a 1:1 relationship line;

FIG. 6 is an example of the spike segmentation using Mask R-CNN according to example 1 of the present invention; (a) and (d) show the original image blocks of relatively dense and sparse ears of wheat, respectively, (b) and (e) ear marking examples, (c) and (f) ear segmentation result display (green polygon);

FIG. 7 is a relationship between the three model predicted grain yields and the observed grain yield sums of example 1 of the present invention; (a) multiple linear regression; (b) support vector regression; (c) random forest regression.

Detailed Description

The following describes a number of preferred embodiments of the present invention to make its technical contents more clear and easy to understand. This invention may be embodied in many different forms of embodiments which are exemplary of the description and the scope of the invention is not limited to only the embodiments set forth herein.

Example 1 estimation of wheat grain yield once was accomplished using the method of the present invention

This example 1 was completed in the department of geography, eastern, shandong, middle, japan (118℃55'17.91 "N, 37℃40' 17.94" E);

the east station has an area of about 300 hectares, and is planted with a plurality of different crops including corn, wheat, sorghum, rice, soybean and the like, and an orthographic spliced image of the east station is shown in fig. 2 a;

the area in the east ying station is flat in topography, the average altitude is about 6 meters, and the area belongs to a typical coastal saline-alkali land due to the fact that the area is on the impact plain of a new yellow river. About 30 hectares of wheat are planted in the station and divided into 18 field areas, wherein fig. 2b is a wheat cultivation area of the east camping station, which comprises four unmanned aerial vehicle flight areas (blue frames) and four sampling areas (orange frames);

Wheat planted in the east ying station is affected by the saline-alkali stress to different degrees, and the average yield in 2020 is about 2500kg/ha;

the flow of the estimation method for estimating the high-precision wheat grain yield based on the deep learning and the phenotypic characteristics, which is performed in the present embodiment 1, is shown in fig. 1;

firstly, carrying out step 1, shooting and collecting phenotype characteristic data of wheat grains;

the shooting and collecting method adopted in the embodiment is that the unmanned aerial vehicle flies and shoots, and the time of the unmanned aerial vehicle flying experiment is between 27 days and 28 days of 5 months in 2020; the selected shooting time points are from 11 am to 1 pm, and the time with good weather conditions, small wind speed and good illumination conditions;

the unmanned aerial vehicle equipment used is the Xin of Xin Shenzhen in Xin in Xin Shen, the shooting lens used is the lens of the unmanned aerial vehicle;

the image acquired by the unmanned aerial vehicle is an orthographic image with 5472 multiplied by 3648 pixels, the orthographic image comprises three red, green and blue channels, the flying height of the unmanned aerial vehicle is set at a position 3 meters away from the ground, and the ground resolution of the corresponding shot image is 0.45mm;

in the step 1, two sets of data are acquired through shooting and acquisition, the first set of data is used as a training data set training deep learning model, 112 unmanned aerial vehicle images are shot in total in the embodiment 1, the shooting areas are shot at random positions of four flight areas, the shooting areas are blue unmanned aerial vehicle flight area areas as shown in fig. 2b, but the sampling areas in the flight areas cannot be covered, no overlapping exists among the images, and the boundary effect of a field is considered;

In this embodiment 1, the second set of data altogether shoots 40 images of the unmanned aerial vehicle, the images place 40 sampling frames in the center of the images, before the image shooting of the unmanned aerial vehicle, the wheat plants in the sampling frames are carefully processed to prevent the original state from being destroyed, and the wheat plants around the sampling frames are pushed away from the sampling frames to clean the environment around the sampling frames, as shown in fig. 2c and fig. 2 d; wherein fig. 2c is 10 sample boxes in the sample area; FIG. 2d is an image of a sample frame;

in this example 1, the first set of data is used as a training data set for training the deep learning model, and the second set of data is used as verification data for verifying the performance of the model in ear segmentation, and extracting phenotypic characteristics and predicted yield in subsequent experiments;

the two sets of data profile tables can be shown in the following table;

table 1 two sets of data profile tables acquired by unmanned aerial vehicle

Then, carrying out the step 2 and the step 2, wherein after shooting is completed in the step 1, the sampling area is required to cover different yield gradients;

step 2 of this example 1 was performed on days 28-31 of 5 in 2020, and the sampling test on wheat ears was performed on four sampling areas, as shown in fig. 2b for A, B, C, D four sampling areas; step 2 of this example 1, the carefully designed four sampling regions approximately cover an increasing yield gradient (a < B < C < D), and take into account boundary effects;

In the period of wheat maturation, 10 sampling frames of 0.50m×0.50m are respectively arranged in 4 sampling areas, and all ears of wheat in all sampling frames are collected as shown in fig. 2c and 2 d;

the wheat ears in each frame are counted after collection, and the number of wheat ears in each frame is obtained as shown in fig. 2 e; then the wheat ear sample is put into an oven and baked for two hours at a constant temperature of 75 ℃, so that the subsequent manual shelling treatment is convenient, as shown in fig. 2 f;

after dehulling, the wheat kernels were continuously baked in an oven at 75 degrees celsius for 48 hours to obtain kernel dry weight, after normalization with 13% moisture content, the kernel yield in the final obtained sampling frame was between 1334 and 5839 kg/ha;

step 3, preprocessing the image shot and collected in the step 1; the pretreatment in the step 3 is divided into three aspects; the first aspect is: dividing the original data image shot and collected in the step 1, and preventing the oversized image from influencing the working efficiency of the subsequent step; and manually picking a certain number of relatively high quality images;

the training data of this embodiment 1 is separated from 112 original unmanned aerial vehicle images in the first set of data;

The size of the unmanned aerial vehicle image in the embodiment is controlled below 1200×1200 pixels, because the size of the too large image exceeds the video memory of the video card, and the size of the original unmanned aerial vehicle image obviously exceeds the video memory capacity; it is therefore necessary to divide the original image into image blocks first;

in this embodiment, 112 original images in the training set are divided into 1200×1200 pixels or smaller image blocks (the image blocks divided at the edges of the image may be smaller); finally, 400 image blocks with relatively high quality are manually selected for training the wheat head segmentation model as shown in table 1;

in this embodiment 1, the verification data is separated from 40 images of the unmanned aerial vehicle in the second set of data; on the 40 original unmanned aerial vehicle images, only the part in the sampling frame is separated from the images, and the separated image blocks completely cover the sampling frame, but the wheat ears except the sampling frame are removed; the length-width of the image blocks of the 40 separated 40 sampling frames is about 1000 to 1200 pixels as shown in fig. 2 d;

in this example 1, all ears of wheat on 440 image blocks were manually marked; when marking the mask of the wheat ears, marking software adopted is 'labelme'; during the marking process, the wheat middlings are all ignored and not contained within the mask; the operator of the wheat ear mark can finish the marking of the wheat ear together to avoid subjective errors as much as possible. One of which is responsible for marking the mask and the other for checking the marking result, marking approximately 85000 ears of wheat over the entire 440 images for about one month;

in the embodiment 1, in the step 4, before the phenotypic feature extraction, the Mask R-CNN is adopted as the most classical algorithm for example segmentation to segment the wheat ears;

the working principle of Mask R-CNN is divided into two stages, namely a region recommendation stage (regional proposal stage) and a classification stage (classification stage). The region recommendation phase is performed on the basis of a region recommendation network (region proposal network, RPN), at which phase the model predicts a series of candidate bounding boxes; in the classification stage, the candidate boxes are classified into predetermined classifications by softmax regression, and masks of example objects are depicted by semantic segmentation;

Specifically, this example 1 uses ResNet-101 in a depth residual network to generate a feature map, while using a full convolution network (fully convolutional network, FCN) to delineate the mask of the ear instance;

the deep learning model in the model experiment of this example 1 was constructed on a TensorFlow framework, with 80,000 iterations (200 epochs) of the model on the basis of 400 training data sets; the learning rate setting of the model was set to 5×10 from initial, 10,000 and 30,000 cycles, respectively ^-3 ，5×10 ^-4 And 5X 10 ^-5 ；

Considering the case where objects are dense in the ear recognition scene, the maximum recommended frame numbers for the non-maximum suppression operation are set to 1200 and 600, respectively;

in this example 1, considering that the size of the ears of wheat in the training set is approximately 8-90 pixels, the model reduces the base anchor size (base anchor size) in the model from Mask R-CNN to 256 pixels by default to 32 pixels;

two model training techniques, data enhancement and transfer learning, are employed to provide model robustness and fitting speed; specific methods adopted for data enhancement are image rotation (90, 180 and 270 degree rotation) and flipping (up-down and left-right flipping); the specific practice of the transfer learning is to adopt the pre-trained network weight on COCO data; after training the model, dividing the wheat ear instance on the 40 image blocks used in the second set of data;

the accuracy formula (2) is the proportion of correctly predicted wheat ear examples in the whole prediction, and the purpose is mainly to limit false positives;

the recall formula (3) is the proportion of correctly predicted wheat ear examples in all true values, and aims to limit missing report;

in the step 4, in evaluating the model, a confidence score (confidence score) threshold is set to 0.6 in order to balance the relative balance between accuracy and recall;

After the model training phase, the Mask R-CNN algorithm is used to segment the ear instances on the validation data; here, in addition to marking the obtained truth values, the observed truth values are also used to evaluate the instance segmentation results, see table 2 below; for image marker based evaluation, the model yielded a higher overall performance, with an F1 score of 0.87; because a relative balance (0.86 and 0.89 respectively) is achieved between accuracy and recall, prediction of the model achieves some mutual cancellation between false alarms and false alarms on the ear instance as shown in fig. 6; then, for the observation-based evaluation, the results of the model were much worse, and the F1 score was only 0.79, mainly due to the steep drop in recall from 0.89 to 0.73. The main reason for this phenomenon is that the unmanned aerial vehicle image can only cover 81% of the ears of wheat in the field;

although the missing report and the false report of the wheat ear example weaken the performance of the model, the number of the wheat ears and the size of the wheat ears can be only partially influenced by the mutual offset between the false report and the missing report; based on the marker-based evaluation, the relative error of the ear number and ear size was still small, 0.04 and 0.09, respectively (see table 2); based on the observation-based evaluation, the relative error in the wheat head number was also only 0.15;

Table 2 Performance of Mask R-CNN model on validation dataset

Step 5, extracting the phenotype characteristics of the wheat ears from the images subjected to the cutting of the wheat ears in the step 4;

after the wheat head segmentation is carried out by using the mask-CNN, three phenotypic characteristics of the wheat head number, the wheat head size and the abnormal index of the wheat head are required to be extracted; it should be noted that these three characteristic features are all related to the wheat yield factor (thousand kernel weight, kernel per ear and ear density);

first, because the ground resolution of the unmanned aerial vehicle image is substantially fixed, the wheat head number can be directly converted into the wheat head density;

second, the ear size is made up of ear length and width, while this example 1 has found that the number of kernels per ear is highly positively correlated with ear length;

thirdly, the abnormal wheat ear index can effectively screen effective wheat ears; ear fertility (fertility) is highly related to wheat kernel yield; when wheat is poorly fertile (such as premature or malnutrition), ears of wheat in the field often exhibit abnormal spectral characteristics (or color characteristics);

in particular, healthy ears of wheat should be golden yellow when the wheat is mature, while unhealthy ears of wheat may tend to appear green; thus, this example 1 defines these ears with green color as abnormal ears and attempts to distinguish them from normal ears by using an ear abnormality index, reducing the bias in yield estimation;

FIG. 3 is a graph showing the relative frequency distribution of DGYI (blue-green difference index) values over 400 tiles in a training dataset; the dashed circle in fig. 3 indicates a small peak in the profile; a threshold value of the incident DGYI value at 0.04 indicated by a vertical dashed line;

the wheat ear number and the wheat ear size can be directly obtained from the wheat ear segmentation result, and the abnormal wheat ear index needs to be further calculated on the result;

first, a blue-green differential index (difference between green and yellow index, DGYI) is used to determine the abnormal status of the ear. DGYI is essentially a normalization of the contrast of green and yellow light over the total visible light reflection, where yellow light can be regarded as the sum of red and green light, and can be expressed by equation (5);

then, the DGYI values on all pixels of the ear instance are averaged to obtain the overall DGYI for the ear; the DGYI relative frequency distribution of all wheat ears on 400 image blocks in the training dataset is shown in fig. 3; a small peak (shown in broken line) other than the main peak can be easily found on this profile; the threshold value between this small peak and the main peak is 0.04 apart; in examining the conditions on the images of the training set, the present example 1 found that the ears with DGYI values greater than 0.04 were basically abnormal ears to be removed; therefore, when the DGYI value of a wheat ear on the image is greater than 0.04, it is generalized to the abnormal wheat ear constant (ear anomaly count, EAC) as shown in formula (11);

fig. 4 shows an example of the ear anomaly index of the embodiment 1, in which (a) in fig. 4 represents an example of an image block on the training set, (b) four ear anomalies on the image are highlighted with red circles, and (c) a marked ear mask, in which normal ears are represented by blue polygons and ear anomalies are represented by red polygons;

step 6, establishing the relation between the three phenotype characteristics and the grain yield through a machine learning regression algorithm;

the machine learning regression algorithm of this example 1 was used to establish the relationship between the three phenotypic characteristics and the grain yield; the three regression algorithms are respectively: multiple linear regression, support vector regression and random forest regression; the three regression algorithms are implemented in the Python language; due to data volume limitation (only 40 groups of samples in the verification data set), a leave-one-out cross-validation method is adopted in model verification; before the model operates, the data set is subjected to normalization and random ordering treatment so as to improve the robustness of the model;

wherein n in the above formulas (8), (9) and (10) represents the total number of samples in the verification set; e (E) _i And S is _i The i-th estimated and observed grain yield, respectively;the observed average grain yield;

in the step 6, the preliminary analysis of the data was performed in this example 1 using the observed grain yield and the number of ears, and the number of ears obtained in the image mark;

first, the linear regression relationship between the observed ear count and the observed kernel yield shows a high correlation between the two parameters, and R ² Reaching 0.93 (fig. 5 a);

FIG. 5 (a) is an observed relationship (small dots) between ear count and kernel yield, wherein the dark dashed line represents the fit line; (b) A relationship (small dots) between the observed spike count and the marked spike count, wherein the deep dashed line represents the fit line and the shallow dashed line represents the 1:1 relationship line; the figure also shows the difference between the two spike numbers (light dashed line);

This remarkable relationship demonstrates the reliability of estimating kernel yield using the ear-related phenotypic features; then, the number of ears observed in the 40 sampling frames and the number of ears obtained in the corresponding image markers were 11098 and 9024, respectively;

this difference indicates that the unmanned aerial vehicle images can only observe about 81% [ (11098-9024)/11098 ] ears of wheat in an actual field;

finally, a highly uniform relationship is also achieved between the observed spike count and the marked spike count, and R ² 0.93 (fig. 5 b); this stable relationship ensures the possibility of using the ear-related phenotypic characteristics to estimate kernel yield even if there is a large difference between the observed and marked ear numbers; the difference between the other two wheat ears increases with increasing number of wheat ears in the sampling frame (light dashed line in fig. 5 b);

step 6, driven by the three wheat ear phenotype characteristics, three machine learning regression models are used for estimating the grain yield; overall, the estimation results of the three machine learning models are not quite different, as shown in fig. 7;

FIG. 7 is a relationship between the three model predicted grain yield and the observed grain yield sum; (a) multiple linear regression; (b) support vector regression; (c) random forest regression; the deep dashed line represents a fit line;

Wherein the best estimation result of the support vector regression (R ² =0.89 and rrmse=16.01%), whereas the multiple linear regression has the worst estimation (R ² =0.84 and rrmse=22.78%);

this indicates that the linear linkage does not describe well the relationship of three phenotypic characteristics to grain yield; for example, the number of ears of wheat also implies a ear size of one wheat; this is because the number of wheat ears indicates that the plant is under a certain stress (such as the saline-alkali stress in this example 1), so the ears often have malnutrition and result in smaller ear sizes;

thus, there is not just a linear relationship between ear count and kernel yield; finally, the regression model predicts better grain yield results, even though the F1 score in the ear segmentation is only 0.79 (based on the observed true values).

Comparative example, unmanned aerial vehicle grain yield estimation on a field scale using prior art

The table 3 enumerates results of grain yield estimation using different sensors in the prior art, for different crops, different canopy or phenotype characteristics, and inputting different amounts of characteristic numbers for a total of 10 prior art techniques, and the yield estimation using the prior art techniques are reproduced in the table according to the prior art techniques;

Table 3 simple comparison of unmanned aerial vehicle seed yield estimation studies on a field scale

The meaning of the symbols in table 3 above is: MS: multispectral; TIR: thermal infrared; HS: a hyperspectral spectrum; NIR: near infrared; VI: a vegetation index; r, correlation coefficient; the symbol "+" indicates a multi-time feature.

Comprehensive analysis: as can be seen from a comparison of the data in Table 3 of the examples according to the invention with the comparative examples, the correlation R of the process according to the invention is, as a result ² =0.89 and rrmse=16.01%; the correlation is highest among all the comparative examples, and on the input feature number, the correlation is also at most R only for the comparative examples with the input feature number smaller than the method of the present invention ² =0.82, also much lower than the method of the invention; and the input characteristic quantity of the invention is only 3, which is far less than 381 characteristic input quantity at most in the comparative example; according to the method, under the condition of fewer input feature numbers, excellent high-precision estimation of the wheat grain yield is realized; the present embodiment 1 has a certain advantage in terms of estimation accuracy; among them, the accuracy of this example 1 significantly exceeded the prior art in the comparative example of the same wheat grain yield estimation.

In summary, according to the technical scheme, the actual phenotypic characteristics of wheat are utilized to estimate the yield; is based on the direct relationship established between the phenotypic characteristics of crops and the yield of seeds; the method is not easy to be interfered by external environment factors, and can directly extract the characteristics on organs directly related to the grain yield, so that the yield estimation result realized by the method is more accurate, and the method has good anti-interference characteristics.

The foregoing describes in detail preferred embodiments of the present invention. It should be understood that numerous modifications and variations can be made in accordance with the concepts of the invention by one of ordinary skill in the art without undue burden. Therefore, all technical solutions which can be obtained by logic analysis, reasoning or limited experiments based on the prior art by a person skilled in the art according to the inventive concept shall be within the scope of protection defined by the claims.

Claims

1. A method for estimating high precision wheat grain yield using deep learning and phenotypic characteristics, comprising the steps of:

step 1, shooting and collecting phenotype characteristic data of wheat kernels;

step 3, preprocessing the image shot and collected in the step 1;

and 6, under the drive of the wheat ear phenotype characteristics extracted in the step 5, establishing a relation between the phenotype characteristics and the kernel yield, so as to estimate the kernel yield.

2. The method for estimating a yield of wheat grain with high precision according to claim 1,

in the step 1, the first set of data is used as a training data set for training the deep learning model, and the second set of data is used as verification data for verifying the performance of the model in the wheat head segmentation.

3. The method for estimating a yield of wheat grain with high precision according to claim 1,

and step 2, carrying out unshelling treatment after primary dehydration, carrying out secondary dehydration after unshelling to obtain the dry weight of the seeds, and obtaining the yield of the seeds in a sampling range after normalization by using the water content.

4. The method for estimating a yield of wheat grain with high precision according to claim 1,

the third aspect of the pretreatment in step 3 is: and (3) manually marking the manually selected image in the first aspect and the processed image in the second aspect, and manually marking all wheat ears on the image, wherein during marking, all wheat middles are ignored and are not calculated.

5. The method for estimating a yield of wheat grain with high precision according to claim 1,

6. The method for estimating a yield of wheat grain with high precision according to claim 1,

the abnormal state of the wheat ears in the step 5 can be judged by using a blue-green difference index (difference between green and yellowindex, DGYI);

7. the method for estimating a yield of wheat grain with high precision according to claim 1,

wherein n in the above formulas (8), (9) and (10) represents the total number of samples in the verification set; e (E) _i And S is _i The i-th estimated and observed grain yield, respectively;average grain yield observed.

8. The method for estimating a yield of wheat grain with high precision according to claim 1,

the shooting height in the step 1 is required to be higher than the height of wheat, and the shooting collected image is an orthographic image;

in the step 1, when shooting and collecting, the shooting and collecting are required to be carried out in a time period with good weather conditions, small wind speed and good illumination conditions;

in the step 1, before shooting and collecting the second set of data, the wheat plants in the sampling frame need to be carefully processed, so that the original state is prevented from being damaged, and the environment around the sampling area needs to be cleaned;

the manual marking of the wheat ears in the third aspect in the step 3 requires at least 2 operators to be completed together, wherein one is responsible for marking the mask and the other is responsible for checking the marking result, so that subjective errors can be avoided as much as possible.

9. The method for estimating a yield of wheat grain with high precision according to claim 1,

the accuracy evaluation formula (2) in the step 4 is the proportion of correctly predicted wheat ear examples in the whole prediction, and the purpose of the accuracy evaluation formula is mainly to limit false alarm;

the recall formula (3) in the step 4 is the proportion of correctly predicted wheat ear examples in all true values, and the purpose of the recall formula is mainly to limit missing report;

in the step 4, specific methods adopted by data enhancement are image rotation (90, 180 and 270 degrees rotation) and flipping (up-down and left-right flipping);

in the step 4, the specific practice of the transfer learning is to adopt the pre-trained network weight on COCO data;

in the step 4, a confidence score (confidence score) threshold of classification is set to 0.6 in order to balance the accuracy and recall in evaluating the model.

10. The method for estimating a yield of wheat grain with high precision according to claim 1,

in the step 6, the sampled machine learning regression algorithm includes multiple linear regression, support vector regression and random forest regression;

in the step 6, before the model is operated, the data set is subjected to normalization and random sorting treatment so as to improve the robustness of the model.