CN117789037A

CN117789037A - Crop growth period prediction method and device

Info

Publication number: CN117789037A
Application number: CN202410185298.1A
Authority: CN
Inventors: 于景鑫; 任妮; 吕志远; 李友丽; 吴茜; 单飞飞; 刘长斌
Original assignee: Jiangsu Academy of Agricultural Sciences; Intelligent Equipment Technology Research Center of Beijing Academy of Agricultural and Forestry Sciences
Current assignee: Jiangsu Academy of Agricultural Sciences; Intelligent Equipment Technology Research Center of Beijing Academy of Agricultural and Forestry Sciences
Priority date: 2024-02-19
Filing date: 2024-02-19
Publication date: 2024-03-29

Abstract

The invention relates to the technical field of image recognition, and provides a crop growth period prediction method and device, wherein the method comprises the following steps: acquiring a crop image to be detected, wherein the crop image to be detected comprises at least one type of crop; inputting the crop image to be detected into a crop prediction model to obtain a crop type prediction result and a crop growth period prediction result; the crop prediction model is obtained by performing multi-task training on a target classification network by taking microscopic features of crops in a sample crop image, planting intervals among different crops and time sequence information corresponding to the sample crop image as training features and taking a multi-task learning loss function as a training function; the multitasking learning loss function is determined based on the cross entropy loss function and the mean square error loss function. The method can accurately identify different types of crops at the early stage of the seedling stage of the crops, and improves the accuracy of crop classification and growth prediction, thereby providing powerful support for seedling stage management of greenhouse agriculture.

Description

Crop growth period prediction method and device

Technical Field

The invention relates to the technical field of image recognition, in particular to a crop growth period prediction method and device.

Background

The crop identification and growth monitoring are one of key technologies of facility agriculture intellectualization, can realize automatic acquisition and analysis of information such as the types, the quantity, the growth conditions, the maturity and the like of crops in facilities, and provide scientific basis for production management, market prediction, risk assessment and the like of the facility agriculture; the crop in the facility has high difficulty and complexity in identifying and monitoring the growth due to the characteristics of high crop planting density, complex crop appearance, long crop growth period, changeable crop growth environment and the like.

In the related technology, the traditional crop identification and growth monitoring method mainly focuses on the monitoring of the later growth period of crops, the lack of traceability and quality control on crop varieties leads to low safety and traceability of the crops, and when the existing prediction model adopts end-to-end to realize the prediction from the acquisition of data to the growth parameters of the crops, the extracted characteristic characterization capability is weak, so that the generalization capability and the prediction capability of the model are insufficient.

Disclosure of Invention

The invention provides a crop growth period prediction method and device, which are used for solving the defects that the existing technology mainly monitors the later period of crop growth, the crop variety is not traced and controlled in quality, the safety and traceability of crops are low, the characteristic characterization capability extracted by the existing prediction model is weak, the generalization capability and the prediction capability of the model are insufficient, and the accuracy of crop classification and growth prediction is improved.

The invention provides a crop growth period prediction method, which comprises the following steps:

acquiring a crop image to be detected, wherein the crop image to be detected comprises at least one type of crop;

inputting the crop image to be detected into a crop prediction model to obtain a crop type prediction result and a crop growth period prediction result;

the crop prediction model is obtained by performing multi-task training on a target classification network by taking microscopic features of crops in a sample crop image, planting intervals among different crops and time sequence information corresponding to the sample crop image as training features and taking a multi-task learning loss function as a training function; the multi-tasking learning loss function is determined based on a cross entropy loss function and a mean square error loss function.

According to the crop growth period prediction method provided by the invention, the crop prediction model is obtained through the following steps:

acquiring a sample crop image, wherein the sample crop image comprises at least one type of crop;

extracting mask features from the sample crop images, dividing the mask features to obtain divided images of different crops, and determining an image capable of dividing seedling stage according to the statistical analysis quantity of each crop in the divided images;

classifying the images in the seedling stage according to a self-attention mechanism to obtain classified images, and respectively extracting microscopic features of each crop, planting intervals among different crops and time sequence information from the classified images;

Performing feature fusion on the micro features and the planting intervals to obtain fusion features;

and performing multi-task training on the target classification network according to the fusion characteristics, performing time sequence judgment on the target classification network through the time sequence information, and obtaining a crop prediction model under the condition that the target classification network is converged so as to realize the category prediction and the growing period prediction of crops.

According to the crop growth period prediction method provided by the invention, the steps of extracting mask features from the sample crop image and dividing the mask features, and obtaining divided images of different crops comprise the following steps:

and generating an instance mask from the pixel level of the sample crop image according to an SOLOv2 network, and optimizing the instance mask according to a mask loss function to obtain the segmented image.

According to the crop growth period prediction method provided by the invention, the determining the image capable of dividing the seedling period according to the statistical analysis quantity of each crop in the divided image comprises the following steps:

sequentially performing morphology-based image erosion, image dilation and noise filtering on the segmented image to obtain a filtered image;

the number of each crop in the filtered image and the area of the corresponding area are respectively counted by a connected domain marking algorithm based on depth-first search to obtain statistical data;

Calculating the average height and leaf area index of each crop in the statistical data based on an image color space conversion algorithm, and determining the segmented image as the image in the seedling stage when the average height exceeds a height threshold and the leaf area index exceeds an index threshold;

wherein the elevation threshold and the index threshold are determined based on growth cycles, climatic features, historical data and expert knowledge of each crop in the segmented image.

According to the crop growth period prediction method provided by the invention, the classifying of the images in the seedling stage according to the self-attention mechanism comprises the following steps:

extracting global features from the images in the seedling stage according to an image classification model of the visual transducer, and classifying the images in the seedling stage according to the global features to obtain leaf vegetable classification images and fruit vegetable classification images; wherein the image classification model of the visual transformer is constructed based on a self-attention mechanism.

According to the crop growth period prediction method provided by the invention, the extracting of the micro-characteristics of each crop, the planting spacing between different crops and the time sequence information from the classified images respectively comprises the following steps:

Extracting the micro-features from the classified images based on a first convolutional neural network, and extracting the planting intervals from the classified images based on a second convolutional neural network; the timing information is extracted from the classified images based on a time series prediction network.

According to the crop growth period prediction method provided by the invention, after the crop prediction model is obtained, the method further comprises the following steps:

and sequentially carrying out operations of model pruning, quantization and distillation on the crop prediction model to obtain an optimized crop prediction model.

The invention also provides a crop growth period prediction device, which comprises:

the image acquisition module is used for acquiring an image of a crop to be detected, wherein the image of the crop to be detected comprises at least one type of crop;

the prediction module is used for inputting the crop image to be detected into a crop prediction model to obtain a crop type prediction result and a crop growth period prediction result;

The invention also provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the crop growth phase prediction method as described above when executing the program.

The present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which when executed by a processor implements a method of crop growth phase prediction as described in any of the above.

According to the crop growth period prediction method and device, the crop prediction model obtained by training the microcosmic features of each crop, the planting intervals among different crops and the time sequence information corresponding to the sample crop image is used as training features and the multitask learning loss function to predict different tasks on the crop image to be detected, so that more accurate crop category identification results and crop growth period prediction results are obtained, different types of crops can be accurately identified at the early stage of the crop seedling period, the accuracy of crop classification and growth prediction is improved, and powerful support is provided for seedling period management of greenhouse agriculture.

Drawings

In order to more clearly illustrate the invention or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic flow chart of a crop growth period prediction method provided by the invention;

FIG. 2 is a second flow chart of the crop growth phase prediction method according to the present invention;

FIG. 3 is a schematic structural diagram of a plant growth period prediction apparatus according to the present invention;

FIG. 4 is a second schematic diagram of a plant growth period prediction apparatus according to the present invention;

fig. 5 is a schematic structural diagram of an electronic device provided by the present invention.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The crop growth phase prediction method and apparatus of the present invention are described below with reference to fig. 1 to 4.

Fig. 1 is a schematic flow chart of a method for predicting a growth period of a crop according to the present invention, as shown in fig. 1, and the present invention provides a method for predicting a growth period of a crop, including the following steps:

Step 110, obtaining an image of a crop to be detected, wherein the image of the crop to be detected comprises at least one type of crop.

In this step, the crop image to be measured includes a crop picture in a designated planting area (for example, a greenhouse or a farming greenhouse, etc.) photographed by the camera, and image photographing time information.

In this embodiment, a monitoring camera is installed on top of the greenhouse, the number, position and shooting angle of the cameras are determined according to the size and shape of the greenhouse, so that the cameras can cover the whole cultivation area of the greenhouse, and the cameras are calibrated by using the internal and external parameters of the cameras, distortion and distortion of the images are eliminated, finally the nodding pictures of the greenhouse are acquired at fixed time intervals (e.g., daily or weekly), and the current time stamp is recorded.

In this embodiment, information such as the size and shape of the greenhouse, and the range of the cultivation area can be obtained from the BIM (Building Information Modeling, building information model); or using a measurement tool to make a field measurement.

In the embodiment, a proper monitoring camera is selected, and the number, the positions and the shooting angles of the cameras are determined to acquire an image of the crop to be detected; specifically, according to the environmental conditions of the greenhouse, waterproof, dustproof, antifogging and shockproof cameras are selected to ensure the definition and stability of images. According to the imaging principle and model of the camera, proper parameters such as resolution, focal length, field angle and the like are selected so as to ensure the quality and information quantity of the image.

In this embodiment, the camera may be an RGB camera, the resolution of which is not lower than 1920×1080, i.e., 1080P, can provide clear image quality, the focal length of the camera is a 2.8mm focal length, and the field angle of view of the camera is not less than 90 °; in addition, a higher resolution camera may be selected, for example, a 2K (2560×1440) or 4K (3840×2160) camera may be selected.

In this embodiment, the storage format of the RGB camera should be selected to be able to retain the original information of the image while compressing the size of the image, the format of the image including JPEG (joint photographic experts group) or PNG (portable network graphics) format.

In this embodiment, it is optional to mount at least one camera at four corners and at a central position of the greenhouse, ensuring that the shooting angle of the camera can cover more views.

In this embodiment, the method of camera calibration is used, internal parameters (intrinsic parameters) and external parameters (extrinsic parameters) of the camera are acquired, and distortion (distortion) and distortion (abberation) of the image are eliminated. The internal parameters include principal point (principal point) of the camera, focal length (focal length) and distortion coefficient (distortion coefficients), and the external parameters include rotation (rotation) and translation (translation) of the camera; the objective of the camera calibration is, among other things, to solve the following equation:

x＝K[R∣t]X

Wherein X is an image coordinate, X is a world coordinate, K is an internal reference matrix, and R and t are external reference matrices; solving the above equation by using world coordinates and corresponding image coordinates; the world coordinates can be obtained from a calibration object (calibration object) by using a checkerboard (chessboard) or a circular grid (circle grid); feature points on the calibration object are detected using the functions findchessboard cores or findCirclesGrid in the OpenCV library, and camera parameters are solved using calibrecramer or solvePnP.

In this embodiment, by optimizing the number, position and shooting angle of the cameras, the cameras are enabled to cover the whole cultivation area of the greenhouse; in particular, the problem is modeled as a combinatorial optimization problem, a set of camera positions is selected from a larger pool of camera positions given a maximum number of cameras, the square error between the desired coverage and the actual coverage is minimized, and the nonlinear cost function is converted to a mixed integer linear programming problem. Using the camera lens model, projecting the view of the camera onto a 3D voxel map to calculate coverage scores, which makes optimizing the problem viable in real environments; wherein the mixed integer linear programming problem can be solved using a tool such as Gurobi or CPLEX.

In this embodiment, at fixed time intervals (daily), a nodding picture of the greenhouse is taken and the current timestamp is recorded; triggering shooting actions of a camera by using a timer or a sensor, and acquiring and saving pictures by using a function video capture or an imwrite in an OpenCV library; wherein the naming format of the pictures is yyyymmddhhmmss. Jpg, wherein YYYY is year, MM is month, DD is date, HH is hour, MM is minute, SS is seconds; metadata (metadata) time stamp of the picture, camera number, camera position, etc., can be written and read using tools such as ExifTool or PyExifTool.

Step 120, inputting the crop image to be detected into a crop prediction model to obtain a crop category prediction result and a crop growth period prediction result; the crop prediction model is obtained by performing multi-task training on a target classification network by taking microscopic features of crops in a sample crop image, planting intervals among different crops and time sequence information corresponding to the sample crop image as training features and taking a multi-task learning loss function as a training function; the multitasking learning loss function is determined based on the cross entropy loss function and the mean square error loss function.

In this step, the micro-features include different growing parts of the crop, for example, the branches, leaves, fruits, etc. of the plant, and the micro-features include the height, shape, color, smell, etc. of the different crop.

In this embodiment, the timing information includes different growth phases of the crop, for example, a time stamp corresponding to a germination phase, a time stamp corresponding to a seedling phase, a time stamp corresponding to a flowering phase, and a time stamp corresponding to a fruiting phase.

In this embodiment, the multi-task learning loss function may be a weighted sum of the loss functions of the plurality of tasks, e.g., the multi-task learning loss function includes a cross entropy loss function for the loss function as a common crop discrimination task and a mean square error loss function for the loss function as a crop growth maturity prediction task.

In this embodiment, the target classification network comprises a convolutional neural network or a deep convolutional neural network, for example, the target classification network is CNN (Convolutional Neural Networks, convolutional neural network).

In the embodiment, for each picture capable of being in a seedling stage, the GoogLeNet network based on the acceptance module is utilized to realize fusion and multi-task learning of multi-dimensional data; the multi-dimensional data comprises an image color histogram, a first-level crop classification result, microscopic crop characteristics, macroscopic planting spacing, time sequence information and the like; multitasking includes common crop discrimination and crop maturation prediction; according to the characteristics of the data set, the embodiment selects proper super parameters such as a loss function, an optimizer, a learning rate and the like, and proper strategies such as data enhancement, regularization, pre-training and the like to train and optimize the CNN network, and improves the performance and generalization capability of the model so as to realize accurate prediction of the category of crops and accurate prediction of the growing period.

According to the crop growth period prediction method provided by the embodiment of the invention, the crop prediction model obtained by training the microscopic features of each crop, the planting intervals among different crops and the time sequence information corresponding to the sample crop image is used as training features and the multi-task learning loss function to predict different tasks on the crop image to be detected, so that more accurate crop category identification results and crop growth period prediction results are obtained, different types of crops can be accurately identified at the early stage of the crop seedling period, the accuracy of crop classification and growth prediction is improved, and therefore powerful support is provided for seedling period management of greenhouse agriculture.

In some embodiments, the crop prediction model is obtained by: acquiring a sample crop image, wherein the sample crop image comprises at least one type of crop; extracting mask features from sample crop images, dividing the mask features to obtain divided images of different crops, and determining an image capable of dividing seedling stage according to the statistical analysis quantity of each crop in the divided images; classifying the images in the seedling stage according to a self-attention mechanism to obtain classified images, and respectively extracting microscopic features of each crop, planting intervals among different crops and time sequence information from the classified images; performing feature fusion on the micro features and the planting intervals to obtain fusion features; and performing multi-task training on the target classification network according to the fusion characteristics, performing time sequence judgment on the target classification network through time sequence information, and obtaining a crop prediction model under the condition that the target classification network is converged so as to realize crop category prediction and growth period prediction.

In this embodiment, the sample crop image includes a crop picture of a greenhouse or a cultivation greenhouse, etc. captured by the camera, and the image capturing time information, and the obtaining way and the size setting are consistent with the technical means for obtaining the crop image to be detected in the above embodiment, which is not described in detail.

In this embodiment, image segmentation and classification is achieved for each sample crop image using advanced object segmentation networks based on deep learning to generate features of instance mask auto-learn images directly from pixel level, where each instance corresponds to a crop.

In this embodiment, the advanced target segmentation network based on deep learning includes a convolutional neural network such as a SOLOv2 (Segmenting Objects by Locations v 2) network.

In this embodiment, the statistical analysis amount of each crop includes the number, category, and area of each crop in the sample crop image, and also includes statistics of the height and leaf area index of each crop.

In this embodiment, the transplanting time stamp T is recorded first after transplanting the crop into the cultivation area ₀ And dynamically and statistically analyzing the crop quantity in the cultivation area by using an image processing algorithm. For each segmented image, eliminating noise and interference in the image by utilizing morphological-based image erosion and expansion operation, and then counting the number and area of crops in the image by utilizing a connected domain marking algorithm based on depth-first search; then, calculating the discrimination index of the separable seedling stage according to the counted number and area of crops by using an image color space conversion algorithm: as the average height and leaf area index, further determining whether the current image is a splittable image.

In this embodiment, for each image in the separable seedling stage, the images in the separable seedling stage are classified by using a network constructed based on a self-attention mechanism, for example, global features in the images in the separable seedling stage are automatically learned and the classification of the images is realized by using a ViT-G/14 network based on a Transformer structure; the neural network based on the self-attention mechanism classifies crops and divides facilities into common leaf vegetables and fruit vegetables.

In this embodiment, after the classified images are obtained, the microscopic features of the crops are identified from the classified images (for example, leaf-vegetable classified images and fruit-vegetable classified images), the planting intervals are detected, and the time sequence information is calculated, and the target classified network is classified and trained according to the features fused according to the observed features and the planting intervals, and meanwhile, the multiple time sequence information of the crops is used as labels to predict the growing periods corresponding to different classification results, so that the growing period prediction results of different crops are obtained.

In this embodiment, the target classification network is a google net network based on an acceptance module, and for each picture capable of dividing the seedling stage, the fusion and the multi-task learning of multi-dimensional data are realized, and the specific steps are as follows:

(1) Using the input layer of the google net network, the pixel values of an image are converted into a matrix X of size N X D, where N is the number of tiles of the image and D is the dimension of each tile.

Specifically, an image of 224×224 size is used and divided into 64 segments, each segment having a dimension of 64.

(2) Adding a position coding vector to the segmented representation vector using the position coding layer of the google net network to add position information; a learnable position-coded vector of size D is used and added to the representation vector of the block to obtain a matrix X' of size nxd.

(3) And fusing the segmented expression vector with multidimensional data by using an acceptance module of the GoogLeNet network to obtain a new segmented expression vector.

Specifically, the present invention uses 9 acceptance modules, each module containing a 1×1,3×3,5×5 convolution layer and a 3×3 max pooling layer, and the 1×1 convolution layer as a dimension reduction operation; the output of each module is stitched into a new block representation matrix X ".

(4) The partitioned representation vector is mapped to output vectors of a plurality of tasks using a multi-task learning layer of a google net network, wherein the output vector of each task contains a number of categories of tasks.

In this embodiment, the plurality of tasks may include two tasks, common crop discrimination and crop growth maturation prediction, respectively; for a common crop distinguishing task, mapping a block representation matrix X' into a matrix Y with a size of N multiplied by C by adopting a full connection layer and a Softmax layer, wherein C is the category number of the common crop; for the crop growth maturation prediction task, a full-join layer and a linear regression layer are used to map the partitioned representation matrix X "into a matrix Z of size nxr, where R is the crop growth maturation prediction value.

(5) For each picture of the separable seedling stage, the output of the GoogLeNet network, namely the output vector of the multiplexing, is used for realizing the identification of common crops and the prediction of the growth maturity of the crops.

Specifically, this embodiment uses a voting mechanism to weight the output vector of the picture and the area of the block to obtain a scalar, which represents the probability that the picture belongs to a certain class or a certain predicted value. The following formula represents this process:

wherein P is the output probability of the picture, O _i Is the output vector of the ith block, A _i Is the area of the ith block, and N is the number of blocks.

In this embodiment, a specific method of training and optimizing the target classification network (CNN network) is as follows:

(1) For the loss function, a multi-task learning loss function, namely a weighted sum of the loss functions of a plurality of tasks is adopted; specifically, the cross entropy loss function is used as the loss function of a common crop distinguishing task, and the mean square error loss function is used as the loss function of a crop growth maturity prediction task.

(2) A learnable weight vector is used to balance the contributions of the loss functions of different tasks, which is expressed by the following equation:

where L is a loss function for multitasking, w _j Is the weight of the loss function of the j-th task, L _j Is the loss function of the j-th task, and M is the number of tasks.

(3) For the optimizer, an Adam optimizer is adopted to update parameters of the CNN network; with an initial learning rate of 0.001, an attenuation factor of 0.9, a momentum factor of 0.99, and an epsilon value of e ^-8 And a weight decay factor of 0.0005, the process being represented by:

wherein θ _t Is the parameter of the t-th iteration, alpha is the learning rate,is the first moment estimate of the t-th iteration,>is the second moment estimate for the t-th iteration, e is a small constant, and λ is the weight decay factor.

(4) For learning rate, an exponential decay strategy is used to dynamically adjust the learning rate, and this embodiment uses an initial learning rate of 0.001, a decay rate of 0.95, and a decay step of 1000, where the process is represented by the following formula:

Wherein alpha is _t Is the learning rate of the t-th iteration, alpha ₀ Is the initial learning rate, gamma is the decay rate, s is the decay step,is a round down function.

In this example, a "feature-crop-growth period" three-segment cascade neural network was constructed according to different data sources and tasks; specifically, the TensorFlow2.6 deep learning framework is used to build the following network models:

(1) Characteristic network: the embodiment uses an advanced target segmentation network SOLOv2 network based on deep learning to realize segmentation and classification of the monitoring site image, and each embodiment corresponds to one crop; and constructing and training an SOLOv2 network by using the Keras API.

Specifically, pre-trained ResNet50 is adopted as a backbone network, focal Loss is adopted as a Loss function, RPN is adopted as a candidate region generator, and Mask R-CNN is adopted as an example divider; the embodiment also uses techniques such as multi-scale training and testing, gradient accumulation and the like to improve the performance and stability of the model; the example uses a ViT-G/14 network based on a Transformer structure to realize classification of sunlight greenhouse nodding images, and facilities are divided into common leafy vegetables and fruit vegetables; adopting TensorFlow Hub API to realize the construction and training of ViT-G/14 network; the pre-trained ViT-G/14 is used as a classifier, a cross entropy loss function, an AdamW optimizer and a learning rate attenuation strategy, and regularization methods such as Dropout, label Smoothing and the like are adopted; the embodiment uses a ResNet network based on deep learning to realize the identification of microscopic features of the side shot image, and adopts Keras API to realize the construction and training of the ResNet network; pre-trained res net50 as feature extractor, full connection layer and Softmax layer as classifier, cross entropy loss function, SGD optimizer, batch normalization and Dropout etc. techniques; the detection of macro features of the side shot images is realized based on the deep learning YOLOX network, and TensorFlow Object Detection API realizes the construction and training of the YOLOX network; pre-trained YOLOX-s was used as the target detector, YOLO Loss as the Loss function, SGD optimizer, mosaics data enhancement, EMA model averaging, etc.

(2) Crop network: the time sequence characteristics of crops are calculated by using a BiLSTM network based on deep learning, and the BiLSTM network is built and trained by using a Keras API; the bidirectional LSTM layer is used as a time sequence modeler, and the full connection layer and the Sigmoid layer are used as regressors, a mean square error loss function, an Adam optimizer, learning rate attenuation strategies, dropout, batch Normalization and other skills.

(3) Growth period network: the example uses the GoogLeNet network based on the acceptance module to realize the fusion of multidimensional data and the multi-task learning; the multidimensional data comprises an image color histogram, a primary crop classification result, microscopic crop characteristics, macroscopic planting spacing, time sequence information and the like; multitasking includes common crop discrimination and crop maturation prediction; the embodiment uses Keras API to realize the construction and training of the GoogLeNet network; pre-trained google net as feature fusion, multiple auxiliary classifiers and main classifiers, weighted cross entropy loss function, RMSProp optimizer, batch normalization and Dropout.

(4) The training parameters of this example were set as follows: the byte width is dynamically set to 8 or 4 according to the network size, the momentum factor is set to 0.95, the cosine annealing attenuation coefficient is 0.005, the initial learning rate is 0.001, and the model training dynamically optimizes the learning rate by using an Adam optimizer and cosine annealing.

Fig. 2 is a second flow chart of the crop growth period prediction method provided by the invention, in the embodiment shown in fig. 2, a sample crop image is collected first, a region of interest of the image is segmented, a crop is monitored according to the segmentation result, a seedling period threshold value is counted, and a transplanting time T corresponding to the segmentation result is used ₀ Determination of separable seedling stage T _n The method comprises the steps of carrying out a first treatment on the surface of the According to the threshold value of the seedling-dividing stage and the T of the seedling-dividing stage _n And carrying out primary crop classification, respectively extracting micro-characteristics (corresponding to crop characteristic identification), planting intervals (corresponding to planting macro intervals) and time sequence information according to classification results, and carrying out secondary crop classification of multidimensional data fusion according to the characteristics so as to realize crop type identification and growth parameter monitoring.

According to the crop growth period prediction method provided by the embodiment of the invention, the crop prediction model is constructed from the characteristic-crop-growth period three-section cascade neural network model mode, the model breaks through a traditional static and isolated visual detection method, a shooting time stamp is introduced, time sequence information is fused, multi-dimensional data fusion is realized, a reliable seedling crop identification model is formed, specific crops are described from three layers of time sequence change, micro characteristics, macroscopic morphology and the like, the image of the crops is full and three-dimensional, meanwhile, effective information in crop pictures is fully excavated, and the accuracy and robustness of the model are improved.

In some embodiments, extracting mask features from a sample crop image and segmenting, resulting in segmented images of different crops comprises: an instance mask is generated from the pixel level of the sample crop image according to the SOLOv2 network, and the instance mask is optimized according to a mask loss function to obtain a segmented image.

In this embodiment, for each sample crop image, the SOLOv2 network is utilized to directly generate the features of the instance mask auto-learn image from the pixel level without pre-defining the region of interest ROI, to achieve image segmentation and classification; specifically, obtaining a nodding picture, namely a sample crop image, from a greenhouse, and recording a current time stamp; for each nodding picture, SOLOv2 network is used for realizing image segmentation and classification; the SOLOv2 network structure comprises:

SOLOv2(I)＝＝FPN(Backbone(I))＝＝{F ₂ ,F ₃ ,F ₄ ,F ₅ ,F ₆ }；

wherein I is the input image, F is the output feature pyramid, F _i Is the feature map of the ith hierarchy, the Backbone network, and the FPN is the feature pyramid network.

In this embodiment, the SOLOv2 network consists of the following parts:

(1) A backbone network (backbone) of a Full Convolutional Network (FCN) is used to extract multi-scale feature maps (feature maps), or features called different levels (levels), from an input image. The invention uses ResNet-50 as a backbone network, and outputs 5 levels of characteristics, namely P2, P3, P4, P5 and P6, according to the resolution of an input image, wherein the resolution is 1/4, 1/8, 1/16, 1/32 and 1/64 respectively; the output of the backbone network can be expressed as:

{P ₂ ,P ₃ ,P ₄ ,P ₅ ,P ₆ }＝Backbone(I)

(2) A Feature Pyramid Network (FPN) for fusing features of different levels and enhancing the expression capability of the features; the SOLOv2 network uses a BiFPN structure to realize Self-adaptive feature fusion through a Self-attention mechanism (Self-Attention Mechanism) and weighted feature fusion (Weighted Feature Fusion); wherein the structure of BiFPN is represented by the formula:

(3) Category branches (category branches) for predicting the category probability of each pixel point.

Specifically, the category branches consist of 5 parallel convolution layers, each convolution layer corresponds to a level of characteristics, and the number of output channels is C, wherein C is the number of crop categories; the output of the class branches is 5 class probability maps (Category Probability Maps) with the sizes of H/4 XW/4 XC, H/8 XW/8 XC, H/16 XW/16 XC, H/32 XW/32 XC and H/64 XW/64 XC, wherein H and W are the height and width of an input image, and the class probability maps of the output of the class branches are respectively expressed as:

wherein,is the class probability map of the i-th hierarchy, and CategoryBranch is the class branch.

(4) The Loss Function (Loss Function) of the category branch is Focal Loss, which is used for solving the problem of unbalanced category, and the definition of Focal Loss is:

Wherein N is the total number of pixel points, C is the number of crop categories, y _i,c Is the true label of the c-th category of the i-th pixel point, p _i,c Is the prediction probability of the c-th class of the i-th pixel point, and gamma is an adjustment parameter for reducing the weight of the easy-to-classify sample.

(5) A mask branch (mask branch) for predicting a mask probability for each pixel point.

Specifically, the mask branches consist of 5 parallel convolution layers, each corresponding to a level of features, the number of output channels is sxs, where S is the size of the mask, and the output of the mask branches is 5 mask probability maps (mask probability maps) whose sizes are H/4 xw/4 xsxs, H/8 xw/8 xsxs, H/16 xw/16 xsxs, H/32 xw/32 xs, H/64 xw/64 xsxs, respectively. The output of the mask branch may be expressed as:

wherein,is the mask probability map of the ith hierarchy, Maskback is the mask branch; the Loss function of the mask branch is Dice, and is used for measuring the similarity between the prediction mask and the real mask; the penalty function of the mask branch is expressed as:

wherein N is the total number of pixel points, m _i Is the true mask of the ith pixel point, p _i Is the prediction mask for the i-th pixel.

In this implementation, the training process for the SOLOv2 network is as follows:

First, scaling the input image to a fixed size 800 x 800, then performing data enhancement (data augmentation) by random cropping, horizontal flipping, color dithering, etc.; secondly, the input image and the corresponding real label (group trunk) are sent to SOLOv2 network, the output of class branch and mask branch is calculated, and compared with the real label, the loss function is calculated, and the parameters of the network are updated by using a random gradient descent (SGD) optimizer, so that the loss function is minimized.

In this embodiment, the super parameters (Hyperparameters) used by the training process of the SOLOv2 network are as follows: the initial learning rate (Initial Learning rate) was 0.01, the Momentum (Momentum) was 0.9, the Weight Decay (Weight Decay) was 0.0001, the Batch size (Batch size) was 16, the training round number (epochs) was 36, and the learning rate Decay strategy (Learning Rate Decay Policy) was Cosine Annealing.

According to the crop growth period prediction method provided by the embodiment of the invention, the example mask is generated from the pixel level of the sample crop image through the SOLOv2 network, and the example mask is optimized according to the mask loss function, so that the segmented image is obtained, the workload and the cost of data marking are greatly reduced, and the data utilization efficiency and quality are improved.

In some embodiments, determining the segmentable seedling stage image from the statistical analysis of each crop in the segmented image comprises: sequentially performing morphology-based image erosion, image dilation and noise filtering on the segmented image to obtain a filtered image; the number of crops and the area of the corresponding area in the filtered image are respectively counted by a connected domain marking algorithm based on depth-first search to obtain statistical data; calculating the average height and leaf area index of each crop in the statistical data based on an image color space conversion algorithm, and determining the segmented image as a separable seedling stage image under the condition that the average height exceeds a height threshold and the leaf area index exceeds an index threshold; wherein the elevation threshold and the index threshold are determined based on growth cycles, climatic features, historical data and expert knowledge of each crop in the segmented image.

In this embodiment, the transplanting time stamp T is recorded first after transplanting the crop into the cultivation area ₀ The crop quantity in the cultivation area is dynamically counted and analyzed by utilizing an image processing algorithm; for each segmented image, using morphological-based image erosion and dilation operation to eliminate noise and interference in the image, and then using depth-first search-based connected domain labeling algorithm to count the number and area of crops in the image. The method comprises the following specific steps:

(1) For each segmented image, a binarized instance mask map (Instance Mask Map) is generated using the SOLOv2 network output, i.e., the class probability map and the mask probability map.

(2) For each pixel, if its class probability is greater than a threshold and its mask probability is greater than a threshold, it is marked as 1, otherwise marked as 0, resulting in a binarized instance mask map M of size H x W, where H and W are the height and width of the input image.

(3) For each binarized instance mask map, morphology-based image erosion and dilation operations are used to eliminate noise and interference in the image.

Specifically, the image erosion operation is to compare each pixel in the image with a structural element (structural element), if one pixel in the area covered by the structural element is 0, the original pixel is set to 0, otherwise, the original pixel is kept unchanged; the image erosion operation can eliminate small white areas in the image, i.e., isolated noise points; the image expansion operation is to compare each pixel in the image with one structural element, if one pixel in the area covered by the structural element is 1, the original pixel is set to be 1, otherwise, the original pixel is kept unchanged; the image dilation operation may fill small black areas in the image, i.e. small cracks or voids.

In the embodiment, the function code or the dialate in the OpenCV library is adopted to realize the image corrosion and expansion operation; in this embodiment, a rectangular structural element with a size of 3×3 is used, and an image erosion operation is performed first, and then an image expansion operation is performed, so that a denoised example mask map M' can be obtained; the specific calculation process is represented by the following formula:

M′＝dilate(erode(M,K),K)；

where M is a binarized instance mask map, M' is a denoised instance mask map, K is a structural element, and erode and dialite are image erosion and dilation operations.

(4) For each denoised example mask map, counting the number and the area of crops in the image by using a connected domain marking algorithm based on depth-first search; the connected domain labeling algorithm is an algorithm that assigns each pixel in an image a label such that pixels having the same label belong to the same connected domain, i.e., adjacent regions having the same pixel value.

Specifically, a connected domain marking algorithm is implemented by using a function connectives in the OpenCV library. Using a four-neighbor (4-neighbor) as a pixel adjacent relation, namely only considering the pixels adjacent in the four directions of the upper, lower, left and right directions of the pixel, using a depth-first search (DFS) method, traversing each pixel from the upper left corner of the image, if a pixel with an unlabeled pixel value of 1 is found, taking the pixel as a starting point of a new connected domain, distributing a new label, using the DFS method, marking the pixels with the adjacent pixel value of 1 as the same label, and simultaneously recording the number of pixels of the connected domain, namely the area of crops, so as to obtain a label graph (label map) L with the size of H×W and a list N of the number and the area of crops; the specific calculation process is represented by the following formula:

L,N＝connectedComponents(M′)；

Where M' is a denoised instance mask map, L is a label map, N is a list of crop numbers and areas, and connectives is a connected domain labeling algorithm.

N＝(n ₁ ,a ₁ ),(n ₂ ,a ₂ ),…,(n _k ,a _k )；

Wherein n is _i Is the label of the ith communicating domain, a _i The number of pixels in the i-th connected domain, and k is the total number of connected domains.

In this embodiment, for each segmented image, a cvttcolor function is designed to implement the conversion of the image color space using an image color space conversion algorithm, and HSV, RGB, CMYK color histograms for each instance are calculated separately; then, designing a calcHist function to calculate an image color histogram by using an image histogram analysis algorithm; finally, according to the growth period and the climatic characteristics of the crops, combining historical data and expert knowledge, determining a threshold value capable of dividing seedling stage, average height and leaf area index of the crops and the like; when the crop quantity reaches the threshold value of the separable seedling stage, recording the picture as the separable seedling stage, and storing a time stamp T _n 。

In this embodiment, for each binarized instance mask map, a cvttcolor function is designed to effect conversion of the image color space using an image color space conversion algorithm, and HSV, RGB, CMYK color histograms for each instance are calculated separately.

It should be noted that, the image color space conversion algorithm is a process of converting one color representation mode into another color representation mode, and in this embodiment, the conversion of the image color space is implemented by using a function cvtdcolor in the OpenCV library; conversion used to represent the image color space:

I _HSV ＝cvtColor(M,COLOR_BGR2HSV)

I _RGB ＝cvtColor(M,COLOR_BGR2RGB)

I _CMYK ＝cvtColor(M,COLOR_BGR2CMYK)

where M is a binarized instance mask map, I _HSV ，I _RGB And I _CMYK Is the converted image, cvtColor is the image COLOR space conversion function, color_bgr2hsv, color_bgr2rgb and color_bgr2cmyk are the converted flags.

In this embodiment, for each converted image, a calcHist function is designed to calculate an image color histogram using an image histogram analysis algorithm; wherein the image histogram represents a graph of pixel intensity (on the x-axis) versus number of pixels (on the y-axis) in the image, the x-axis containing all available gray levels, the y-axis representing the number of pixels having a particular gray value; the image color histogram is calculated by designing a function calcHist, and specifically the following formula is adopted to represent the calculation of the image color histogram:

H _HSV ＝calcHist(I _HSV ,[0,1,2],M,[180,256,256],[0,180,0,256,0,256])

H _RGB ＝calcHist(I _RGB ,[0,1,2],M,[256,256,256],[0,256,0,256,0,256])

H _CMYK ＝calcHist(I _CMYK ,[0,1,2,3],M,[256,256,256,256],[0,256,0,256,0,256,0,256])

wherein I is _HSV ，I _RGB And I _CMYK Is the converted image, H _HSV ，H _RGB And H _CMYK Is an image color histogram, calcHist is an image histogram analysis function, [0,1,2 ] ]And [0,1,2,3 ]]Is an index of the color channel, M is a binarized instance mask map, [180,256,256 ]]，

[256,256,256] and [256,256,256,256] are the sizes of histograms, and [0,180,0,256,0,256], [0,256,0,256,0,256] and [0,256,0,256,0,256,0,256] are the ranges of histograms.

In this embodiment, the threshold value of the separable seedling stage, the average height of the crop, the leaf area index, and the like are determined based on the growth cycle and the climatic characteristics of the crop in combination with the history data and the expert knowledge. When the crop quantity reaches the threshold value of the separable seedling stage, then recordRecording pictures as separable seedling stage and storing time stamp T _n The image in the seedling stage can be judged by the following formula:

(1) Calculating the average height of the crop:

wherein H is _avg Is the average height of crops, H _i Is the height of the ith crop, N is the number of crops.

(2) Calculating leaf area index of crops:

wherein L is _avg Is the leaf area index of crops, L _i Is the leaf area index of the ith crop.

In this embodiment, it is determined whether or not a threshold value of the separable seedling stage is reached, assuming θ _H And theta _L Is a threshold value of the height and leaf area index of the seedling stage so as to judge; for example, H corresponding to the current crop image _avg ≥θ _H And L is _avg ≥θ _L And determining the current crop image as a gradeable image.

According to the crop growth period prediction method provided by the embodiment of the invention, the number of crops and the corresponding area in the filtered images are respectively counted by sequentially carrying out the connected domain marking algorithm based on morphological image corrosion, image expansion, noise filtering and depth-first search on the segmented images, so that accurate statistic data is obtained, and the accuracy and the efficiency of the image recognition in the seedling stage can be improved.

In some implementations, classifying the segmentable images according to a self-attention mechanism, the deriving classified images includes: extracting global features from the images in the seedling stage according to the image classification model of the visual transducer, and classifying the images in the seedling stage according to the global features to obtain leaf vegetable classification images and fruit vegetable classification images; wherein the image classification model of the visual transformer is constructed based on a self-attention mechanism.

In this embodiment, the image classification model of the visual transducer comprises a ViT-G/14 network of a transducer structure through which global features of the image are automatically learned and classification of the image is achieved as follows:

(1) For each picture capable of being subjected to seedling stage, using an input layer of a ViT-G/14 network to convert pixel values of the image into a matrix X with a size of NxD, wherein N is the number of blocks of the image, and D is the dimension of each block; this embodiment employs a 16 x 16 convolution kernel to divide the image into 196 segments, each segment having a dimension of 768. The following formula represents this process:

X＝conv(I,W,S)；

Where I is the matrix of pixel values of the image, W is the weight matrix of the convolution kernel, S is the step size of the convolution kernel, conv is the convolution operation.

(2) For each block, adding a position coding vector to the representation vector of the block using the position coding layer of the ViT-G/14 network to add position information; the present embodiment uses a learnable position-coded vector of size D, which is added to the representation vector of the block to obtain a matrix X' of size nxd. The following formula represents this process:

X′＝X+P；

wherein X is a block representation matrix, P is a position coding matrix, and X' is a block representation matrix after adding position information.

(3) For each tile, using the multi-headed self-attention layer of the ViT-G/14 network, the relevance of the tile to the other tiles is calculated and a new tile representation vector is obtained. The invention uses 12 self-attention heads, each head has dimension of 64, the block representation matrix X' is divided into 12 sub-matrixes, and the size of each sub-matrix is N multiplied by 64; for each submatrix, calculating self-attention output to obtain a submatrix with the size of N multiplied by 64, and finally, splicing 12 submatrices to obtain a matrix X' with the size of N multiplied by D, wherein the specific expression mode is as follows:

X″＝concat(head ₁ ,head ₂ ,…,head ₁₂ )；

Wherein head _i Is the output of the ith self-attention header and concat is the splicing operation.

(4) For each block, using a feedforward layer of a ViT-G/14 network to carry out nonlinear transformation on the block representation vector and obtaining a new block representation vector; in this embodiment, two full connection layers and a ReLU activation function are adopted to transform the block representation matrix x″ into a matrix X' "with a size of nxd, which is specifically expressed as follows:

X″′＝ReLU(W ₂ X″+b ₂ )W ₁ +b ₁ ；

wherein X' is the output of the multi-headed self-focusing layer, W ₁ ，W ₂ ，b ₁ ，b ₂ Is a learnable parameter and ReLU is an activation function.

(5) For each block, adding the block representation vector to the output of the feedforward layer by using a residual connection layer of a ViT-G/14 network, and carrying out layer normalization to obtain a new block representation vector; in this embodiment, a layer normalization vector with a size D is multiplied by the block representation matrix X '"to obtain a matrix X'" with a size nxd, which is specifically expressed as follows:

X″″＝LayerNorm(X′+X″′)；

where X 'is the output of the position-coding layer, X' "is the output of the feedforward layer, layerNorm is the layer normalization operation; repeating the steps to obtain a matrix X' with the size of NxD, which is taken as the output of an encoder of a ViT-G/14 network, wherein the specific representation mode is as follows:

X″″′＝Encoder(X,P,W,S,W ₁ ,W ₂ ,b ₁ ,b ₂ )；

Wherein X is a block representation matrix, P is a position coding matrix, W is a weight matrix of a convolution kernel, S is a step size of the convolution kernel, W ₁ ,W ₂ ,b ₁ ,b ₂ Is a learnable parameter and the Encoder is the Encoder of the ViT-G/14 network.

(6) Mapping the block representation vector to a vector of size C using the classification layer of the ViT-G/14 network, wherein C is the number of crop categories; in this embodiment, a full connection layer and a Softmax function are adopted to map the block representation matrix X "" into a matrix Y with size of nxc, which is specifically expressed as follows:

Y＝Softmax(X″″′W ₃ +b ₃ )；

wherein X "" is the output of the encoder, W ₃ ，b ₃ Is a learnable parameter, softmax is a Softmax function.

(7) For each picture capable of dividing seedling stage, judging that the picture belongs to leaf vegetables or fruit vegetables by using the output of a ViT-G/14 network, namely a segmented prediction label; in this embodiment, a voting mechanism is adopted, and the prediction label of the picture and the area of the block are weighted to obtain a scalar, which represents the probability that the picture belongs to leaf vegetables or fruit vegetables, and specifically represents the following modes:

where P is the prediction probability of the picture, Z _i Is the predictive label of the ith block, A _i Is the area of the ith block, and N is the number of blocks.

(8) Determining a first-level crop classification of the picture using a predictive probability of the ViT-G/14 network; in this embodiment, a threshold τ 'is adopted, and the prediction probability of the picture is compared with the threshold τ' to obtain a scalar, which indicates that the picture belongs to leaf vegetables or fruit vegetables, and specifically indicates that:

wherein C is the first crop classification of the picture, P is the prediction probability of the picture, and τ' is the threshold.

According to the crop growth period prediction method provided by the embodiment of the invention, the image classification model of the visual transducer is used for extracting the global features from the separable seedling period images, and the geometric information in the segmented images is combined to form a richer and more complete crop feature representation, so that the accuracy and the robustness of crop classification are improved.

In some embodiments, extracting the micro-features of each crop, the planting spacing between different crops, and the timing information from the classification image separately includes: extracting micro-features from the classified images based on the first convolutional neural network, and extracting planting intervals from the classified images based on the second convolutional neural network; the time series prediction network extracts timing information from the classified images based on the time series prediction network.

In this embodiment, the first convolutional neural network comprises a ResNet (residual network), the second convolutional neural network comprises a YOLOX network, and the time series prediction network comprises a BiLSTM (Bi-directional Long Short-Term Memory) network.

In this embodiment, for each classified image of the images in the separable seedling stage, the features of the micro-crops, the macro-planting spacing and the time sequence information are respectively identified from the leaf crops and the fruit crops, and the specific steps are as follows:

(1) For leaf vegetable crops, the identification of the microscopic features of the crops is realized by utilizing a ResNet network based on deep learning; in this embodiment, a res net-50 network pre-trained on an ImageNet dataset is used as a feature extractor, an image is input into the network to obtain a feature vector with a size of 2048, and the feature vector is mapped to a vector with a size of C by using a full connection layer and a Softmax layer, wherein C is the category number of micro features, and the identification process of the micro features is represented by the following formula:

F＝Softmax(ResNet-50(I)W+b)；

where I is the matrix of pixel values of the image, resNet-50 is the feature extractor, W, b is the learnable parameter, softmax is the Softmax function, and F is the classification vector of the microfeatures.

(2) For fruit and vegetable crops, the detection of planting spacing, row number and the like of macroscopic features of the crops is realized by utilizing the features of the self-learning image of the YOLOX network based on deep learning; in the embodiment, a Yolox-S network pre-trained on a COCO data set is adopted as a target detector, and an image is input into the network to obtain a matrix with the size of N multiplied by 5, wherein N is the number of detected targets, and 5 is the position and category information of the targets; the detection process of the planting distance is represented by the following formula:

D＝YOLOX-S(I)；

Where I is the matrix of pixel values of the image, yolox-S is the target detector, and D is the detection matrix of macroscopic features.

(3) For each picture capable of dividing seedling stage, utilizing a BiLSTM network based on deep learning to realize the maturity stage calculation of time sequence characteristics of crops; the embodiment adopts a two-way long-short-term memory network (BiLSTM) to model an image sequence and learn the time sequence relation among images; in the embodiment, a sliding window with a size of T is adopted, an image sequence is divided into a plurality of subsequences, each subsequence comprises T pictures, each subsequence is input into a BiLSTM network to obtain a matrix with a size of T multiplied by H, H is the dimension of a hidden state, and a full connection layer and a linear regression layer are used for mapping a time sequence feature matrix to a vector with a size of R, wherein R is the number of time sequence features; the extraction process of the timing information is represented by the following formula:

S＝Linear(BiLSTM(I ₁ ,I ₂ ,…,I _T )W ₄ +b ₄ )；

wherein I is ₁ ,I ₂ ,…,I _T Is a subsequence of the image sequence, biLSTM is a timing modeler, W, b ₄ Is a learnable parameter, linear is a Linear regression layer, and S is a regression vector of the timing characteristics.

According to the crop growth period prediction method provided by the embodiment of the invention, the micro-characteristics are extracted from the classified images through the first convolutional neural network, the planting distance is extracted from the classified images through the second convolutional neural network, the positioning and identifying effects of the target are improved, meanwhile, the calculated amount and memory occupation of the model are reduced, the speed and performance of the model are improved, the time sequence information is extracted from the classified images through the time sequence prediction network, the regression calculation of the mature period of crops is realized, and the precision and stability of crop growth prediction are improved.

In some embodiments, after deriving the crop prediction model, the method further comprises: and sequentially carrying out model pruning, quantization and distillation on the crop prediction model to obtain an optimized crop prediction model.

In this embodiment, for each picture in the separable seedling stage, the complexity and the running time of the model are reduced by using the model compression and acceleration technique model pruning, quantization, distillation and the like, and the efficiency and portability of the model are improved, and the specific steps are as follows:

(1) For model pruning, the embodiment adopts a pruning method based on sensitivity to delete unimportant parameters in the model, so that the size and the calculated amount of the model are reduced. The present invention uses the following formula to calculate the sensitivity of each parameter in the model, i.e., the degree to which the parameter affects the performance of the model.

/>

Wherein S is _i Represents the sensitivity of the ith parameter, L (θ) represents the loss function value of the model at the original parameter θ, L (θ - Δ) _i ) Representing the model minus a small variation delta from the ith parameter _i The subsequent loss function value.

(2) For model quantization, the embodiment adopts a method based on uniform quantization to represent parameters in a model with a smaller number of bits, reduces the storage space and memory occupation of the model, and adopts the following formula to calculate the quantized value of each parameter in the model, namely the approximate value of the parameter after quantization, wherein the quantization process is represented by the following formula:

Wherein Q is _i Representation ofQuantized value of ith parameter, θ _i Representing the original value of the i-th parameter, min (θ) and max (θ) representing the minimum and maximum values of the parameter, b representing the number of quantized bits, round representing the rounding function. The present invention uses a bit number b=8 to quantize the parameters in the model into 8-bit unsigned integers, i.e. integers between 0 and 255.

(3) For model distillation, the present embodiment uses a knowledge-based distillation method to transfer knowledge in the model to a smaller model, improving generalization ability and reasoning speed of the model, and uses the following formula to calculate the distillation loss function of the model, i.e. the difference between the model and a larger teacher model:

L _distill ＝αL _hard +(1-α)T ² L _soft ；

wherein L is _distill Represents the distillation loss function of the model, alpha represents the weight of the loss function, L _hard Representing the hard tag loss function of the model, i.e. the difference between the model and the real tag, L _soft Representing the soft label loss function of the model, i.e. the difference between the model and the teacher model, T represents the temperature parameter for smoothing the probability distribution of the soft label.

In this embodiment, knowledge distillation is performed using a weight α=0.5, a temperature parameter t=2, and a GoogLeNet network trained on ImageNet dataset as a teacher model.

According to the crop growth period prediction method provided by the embodiment of the invention, the crop prediction model is subjected to model pruning, quantization and distillation in sequence to obtain the optimized crop prediction model, so that the complexity and the running time of the model are reduced, the efficiency and the portability of the model are improved, and the model can adapt to different hardware platforms and scene requirements.

FIG. 3 is a schematic structural diagram of a plant growth period prediction apparatus according to the present invention, where in the embodiment shown in FIG. 3, a plant growth period prediction apparatus includes a facility greenhouse monitoring camera, an image acquisition module, a first-stage plant classification module, a second-stage plant classification module, and a model construction module; wherein the facility temperatureThe room monitoring camera comprises image parameters and equipment parameters, wherein the image parameters comprise resolution, time stamp, definition, color cast, storage format and image size; the device parameters include battery voltage, location information, device number, time, protocol number, and memory size; the image acquisition module is specifically used for: detecting whether the input image is complete or not, if not (abnormality, blurring and the like), shooting again, if the image is complete, performing foreground segmentation to determine a planting area, and recording a transplanting time stamp T of each input image ₀ The planting area and the transplanting time stamp T ₀ Storing the sheets into a database; the first-level crop classification module is specifically used for: calling a planting area and corresponding time sequence information from a database, determining a color space histogram through the planting area to determine whether each crop index (height and leaf area index) in the image reaches a threshold value capable of dividing the seedling stage, if so, determining the image capable of dividing the seedling stage, updating the time sequence information of the image capable of dividing the seedling stage, and inputting a value ViT to carry out subsequent classification from a attention network; the secondary crop classification module is specifically used for: classifying images in a seedling stage, determining fruit and vegetable classified images and leaf and vegetable classified images, extracting microscopic features from the leaf and vegetable classified images, extracting planting intervals from the fruit and vegetable classified images, and calculating the maturity of crop time sequence features according to time sequence information; then, carrying out data fusion on the micro-features and the planting intervals, inputting the micro-features and the planting intervals into a CNN classifier for training, and simultaneously carrying out time sequence discrimination on classification results by utilizing time sequence information to train a growth period prediction model in sequence; the model construction module is specifically used for: the pictures are manually marked through a LabelImg tool, 8 kinds of data enhancement pretreatment including color balance, brightness adjustment, gray level balance enhancement, left and right mirror images, color enhancement, image dicing, transverse deformation and longitudinal deformation are carried out on the labels with a small number for increasing the diversity and balance of data, and by adopting a common RGB image as input, no special sensor or equipment is needed, so that the difficulty and cost of data acquisition are reduced, the quality and consistency of the data are ensured, and the distortion and loss of the data caused by the difference and faults of the equipment are avoided; furthermore, the present example is based on different views The task was to randomly extract 60% of the data as training set, 20% as validation set, and 20% as test set.

The crop growth period prediction apparatus provided by the present invention will be described below, and the crop growth period prediction apparatus described below and the crop growth period prediction method described above may be referred to correspondingly to each other.

Fig. 4 is a second schematic structural diagram of a plant growth period prediction apparatus according to the present invention, as shown in fig. 4, the plant growth period prediction apparatus includes: an image acquisition module 410 and a prediction module 420.

An image acquisition module 410, configured to acquire an image of a crop to be detected, where the image of the crop to be detected includes at least one type of crop;

the prediction module 420 is configured to input a crop image to be detected into a crop prediction model to obtain a crop category prediction result and a crop growth period prediction result;

the crop prediction model is obtained by performing multi-task training on a target classification network by taking microscopic features of crops in a sample crop image, planting intervals among different crops and time sequence information corresponding to the sample crop image as training features and taking a multi-task learning loss function as a training function; the multitasking learning loss function is determined based on the cross entropy loss function and the mean square error loss function.

According to the crop growth period prediction device provided by the embodiment of the invention, the crop prediction model obtained by training the microscopic features of each crop, the planting intervals among different crops and the time sequence information corresponding to the sample crop image is used for predicting different tasks of the crop image to be detected by using the training features and the multi-task learning loss function, so that more accurate crop category identification results and crop growth period prediction results are obtained, different types of crops can be accurately identified at the early stage of the crop seedling period, the accuracy of crop classification and growth prediction is improved, and therefore, powerful support is provided for seedling period management of greenhouse agriculture.

Fig. 5 is a schematic structural diagram of an electronic device according to the present invention, and as shown in fig. 5, the electronic device may include: processor 510, communication interface (Communications Interface) 520, memory 530, and communication bus 540, wherein processor 510, communication interface 520, memory 530 complete communication with each other through communication bus 540. Processor 510 may invoke logic instructions in memory 530 to perform a crop growth phase prediction method comprising: acquiring a crop image to be detected, wherein the crop image to be detected comprises at least one type of crop; inputting the crop image to be detected into a crop prediction model to obtain a crop type prediction result and a crop growth period prediction result; the crop prediction model is obtained by performing multi-task training on a target classification network by taking microscopic features of crops in a sample crop image, planting intervals among different crops and time sequence information corresponding to the sample crop image as training features and taking a multi-task learning loss function as a training function; the multitasking learning loss function is determined based on the cross entropy loss function and the mean square error loss function.

Further, the logic instructions in the memory 530 described above may be implemented in the form of software functional units and may be stored in a computer-readable storage medium when sold or used as a stand-alone product. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

In another aspect, the present invention also provides a computer program product comprising a computer program, the computer program being storable on a non-transitory computer readable storage medium, the computer program, when executed by a processor, being capable of performing the method of crop growth phase prediction provided by the methods described above, the method comprising: acquiring a crop image to be detected, wherein the crop image to be detected comprises at least one type of crop; inputting the crop image to be detected into a crop prediction model to obtain a crop type prediction result and a crop growth period prediction result; the crop prediction model is obtained by performing multi-task training on a target classification network by taking microscopic features of crops in a sample crop image, planting intervals among different crops and time sequence information corresponding to the sample crop image as training features and taking a multi-task learning loss function as a training function; the multitasking learning loss function is determined based on the cross entropy loss function and the mean square error loss function.

In yet another aspect, the present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, is implemented to perform the crop growth period prediction method provided by the methods above, the method comprising: acquiring a crop image to be detected, wherein the crop image to be detected comprises at least one type of crop; inputting the crop image to be detected into a crop prediction model to obtain a crop type prediction result and a crop growth period prediction result; the crop prediction model is obtained by performing multi-task training on a target classification network by taking microscopic features of crops in a sample crop image, planting intervals among different crops and time sequence information corresponding to the sample crop image as training features and taking a multi-task learning loss function as a training function; the multitasking learning loss function is determined based on the cross entropy loss function and the mean square error loss function.

The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.

From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. A method for predicting the growth period of a crop, comprising:

2. The method for predicting the growth period of crops according to claim 1, wherein the crop prediction model is obtained by:

3. The method of claim 2, wherein extracting mask features from the sample crop image and segmenting, obtaining segmented images of different crops comprises:

4. The method of predicting a growing period of a crop of claim 2, wherein determining a segmentable seedling stage image from the statistical analysis of each crop in the segmented image comprises:

5. The method of claim 2, wherein classifying the segmentable images according to a self-attention mechanism to obtain classified images comprises:

6. The method according to claim 2, wherein the extracting of the micro-features of each crop, the planting spacing between different crops, and the timing information from the classification image, respectively, comprises:

7. The method of crop growth phase prediction according to claim 2, wherein after the obtaining of the crop prediction model, the method further comprises:

8. A crop growth phase prediction apparatus, comprising:

9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the crop growth phase prediction method of any one of claims 1 to 7 when the program is executed.

10. A non-transitory computer readable storage medium having stored thereon a computer program, characterized in that the computer program when executed by a processor implements the crop growth phase prediction method of any of claims 1 to 7.