CN114821182A - Rice growth stage image recognition method - Google Patents

Rice growth stage image recognition method Download PDF

Info

Publication number
CN114821182A
CN114821182A CN202210494136.7A CN202210494136A CN114821182A CN 114821182 A CN114821182 A CN 114821182A CN 202210494136 A CN202210494136 A CN 202210494136A CN 114821182 A CN114821182 A CN 114821182A
Authority
CN
China
Prior art keywords
swin
odrl
transformer
rice
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210494136.7A
Other languages
Chinese (zh)
Inventor
吴琪
吴云志
曾涛
乐毅
张友华
余克健
胡楠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui Agricultural University AHAU
Original Assignee
Anhui Agricultural University AHAU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui Agricultural University AHAU filed Critical Anhui Agricultural University AHAU
Priority to CN202210494136.7A priority Critical patent/CN114821182A/en
Publication of CN114821182A publication Critical patent/CN114821182A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses an image recognition method for a rice growth stage, which comprises the following steps: step 1, collecting rice pictures on the spot and on the network as a data set; step 2, dividing the data set into a training set and a testing set, and performing preprocessing and data enhancement; step 3, constructing an ODRL-Swin transducer model; and step 4, setting configuration parameters of the ODRL-Swin transducer model as the optimal configuration parameters through training, and step 5, outputting a final rice prediction identification result by the ODRL-Swin transducer model. The method can improve the accuracy of rice image recognition.

Description

Rice growth stage image recognition method
Technical Field
The invention relates to the field of crop image identification methods, in particular to a rice growth stage image identification method.
Background
In order to cultivate rice better, the growth stages of rice need to be known, and corresponding planting measures need to be taken in different growth stages. In current wisdom agriculture, generally, through carrying out image acquisition at each growth stage of rice to discern the image of gathering, know what growth stage the rice is at present, and take corresponding planting measure based on the recognition result. Therefore, how to know the current growth state of the rice by identifying the images of the growth stage of the rice is an important factor influencing intelligent management and planting of the rice.
As computer technology continues to evolve, more and more technology can be integrated with other areas. In agriculture, accurate, rapid and convenient identification technology can reduce the cost of agricultural workers and has positive influence on the yield of crops. In the prior art, a convolutional network is mostly used as a model for detecting and identifying agricultural crops, the convolutional network is convenient to use, but the model mainly comprising the convolutional network has the defect of low precision. For the transform-based model, the performance is higher than that of the traditional convolution network, but a large amount of data sets are needed to be used as supports for training the model. In agriculture, data sets are often few, most data sets need to be collected and manually marked, a large amount of manpower and material resources need to be invested for manufacturing a large number of data sets, and the method has the characteristic of high cost. Therefore, a recognition model that achieves high accuracy on a small-scale data set is currently lacking.
Disclosure of Invention
The invention aims to provide an image recognition method for a rice growth stage, and aims to solve the problems that the rice recognition accuracy is low and a large number of data sets are required for training in the prior art.
In order to achieve the purpose, the technical scheme adopted by the invention is as follows:
a rice growth stage image recognition method comprises the following steps:
step 1, acquiring a plurality of image data of rice at different growth stages as a data set;
step 2, dividing the data set obtained in the step 1 into a training set and a testing set, respectively preprocessing the data in the training set and the testing set, and then respectively enhancing the data;
step 3, based on the Swin Transformer model, the Swin Transformer module is composed of Block division, linear embedding and a plurality of Swin Transformer blocks, Optimized Dense relative positioning (Optimized Dense relative positioning) is added on the Swin Transformer blocks in the Swin Transformer model, and Dense relative position loss (Dense relative positioning loss) is added in a loss function, so that the ODRL-Swin Transformer model is constructed;
an original Swin Transformer module divides a characteristic graph into non-overlapping windows according to the size of the windows, elements in the windows are called blocks, and optimization dense relative positioning enables the Swin Transformer to fuse more spatial information without additional marking information through learning relative positions between the blocks; in an ODRL-Swin transform model, optimizing dense relative positioning by densely sampling blocks in an original Swin transform window, randomly selecting two blocks in the window during sampling, wherein the two blocks are called as an embedding pair, then calculating the geometric relative position distance of the embedding pair and predicting the relative position distance of the embedding pair by using MLP (Multi-level Linear programming) to realize the collection of spatial information, and further guiding the calculation of the relative position of the embedding pair by adding spatial relative position loss on a loss function of the Swin transform;
step 4, training the ODRL-Swin transducer model constructed in the step 3 by using the training set in the step 2, and adjusting parameters of the ODRL-Swin transducer model by combining the training result and the test set in the step 2 until the parameters of the ODRL-Swin transducer model are optimal configuration parameters;
and 5, inputting the rice growth stage image data to be identified into the ODRL-Swin Transformer model under the optimal configuration parameters obtained in the step 4, and outputting the rice growth stage image prediction identification result through the ODRL-Swin Transformer model.
Further, the preprocessing in the step 2 comprises the steps of removing repeated pictures and damaged pictures of the obtained multiple image data of the rice in different growth stages, deleting unmatched information in the annotation file, and dividing the data set into a training set, a testing set and a verification set according to the proportion of 7:2: 1.
Further, the data enhancement in step 3 includes Mosaic data augmentation, random flipping, scaling, and random cropping.
Furthermore, in step 4, the detection accuracy of the Swin Transformer is greatly improved compared with that of the conventional convolutional network, but the detection accuracy of the Swin Transformer on a small-scale data set is not high, and the Swin Transformer can also have better performance on the small-scale data set by optimizing Dense relative positioning (Optimized Dense relative positioning) and Dense relative position loss (Dense relative positioning loss).
Further, when training is performed in step 5, inputting the training set data into the ODRL-Swin Transformer model to obtain an output result, performing error calculation on the output result and the test set, and then adjusting the configuration parameters of the ODRL-Swin Transformer model based on the error calculation result until the error calculation result meets the expectation, wherein the configuration parameters of the ODRL-Swin Transformer model are the optimal configuration parameters.
The method is used for detecting rice pictures to be recognized based on the ODRL-Swin Transformer model, so that the detection of different stages of rice is realized, wherein the used ODRL-Swin Transformer model is added to the Swin Transformer Block in the Swin Transformer, and the ODRL-Swin Transformer model outputs the final rice growth stage prediction recognition result.
The method can ensure the accuracy of rice image recognition, realize the accurate recognition of the rice through a small-scale data set, and efficiently and accurately recognize the growth stage of the rice, so that farmers can provide the most reasonable planting measures for the rice, and the yield of the rice is improved.
Drawings
FIG. 1 is a block flow diagram of the method of the present invention.
FIG. 2 is a structural diagram of the ODRL-Swin transducer model of the present invention.
FIG. 3 is a structural diagram of an manipulated sensitive Relative Localization section of the present invention.
Detailed Description
The invention is further illustrated with reference to the following figures and examples.
As shown in fig. 1, the image recognition method for rice growth stage of the present invention comprises the following steps:
(1) preparing a data set:
collecting picture data of rice at different growth stages on site and on line, thereby constructing a data set;
(2) processing the data set:
firstly, obtaining specific labels and the number of images of various rice growth stages from the data set obtained in the step 1, removing repeated and abnormal data, then using the data set as a training set and a testing set according to the proportion of 7:3, respectively preprocessing the data in the training set and the testing set, wherein the preprocessing comprises the steps of obtaining a plurality of image data of the rice in different growth stages, removing repeated pictures and damaged pictures, deleting unmatched information in a label file, and dividing the data set into the training set, the testing set and a verification set according to the proportion of 7:2: 1.
(3) Data enhancement:
and respectively performing data enhancement on the data in the training set and the test set, wherein the data enhancement comprises Mosaic data amplification, random flip (RandomFlip), zoom (Resize) and random crop (RandomCrop). After data enhancement, filling (Pad) is carried out on the data so as to avoid feature loss and keep the features of the rice data set.
(4) An ODRL-Swin Transformer model is constructed on the basis of Swin Transformer:
an Optimized Dense relative positioning (Optimized Dense relative positioning) is added to a SwinTransformer Block in the SwinTransformer model, and a Dense relative position loss (Dense relative positioning loss) is added to a loss function, so that the ODRL-SwinTransformer model is constructed.
The characteristic diagram obtained after Block division and linear embedding is divided into non-overlapping windows in an original Swin transducer Block according to the size of the windows, elements in the windows are called blocks, and optimization dense relative positioning enables the Swin transducer to be capable of fusing more spatial information without additional labeling information through learning relative positions among the blocks. In our ODRL-Swin Transformer model, optimizing dense relative positioning is implemented by densely sampling blocks in the original Swin Transformer window, randomly selecting two blocks in the window during sampling, and then calculating the geometric relative position distance of the embedded pair and predicting the relative position distance of the embedded pair by using MLP to realize the collection of spatial information, and the implementation method is as follows:
given an image x, a Swin Transformer Block in a Swin Transformer model divides a feature graph obtained by Block division and linear embedding by the size of a window 7 × 7, and divides an input feature graph into H × W windows with the same size, wherein H is the number of rows of the divided windows, and W is the number of columns of the divided windows. One of which may be denoted G x ={e i,j } 1≤i≤H,1≤j≤W In which e is i,j ∈R D ,e i,j Representing an embedded block, D is the dimension dividing the window space. i denotes an embedded block of an ith row, and j denotes a jth column of the second embedded block.
For each G x Randomly sampling pairs of embeddings in an optimized dense relative positioning module and for each sampling pair (e) i,j ,e p,j ) Calculating a 2D normalized translation offset (t) u ,t v ) T The calculation method is as follows:
Figure BDA0003628214520000041
(t u ,t v ) T ∈[0,1] 2 .
wherein
Figure BDA0003628214520000042
sign () is a sign function, the function returns an integer variable indicating the sign of the parameter, and if the number of the returned value is greater than 0, sign returns 1; equal to 0, return 0; if less than 0, return to-1. Wherein the sign of the number parameter determines the return value of the sign function. When i.e. return 1 represents a positive input, return-1 represents a negative input, and return 0 represents the other inputs. Alpha is the segmentation point of the segmentation function and has a default value of 4. p denotes that the other embedded block in the embedded pair is the p-th row, and h denotes that the other embedded block in the embedded pair is the h-th row.
Embedding vector e to be selected i,j And e p,h Performing connection operation, inputting the vector subjected to the connection operation into a multilayer perceptron MLP of an optimized dense relative positioning module, wherein the multilayer perceptron MLP is provided with two hidden layers and two output neurons and is used for predicting the relative distance between a position (i, j) and a position (p, h) on a grid, and d u Representing the distance on the predicted abscissa, d v Indicating the distance on the prediction ordinate. The structure is shown in fig. 3, and the calculation formula is as follows:
(d u ,d v ) T =f(e i,j ,e p,j ) T
b represents the number of pictures processed simultaneously during model parallel computation, and the Dense relative localization loss (density relative localization loss) provided by the invention is as follows:
Figure BDA0003628214520000051
Figure BDA0003628214520000052
is added to Swin transducerStandard cross entropy loss of
Figure BDA0003628214520000053
In (1). The final total loss was:
Figure BDA0003628214520000054
λ 0.5 was used initially in the present invention. The introduction of regularization loss enables Swin Transformer to learn spatial information without using additional manual labeling. The spatial relationship in the image can be effectively learned without depending on a large number of data sets.
According to the invention, an optimized affected sensitive Localization optimization Relative position module is added in a Swin transform Block in a Swin transform model, the Swin transform Block is modified into an ODRL-Swin transform Block, and the structure diagram of the modified ODRL-Swin transform network is shown in FIG. 2. The Swin Transformer can be made to learn spatial information without using additional manual labeling, by densely sampling multiple embedded pairs per image and requiring the network to predict their relative positions.
(5) Training an ODRL-Swin transducer model, and setting the optimal configuration parameters of the model:
and (3) taking the output result of the ODRL-Swin transducer model after training as the predicted recognition result of the rice growth stage, calculating the classification error and regression error of the model output result and the test set, adjusting the configuration parameters of the ODRL-Swin transducer model after training into the optimal configuration parameters according to the verification and test results, inputting the data set of the rice growth stage image to be recognized into the ODRL-Swin transducer model with the parameters adjusted into the optimal configuration parameters, and outputting the final rice predicted recognition result by the ODRL-Swin transducer model.
The rice image data are used as a data set to be identified and input into a final ODRL-Swin transform model by identifying the images of the rice in different stages, the detection result of the rice is output by the ODRL-Swin transform model, and the growth stage of the rice to which the input image belongs is obtained, so that accurate identification and detection are realized.
The embodiments of the present invention are described only for the preferred embodiments of the present invention, and not for the limitation of the concept and scope of the present invention, and various modifications and improvements made to the technical solution of the present invention by those skilled in the art without departing from the design concept of the present invention shall fall into the protection scope of the present invention, and the technical content of the present invention which is claimed is fully set forth in the claims.

Claims (5)

1. A rice growth stage image recognition method is characterized by comprising the following steps:
step 1, acquiring a plurality of image data of rice at different growth stages as a data set;
step 2, dividing the data set obtained in the step 1 into a training set and a testing set, respectively preprocessing the data in the training set and the testing set, and then respectively enhancing the data;
step 3, based on the Swin Transformer model, the Swin Transformer module is composed of Block division, linear embedding and a plurality of Swin Transformer blocks; adding optimized dense relative positioning on a SwinTransformer Block in a SwinTransformer model, and adding dense relative position loss in a loss function, thereby constructing an ODRL-Swin Transformer model;
an original Swin Transformer module divides a characteristic graph into non-overlapping windows according to the size of the windows, elements in the windows are called blocks, and optimization dense relative positioning enables the Swin Transformer to fuse more spatial information without additional marking information through learning relative positions between the blocks; in an ODRL-Swin transform model, optimizing dense relative positioning by densely sampling blocks in an original Swin transform window, randomly selecting two blocks in the window during sampling, wherein the two blocks are called as an embedding pair, then calculating the geometric relative position distance of the embedding pair and predicting the relative position distance of the embedding pair by using MLP (Multi-level Linear programming) to realize the collection of spatial information, and further guiding the calculation of the relative position of the embedding pair by adding spatial relative position loss on a loss function of the Swin transform;
step 4, training the ODRL-Swin transducer model constructed in the step 3 by using the training set in the step 2, and adjusting parameters of the ODRL-Swin transducer model by combining the training result and the test set in the step 2 until the parameters of the ODRL-Swin transducer model are optimal configuration parameters;
and 5, inputting the rice growth stage image data to be identified into the ODRL-Swin Transformer model under the optimal configuration parameters obtained in the step 4, and outputting the rice growth stage image prediction identification result through the ODRL-Swin Transformer model.
2. The method for identifying the images of the rice in the growing stage as claimed in claim 1, wherein the preprocessing in the step 2 includes the operations of removing the duplicate images and the damaged images of the obtained image data of the rice in different growing stages, deleting the unmatched information in the annotation file, and dividing the data set into a training set, a testing set and a verification set according to a ratio of 7:2: 1.
3. The method for recognizing the rice growth stage image as claimed in claim 1, wherein the data enhancement in the step 3 comprises Mosaic data expansion, random inversion, scaling and random cropping.
4. The rice growth stage image recognition method as claimed in claim 1, wherein in step 4, Swin Transformer detection accuracy is greatly improved compared with that of a conventional convolution network, but Swin Transformer detection accuracy is not high on a small-scale data set, and optimization of dense relative positioning and dense relative position loss enables Swin Transformer to perform better on the small-scale data set as well.
5. The method as claimed in claim 1, wherein during the training in step 5, the data of the training set is input into the ODRL-Swin Transformer model to obtain an output result, the output result and the test set are subjected to error calculation, and then the configuration parameters of the ODRL-Swin Transformer model are adjusted based on the error calculation result until the error calculation result is in accordance with the expectation, at which time the configuration parameters of the ODRL-Swin Transformer model are the optimal configuration parameters.
CN202210494136.7A 2022-05-05 2022-05-05 Rice growth stage image recognition method Pending CN114821182A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210494136.7A CN114821182A (en) 2022-05-05 2022-05-05 Rice growth stage image recognition method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210494136.7A CN114821182A (en) 2022-05-05 2022-05-05 Rice growth stage image recognition method

Publications (1)

Publication Number Publication Date
CN114821182A true CN114821182A (en) 2022-07-29

Family

ID=82512188

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210494136.7A Pending CN114821182A (en) 2022-05-05 2022-05-05 Rice growth stage image recognition method

Country Status (1)

Country Link
CN (1) CN114821182A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116863403A (en) * 2023-07-11 2023-10-10 仲恺农业工程学院 Crop big data environment monitoring method and device and electronic equipment

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170351790A1 (en) * 2016-06-06 2017-12-07 The Climate Corporation Data assimilation for calculating computer-based models of crop growth
CN109740483A (en) * 2018-12-26 2019-05-10 南宁五加五科技有限公司 A kind of rice growing season detection method based on deep-neural-network
US20210183045A1 (en) * 2018-08-30 2021-06-17 Ntt Data Ccs Corporation Server of crop growth stage determination system, growth stage determination method, and storage medium storing program
CN113505810A (en) * 2021-06-10 2021-10-15 长春工业大学 Pooling vision-based method for detecting weed growth cycle by using Transformer
CN113610108A (en) * 2021-07-06 2021-11-05 中南民族大学 Rice pest identification method based on improved residual error network
CN114066820A (en) * 2021-10-26 2022-02-18 武汉纺织大学 Fabric defect detection method based on Swin-transducer and NAS-FPN

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170351790A1 (en) * 2016-06-06 2017-12-07 The Climate Corporation Data assimilation for calculating computer-based models of crop growth
US20210183045A1 (en) * 2018-08-30 2021-06-17 Ntt Data Ccs Corporation Server of crop growth stage determination system, growth stage determination method, and storage medium storing program
CN109740483A (en) * 2018-12-26 2019-05-10 南宁五加五科技有限公司 A kind of rice growing season detection method based on deep-neural-network
CN113505810A (en) * 2021-06-10 2021-10-15 长春工业大学 Pooling vision-based method for detecting weed growth cycle by using Transformer
CN113610108A (en) * 2021-07-06 2021-11-05 中南民族大学 Rice pest identification method based on improved residual error network
CN114066820A (en) * 2021-10-26 2022-02-18 武汉纺织大学 Fabric defect detection method based on Swin-transducer and NAS-FPN

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
YAHUI LIU等: "Efficient Training of Visual Transformers with Small Datasets", Retrieved from the Internet <URL:https://arxiv.org/abs/2106.03746v2> *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116863403A (en) * 2023-07-11 2023-10-10 仲恺农业工程学院 Crop big data environment monitoring method and device and electronic equipment
CN116863403B (en) * 2023-07-11 2024-01-02 仲恺农业工程学院 Crop big data environment monitoring method and device and electronic equipment

Similar Documents

Publication Publication Date Title
CN106874688B (en) Intelligent lead compound based on convolutional neural networks finds method
CN111739075A (en) Deep network lung texture recognition method combining multi-scale attention
CN111582401B (en) Sunflower seed sorting method based on double-branch convolutional neural network
CN101520847A (en) Pattern identification device and method
CN110245683B (en) Residual error relation network construction method for less-sample target identification and application
CN110287806A (en) A kind of traffic sign recognition method based on improvement SSD network
CN113256636A (en) Bottom-up parasite species development stage and image pixel classification method
CN114155474A (en) Damage identification technology based on video semantic segmentation algorithm
CN111652039A (en) Hyperspectral remote sensing ground object classification method based on residual error network and feature fusion module
CN112036249A (en) Method, system, medium and terminal for end-to-end pedestrian detection and attribute identification
CN115984543A (en) Target detection algorithm based on infrared and visible light images
CN115311502A (en) Remote sensing image small sample scene classification method based on multi-scale double-flow architecture
CN114821182A (en) Rice growth stage image recognition method
CN111242028A (en) Remote sensing image ground object segmentation method based on U-Net
Shi et al. Vision-based apple quality grading with multi-view spatial network
CN112949723A (en) Endometrium pathology image classification method
CN116630700A (en) Remote sensing image classification method based on introduction channel-space attention mechanism
CN114519402A (en) Citrus disease and insect pest detection method based on neural network model
CN115511798A (en) Pneumonia classification method and device based on artificial intelligence technology
Jin et al. Intelligent tea sorting system based on computer vision
CN111144422A (en) Positioning identification method and system for aircraft component
CN117593514B (en) Image target detection method and system based on deep principal component analysis assistance
CN117765410B (en) Remote sensing image double-branch feature fusion solid waste identification method and system and electronic equipment
CN116258914B (en) Remote Sensing Image Classification Method Based on Machine Learning and Local and Global Feature Fusion
CN111881743B (en) Facial feature point positioning method based on semantic segmentation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination