CN117351356A - Field crop and near-edge seed disease detection method under unmanned aerial vehicle visual angle - Google Patents

Field crop and near-edge seed disease detection method under unmanned aerial vehicle visual angle Download PDF

Info

Publication number
CN117351356A
CN117351356A CN202311364873.6A CN202311364873A CN117351356A CN 117351356 A CN117351356 A CN 117351356A CN 202311364873 A CN202311364873 A CN 202311364873A CN 117351356 A CN117351356 A CN 117351356A
Authority
CN
China
Prior art keywords
feature
image
convolution
representing
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202311364873.6A
Other languages
Chinese (zh)
Other versions
CN117351356B (en
Inventor
张建华
潘攀
周国民
胡林
王健
樊景超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sanya National Academy Of Southern Propagation Chinese Academy Of Agricultural Sciences
Agricultural Information Institute of CAAS
Original Assignee
Sanya National Academy Of Southern Propagation Chinese Academy Of Agricultural Sciences
Agricultural Information Institute of CAAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sanya National Academy Of Southern Propagation Chinese Academy Of Agricultural Sciences, Agricultural Information Institute of CAAS filed Critical Sanya National Academy Of Southern Propagation Chinese Academy Of Agricultural Sciences
Priority to CN202311364873.6A priority Critical patent/CN117351356B/en
Publication of CN117351356A publication Critical patent/CN117351356A/en
Application granted granted Critical
Publication of CN117351356B publication Critical patent/CN117351356B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/188Vegetation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A40/00Adaptation technologies in agriculture, forestry, livestock or agroalimentary production
    • Y02A40/10Adaptation technologies in agriculture, forestry, livestock or agroalimentary production in agriculture

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention relates to the field of field crop disease detection, in particular to a field crop and related seed disease detection method under the view angle of an unmanned aerial vehicle; according to the invention, by introducing a mode of dynamically adjusting the space receptive field of the unmanned aerial vehicle, the detection of small-size disease targets under the visual angle of the unmanned aerial vehicle is better realized, the GSConv mixed convolution module is introduced into the feature pyramid, so that the calculated amount and the parameter amount of a model are reduced, the model is more suitable for being carried on the hardware of the unmanned aerial vehicle for running, and by further introducing a rotatable marking box and matching with the calculation of the feature confidence coefficient in the marking box, the interference caused by excessive introduced background information is reduced while the positioning detection of the disease in any direction is realized, and the accuracy and the robustness of the detection of crops and near-edge diseases under the visual angle of the unmanned aerial vehicle in the field are improved; the problem of in prior art through unmanned aerial vehicle detection small target and the lower rate of accuracy when intensive target on the crop under the complex environment in field is solved.

Description

Field crop and near-edge seed disease detection method under unmanned aerial vehicle visual angle
Technical Field
The invention relates to the field of field crop disease detection, in particular to a field crop and related seed disease detection method under an unmanned aerial vehicle visual angle.
Background
In the traditional disease resistance identification process, scientific researchers need to observe and detect and count the disease conditions of crops and related species one by one in a relatively remote disease experimental field, and the method has the characteristics of time and labor consumption and strong subjectivity, and along with the increasing demand of large-scale crop disease resistance identification, unmanned aerial vehicles are also used for intelligently detecting the large-scale crops and related species diseases in the field at present so as to realize accurate positioning of the crop diseases in the field, and provide a basis for intelligent evaluation and identification of the disease resistance of the crop diseases.
However, due to complex backgrounds such as water, silt, algae, weeds, bird feces, shadow reflection and the like in the paddy field, and the disease size under the view angle of the unmanned aerial vehicle is smaller, the accuracy of the existing technology in detecting small targets and dense targets of crop diseases in the complex field environment is lower.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a field crop and a near-edge disease detection method thereof under the view angle of an unmanned aerial vehicle, which solves the problem of lower accuracy rate when the unmanned aerial vehicle detects small targets and dense targets on crops in a complex field environment in the prior art.
In order to achieve the above purpose, the present invention provides the following technical solutions:
a field crop and near-edge seed disease detection method under an unmanned aerial vehicle visual angle comprises the following steps:
s1, acquiring an initial image of a field crop, and performing scaling and normalization processing on the initial image to obtain a preprocessed image;
s2, sequentially passing the preprocessed image through a plurality of self-adaptive convolution kernels to obtain a plurality of convolution images, extracting spatial relations of the plurality of convolution images to obtain attention features, and multiplying the attention features and the preprocessed image element by element to obtain feature images;
s3, carrying out feature extraction on a P5 feature layer of the feature image through a GSConv module, carrying out convolution operation on the extracted features and a P4 feature layer and a P3 feature layer of the feature image step by step through an Upsampling module and a VOV0GSCSP module to obtain a P3 out feature image, and combining the P3 out feature image with the P4 feature layer and the P5 feature layer of the feature image step by step to generate a P4 out feature image and a P5 out feature image respectively;
s4, decoding the P3_out feature map, the P4_out feature map and the P5_out feature map to obtain a plurality of marking boxes, and carrying out parameter adjustment on the marking boxes by adopting a KLD loss function;
s5, setting a confidence coefficient broad value, and deleting a frame with the confidence coefficient smaller than the confidence coefficient broad value by using a BCE loss function;
and S6, removing overlapped marked boxes in the pre-output image by adopting a non-maximum value suppression algorithm to generate a final image, and outputting the final image.
Preferably, in step S2, the method specifically includes the steps of:
s21, acquiring a pretreatment image of a field crop;
s22, setting the kernel size, the expansion rate and the receptive field size of the depth separable convolution of the preprocessed image; the kernel size k, the dilation rate d and the dilation of the receptive field RF of the ith depth separable convolution are defined as follows:
k i-1 ≤k i ,d 1 =1,d i-1 <d i ≤RF i-1
RF 1 =k 1 ,RF i =d i (k i -1)+RF i-1
in the above, k i And k i-1 The kernel sizes, d, representing the i-th and i-1 th depth-separable convolutions, respectively i And d i-1 The expansion rate magnitudes, RF, of the ith and ith-1 th depth-separable convolutions, respectively i And RF i-1 The size of the receptive field of the ith and ith-1 th depth separable convolutions are shown, respectively;
s23, sequentially convoluting the preprocessed images according to the defined kernel size, the expansion rate and the receptive field size, and performing space feature vector fusion on the convolved images to generate a plurality of convolved images; the calculation formula of the convolution image is as follows:
in the above, U i For the ith depth separable convolved image, U 0 And X is the number of pre-processed images,for image U i Proceeding with a kernel size k i And expansion ratio d i Is a deep convolution operation of>For the ith image feature fused by the spatial feature vector channel, generating an ith convolution image,/for the ith image feature fused by the spatial feature vector channel>Representing an image U after deconvolution of the ith depth i Spatial characterization through a 1 x 1 convolutional layerChannel fusion of quantity, N represents the number of convolution kernels;
s24, connecting the plurality of convolution images to generate a related image; the expression of the associated image is:
in the above-mentioned method, the step of,representing an associated image +.>Representing an ith convolution image;
s25, extracting spatial relations of the associated images through average pooling and maximum pooling to obtain spatial pooling characteristics;
s26, converting the space pooling characteristics after the average pooling and the maximum pooling into a plurality of space attention diagrams; the expression for converting the spatial pooling feature into N spatial attention patterns is:
in the above-mentioned method, the step of,representing spatial attention, fun>Representing the transformation of 2 channels, i.e., 2 spatial pooling features that are averaged pooled and maximally pooled, into N spatial attention patterns;
s27, activating a plurality of space attention patterns through a sigmoid activation function to generate an activated image;
s28, performing mask weighting on the features in the plurality of activated images, and fusing through a convolution layer to obtain concerned features; the expression for the feature of interest is:
in the above equation, S represents an image of a feature of interest,representing convolution operations +.>Representing the i-th activation image,for the ith image feature fused by the space feature vector channel, N represents the number of convolution kernels;
s29, performing element-by-element multiplication on the attention feature and the input feature to obtain a feature image.
Preferably, in step S3, the method specifically includes the steps of:
s31, generating a P5 feature map by passing a P5 feature layer of the feature image through a GSConv module;
s32, combining the P5 feature map with a feature layer P4 of the feature image through an Upsampling module, and extracting features of the feature layer P4 by using a VOV0GSCSP module to generate a P4 feature map;
s33, combining the P4 feature map with a feature layer P3 of the feature image through an Upsampling module, and extracting features of the P4 feature map by using a VOV0GSCSP module to obtain a P3 out feature map;
s34, combining the P3 out feature map with the P4 feature layer of the feature image after passing through the GSConv module, and extracting features of the P4 out feature map by using the VOV0GSCSP module;
and S35, combining the P4_out feature map with a P5 feature layer of the feature image after passing through a GSConv module, and extracting features of the P5_out feature map by using a VOV0GSCSP module.
Preferably, in step S3, the specific steps of the GSConv module operation are as follows:
s301, setting the channel number of the input characteristic as C1;
s302, carrying out standard convolution on the characteristic image and a C1 channel to generate a C2/2 characteristic vector;
s303, performing depth separable convolution on the feature image to generate another C2/2 feature vector;
s304, connecting the two C2/2 feature vectors through a Concat module to obtain fusion features;
s305, the fusion characteristic is subjected to a shuffle operation to obtain an output characteristic.
Preferably, in step S4, the method specifically includes the steps of:
s41, decoding the P3_out characteristic diagram, the P4_out characteristic diagram and the P5_out characteristic diagram to obtain a plurality of marked boxes; the expression of the marking box parameter is:
H=(x,y,w,h,θ)
in the above formula, H represents the sign of the marked square, x, y, w, H and θ represent the abscissa of the marked square center, the ordinate of the marked square center, the width of the marked square, the height of the marked square and the rotation angle of the marked square, respectively, wherein, -90 DEG is more than or equal to θ is more than or equal to 90 DEG, the width w of the marked square is defined as the longest side, and θ represents the angle range through which the x-axis rotates to the w side.
S42, parameter adjustment is carried out on the marking box by adopting the KLD loss function.
Preferably, in step S43, the calculation formula of the KLD loss function is:
wherein,
wherein,
on the upper partIn,weight representing KL divergence, τ being the adjustment factor, f (D) representing a value related to D kl (N p' ∥N t' ) Non-linear function of N p' Representing the predicted disease characteristic distribution result, N t' Representing the actual disease characteristic distribution result, D kl (N p' ∥N t' ) Representation of N p' And N t' KL divergence between two probability distributions, μ p Sum mu t Respectively N p' And N t' T is the sign of the transposed matrix, < >>And->Respectively representing the inverse and the quadratic evolution of the positive definite symmetric matrix of the actual disease characteristic distribution, M is a linear transformation matrix, ln is the sign of the natural logarithm, Σ P Sum sigma t Respectively N p' And N t' θ represents the angular range through which the x-axis rotates to the w-edge of the mark box, w and h representing the width and height of the mark box, respectively.
Preferably, in step S5, the method specifically includes the steps of:
s51, setting a confidence value broad value T;
s52, sequentially calculating the confidence coefficient of each marking box through a BCE function;
s53, judging the confidence coefficient and the confidence coefficient wide value corresponding to each marking square frame in the feature image in sequence;
if it isReserving a marked box corresponding to the confidence coefficient to generate a pre-output image;
if it isThe marked box corresponding to the confidence level is deleted to generate a pre-output image.
Compared with the prior art, the invention provides a field crop and near-edge seed disease detection method under the view angle of an unmanned aerial vehicle, which has the following beneficial effects:
1. according to the invention, by means of dynamically adjusting the space receptive field, the detection of small-size disease targets under the view angle of the unmanned aerial vehicle is better realized, meanwhile, the GSConv mixed convolution module is introduced into the feature pyramid, so that the calculated amount and the parameter amount of a model are reduced, the model is more suitable for being carried on the hardware of the unmanned aerial vehicle for operation, the rotatable marking square frame is further introduced, the weight calculation of the divergence of the feature KL in the marking square frame is matched, and the purpose of detecting the disease features with higher probability is realized by comparing the disease features with the disease features in the training model.
2. According to the invention, by introducing a rotatable marking box construction method, the disease positioning detection in any direction is realized, the interference caused by excessive introduced background information is reduced, the capability of extracting disease characteristic information by a network is enhanced, the problem of low disease target detection accuracy under the field unmanned aerial vehicle visual angle is solved, and the accuracy and the robustness of the detection of crops and near-edge disease under the field unmanned aerial vehicle visual angle are improved.
3. The invention allows the model to adaptively use different large kernels by defining the rule that the size of the convolution kernel is automatically and dynamically determined according to the input image, effectively weights the characteristics processed by a series of convolution kernels and spatially combines the characteristics, thereby dynamically adjusting the receptive field of each target in the space according to the needs and better meeting the requirements of detecting crops and diseases of near-edge seeds under the complex field conditions.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute an undue limitation to the application. In the drawings:
FIG. 1 is a flow chart of a method for detecting diseases of field crops and closely related seeds under the view angle of an unmanned aerial vehicle;
FIG. 2 is a flow chart of a method of obtaining a feature image according to the present invention;
FIG. 3 is a schematic diagram of the detection method of the present invention;
FIG. 4 is a schematic diagram of the present invention for generating P3_out, P4_out and P5_out feature maps;
FIG. 5 is a schematic representation of the marking of disease features using marking boxes in the present invention.
Detailed Description
In order that the above-recited objects, features and advantages of the present invention will become more readily apparent, a more particular description of the invention will be rendered by reference to the appended drawings and appended detailed description. Therefore, the implementation process of how to apply the technical means to solve the technical problems and achieve the technical effects can be fully understood and implemented.
Those of ordinary skill in the art will appreciate that all or a portion of the steps in a method of implementing the following embodiments may be implemented by a program to instruct related hardware and thus the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
Fig. 1-5 are diagrams of an embodiment of the present invention, in order to solve the problems of poor detection effect and incapacity of both detection accuracy and speed of dense targets and small targets, the requirements of detection of crops and diseases near the same are met, by dynamically adjusting the space receptive field of the targets, the detection of small-size disease targets under the view angle of an unmanned aerial vehicle is better realized, meanwhile, a GSConv mixed convolution module is introduced into a feature pyramid, so as to reduce the calculated amount and the parameter amount of a model, and a rotating frame detection method with the dimension of a rotating angle theta is provided, which reduces the interference caused by excessive introduced background information while realizing the positioning detection of diseases in any direction, and enhances the capability of extracting the feature information of diseases on the network, so as to solve the problem of low detection accuracy of the disease targets under the view angle of the unmanned aerial vehicle in the field, and the detection accuracy and robustness of the diseases near the crops under the view angle of the unmanned aerial vehicle in the field are improved.
In order to realize the thought, the invention provides a method for detecting diseases of field crops and near-edge seeds thereof under the view angle of an unmanned aerial vehicle, which comprises the following steps:
s1, acquiring an initial image of a field crop, and performing scaling and normalization processing on the initial image to obtain a preprocessed image;
the formula for image scaling of the initial image is:
in the above formula, x and y are x-axis and y-axis coordinates after the initial image scaling, x 0 And y 0 The x-axis and y-axis coordinates of the initial image before scaling, S x And S is y Scaling coefficients of the initial image in x-axis and y-axis directions, respectively;
the formula for normalizing the initial image is:
wherein,
on the upper partIn,the pixel value of the ith neuron of the neural network layer obtained by normalizing the initial image is constant, and gamma and beta are constants +.>Is of intermediate value, z i For the pixel value of the ith neuron of the initial image neural network layer, u represents the mean, σ is the standard deviation, n represents the number of neurons in the neural network layer, by input z to each neuron or channel i Normalization was performed to ensure that they had a standard normal distribution with a mean of 0 and a variance of 1, which helps to speed training of the neural network and improve gradient flow, improving performance.
S2, sequentially passing the preprocessed image through a plurality of self-adaptive convolution kernels to obtain a plurality of convolution images, extracting spatial relations of the plurality of convolution images to obtain attention features, and multiplying the attention features and the preprocessed image element by element to obtain feature images;
in order to dynamically acquire convolution kernels of different sizes according to different sizes of the preprocessed images so as to realize convolution on the preprocessed images, thereby obtaining a receptive field capable of being dynamically adjusted, in step S2, the method specifically comprises the following steps:
s21, acquiring a pretreatment image of a field crop;
s22, setting the kernel size, the expansion rate and the receptive field size of the depth separable convolution of the preprocessed image; the kernel size k, the dilation rate d and the dilation of the receptive field RF of the ith depth separable convolution are defined as follows:
k i-1 ≤k i ,d 1 =1,d i-1 <d i ≤RF i-1
RF 1 =k 1 ,RF i =d i (k i -1)+RF i-1
in the above, k i And k i-1 The kernel sizes, d, representing the i-th and i-1 th depth-separable convolutions, respectively i And d i-1 The expansion rate magnitudes, RF, of the ith and ith-1 th depth-separable convolutions, respectively i And RF i-1 The size of the receptive field of the ith and ith-1 th depth separable convolutions are shown, respectively;
s23, sequentially convoluting the preprocessed images according to the defined kernel size, the expansion rate and the receptive field size, and performing space feature vector fusion on the convolved images to generate a plurality of convolved images; the calculation formula of the convolution image is as follows:
in the above, U i For the ith depth separable convolved image, U 0 And X is the number of pre-processed images,for image U i Proceeding with a kernel size k i And expansion ratio d i The input preprocessing image sequentially passes through the convolution kernels, and rich background information features are acquired from different areas of the preprocessing image X, < + >>For the ith image feature fused by the spatial feature vector channel, generating an ith convolution image,/for the ith image feature fused by the spatial feature vector channel>Representing an image U after deconvolution of the ith depth i Channel fusion of space feature vectors is carried out through a 1 multiplied by 1 convolution layer, and N represents the number of convolution kernels;
the increased kernel size and expansion ratio ensures fast expansion of the receptive field while setting an upper limit on the expansion ratio to avoid the expansion convolution introducing gaps between feature maps, which makes subsequent kernel selection easier while also significantly reducing the number of parameters, while using a series of split-depth convolutions with different receptive fields to obtain context information features in different ranges, allowing channel mixing for each spatial feature vector.
S24, connecting the plurality of convolution images to generate a related image; the expression of the associated image is:
in the above-mentioned method, the step of,representing an associated image +.>Representing an ith convolution image;
s25, extracting spatial relations of the associated images through average pooling and maximum pooling to obtain spatial pooling characteristics; the expressions for average pooling and maximum pooling of the associated images are respectively:
in the above, SA avg Representing the image obtained after the averaging pooling,representing +.>Average pooling operation, SA max Representing the image after maximum pooling, < >>Representing +.>Performing maximum pooling operation;
s26, converting the space pooling characteristics after the average pooling and the maximum pooling into a plurality of space attention diagrams; the expression for converting the spatial pooling feature into N spatial attention patterns is:
in the above-mentioned method, the step of,representing spatial attention, fun>Representing the transformation of 2 channels, i.e., 2 spatial pooling features that are averaged pooled and maximally pooled, into N spatial attention patterns;
s27, activating a plurality of space attention patterns through a sigmoid activation function to generate an activated image; the expression for the activation image is:
in the above-mentioned method, the step of,representing the ith activation image,/->For the ith space attention, the characteristic of a large convolution kernel sequence generated by decomposing an activation function, and sigma (+) is a sigmoid activation function;
s28, performing mask weighting on the features in the plurality of activated images, and fusing through a convolution layer to obtain concerned features; the expression for the feature of interest is:
in the above equation, S represents an image of a feature of interest,representing convolution operations +.>Representing the i-th activation image,for the ith image feature fused by the space feature vector channel, N represents the number of convolution kernels;
s29, performing element-by-element multiplication on the attention feature and the input feature to obtain a feature image; the expression of the feature image is:
Y=X·S
in the above expression, Y represents a feature image, X represents a preprocessed image, and S represents an image of a feature of interest.
A spatial kernel selection mechanism is adopted to enhance the attention capability of the network to the most relevant spatial context area, and feature mapping is selected from the spatial kernel of large convolution with different scales, so that the mechanism helps the network to better focus on important spatial context information, and the capability of the network to focus on the most relevant spatial background area of a detection target is improved.
S3, carrying out feature extraction on a P5 feature layer of the feature image through a GSConv module, carrying out convolution operation on the extracted features and a P4 feature layer and a P3 feature layer of the feature image step by step through an Upsampling module and a VOV0GSCSP module to obtain a P3 out feature image, and combining the P3 out feature image with the P4 feature layer and the P5 feature layer of the feature image step by step to generate a P4 out feature image and a P5 out feature image respectively;
in order to infiltrate the information generated by the standard convolution into the information generated by the depth separable convolution, the hidden connection of each channel is reserved as much as possible with lower complexity, and the balance of the model accuracy and the speed is better realized, in step S3, the method specifically comprises the following steps:
s31, generating a P5 feature map by passing a P5 feature layer of the feature image through a GSConv module;
s32, combining the P5 feature map with a feature layer P4 of the feature image through an Upsampling module, and extracting features of the feature layer P4 by using a VOV0GSCSP module to generate a P4 feature map;
s33, combining the P4 feature map with a feature layer P3 of the feature image through an Upsampling module, and extracting features of the P4 feature map by using a VOV0GSCSP module to obtain a P3 out feature map;
s34, combining the P3 out feature map with the P4 feature layer of the feature image after passing through the GSConv module, and extracting features of the P4 out feature map by using the VOV0GSCSP module;
and S35, combining the P4_out feature map with a P5 feature layer of the feature image after passing through a GSConv module, and extracting features of the P5_out feature map by using a VOV0GSCSP module.
In order to reduce the number of convolutions of the GSConv module, the number of channels of the first convolution needs to be doubled, and then the convolution result is divided into halves on the channels, so as to achieve the purpose of reducing the number of convolutions once, in step S3, the specific steps of the GSConv module operation are as follows:
s301, setting the channel number of the input characteristic as C1;
s302, carrying out standard convolution on the characteristic image and a C1 channel to generate a C2/2 characteristic vector;
s303, performing depth separable convolution on the feature image to generate another C2/2 feature vector;
s304, connecting the two C2/2 feature vectors through a Concat module to obtain fusion features;
s305, the fusion feature is subjected to a shuffle operation to obtain an output feature; the number of channels finally output is C2.
The mathematical expression of the GSConv module operation is as follows:
X out =f shuffle (cat(f conv (X),f dsc (f conv (X))))
in the above,X out Representing the fusion characteristics of the output, f shuffle (ζ) indicates a shuffle operation, f conv (≡) represents a standard convolution, f dsc (≡) represents a depth separable convolution and cat represents feature fusion operations by the Concat module.
The input characteristic image is divided into two branches after one convolution, one branch is kept unchanged, the other branch carries out GSconv module operation, information of the two branches is output to one convolution operation after being subjected to a Concat module, and the purpose of saving operation amount is achieved.
The concat module is an operation for merging multiple feature maps (feature maps), and assuming that there are two feature maps a and B with dimensions (H1, W1, C1) and (H2, W2, C2), respectively, where H, W, C represents the height, width, and number of channels, respectively, then these two feature maps are connected in the channel dimension when the concat operation is performed, and the dimensions of the connected feature maps are (H1, W1, c1+c2), which means that the new feature map contains all channel information of the two original feature maps in the channel dimension, and mathematically, the concat operation can be expressed as: the con-connected_feature_map=con (a, B), where a and B are two feature maps to be connected, and in practical implementation, the con operation will connect the two feature maps in the channel dimension, i.e. after splicing the channel dimension of B to the channel dimension of a, a new feature map is generated, which will contain all the channel information of the original feature maps a and B.
S4, decoding the P3_out feature map, the P4_out feature map and the P5_out feature map to obtain a plurality of marking boxes, and carrying out parameter adjustment on the marking boxes by adopting a KLD loss function;
in order to solve the problem of inaccurate detection caused by the fact that crops and near-edge diseases of the crops are at any angle under the view angle of the unmanned aerial vehicle, a rotating frame detection method is provided, the detection of the disease in any direction is realized, meanwhile, the interference caused by excessive introduced background information is reduced, the capability of extracting disease characteristic information of a network is enhanced, and in the step S4, the method specifically comprises the following steps:
s41, decoding the P3_out characteristic diagram, the P4_out characteristic diagram and the P5_out characteristic diagram to obtain a plurality of marked boxes; the expression of the marking box parameter is:
H=(x,y,w,h,θ)
in the above formula, H represents the sign of the marked square, x, y, w, H and θ represent the abscissa of the marked square center, the ordinate of the marked square center, the width of the marked square, the height of the marked square and the rotation angle of the marked square, wherein, -90 DEG is more than or equal to 90 DEG, in order to prevent ambiguity, a long-side definition method is used, the width w of the marked square is defined as the longest side, the adjacent sides are H, and θ represents the angle range through which the x-axis rotates to the w-side.
S42, carrying out parameter adjustment on the marking box by adopting a KLD loss function; the calculation formula of the KLD loss function is:
wherein,
wherein,
in the above-mentioned method, the step of,a weight representing KL divergence, τ being an adjustment factor for adjusting the weight of KL divergence, a larger value of τ increasing the weight of KL divergence, f (D) representing a factor related to D kl (N p' ∥N t' ) For adjusting the weight of KL divergence in different stages in the model training process, for transforming the distance D in the formula to make the loss smoother, for measuring the difference between the output of the generated model and the target distribution, the KLD loss function consists of three parts of a Reg part, an Obj part and a Cls part, and the final loss function consists ofThe three parts are combined, the KL divergence is a loss function used by the Reg part for measuring the difference between a predicted result and a real label, a feedback signal is provided for telling a model how far away from an actual target the prediction of the model is, the model can gradually adjust parameters of the model according to the gradient of the loss function through optimization algorithms such as back propagation, gradient descent and the like, and N p' Representing the predicted disease characteristic distribution result, N t' Representing the actual disease characteristic distribution result, D kl (N p' ∥N t' ) Representation of N p' And N t' KL divergence between two probability distributions, μ p Sum mu t Respectively N p' And N t' T is the sign of the transposed matrix, < >>And->Respectively representing the inverse and the quadratic evolution of a positive definite symmetric matrix of the actual disease characteristic distribution, tr represents the sum of diagonal elements of the matrix, M is a linear transformation matrix, ln is the sign of the natural logarithm, Σ P Sum sigma t Respectively N p' And N t' θ represents the angular range through which the x-axis rotates to the w-edge of the mark box, w and h represent the width and height of the mark box, respectively;
s5, setting a confidence coefficient broad value, and deleting a frame with the confidence coefficient smaller than the confidence coefficient broad value by using a BCE loss function;
according to the confidence coefficient of each marked box calculated by the BCE function, the probability that the features in the box are disease features can be obtained, and at the moment, the box with higher disease feature probability needs to be selected, and in step S5, the method specifically comprises the following steps:
s51, setting a confidence value broad value T; the confidence value of the wide value T is generally larger than or equal to 0.5;
s52, sequentially calculating the confidence coefficient of each marking box through a BCE function; the formula for the BCE function is:
L BCE =-[y*log(p)+(1-y)log(1-p)]
in the above, L BCE Representing BCE functions, the result of the calculation is confidence, y represents the true label, its binary label value is 0 or 1, p represents the prediction probability, and represents the probability of belonging to class 1 in the label box, this formula is typically used to measure the difference between the model's predictions for the binary tasks and the true label, this calculation of the loss function is-y log (p) if it belongs to class 1 (y=1), and if it belongs to class 0 (y=0), the calculation of the loss function is- (1-y) log (1-p), the goal being to minimize this loss function, so that the model's prediction probability p is as close as possible to the true label y.
S53, judging the confidence coefficient and the confidence coefficient wide value corresponding to each marking square frame in the feature image in sequence;
if it isReserving a marked box corresponding to the confidence coefficient to generate a pre-output image;
if it isDeleting the marked box corresponding to the confidence coefficient to generate a pre-output image;
thus, a plurality of pre-output images with disease features are obtained, the disease features are marked out by the marking boxes in the output images, and the detection personnel can conveniently and quickly identify the disease positions.
And S6, removing overlapped marked boxes in the pre-output image by adopting a non-maximum value suppression algorithm to generate a final image, and outputting the final image.
According to the invention, the dynamic adjustment receptive field is introduced into the backbox, the GSConv module and the VoVGSCSP module are introduced into the Neck layer, the adaptability of the modules is improved to realize light weight, and the crop and the disease detection model of the near-edge species thereof are established by constructing a rotatable marking box and calculating the confidence of the marking box, so that the high-precision identification of the disease characteristics is realized.
It should be noted that, in the system provided in the foregoing embodiment, when implementing the functions thereof, only the division of the foregoing functional modules is used as an example, in practical application, the foregoing functional allocation may be implemented by different functional modules, that is, the internal structure of the device is divided into different functional modules, so as to implement all or part of the functions described above. In addition, the system and method embodiments provided in the foregoing embodiments belong to the same concept, and specific implementation processes of the system and method embodiments are detailed in the method embodiments, which are not repeated herein.
The foregoing embodiments have been presented in a detail description of the invention, and are presented herein with a particular application to the understanding of the principles and embodiments of the invention, the foregoing embodiments being merely intended to facilitate an understanding of the method of the invention and its core concepts; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present invention, the present description should not be construed as limiting the present invention in view of the above.

Claims (7)

1. The method for detecting the disease of the field crops and the closely related seeds under the visual angle of the unmanned aerial vehicle is characterized by comprising the following steps:
s1, acquiring an initial image of a field crop, and performing scaling and normalization processing on the initial image to obtain a preprocessed image;
s2, sequentially passing the preprocessed image through a plurality of self-adaptive convolution kernels to obtain a plurality of convolution images, extracting spatial relations of the plurality of convolution images to obtain attention features, and multiplying the attention features and the preprocessed image element by element to obtain feature images;
s3, carrying out feature extraction on a P5 feature layer of the feature image through a GSConv module, carrying out convolution operation on the extracted features and a P4 feature layer and a P3 feature layer of the feature image step by step through an Upsampling module and a VOV0GSCSP module to obtain a P3 out feature image, and combining the P3 out feature image with the P4 feature layer and the P5 feature layer of the feature image step by step to generate a P4 out feature image and a P5 out feature image respectively;
s4, decoding the P3_out feature map, the P4_out feature map and the P5_out feature map to obtain a plurality of marking boxes, and carrying out parameter adjustment on the marking boxes by adopting a KLD loss function;
s5, setting a confidence coefficient broad value, and deleting a frame with the confidence coefficient smaller than the confidence coefficient broad value by using a BCE loss function;
and S6, removing overlapped marked boxes in the pre-output image by adopting a non-maximum value suppression algorithm to generate a final image, and outputting the final image.
2. The method according to claim 1, wherein in step S2, the method specifically comprises the steps of:
s21, acquiring a pretreatment image of a field crop;
s22, setting the kernel size, the expansion rate and the receptive field size of the depth separable convolution of the preprocessed image; the kernel size k, the dilation rate d and the dilation of the receptive field RF of the ith depth separable convolution are defined as follows:
k i-1 ≤k i ,d 1 =1,d i-1 <d i ≤RF i-1
RF 1 =k 1 ,RF i =d i (k i -1)+RF i-1
in the above, k i And k i-1 The kernel sizes, d, representing the i-th and i-1 th depth-separable convolutions, respectively i And d i-1 The expansion rate magnitudes, RF, of the ith and ith-1 th depth-separable convolutions, respectively i And RF i-1 The size of the receptive field of the ith and ith-1 th depth separable convolutions are shown, respectively;
s23, sequentially convoluting the preprocessed images according to the defined kernel size, the expansion rate and the receptive field size, and performing space feature vector fusion on the convolved images to generate a plurality of convolved images; the calculation formula of the convolution image is as follows:
for i in[1,N]
in the above, U i For the ith depth separable convolved image, U 0 And X is the number of pre-processed images,for image U i Proceeding with a kernel size k i And expansion ratio d i Is a deep convolution operation of>For the ith image feature fused by the spatial feature vector channel, generating an ith convolution image,/for the ith image feature fused by the spatial feature vector channel>Representing an image U after deconvolution of the ith depth i Channel fusion of space feature vectors is carried out through a 1 multiplied by 1 convolution layer, and N represents the number of convolution kernels;
s24, connecting the plurality of convolution images to generate a related image; the expression of the associated image is:
in the above-mentioned method, the step of,representing an associated image +.>Representing an ith convolution image;
s25, extracting spatial relations of the associated images through average pooling and maximum pooling to obtain spatial pooling characteristics;
s26, converting the space pooling characteristics after the average pooling and the maximum pooling into a plurality of space attention diagrams; the expression for converting the spatial pooling feature into N spatial attention patterns is:
in the above-mentioned method, the step of,representing spatial attention, fun>Representing the transformation of 2 channels, i.e., 2 spatial pooling features that are averaged pooled and maximally pooled, into N spatial attention patterns;
s27, activating a plurality of space attention patterns through a sigmoid activation function to generate an activated image;
s28, performing mask weighting on the features in the plurality of activated images, and fusing through a convolution layer to obtain concerned features; the expression for the feature of interest is:
in the above equation, S represents an image of a feature of interest,representing convolution operations +.>Representing the ith activation image,/->Is the ithThe image features are fused through the space feature vector channels, and N represents the number of convolution kernels;
s29, performing element-by-element multiplication on the attention feature and the input feature to obtain a feature image.
3. The method according to claim 1, wherein in step S3, the method specifically comprises the steps of:
s31, generating a P5 feature map by passing a P5 feature layer of the feature image through a GSConv module;
s32, combining the P5 feature map with a feature layer P4 of the feature image through an Upsampling module, and extracting features of the feature layer P4 by using a VOV0GSCSP module to generate a P4 feature map;
s33, combining the P4 feature map with a feature layer P3 of the feature image through an Upsampling module, and extracting features of the P4 feature map by using a VOV0GSCSP module to obtain a P3 out feature map;
s34, combining the P3 out feature map with the P4 feature layer of the feature image after passing through the GSConv module, and extracting features of the P4 out feature map by using the VOV0GSCSP module;
and S35, combining the P4_out feature map with a P5 feature layer of the feature image after passing through a GSConv module, and extracting features of the P5_out feature map by using a VOV0GSCSP module.
4. A detection method according to claim 3, wherein in step S3, the GSConv module operates as follows:
s301, setting the channel number of the input characteristic as C1;
s302, carrying out standard convolution on the characteristic image and a C1 channel to generate a C2/2 characteristic vector;
s303, performing depth separable convolution on the feature image to generate another C2/2 feature vector;
s304, connecting the two C2/2 feature vectors through a Concat module to obtain fusion features;
s305, the fusion characteristic is subjected to a shuffle operation to obtain an output characteristic.
5. The method according to claim 1, wherein in step S4, the method specifically comprises the steps of:
s41, decoding the P3_out characteristic diagram, the P4_out characteristic diagram and the P5_out characteristic diagram to obtain a plurality of marked boxes; the expression of the marking box parameter is:
H=(x,y,w,h,θ)
in the above formula, H represents the sign of the marked square, x, y, w, H and θ represent the abscissa of the marked square center, the ordinate of the marked square center, the width of the marked square, the height of the marked square and the rotation angle of the marked square, respectively, wherein, -90 DEG is more than or equal to θ is more than or equal to 90 DEG, the width w of the marked square is defined as the longest side, and θ represents the angle range through which the x-axis rotates to the w side.
S42, parameter adjustment is carried out on the marking box by adopting the KLD loss function.
6. The detecting method according to claim 5, wherein in step S43, the calculation formula of the KLD loss function is:
wherein,
wherein,
in the above-mentioned method, the step of,weight representing KL divergence, τ being the adjustment factor, f (D) representing a value related to D kl (N p' ||N t' ) Is a non-linear one of (2)Function N p' Representing the predicted disease characteristic distribution result, N t' Representing the actual disease characteristic distribution result, D kl (N p' ||N t' ) Representation of N p' And N t' KL divergence between two probability distributions, μ p Sum mu t Respectively N p' And N t' T is the sign of the transposed matrix, < >>And->Respectively representing the inverse and the quadratic evolution of a positive definite symmetric matrix of the actual disease characteristic distribution, tr represents the sum of diagonal elements of the matrix, M is a linear transformation matrix, ln is the sign of the natural logarithm, Σ P Sum sigma t Respectively N p' And N t' θ represents the angular range through which the x-axis rotates to the w-edge of the mark box, w and h representing the width and height of the mark box, respectively.
7. The method according to claim 1, wherein in step S5, the method specifically comprises the steps of:
s51, setting a confidence value broad value T;
s52, sequentially calculating the confidence coefficient of each marking box through a BCE function;
s53, judging the confidence coefficient and the confidence coefficient wide value corresponding to each marking square frame in the feature image in sequence;
if it isReserving a marked box corresponding to the confidence coefficient to generate a pre-output image;
if it isThen delete the confidenceThe corresponding marked boxes to generate a pre-output image.
CN202311364873.6A 2023-10-20 2023-10-20 Field crop and near-edge seed disease detection method under unmanned aerial vehicle visual angle Active CN117351356B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311364873.6A CN117351356B (en) 2023-10-20 2023-10-20 Field crop and near-edge seed disease detection method under unmanned aerial vehicle visual angle

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311364873.6A CN117351356B (en) 2023-10-20 2023-10-20 Field crop and near-edge seed disease detection method under unmanned aerial vehicle visual angle

Publications (2)

Publication Number Publication Date
CN117351356A true CN117351356A (en) 2024-01-05
CN117351356B CN117351356B (en) 2024-05-24

Family

ID=89359156

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311364873.6A Active CN117351356B (en) 2023-10-20 2023-10-20 Field crop and near-edge seed disease detection method under unmanned aerial vehicle visual angle

Country Status (1)

Country Link
CN (1) CN117351356B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111968087A (en) * 2020-08-13 2020-11-20 中国农业科学院农业信息研究所 Plant disease area detection method
US20210224581A1 (en) * 2020-09-25 2021-07-22 Beijing Baidu Netcom Science And Technology Co., Ltd. Method, apparatus, and device for fusing features applied to small target detection, and storage medium
CN114359654A (en) * 2021-12-06 2022-04-15 重庆邮电大学 YOLOv4 concrete apparent disease detection method based on position relevance feature fusion
CN114419051A (en) * 2021-12-08 2022-04-29 西安电子科技大学 Method and system for adapting to multi-task scene containing pixel-level segmentation
CN114937151A (en) * 2022-05-06 2022-08-23 西安电子科技大学 Lightweight target detection method based on multi-receptive-field and attention feature pyramid
US20220309674A1 (en) * 2021-03-26 2022-09-29 Nanjing University Of Posts And Telecommunications Medical image segmentation method based on u-net
US20230073541A1 (en) * 2020-01-27 2023-03-09 Matthew Charles King System and method for performing machine vision recognition of dynamic objects
CN116229292A (en) * 2023-01-18 2023-06-06 吉林大学 Inspection system and method based on unmanned aerial vehicle road surface inspection disease

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230073541A1 (en) * 2020-01-27 2023-03-09 Matthew Charles King System and method for performing machine vision recognition of dynamic objects
CN111968087A (en) * 2020-08-13 2020-11-20 中国农业科学院农业信息研究所 Plant disease area detection method
US20210224581A1 (en) * 2020-09-25 2021-07-22 Beijing Baidu Netcom Science And Technology Co., Ltd. Method, apparatus, and device for fusing features applied to small target detection, and storage medium
US20220309674A1 (en) * 2021-03-26 2022-09-29 Nanjing University Of Posts And Telecommunications Medical image segmentation method based on u-net
CN114359654A (en) * 2021-12-06 2022-04-15 重庆邮电大学 YOLOv4 concrete apparent disease detection method based on position relevance feature fusion
CN114419051A (en) * 2021-12-08 2022-04-29 西安电子科技大学 Method and system for adapting to multi-task scene containing pixel-level segmentation
CN114937151A (en) * 2022-05-06 2022-08-23 西安电子科技大学 Lightweight target detection method based on multi-receptive-field and attention feature pyramid
CN116229292A (en) * 2023-01-18 2023-06-06 吉林大学 Inspection system and method based on unmanned aerial vehicle road surface inspection disease

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
SHUAI YANG 等: "Maize-YOLO:A New High-Precision and Real-Time Method for Maize Pest Detection", MDPI, 10 March 2023 (2023-03-10), pages 1 - 13 *
严娟;方志军;高永彬;: "结合混合域注意力与空洞卷积的3维目标检测", 中国图象图形学报, no. 06, 16 June 2020 (2020-06-16), pages 157 - 170 *
张宽;滕国伟;范涛;李聪;: "基于密集连接的FPN多尺度目标检测算法", 计算机应用与软件, no. 01, 12 January 2020 (2020-01-12), pages 171 - 177 *

Also Published As

Publication number Publication date
CN117351356B (en) 2024-05-24

Similar Documents

Publication Publication Date Title
CN109949255B (en) Image reconstruction method and device
CN112766199B (en) Hyperspectral image classification method based on self-adaptive multi-scale feature extraction model
CN112488210A (en) Three-dimensional point cloud automatic classification method based on graph convolution neural network
Wang et al. SSRNet: In-field counting wheat ears using multi-stage convolutional neural network
CN113627472B (en) Intelligent garden leaf feeding pest identification method based on layered deep learning model
CN111325381A (en) Multi-source heterogeneous farmland big data yield prediction method, system and device
CN110136162B (en) Unmanned aerial vehicle visual angle remote sensing target tracking method and device
CN113191489B (en) Training method of binary neural network model, image processing method and device
Der Yang et al. Real-time crop classification using edge computing and deep learning
CN111781599B (en) SAR moving ship target speed estimation method based on CV-EstNet
CN116091951A (en) Method and system for extracting boundary line between farmland and tractor-ploughing path
CN109712149A (en) A kind of image partition method based on wavelet energy and fuzzy C-mean algorithm
CN116500611A (en) Deep learning-based radar wave surface image sea wave parameter inversion method
CN110288050B (en) Hyperspectral and LiDar image automatic registration method based on clustering and optical flow method
CN115115601A (en) Remote sensing ship target detection method based on deformation attention pyramid
Zhang et al. Hawk‐eye‐inspired perception algorithm of stereo vision for obtaining orchard 3D point cloud navigation map
Lu et al. Image recognition of rice leaf diseases using atrous convolutional neural network and improved transfer learning algorithm
CN117351356B (en) Field crop and near-edge seed disease detection method under unmanned aerial vehicle visual angle
CN111832508A (en) DIE _ GA-based low-illumination target detection method
CN115294562B (en) Intelligent sensing method for operation environment of plant protection robot
CN116953702A (en) Rotary target detection method and device based on deduction paradigm
Xu et al. Cucumber flower detection based on YOLOv5s-SE7 within greenhouse environments
CN112465736B (en) Infrared video image enhancement method for port ship monitoring
KR20220168875A (en) A device for estimating the lodging area in rice using AI and a method for same
CN117649602B (en) Image processing method and system based on artificial intelligence

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant