CN112149518A - Pine cone detection method based on BEGAN and YOLOV3 models - Google Patents

Pine cone detection method based on BEGAN and YOLOV3 models Download PDF

Info

Publication number
CN112149518A
CN112149518A CN202010912858.0A CN202010912858A CN112149518A CN 112149518 A CN112149518 A CN 112149518A CN 202010912858 A CN202010912858 A CN 202010912858A CN 112149518 A CN112149518 A CN 112149518A
Authority
CN
China
Prior art keywords
yolov3
yolov3 model
model
pine cone
detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202010912858.0A
Other languages
Chinese (zh)
Inventor
张怡卓
于慧伶
蒋大鹏
张健
罗泽
葛奕麟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Shengdong Technology Development Co ltd
Original Assignee
Jiangsu Shengdong Technology Development Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Shengdong Technology Development Co ltd filed Critical Jiangsu Shengdong Technology Development Co ltd
Priority to CN202010912858.0A priority Critical patent/CN112149518A/en
Publication of CN112149518A publication Critical patent/CN112149518A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a pine cone detection method based on BEGAN and YOLOV3 models, which comprises the steps of firstly collecting pine cone images under a plurality of different time nodes, and performing data enhancement by using a traditional image enhancement technology and a BEGAN deep learning method; dividing the pine cone image subjected to data enhancement into a plurality of cells by using a YOLOV3 model, and introducing a dense connection network structure modified by using a constructed bottleneck structure dense layer into the YOLOV3 model; the detection proportion of the YOLOV3 model is expanded, and a DIoU algorithm is utilized to optimize a loss function of the YOLOV3 model, so that the overall performance of pine cone detection can be effectively improved.

Description

Pine cone detection method based on BEGAN and YOLOV3 models
Technical Field
The invention relates to the technical field of pine cone recognition, in particular to a pine cone detection method based on BEGAN and YOLOV3 models.
Background
The real-time detection of the pine cone in the Korean pine forest is not only a data basis for realizing the mechanized picking of the pine cone, but also one of the important methods for evaluating the yield of the Korean pine forest. In recent years, some detection accuracy has been provided for image processing of fruits in trees by using deep learning methods, but these methods have the disadvantages of less reference of detection data, slower speed and low detection accuracy, which results in low overall performance of pine nut detection.
Disclosure of Invention
The invention aims to provide a pine cone detection method based on BEGAN and YOLOV3 models, which improves the overall performance of pine cone detection.
In order to achieve the above object, the present invention provides a pine cone detection method based on the model of BEGAN and YOLOV3, comprising:
collecting pine cone images under a plurality of different time nodes, and performing data enhancement by using a traditional image enhancement technology and a BEGAN deep learning method;
dividing the pine cone image after data enhancement by using a YOLOV3 model, and introducing a dense connection network structure in the YOLOV3 model;
expanding the detection proportion of the Yolov3 model, and optimizing a loss function of the Yolov3 model by using a DIoU algorithm.
Dividing the pine cone image after data enhancement by using a Yolov3 model, and introducing a dense connection network structure in the Yolov3 model, wherein the method comprises the following steps:
dividing the input pine cone image into a plurality of cells by using a YOLOV3 model, and acquiring a plurality of bounding box information and 5 data values corresponding to the bounding box information by taking the cells with pine cones as units.
The method includes the steps of dividing the pine cone image after data enhancement by using a YOLOV3 model, and introducing a dense connection network structure into the YOLOV3 model, and further includes the following steps:
and dividing 23 residual modules in a backbone network used by the YOLOV3 model into 5 groups, modifying a dense connection network structure by using the constructed bottleneck structure dense layer, and replacing any three groups of residual modules in the 5 groups of residual modules.
The method comprises the steps of expanding the detection proportion of the YOLOV3 model, and optimizing a loss function of the YOLOV3 model by using a DIoU algorithm, wherein the method comprises the following steps:
and respectively carrying out up-sampling on 32-time down-sampling, 16-time down-sampling and 8-time down-sampling, and then connecting the up-sampling with the output of the second group of residual error modules in the YOLOV3 model after the dense connection network structure is introduced to obtain a feature fusion target detection layer with 4-time down-sampling.
Wherein, expanding the detection proportion of the YOLOV3 model, and optimizing the loss function of the YOLOV3 model by using a DIoU algorithm, further comprises:
and constructing the coordinate error, the confidence error and the classification error into a loss function of the YOLOV3 model, and optimizing the loss function according to the Euclidean distance and the diagonal distance of the central point between the corresponding prediction frame and the target frame.
The invention relates to a pine cone detection method based on BEGAN and YOLOV3 models, which comprises the steps of firstly collecting pine cone images under a plurality of different time nodes, and performing data enhancement by using a traditional image enhancement technology and a BEGAN deep learning method; dividing the pine cone image subjected to data enhancement into a plurality of cells by using a YOLOV3 model, and introducing a dense connection network structure modified by using a constructed bottleneck structure dense layer into the YOLOV3 model; the detection proportion of the YOLOV3 model is expanded, and a DIoU algorithm is utilized to optimize a loss function of the YOLOV3 model, so that the overall performance of pine cone detection can be effectively improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic step diagram of a pine cone detection method based on the models of BEGAN and YOLOV3 according to the present invention.
Fig. 2 is a schematic structural diagram of a BEGAN network provided by the present invention.
Fig. 3 is a schematic structural diagram of the backbone network provided by the present invention after being divided.
Fig. 4 is a schematic structural diagram of a basic dense layer provided by the present invention.
Fig. 5 is a schematic diagram of a dense connection network structure after a bottleneck structure dense layer is added.
Fig. 6 is a diagram of an improved darknet-53 backbone network architecture provided by the present invention.
FIG. 7 is a schematic diagram of a method for improving a scale detection module provided by the present invention.
FIG. 8 is a schematic structural diagram of the entire detection method provided by the present invention.
FIG. 9 is a P-R graph obtained from recall and accuracy provided by the present invention.
Fig. 10 is a graph of YOLOV3 and original YOLOV3AP after the dense connection network is introduced.
FIG. 11 is a comparison of the expanded detection scale provided by the present invention.
FIG. 12 is a comparison graph of the loss function using the DIoU optimization provided by the present invention.
FIG. 13 is a graph of data enhancement contrast provided by the present invention.
FIG. 14 is a P-R plot of different data sets provided by the present invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are illustrative and intended to be illustrative of the invention and are not to be construed as limiting the invention.
In the description of the present invention, "a plurality" means two or more unless specifically defined otherwise.
Referring to fig. 1, the present invention provides a pine cone detection method based on the models of BEGAN and YOLOV3, including:
s101, collecting a plurality of pine cone images under different time nodes, and performing data enhancement by using a traditional image enhancement technology and a BEGAN deep learning method.
Specifically, a camera is used for acquiring at 5312x2988 pixel resolution, the acquired images are manually annotated for experiments, the acquisition site is located in a forest farm, image data are collected in cloudy days and sunny days respectively, and the acquisition time comprises 8 am, 1 pm and 3 pm. Some images are collected in different angles of the same position, taking into account the recognition performance at different angles. 800 pine cone images are collected from an original data set; the data enhancement uses two methods of traditional image enhancement technology and BEGAN deep learning.
The GAN network model consists of two networks, generator g (generator) and arbiter d (discriminator). The generator G generates an image G (z) by receiving a random noise z, the function of the D network is to distinguish whether an image is real or not, the input parameter of the D network is an image x, and the output D (x) represents the probability that x is a real image. Through the training of the countermeasure process, the generator G and the discriminator D carry out a very small and very large game, and finally Nash balance is achieved. In an optimal state, G can generate an image G (z) which is enough to be spurious, and the hyper-parameter D (G (z)) is 0.5, so that a trained GAN network is obtained, and data enhancement can be realized by generating the image through the generator G.
In the original GAN, the data distribution generated by the generator is expected to be as close to the real data distribution as possible, and when the generated data distribution is equal to the real data distribution, the generator G is determined to generate the same sample as the real data distribution through training, that is, the capability of generating data enough to falsely and truly confuse is obtained, so from this point, researchers design various loss functions to make the generated data distribution of G as close to the real data distribution as possible. Instead of this method of estimating probability distribution, the difference between the generated data distribution pg and the real data distribution px is calculated directly, and the distance between the errors of the two distributions is calculated. To estimate the error of the computation distribution, the BEGAN employs a network of discriminators of a self-encoder structure, the network structure of which is shown in FIG. 2.
The BEGAN network controls the quality and diversity of the generated images by a hyper-parameter gamma ranging from 0 to 1, with the diversity of the generated images being better as gamma increases, but the image quality also decreasing, where gamma is taken to be 0.4. The size of a pine cone image generated by a BEGAN network is 64x64, all pine cone samples in an acquired original data image are extracted, the size of the pine cone image is uniformly changed to 64x64 to construct a training data set of the BEGAN network, the pine cone image samples generated by the BEGAN network respectively comprise 64 different pine cone samples, the comparison with the pine cone samples of the original image shows that the image generated by the BEGAN network completely inherits the characteristics of a real image, the image generated by the BEGAN network is a single pine cone, each picture is sequentially extracted from the original data set, the image generated by the BEGAN network is adjusted in size to replace the pine cone image in the original picture, and the replaced image is placed back to the data set. In this manner, the present invention creates an augmented data set containing 1600 images.
S102, dividing the pine cone image after data enhancement by using a YOLOV3 model, and introducing a dense connection network structure into the YOLOV3 model.
Specifically, YOLOV3 is a one-stage detection algorithm that does not require a proposed region phase, but rather generates bounding box coordinates and probabilities for each category directly by regression. During the operation of the network, YOLOV3 divides the input picture into S × S cells, and then the output is performed in units of cells. If the center of an object, i.e., the pine cone, falls on a cell, then that cell is responsible for predicting the object. B bounding box information needs to be predicted per cell. The bounding box information contains 5 data values, x, y, w, h and confidence.
(x, y) is the displacement of the center point of the bounding box with respect to the cell, and the final predicted (x, y) is normalized. Suppose the width w of a pictureiHeight of hiCenter coordinates (x) of bounding boxc,yc) Cell coordinate is (x)col,yrow) Then (x, y) is calculated as follows:
Figure BDA0002663932600000051
Figure BDA0002663932600000052
wherein, (w, h) represents the ratio of the bounding box to the whole picture, and the width and height of the predicted bounding box are assumed to be (w)b,hb) Then (w, h) is calculated as follows:
Figure BDA0002663932600000053
Figure BDA0002663932600000054
the confidence is composed of two parts, namely whether a target exists in the grid or not and the accuracy of the bounding box. The confidence is calculated as follows:
Figure BDA0002663932600000055
if the bounding box contains an object, then pr (object) is 1, otherwise pr (object) is 0;
Figure BDA0002663932600000056
for the area of intersection of the predicted bounding box and the real region of the object, this value is also at [0,1 ]]In the above paragraph.
In addition to confidence, each box also outputs C pieces of probability information that the object belongs to a certain class, so the output dimension of the final network is S × (B × 5+ C).
The backbone network Darknet-53 network used by the YOLOV3 is composed of 23 residual modules in total, each residual module is composed of two convolution layers and a shortcut link, the residual modules are divided into 5 groups, each group respectively comprises 1, 2, 8, 4 residual modules, as shown in FIG. 3, the dense connection network structure is modified by the constructed bottleneck structure dense layer, and any three groups of residual modules in the 5 groups of residual modules are replaced, wherein the modification of the dense connection network structure by the constructed bottleneck structure dense layer is specifically that: a densely connected network can improve the information flow and gradient throughout the network, the principle of which is as follows: suppose the input is X0Each layer of the network implements a non-linear transformation Hi(.) where i represents the ith layer. Let the output of the i-th layer be denoted XiThen:
Xi=Hi(X0,X1,,,,,,Xi-1)
dense-connected networks typically comprise a plurality of dense modules, one dense module consisting of n dense layers. The specific structure of the basic dense layer is shown in fig. 4: unlike the common post-activation mechanism, the dense layer uses a pre-activation mechanism, which is a batch normalization layer and an activation function layer (ReLU) and performs activation operation before convolution layer and then performs 3 × 3 convolution output feature mapping.
Assume an input X of dense modules0The dimension of (a) is m, each dense layer outputs k feature maps, and according to the principle of a dense network, the input m + (n-1) × k of the nth dense layer is one feature map, so that the direct convolution operation of 3 × 3 brings huge calculation amount. The bottleneck structure can be adopted to reduce the calculation amount, and the main method is to add 1x1 convolution layers in the original dense module to reduce the feature number. In the bottleneck structure dense layer we build, we get 2k feature maps through 1x1 convolutional layer first, and then output k feature maps through 3x3 convolutional layer, as shown in fig. 5.
In order to take account of the detection speed and the detection accuracy, residual modules with the outputs of 208x208 and 104x104 in the original darknet-53 network are reserved, three groups of residual modules with the outputs of 52x52, 26x26 and 13x13 are replaced by dense modules, each dense module consists of 4 bottle neck structure dense layers, and finally, the network output dimension is consistent with the original darknet-53 network as shown in fig. 6.
S103, expanding the detection proportion of the YOLOV3 model, and optimizing a loss function of the YOLOV3 model by using a DIoU algorithm.
Specifically, for most convolutional neural networks, shallow features are required to distinguish small targets, and deep features are required to distinguish large targets. YOLOV3 uses the concept of multi-scale feature fusion of FPN to detect three scales with feature map sizes of 13 × 13, 26 × 26, and 52 × 52, and makes the feature map pass through two adjacent scales by 2 times of upsampling. We improved the scale detection module in YOLOV3 for the case where the pine cones to be detected are mostly small targets. The original YOLOV3 was tested a total of three times, 32 times down-sampling, 16 times down-sampling, and 8 times down-sampling. The feature map in the network when 4 times down sampling contains more fine-grained features and position information of small targets, and the feature map and the high-level feature map are fused for detection, so that the precision of detecting the small targets can be improved.
The method for improving the scale detection module is shown in fig. 7: the feature graph in the original network during three times of detection is up-sampled, then the feature graph is connected with the output of a second group of residual error modules in the Yolov3, a feature fusion target detection layer sampled by 4 times is established in this way, and the three detection ratios in the original Yolo v3 are expanded to four.
The loss function can be used to evaluate a model of H3, and the loss function of YOLOV3 uses a binary cross entropy, consisting of three parts, coordinate error, confidence error, and classification error:
loss=losscoord+lossnoobj+lossclasses
wherein, the coordinate error comprises frame central point error and the wide high error of frame two parts:
Figure BDA0002663932600000071
the confidence error consists of two parts, namely the confidence error when an object exists in the prediction frame and the confidence error when no object exists in the prediction frame:
Figure BDA0002663932600000072
the classification error is expressed as follows:
Figure BDA0002663932600000073
in the above loss function, λ represents a weight. The coordinate error is a large proportion of the total loss, λcoordSet to 5. In the confidence error, λ when an object is present in a frame is predicted to represent the difference between the presence and absence of the object obj1, predicting λ when there is no object in the framenoobj0.5, coefficient of the classification error term λclassesFixed to 1.
The confidence error in the YOLOV3 loss function is calculated based on IoU, IoU represents the intersection ratio of the prediction box and the target box, when the prediction box is a and the target box is B:
Figure BDA0002663932600000074
IoU is widely used in the target detection task as an evaluation index, but has some disadvantages: if the prediction box and the target box do not intersect, IoU can be obtained as 0 according to the definition, which cannot reflect the distance between the prediction box and the target box, and simultaneously, the position error and the confidence error in the loss function cannot return a gradient, which affects the learning training of the network; when the intersection areas of the target frame and the prediction frame are equal and the distances are unequal, the calculated IoU are equal, which cannot accurately reflect the coincidence ratio of the target frame and the prediction frame, and the performance of the network is also reduced. In order to solve the problem, the method for calculating the GIoU is obtained by calculating the minimum convex set of a prediction box and a target box, and if the minimum convex set of A and B is C:
Figure BDA0002663932600000075
when A and B are not coincident, the farther they are, the closer the GIoU approaches-1, so the loss function can be expressed using 1-GIoU, which better reflects the coincidence of A, B. But when a is within B, GIoU will be completely degraded to IoU. Aiming at the GIoU algorithm, a DIoU improvement method is also provided:
Figure BDA0002663932600000081
in the above loss function, bgtRepresents the center points of A and B, and p represents B and BgtC represents the diagonal distance of the smallest rectangle that can cover both a and B. DIoU can directly minimize the distance between a and B and therefore converges much faster than GIoU. The DIoU inherits the excellent characteristics of IoU and avoids the disadvantage of IoU, and is a good choice in the 2D/3D computer vision task based on IoU as an index, and the DIoU is introduced into the loss function of YOLOV3 to improve the detection accuracy.
The experimental scheme of our proposed work is shown in fig. 8, and we perform data enhancement on the collected raw data through a BEGAN network, then perform as much convergence training as possible on the improved YOLOV3 model, and finally check the visual detection result and evaluation index of the model.
The experimental model herein was constructed based on Pytorch. The environment and parameters of model training are as follows: i78750H (cpu), 16G Random Access Memory (RAM), Nvidia 1070(GPU) and ubuntu18.04 operating system. Training and testing sets of models were as per 8: 2, images were scaled 416 × 416 before training, and the initialization parameters of the network are shown in table 1:
TABLE 1 network initial parameters
Figure BDA0002663932600000082
To verify the performance of the proposed method, a comparative experiment was performed on the collected original data set with the modified YOLOV3 and the original YOLOV3, SSD, Faster R-CNN, the F1 and the inspection speed of the four models are shown in table 2, where the F1 value is taken as the maximum value and the inspection speed is taken as the average speed over the entire data set.
TABLE 2F 1 and test speed for the four models
Figure BDA0002663932600000083
The improved YOLOV3 model obtains an F1 value of 0.923 and a detection speed of 7.9ms, which are respectively improved by 1.31 percent and 38.2 percent compared with the original YOLOV3 and are obviously better than SSD and Faster R-CNN. The P-R curves obtained from recall and accuracy during training are shown in fig. 9, and the P-R curve of the improved YOLOV3 model has a distinct advantage over other models in that its equilibrium point is closer to the coordinates (1, 1), indicating a higher performance of the model.
Three improvements were made to the original YOLOV3 model, and to explore the effect of different improvements on the model, we performed several comparative experiments, with only one improvement added to each experiment.
After dense modules are introduced into a backbone network of the YOLOV3, the computation of the model is greatly reduced, the computation of YOLOV3 after the dense modules are introduced is 40.48BFLOPs, and the computation of the original YOLOV3 is 65.86BFLOPs, which is a main reason for improving the detection speed. As shown in fig. 10, the YOLOV3 model AP value of the dense module is similar to the original YOLOV3 model, which indicates that the detection accuracy is not greatly affected by the dense module.
As shown in fig. 11, the AP value of the dense module model is only introduced to be 91.8%, and the AP value of the dense module model and the detection ratio model added to be 92.9%, which is improved by 2.3%, because 4-scale detection can accurately detect most of small targets. After 34000 steps of training, the model loss value added with the detection proportion begins to be stable, and about 36000 steps of training are needed when the model loss value of the dense modules is introduced to achieve convergence. Meanwhile, the final loss of the model only introducing the dense modules is about 1.41, while the final loss of the model added with the detection proportion is about 1.06, which is reduced by 0.35, and this shows that the model has faster convergence speed and better convergence result by adding one detection proportion.
FIG. 12 shows the effect of DIoU loss on model accuracy, and using the DIoU loss, the AP value of the model increased from 92.7% to 93.4%, which is a 1.5% increase.
The generalization capability of the model is improved by introducing the dense module, the target which cannot be detected by the original YOLOV3 model is detected, and the detection precision of the model is improved to a certain extent by adding the detection proportion and using the DIoU loss. Under the conditions of small data set and complicated and variable data, the improved Yolov3 model is excellent in small target detection and generalization.
To analytically validate the effectiveness of data enhancement using BEGAN, a comparison was made on the original image dataset and the augmented dataset using the modified YOLOV3 model. As shown in fig. 13, after data enhancement by BEGAN, the AP value of the model increased from 93.4% to 95.3%, which is a 2% increase. This shows that the diversity of the training data set is enriched by using the BEGAN to generate the image data, and the robustness of the detection model can be effectively enhanced.
To further analyze the effect of the size of the image dataset on the model. We used the BEGAN enhanced data set as a reference, from which 400, 800, 1200 images were randomly selected to form a new data set, on which the improved YOLOV3 model was trained to obtain the corresponding P-R curve, as shown in fig. 14. The experimental result shows that the data size has a great influence on the detection capability of the model, when a data set of 400 images is used, the detection capability of the model is very weak, and the detection performance of the model is gradually enhanced along with the increase of the size of the training set. Furthermore, the improvement capability of data enhancement is limited, and when the number of images of the training set exceeds 1200, the speed of model performance improvement starts to slow down as the number of images increases.
The invention relates to a pine cone detection method based on BEGAN and YOLOV3 models, which comprises the steps of firstly collecting pine cone images under a plurality of different time nodes, and performing data enhancement by using a traditional image enhancement technology and a BEGAN deep learning method; dividing the pine cone image subjected to data enhancement into a plurality of cells by using a YOLOV3 model, and introducing a dense connection network structure modified by using a constructed bottleneck structure dense layer into the YOLOV3 model; the detection proportion of the YOLOV3 model is expanded, and a DIoU algorithm is utilized to optimize a loss function of the YOLOV3 model, so that the overall performance of pine cone detection can be effectively improved.
While the invention has been described with reference to a preferred embodiment, it will be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (5)

1. A pine cone detection method based on BEGAN and YOLOV3 models is characterized by comprising the following steps:
collecting pine cone images under a plurality of different time nodes, and performing data enhancement by using a traditional image enhancement technology and a BEGAN deep learning method;
dividing the pine cone image after data enhancement by using a YOLOV3 model, and introducing a dense connection network structure in the YOLOV3 model;
expanding the detection proportion of the Yolov3 model, and optimizing a loss function of the Yolov3 model by using a DIoU algorithm.
2. The method for detecting pine cones based on the BEGAN and YOLOV3 models as claimed in claim 1, wherein the pine cone image after data enhancement is divided by using YOLOV3 model, and dense connection network structure is introduced into the YOLOV3 model, comprising:
dividing the input pine cone image into a plurality of cells by using a YOLOV3 model, and acquiring a plurality of bounding box information and 5 data values corresponding to the bounding box information by taking the cells with pine cones as units.
3. The pine cone detection method based on the BEGAN and YOLOV3 models as claimed in claim 2, wherein the pine cone image after data enhancement is divided by using a YOLOV3 model, and a dense connection network structure is introduced into the YOLOV3 model, further comprising:
and dividing 23 residual modules in a backbone network used by the YOLOV3 model into 5 groups, modifying a dense connection network structure by using the constructed bottleneck structure dense layer, and replacing any three groups of residual modules in the 5 groups of residual modules.
4. The method of claim 3, wherein the expanding the detection ratio of the YOLOV3 model and optimizing the loss function of the YOLOV3 model using DIoU algorithm comprises:
and respectively carrying out up-sampling on 32-time down-sampling, 16-time down-sampling and 8-time down-sampling, and then connecting the up-sampling with the output of the second group of residual error modules in the YOLOV3 model after the dense connection network structure is introduced to obtain a feature fusion target detection layer with 4-time down-sampling.
5. The method of claim 4, wherein the detection ratio of the YOLOV3 model is expanded and the loss function of the YOLOV3 model is optimized by a DIoU algorithm, and further comprising:
and constructing the coordinate error, the confidence error and the classification error into a loss function of the YOLOV3 model, and optimizing the loss function according to the Euclidean distance and the diagonal distance of the central point between the corresponding prediction frame and the target frame.
CN202010912858.0A 2020-10-29 2020-10-29 Pine cone detection method based on BEGAN and YOLOV3 models Withdrawn CN112149518A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010912858.0A CN112149518A (en) 2020-10-29 2020-10-29 Pine cone detection method based on BEGAN and YOLOV3 models

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010912858.0A CN112149518A (en) 2020-10-29 2020-10-29 Pine cone detection method based on BEGAN and YOLOV3 models

Publications (1)

Publication Number Publication Date
CN112149518A true CN112149518A (en) 2020-12-29

Family

ID=73889238

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010912858.0A Withdrawn CN112149518A (en) 2020-10-29 2020-10-29 Pine cone detection method based on BEGAN and YOLOV3 models

Country Status (1)

Country Link
CN (1) CN112149518A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113329000A (en) * 2021-05-17 2021-08-31 山东大学 Privacy protection and safety monitoring integrated system based on smart home environment
CN114627143A (en) * 2021-10-12 2022-06-14 深圳宏芯宇电子股份有限公司 Image processing method and device, terminal equipment and readable storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113329000A (en) * 2021-05-17 2021-08-31 山东大学 Privacy protection and safety monitoring integrated system based on smart home environment
CN114627143A (en) * 2021-10-12 2022-06-14 深圳宏芯宇电子股份有限公司 Image processing method and device, terminal equipment and readable storage medium

Similar Documents

Publication Publication Date Title
CN110263705B (en) Two-stage high-resolution remote sensing image change detection system oriented to remote sensing technical field
CN110135267B (en) Large-scene SAR image fine target detection method
CN110991311B (en) Target detection method based on dense connection deep network
CN111179217A (en) Attention mechanism-based remote sensing image multi-scale target detection method
CN111739075A (en) Deep network lung texture recognition method combining multi-scale attention
CN111368769B (en) Ship multi-target detection method based on improved anchor point frame generation model
CN109559297A (en) A method of generating the Lung neoplasm detection of network based on 3D region
CN114463637B (en) Winter wheat remote sensing identification analysis method and system based on deep learning
CN115223063B (en) Deep learning-based unmanned aerial vehicle remote sensing wheat new variety lodging area extraction method and system
CN110555841A (en) SAR image change detection method based on self-attention image fusion and DEC
CN112149518A (en) Pine cone detection method based on BEGAN and YOLOV3 models
CN115965862A (en) SAR ship target detection method based on mask network fusion image characteristics
CN116824585A (en) Aviation laser point cloud semantic segmentation method and device based on multistage context feature fusion network
CN116168240A (en) Arbitrary-direction dense ship target detection method based on attention enhancement
CN114821341A (en) Remote sensing small target detection method based on double attention of FPN and PAN network
CN111222534A (en) Single-shot multi-frame detector optimization method based on bidirectional feature fusion and more balanced L1 loss
CN114743023B (en) Wheat spider image detection method based on RetinaNet model
CN115272819A (en) Small target detection method based on improved Faster-RCNN
CN116758363A (en) Weight self-adaption and task decoupling rotary target detector
CN115249329A (en) Apple leaf disease detection method based on deep learning
CN110363287B (en) Neural network design method for memory calculation and indoor presence or absence of people
Yang et al. Small aircraft target detection using cascade FP-CNN in remote sensing images
CN113158806A (en) OTD (optical time Domain _ Logistic) -based SAR (synthetic Aperture Radar) data ocean target detection method
CN114005001B (en) X-ray image detection method and system based on deep learning
Zhang et al. Detection of rotating small targets in remote sensing images based on improved yolov5s

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20201229

WW01 Invention patent application withdrawn after publication