CN115019226A - Tea leaf picking and identifying method based on improved YoloV4 model - Google Patents

Tea leaf picking and identifying method based on improved YoloV4 model Download PDF

Info

Publication number
CN115019226A
CN115019226A CN202210523294.0A CN202210523294A CN115019226A CN 115019226 A CN115019226 A CN 115019226A CN 202210523294 A CN202210523294 A CN 202210523294A CN 115019226 A CN115019226 A CN 115019226A
Authority
CN
China
Prior art keywords
feature
model
tea
layer
picking
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210523294.0A
Other languages
Chinese (zh)
Inventor
王白娟
杨贺凯
蔡小波
吴奇
刘晓慧
邓秀娟
袁文侠
张世浩
杨春华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yunnan Agricultural University
Original Assignee
Yunnan Agricultural University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yunnan Agricultural University filed Critical Yunnan Agricultural University
Priority to CN202210523294.0A priority Critical patent/CN115019226A/en
Publication of CN115019226A publication Critical patent/CN115019226A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/766Arrangements for image or video recognition or understanding using pattern recognition or machine learning using regression, e.g. by projecting features on hyperplanes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a tea leaf picking and identifying method based on an improved YooloV 4 model, belongs to the technical field of image target detection, and aims to optimize the traditional model feature extraction of the tea leaf picking and identifying method based on the improved YooloV 4 model, reduce the model calculation amount, realize image identification through a small control panel, reduce the size of a picking equipment identification module and facilitate application to classified picking of superior tea leaves.

Description

Tea leaf picking and identifying method based on improved YoloV4 model
Technical Field
The invention belongs to the technical field of image target detection, and particularly relates to a tea leaf picking and identifying method based on an improved Yolov4 model.
Background
At present, with the continuous increase of tea demand, automatic tea picking equipment is gradually selected for use in a large-scale tea plantation for tea picking, and the western swimming tea picking equipment is used for cutting tea leaves close to each other by using a compound cutting knife under the assistance of manpower, so that the picking efficiency is high, and the labor intensity of tea picking is greatly relieved. The high-quality tea leaves are divided into multiple grades such as one bud and one leaf, one bud and two leaves, one bud and three leaves and the like according to picking grades, but the existing tea picking equipment generally has no recognition function, the picking mode is undifferentiated picking, the quality of the picked tea leaves is not selected, and the existing high-quality famous tea leaves still need to be picked manually. In order to improve the identification accuracy of the picking equipment, part of the picking equipment selects a traditional visual identification algorithm to carry out picking grading according to picked tea images, but the traditional visual identification algorithm has large model, large operation data amount and higher requirement on identification processing equipment, and an industrial personal computer is required to be configured for carrying out image identification, so that the picking equipment has overlarge volume and cannot work in a tea garden with high planting density. The mode of identifying and judging through the cloud server is adopted to grade tea leaves to be picked, the requirement on a network is higher, and network infrastructure of a part of mountain tea gardens is poorer, so that normal work of picking equipment is influenced.
Disclosure of Invention
In order to overcome the problems in the background art, the invention provides a tea leaf picking and identifying method based on an improved YOLOV4 model, which optimizes the feature extraction of the traditional model, reduces the model calculation amount, can realize image identification through a small control panel, reduces the size of an identification module of picking equipment, and is convenient to apply to classified picking of superior tea leaves.
In order to achieve the purpose, the invention is realized by the following technical scheme: 1. a tea leaf picking and identifying method based on an improved YooloV 4 model comprises the following steps: step 1: collecting a tea picture sample, and manually labeling the tea picture to complete data set manufacturing;
step 2: dividing an initial picture sample set and a marked picture sample set into a training set, a verification set and a test set;
and step 3: constructing a tea leaf picking grade target recognition model, wherein the tea leaf picking grade target recognition model is an improved YoloV4 target recognition model, and a MobilenetV3 feature extraction network is used for replacing a CSPDarkNet53 feature capture network;
and 4, step 4: importing the characteristic values captured by using a characteristic capture network MobilenetV3 into a characteristic layer to perform convolution operation for 3 times, importing the characteristic layer into a spatial pyramid pooling layer, and pooling the characteristic layer by using maximum pooling layers with different sizes;
and 5: stacking the pooled results, performing convolution for 3 times again, performing up-sampling on the feature layer after convolution for 3 times, stacking the feature layer after convolution for 3 times with a feature layer 1 and a feature layer 2 in a trunk feature extraction network to realize feature fusion, and performing down-sampling in the second stage after the construction of a feature pyramid is completed;
step 6: setting a loss function, adding a cosine annealing attenuation function, and performing iterative training on the tea picking grade target identification model by using a training set until the loss function is converged to obtain a trained tea picking grade target identification model;
and 7: performing performance evaluation on the trained tea picking grade target identification model by using a verification set, and testing again by using a test set after the evaluation reaches the standard;
and 8: and (4) guiding the evaluated tea leaf picking grade target identification model into a controller, and performing real-time video prediction on the picked tea leaves.
Further, the step 3 comprises the following steps:
step 3.1: setting a volume block in a YoloV4 trunk feature network as Depthwise-separable-volume, adopting a Bneck structure, and setting an activation function as H-swish;
step 3.2: setting the input layer pictures as uniform size to be input into a feature capture network;
step 3.3: changing the original picture into 224 × 3 as a first feature layer by using a convolution network conv2d structure in MobilenetV 3;
step 3.4: changing the first feature layer into 112 × 16 as a second feature layer by using a residual network bneck3 × 3 structure in MobilenetV 3;
step 3.5: changing the second feature layer into 56 × 24 as a third feature layer by using a residual network bneck5 × 5 structure in MobilenetV 3;
step 3.6: the third feature layer was changed to 28 × 40 as the fourth feature layer using the residual net bneck3 × 3 structure in MobilenetV3,
step 3.7: changing the third feature layer into 14 × 112 as a fifth feature layer by using a residual network bneck3 × 3 structure in MobilenetV 3;
step 3.8: the sixth feature layer was changed to 1 × 1280 as a feature output layer using pool pooling structure in MobilenetV3 and convolutional network conv2d, NBN structure.
Further, the step 6 comprises the following steps:
step 6.1: setting a loss function according to the data set;
step 6.2: setting the number of iterations to 10000;
step 6.3: the training is divided into two stages, namely a freezing stage and a thawing stage, wherein the first 5000 iterations are set as the freezing stage, and the second 5000 iterations are set as the thawing stage;
step 6.4: and finishing setting and starting training, storing the trained model after 10000 times of iteration, drawing a loss function curve in the iteration process, and selecting the optimal model as a tea leaf picking grade target identification model according to the loss function curve.
Further, the step 6.1 comprises the following steps:
step 6.1.1: using y _ true to take out the position of the point where the target really exists in the characteristic layer and the corresponding type of the point;
step 6.1.2: according to the class cluster objects distributed to the class clusters, repeatedly calculating and updating the class cluster clustering center;
step 6.1.3: when for each graph, IoU for all real and predicted blocks are calculated;
step 6.1.4: calculating LOSS CIOU As a function of the regression loss.
Further, the calculation formula of step 6.1.4 is:
Figure BDA0003642888770000031
in the formula, p 2 (b,b gt ) C represents the diagonal distance of the minimum closure area which can simultaneously contain the prediction frame and the real frame, alpha and v are penalty terms of the length-width ratio, and the formula of the alpha is as follows:
Figure BDA0003642888770000032
the formula for v is as follows:
Figure BDA0003642888770000033
in the formula, w gt 、h gt The width and height of the real box, respectively, and w, h the width and height of the predicted box, respectively.
The invention has the beneficial effects that: a tea leaf picking and identifying method based on an improved YOLOV4 model optimizes feature extraction of a traditional model, reduces model calculation amount, can realize image identification through a small control board, reduces the size of a picking equipment identification module, and is convenient to apply to classified picking of superior tea leaves.
Drawings
FIG. 1 is a flow chart of the steps of the present invention;
FIG. 2 is a diagram of the original YoloV4 framework;
FIG. 3 is a schematic structural diagram of MobilenetV3 according to the present invention;
FIG. 4 is a block diagram of the improved YOLOV4 of the present invention;
FIG. 5 is a graph of an example loss function;
FIG. 6 is a graph comparing model performance.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, preferred embodiments of the present invention will be described in detail below with reference to the accompanying drawings to facilitate understanding of the skilled person.
The invention discloses a tea leaf picking and identifying method based on an improved YOLOV4 model, which comprises the following steps:
step 1: collecting a tea picture sample, and manually labeling the tea picture to complete data set manufacturing;
in the embodiment, a tea picture sample is collected, images of different picking grades of one bud, one leaf, two leaves and three leaves of tea are manually marked by using LabelImg, the images of one bud, one leaf, two leaves and three leaves of tea are ensured to be positioned in the center of a marking frame, and a generated picture corresponding to an XML file is stored in a label folder, so that a data set is manufactured;
step 2: dividing the initial picture sample set and the real frame picture sample set into a training set, a verification set and a test set according to a proper proportion;
in the embodiment, a data set is randomly divided into a training set, a verification set and a test set according to a ratio of 6:2:2, the training set, the verification set and the test set are independent of each other, in specific identification, the training set is used for training a model, the verification set is used for verifying the performance of the model after training is completed, and the test set is used for drawing a loss function curve.
And 3, step 3: a tea leaf picking grade target recognition model is constructed, the tea leaf picking grade target recognition model is an improved YoloV4 target recognition model, a MobilenetV3 feature extraction network is used for replacing a CSPDarkNet53 feature capture network, and a schematic structural diagram of MobilenetV3 adopted in the embodiment is shown in fig. 3.
The CSPDarkNet53 is selected as a feature extraction network in a traditional YoloV4 target recognition model, the structure of an original YoloV4 target recognition model is shown in figure 2, a CSPDarkNet53 feature extraction network model is large, poor in pertinence and high in requirements on processing and analyzing equipment, so that the CSPDarkNet53 feature capture network is replaced by a MobilenetV3 feature extraction network, feature extraction of a traditional model is optimized, the model operation amount is reduced, image recognition can be realized through a small-sized control panel, the size of a recognition module of picking equipment is reduced, and the model is conveniently applied to classified picking of superior tea leaves.
Step 3.1: setting a volume block in a YoloV4 backbone feature network as Depthwise-partial-volume, adopting a Bneck structure, and setting an activation function as H-swish; in a trunk feature extraction network CSPDarknet53 of YoloV4, a convolution block is DarknetConv2D _ BN _ Mish, an activation function is MISh, the convolution block is replaced by Depthwise-partial-consistent, a Bneck structure is adopted, and an H-swish activation function is used for replacing the MISh activation function in CSPDarknet53, the structure of MobileneetV 3 of the invention is schematically shown in FIG. 3, the frame diagram of improved YbilenetV 4 of the invention is shown in FIG. 4, and the feature capture network of larger CSPDarknet53 in YOLOV4 is replaced by MobileneetV 3.
Step 3.2: setting the input layer pictures as uniform size to be input into a feature capture network;
step 3.3: changing the original picture into 224 × 3 as a first feature layer by using a convolution network conv2d structure in MobilenetV 3;
step 3.4: changing the first feature layer into 112 × 16 as a second feature layer by using a residual network bneck3 × 3 structure in MobilenetV 3;
step 3.5: changing the second feature layer into 56 × 24 as a third feature layer by using a residual network bneck5 × 5 structure in MobilenetV 3;
step 3.6: changing the third feature layer into 28 × 40 as a fourth feature layer by using a residual network bneck3 × 3 structure in MobilenetV 3;
step 3.7: changing the third feature layer into 14 × 112 as a fifth feature layer by using a residual network bneck3 × 3 structure in MobilenetV 3;
step 3.8: the sixth feature layer was changed to 1 × 1280 as a feature output layer using pool pooling structure in MobilenetV3 and convolutional network conv2d, NBN structure.
And 4, step 4: importing the characteristic values captured by using a characteristic capture network MobilenetV3 into a characteristic layer to perform convolution operation for 3 times, importing the characteristic layer into a spatial pyramid pooling layer, and pooling the characteristic layer by using maximum pooling layers with different sizes;
in this embodiment, the feature value captured by the feature capture network mobileneetv 3 is introduced into the feature layer to perform convolution operation for 3 times, the feature layer is introduced into a Spatial pyramid pooling layer (SPP), and the feature layer is pooled by using maximum pooling layers (5, 9, 13) of different sizes.
And 5: stacking the pooled results, performing convolution for 3 times again, performing up-sampling on the feature layer after convolution for 3 times, stacking the feature layer after convolution for 3 times with the feature layer 1 and the feature layer 2 in the trunk feature extraction network to realize feature fusion, and performing down-sampling in the second stage after the construction of the feature pyramid is completed. The purpose of continuously upsampling and downsampling is to stack the samples for better features.
Step 6: setting a loss function, adding a cosine annealing attenuation function, and performing iterative training on the tea picking grade target identification model by using a training set until the loss function is converged to obtain a trained tea picking grade target identification model;
in this example, step 6.1: setting a loss function according to the data set;
further, step 6.1.1: using y _ true to take out the position of the point where the target really exists in the characteristic layer and the corresponding type of the point;
in this embodiment: step 6.1.1: and using y _ true to extract the position of the point where the target really exists in the feature layer and the corresponding type of the point.
Step 6.1.2: according to the class cluster objects distributed to the class clusters, repeatedly calculating and updating the class cluster clustering center;
step 6.1.3: when for each graph, IoU for all real and predicted blocks are calculated;
step 6.1.4: calculating LOSS CIOU As a function of the loss.
The IoU parameter calculated in step 6.1.3 represents the cross-over ratio, which is the most common index in target detection, IoU can be used to determine the positive sample and the negative sample, and can also be used to evaluate the distance between the prediction frame and the real frame, but IoU cannot accurately reflect the overlap ratio of the real frame and the prediction frame, LOSS CIOU And considering the distance, the overlapping rate, the scale and the penalty term between the prediction box and the real box, so that the prediction box regression becomes more stable.
The calculation formula of step 6.1.4 is:
Figure BDA0003642888770000061
in the formula, p 2 (b,b gt ) C represents the diagonal distance of the minimum closure area which can simultaneously contain the prediction frame and the real frame, alpha and v are penalty terms of the length-width ratio, and the formula of the alpha is as follows:
Figure BDA0003642888770000062
the formula for v is as follows:
Figure BDA0003642888770000071
in the formula, w gt 、h gt The width and height of the real box, respectively, W, h are the width and height of the predicted box, respectively.
Step 6.2: setting the number of iterations to 10000;
step 6.3: the training is divided into two stages, namely a freezing stage and a thawing stage, and the first 5000 iterations are set as the freezing stage and the second 5000 iterations are set as the thawing stage. The video memory occupied by the feature extraction network is set to be small when the feature extraction network is not changed in the first 5000 times of freezing stages, the network is only subjected to fine adjustment, the main trunk of the model is not frozen when the feature extraction network is changed in the second 5000 times of unfreezing stages, all parameters of the network are changed, and the video memory occupied by the network is increased. By adopting the iteration mode, the data volume needing to be processed during training is reduced, and the requirement of the model on the processor is reduced.
Step 6.4: and finishing setting and starting training, storing the trained model after 10000 times of iteration, drawing a loss function curve in the iteration process, and selecting the optimal model as a tea leaf picking grade target identification model according to the loss function curve.
In this embodiment, the loss function curve is shown in fig. 5, and the loss function curve is divided into a Train loss function curve and a val loss function curve, where the Train loss function curve represents the loss value of the whole training set; the val loss function curve represents the loss value for the entire test set. When the model is trained, the calculated loss function curve has approximately the following relationship:
when Train loss decreases, val loss stabilizes: network overfitting;
when Train loss stabilizes, val loss decreases: if the data set has serious problems, whether the label file has annotation errors or the data set is poor in quality can be checked, and the tea sample picture is reselected for annotation;
when Train loss decreases, val _ loss decreases: training is normal, and the model can be selected as a tea picking grade target identification model.
And 7: performing performance evaluation on the trained tea picking grade target identification model by using a verification set, and testing again by using a test set after the evaluation reaches the standard;
in this embodiment, a verification set is used to perform performance evaluation on the trained tea picking grade target identification model, and the comparison and evaluation of the improved algorithm and the original yoolov 4 result is shown in table 1.
Figure BDA0003642888770000072
Figure BDA0003642888770000081
TABLE 1 evaluation table comparing improved algorithm with original YoloV4 test result
According to the experimental result, the detection speed, the accuracy rate and the aspect of the redesigned tea picking grade target identification model are improved to some extent, and compared with the original accuracy rate, the accuracy rate is improved by 6.89%, and the detection speed of the model is improved by 6.4 times. Therefore, the tea leaf picking grade can be detected more accurately and effectively by using the improved YoloV4 according to different detection scenes.
And 8: and (4) guiding the evaluated tea leaf picking grade target identification model into a controller, and performing real-time video prediction on the picked tea leaves.
In this example, adopt the raspberry group as the controller, the raspberry group is a microcomputer mainboard based on ARM, and small being convenient for install into small-size tea picking equipment, pick grade division discernment to superior quality tealeaves automatically. The installation of the internal system of the raspberry group is completed, the Rasbian OS system provided by the raspberry group official is adopted in the raspberry group identification system, the Python library in the Rasbian OS system of the raspberry group is used for development, the camera detection algorithm is compiled and called on the basis of the model calling algorithm, and the video real-time detection and identification are carried out by connecting the raspberry group USB with the camera sensor.
And (3) the model generated in the step 6.5 is loaded into a raspberry group specified folder for model calling, the generated model is imported into the raspberry group, the raspberry group is connected with a camera through a usb interface, the camera is called and a model program generated before is called in the environment of the raspberry group, and all operations of real-time video prediction are realized through pictures captured by the camera in real time and calling of the model generated before.
Finally, it is noted that the above preferred embodiments are merely illustrative of the technical solutions of the invention and not restrictive, and that, although the invention has been described in detail with reference to the above preferred embodiments, it will be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the scope of the invention.

Claims (5)

1. A tea leaf picking and identifying method based on an improved YoloV4 model is characterized by comprising the following steps:
step 1: collecting a tea picture sample, and manually labeling the tea picture to complete data set manufacturing;
step 2: dividing an initial picture sample set and a marked picture sample set into a training set, a verification set and a test set;
and step 3: constructing a tea leaf picking grade target recognition model, wherein the tea leaf picking grade target recognition model is an improved YoloV4 target recognition model, and a MobilenetV3 feature extraction network is used for replacing a CSPDarkNet53 feature capture network;
and 4, step 4: importing the characteristic values captured by using a characteristic capture network MobilenetV3 into a characteristic layer for convolution operation for 3 times, importing the characteristic layer into a spatial pyramid pooling layer, and pooling the characteristic layer by using maximum pooling layers with different sizes;
and 5: stacking the pooled results, performing convolution for 3 times again, performing up-sampling on the feature layer after convolution for 3 times, stacking the feature layer after convolution for 3 times with a feature layer 1 and a feature layer 2 in a trunk feature extraction network to realize feature fusion, and performing down-sampling in the second stage after the construction of a feature pyramid is completed;
step 6: setting a loss function, adding a cosine annealing attenuation function, and performing iterative training on the tea picking grade target identification model by using a training set until the loss function is converged to obtain a trained tea picking grade target identification model;
and 7: performing performance evaluation on the trained tea picking grade target identification model by using a verification set, and testing again by using a test set after the evaluation reaches the standard;
and 8: and (4) guiding the evaluated tea leaf picking grade target identification model into a controller, and performing real-time video prediction on the picked tea leaves.
2. The tea picking identification method based on the improved yoolov 4 model as claimed in claim 1, wherein the step 3 comprises the following steps:
step 3.1: setting a volume block in a YoloV4 backbone feature network as Depthwise-partial-volume, adopting a Bneck structure, and setting an activation function as H-swish;
step 3.2: setting the input layer pictures as uniform size to be input into a feature capture network;
step 3.3: changing the original picture into 224 × 3 as a first feature layer by using a convolution network conv2d structure in MobilenetV 3;
step 3.4: changing the first feature layer into 112 × 16 as a second feature layer by using a residual network bneck3 × 3 structure in MobilenetV 3;
step 3.5: changing the second feature layer into 56 × 24 as a third feature layer by using a residual network bneck5 × 5 structure in MobilenetV 3;
step 3.6: the third feature layer was changed to 28 × 40 as the fourth feature layer using the residual net bneck3 × 3 structure in MobilenetV3,
step 3.7: changing the third feature layer into 14 × 112 as a fifth feature layer by using a residual network bneck3 × 3 structure in MobilenetV 3;
step 3.8: the sixth feature level was changed to 1 x 1280 as a feature output level using pool pooling in MobilenetV3 and convolutional networks conv2d, NBN.
3. The tea leaf picking identification method based on the improved YoloV4 model as claimed in claim 1, wherein the method comprises the following steps: the step 6 comprises the following steps:
step 6.1: setting a loss function according to the data set;
step 6.2: setting the number of iterations to 10000;
step 6.3: dividing training into two stages, namely a freezing stage and a thawing stage, setting the first 5000 iterations as the freezing stage and the second 5000 iterations as the thawing stage;
step 6.4: and finishing setting and starting training, storing the trained model after 10000 times of iteration, drawing a loss function curve in the iteration process, and selecting the optimal model as a tea leaf picking grade target identification model according to the loss function curve.
4. The tea picking identification method based on the improved yoolov 4 model as claimed in claim 3, wherein the method comprises the following steps: the step 6.1 comprises the following steps:
step 6.1.1: using y _ true to take out the position of the point where the target really exists in the characteristic layer and the corresponding type of the point;
step 6.1.2: according to the class cluster objects distributed to the class clusters, repeatedly calculating and updating the class cluster center;
step 6.1.3: when for each graph, IoU for all real and predicted blocks are calculated;
step 6.1.4: calculating LOSS CIOU As a function of the regression loss.
5. The tea picking identification method based on the improved yoolov 4 model as claimed in claim 4, wherein the method comprises the following steps: the calculation formula of step 6.1.4 is:
Figure FDA0003642888760000021
in the formula, p 2 (b,b gt ) C represents the diagonal distance of the minimum closure area which can contain the prediction frame and the real frame at the same time, alpha and v are penalty terms of the length-width ratio, and the formula of the alpha is as follows:
Figure FDA0003642888760000031
the formula for v is as follows:
Figure FDA0003642888760000032
in the formula, w gt 、h gt The width and height of the real box, respectively, and w, h the width and height of the predicted box, respectively.
CN202210523294.0A 2022-05-13 2022-05-13 Tea leaf picking and identifying method based on improved YoloV4 model Pending CN115019226A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210523294.0A CN115019226A (en) 2022-05-13 2022-05-13 Tea leaf picking and identifying method based on improved YoloV4 model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210523294.0A CN115019226A (en) 2022-05-13 2022-05-13 Tea leaf picking and identifying method based on improved YoloV4 model

Publications (1)

Publication Number Publication Date
CN115019226A true CN115019226A (en) 2022-09-06

Family

ID=83069223

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210523294.0A Pending CN115019226A (en) 2022-05-13 2022-05-13 Tea leaf picking and identifying method based on improved YoloV4 model

Country Status (1)

Country Link
CN (1) CN115019226A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112001339A (en) * 2020-08-27 2020-11-27 杭州电子科技大学 Pedestrian social distance real-time monitoring method based on YOLO v4
CN113269132A (en) * 2021-06-15 2021-08-17 成都恒创新星科技有限公司 Vehicle detection method and system based on YOLOV4 optimization algorithm
CN113674226A (en) * 2021-07-31 2021-11-19 河海大学 Tea leaf picking machine tea leaf bud tip detection method based on deep learning
CN113887395A (en) * 2021-09-29 2022-01-04 浙江工业大学 Depth separable convolution YOLOv4 model-based filter bag opening position detection method
US20220114759A1 (en) * 2020-12-25 2022-04-14 Apollo Intelligent Connectivity (Beijing) Technology Co., Ltd. Target detection method, electronic device and medium
CN114387520A (en) * 2022-01-14 2022-04-22 华南农业大学 Precision detection method and system for intensive plums picked by robot

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112001339A (en) * 2020-08-27 2020-11-27 杭州电子科技大学 Pedestrian social distance real-time monitoring method based on YOLO v4
US20220114759A1 (en) * 2020-12-25 2022-04-14 Apollo Intelligent Connectivity (Beijing) Technology Co., Ltd. Target detection method, electronic device and medium
CN113269132A (en) * 2021-06-15 2021-08-17 成都恒创新星科技有限公司 Vehicle detection method and system based on YOLOV4 optimization algorithm
CN113674226A (en) * 2021-07-31 2021-11-19 河海大学 Tea leaf picking machine tea leaf bud tip detection method based on deep learning
CN113887395A (en) * 2021-09-29 2022-01-04 浙江工业大学 Depth separable convolution YOLOv4 model-based filter bag opening position detection method
CN114387520A (en) * 2022-01-14 2022-04-22 华南农业大学 Precision detection method and system for intensive plums picked by robot

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
XINTING LIAO, SHENGPING LV: "YOLOv4-MN3 for PCB Surface Defect Detection", 《APPLIED SCIENCES》 *
陈龙: "茶叶嫩芽视觉识别与采摘技术研究", 《中国优秀硕士学位论文全文数据库农业科技辑》 *

Similar Documents

Publication Publication Date Title
CN110619385B (en) Structured network model compression acceleration method based on multi-stage pruning
CN104063686B (en) Crop leaf diseases image interactive diagnostic system and method
CN109284760B (en) Furniture detection method and device based on deep convolutional neural network
CN112949704B (en) Tobacco leaf maturity state identification method and device based on image analysis
CN115171165A (en) Pedestrian re-identification method and device with global features and step-type local features fused
CN111382808A (en) Vehicle detection processing method and device
CN110599459A (en) Underground pipe network risk assessment cloud system based on deep learning
CN113850136A (en) Yolov5 and BCNN-based vehicle orientation identification method and system
CN113901928A (en) Target detection method based on dynamic super-resolution, and power transmission line component detection method and system
CN115019226A (en) Tea leaf picking and identifying method based on improved YoloV4 model
CN111881803A (en) Livestock face recognition method based on improved YOLOv3
CN117132802A (en) Method, device and storage medium for identifying field wheat diseases and insect pests
CN114359359B (en) Multitask optical and SAR remote sensing image registration method, equipment and medium
CN110852398A (en) Cotton aphid identification method based on convolutional neural network
CN115761356A (en) Image recognition method and device, electronic equipment and storage medium
CN112767427A (en) Low-resolution image recognition algorithm for compensating edge information
CN111093140A (en) Method, device, equipment and storage medium for detecting defects of microphone and earphone dust screen
CN112257706A (en) Flower identification method based on local characteristics of pistils
CN114120023A (en) Method and device for identifying copied image and computer readable storage medium
CN206236111U (en) A kind of leaf image plant automatic identification equipment based on interactive voice
CN117372787B (en) Image multi-category identification method and device
CN116188834B (en) Full-slice image classification method and device based on self-adaptive training model
CN114049254B (en) Low-pixel ox-head image reconstruction and identification method, system, equipment and storage medium
CN117392440B (en) Textile fabric retrieval method and system based on tissue structure and color classification
CN112184056B (en) Data feature extraction method and system based on convolutional neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20220906

RJ01 Rejection of invention patent application after publication