CN115019226A - Tea leaf picking and identifying method based on improved YoloV4 model - Google Patents
Tea leaf picking and identifying method based on improved YoloV4 model Download PDFInfo
- Publication number
- CN115019226A CN115019226A CN202210523294.0A CN202210523294A CN115019226A CN 115019226 A CN115019226 A CN 115019226A CN 202210523294 A CN202210523294 A CN 202210523294A CN 115019226 A CN115019226 A CN 115019226A
- Authority
- CN
- China
- Prior art keywords
- feature
- model
- tea
- layer
- picking
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 20
- 241001122767 Theaceae Species 0.000 claims abstract description 69
- 238000000605 extraction Methods 0.000 claims abstract description 17
- 238000012549 training Methods 0.000 claims description 23
- 238000011176 pooling Methods 0.000 claims description 14
- 238000012360 testing method Methods 0.000 claims description 14
- 238000012795 verification Methods 0.000 claims description 10
- 238000011156 evaluation Methods 0.000 claims description 9
- 230000008014 freezing Effects 0.000 claims description 7
- 238000007710 freezing Methods 0.000 claims description 7
- 230000004913 activation Effects 0.000 claims description 6
- 238000005070 sampling Methods 0.000 claims description 6
- 238000010257 thawing Methods 0.000 claims description 6
- 238000000137 annealing Methods 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 claims description 3
- 238000010276 construction Methods 0.000 claims description 3
- 230000004927 fusion Effects 0.000 claims description 3
- 238000002372 labelling Methods 0.000 claims description 3
- 238000004519 manufacturing process Methods 0.000 claims description 3
- 238000001514 detection method Methods 0.000 abstract description 7
- 238000012821 model calculation Methods 0.000 abstract description 3
- 230000006870 function Effects 0.000 description 29
- 240000007651 Rubus glaucus Species 0.000 description 11
- 235000011034 Rubus glaucus Nutrition 0.000 description 11
- 235000009122 Rubus idaeus Nutrition 0.000 description 11
- 238000004422 calculation algorithm Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 5
- 230000007423 decrease Effects 0.000 description 4
- 238000012545 processing Methods 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 240000001238 Gaultheria procumbens Species 0.000 description 1
- 235000007297 Gaultheria procumbens Nutrition 0.000 description 1
- 235000010394 Solidago odora Nutrition 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000009434 installation Methods 0.000 description 1
- 238000011897 real-time detection Methods 0.000 description 1
- 230000009182 swimming Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/41—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/766—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using regression, e.g. by projecting features on hyperplanes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/806—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02P—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
- Y02P90/00—Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
- Y02P90/30—Computing systems specially adapted for manufacturing
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Multimedia (AREA)
- Software Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- General Health & Medical Sciences (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a tea leaf picking and identifying method based on an improved YooloV 4 model, belongs to the technical field of image target detection, and aims to optimize the traditional model feature extraction of the tea leaf picking and identifying method based on the improved YooloV 4 model, reduce the model calculation amount, realize image identification through a small control panel, reduce the size of a picking equipment identification module and facilitate application to classified picking of superior tea leaves.
Description
Technical Field
The invention belongs to the technical field of image target detection, and particularly relates to a tea leaf picking and identifying method based on an improved Yolov4 model.
Background
At present, with the continuous increase of tea demand, automatic tea picking equipment is gradually selected for use in a large-scale tea plantation for tea picking, and the western swimming tea picking equipment is used for cutting tea leaves close to each other by using a compound cutting knife under the assistance of manpower, so that the picking efficiency is high, and the labor intensity of tea picking is greatly relieved. The high-quality tea leaves are divided into multiple grades such as one bud and one leaf, one bud and two leaves, one bud and three leaves and the like according to picking grades, but the existing tea picking equipment generally has no recognition function, the picking mode is undifferentiated picking, the quality of the picked tea leaves is not selected, and the existing high-quality famous tea leaves still need to be picked manually. In order to improve the identification accuracy of the picking equipment, part of the picking equipment selects a traditional visual identification algorithm to carry out picking grading according to picked tea images, but the traditional visual identification algorithm has large model, large operation data amount and higher requirement on identification processing equipment, and an industrial personal computer is required to be configured for carrying out image identification, so that the picking equipment has overlarge volume and cannot work in a tea garden with high planting density. The mode of identifying and judging through the cloud server is adopted to grade tea leaves to be picked, the requirement on a network is higher, and network infrastructure of a part of mountain tea gardens is poorer, so that normal work of picking equipment is influenced.
Disclosure of Invention
In order to overcome the problems in the background art, the invention provides a tea leaf picking and identifying method based on an improved YOLOV4 model, which optimizes the feature extraction of the traditional model, reduces the model calculation amount, can realize image identification through a small control panel, reduces the size of an identification module of picking equipment, and is convenient to apply to classified picking of superior tea leaves.
In order to achieve the purpose, the invention is realized by the following technical scheme: 1. a tea leaf picking and identifying method based on an improved YooloV 4 model comprises the following steps: step 1: collecting a tea picture sample, and manually labeling the tea picture to complete data set manufacturing;
step 2: dividing an initial picture sample set and a marked picture sample set into a training set, a verification set and a test set;
and step 3: constructing a tea leaf picking grade target recognition model, wherein the tea leaf picking grade target recognition model is an improved YoloV4 target recognition model, and a MobilenetV3 feature extraction network is used for replacing a CSPDarkNet53 feature capture network;
and 4, step 4: importing the characteristic values captured by using a characteristic capture network MobilenetV3 into a characteristic layer to perform convolution operation for 3 times, importing the characteristic layer into a spatial pyramid pooling layer, and pooling the characteristic layer by using maximum pooling layers with different sizes;
and 5: stacking the pooled results, performing convolution for 3 times again, performing up-sampling on the feature layer after convolution for 3 times, stacking the feature layer after convolution for 3 times with a feature layer 1 and a feature layer 2 in a trunk feature extraction network to realize feature fusion, and performing down-sampling in the second stage after the construction of a feature pyramid is completed;
step 6: setting a loss function, adding a cosine annealing attenuation function, and performing iterative training on the tea picking grade target identification model by using a training set until the loss function is converged to obtain a trained tea picking grade target identification model;
and 7: performing performance evaluation on the trained tea picking grade target identification model by using a verification set, and testing again by using a test set after the evaluation reaches the standard;
and 8: and (4) guiding the evaluated tea leaf picking grade target identification model into a controller, and performing real-time video prediction on the picked tea leaves.
Further, the step 3 comprises the following steps:
step 3.1: setting a volume block in a YoloV4 trunk feature network as Depthwise-separable-volume, adopting a Bneck structure, and setting an activation function as H-swish;
step 3.2: setting the input layer pictures as uniform size to be input into a feature capture network;
step 3.3: changing the original picture into 224 × 3 as a first feature layer by using a convolution network conv2d structure in MobilenetV 3;
step 3.4: changing the first feature layer into 112 × 16 as a second feature layer by using a residual network bneck3 × 3 structure in MobilenetV 3;
step 3.5: changing the second feature layer into 56 × 24 as a third feature layer by using a residual network bneck5 × 5 structure in MobilenetV 3;
step 3.6: the third feature layer was changed to 28 × 40 as the fourth feature layer using the residual net bneck3 × 3 structure in MobilenetV3,
step 3.7: changing the third feature layer into 14 × 112 as a fifth feature layer by using a residual network bneck3 × 3 structure in MobilenetV 3;
step 3.8: the sixth feature layer was changed to 1 × 1280 as a feature output layer using pool pooling structure in MobilenetV3 and convolutional network conv2d, NBN structure.
Further, the step 6 comprises the following steps:
step 6.1: setting a loss function according to the data set;
step 6.2: setting the number of iterations to 10000;
step 6.3: the training is divided into two stages, namely a freezing stage and a thawing stage, wherein the first 5000 iterations are set as the freezing stage, and the second 5000 iterations are set as the thawing stage;
step 6.4: and finishing setting and starting training, storing the trained model after 10000 times of iteration, drawing a loss function curve in the iteration process, and selecting the optimal model as a tea leaf picking grade target identification model according to the loss function curve.
Further, the step 6.1 comprises the following steps:
step 6.1.1: using y _ true to take out the position of the point where the target really exists in the characteristic layer and the corresponding type of the point;
step 6.1.2: according to the class cluster objects distributed to the class clusters, repeatedly calculating and updating the class cluster clustering center;
step 6.1.3: when for each graph, IoU for all real and predicted blocks are calculated;
step 6.1.4: calculating LOSS CIOU As a function of the regression loss.
Further, the calculation formula of step 6.1.4 is:
in the formula, p 2 (b,b gt ) C represents the diagonal distance of the minimum closure area which can simultaneously contain the prediction frame and the real frame, alpha and v are penalty terms of the length-width ratio, and the formula of the alpha is as follows:
the formula for v is as follows:
in the formula, w gt 、h gt The width and height of the real box, respectively, and w, h the width and height of the predicted box, respectively.
The invention has the beneficial effects that: a tea leaf picking and identifying method based on an improved YOLOV4 model optimizes feature extraction of a traditional model, reduces model calculation amount, can realize image identification through a small control board, reduces the size of a picking equipment identification module, and is convenient to apply to classified picking of superior tea leaves.
Drawings
FIG. 1 is a flow chart of the steps of the present invention;
FIG. 2 is a diagram of the original YoloV4 framework;
FIG. 3 is a schematic structural diagram of MobilenetV3 according to the present invention;
FIG. 4 is a block diagram of the improved YOLOV4 of the present invention;
FIG. 5 is a graph of an example loss function;
FIG. 6 is a graph comparing model performance.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, preferred embodiments of the present invention will be described in detail below with reference to the accompanying drawings to facilitate understanding of the skilled person.
The invention discloses a tea leaf picking and identifying method based on an improved YOLOV4 model, which comprises the following steps:
step 1: collecting a tea picture sample, and manually labeling the tea picture to complete data set manufacturing;
in the embodiment, a tea picture sample is collected, images of different picking grades of one bud, one leaf, two leaves and three leaves of tea are manually marked by using LabelImg, the images of one bud, one leaf, two leaves and three leaves of tea are ensured to be positioned in the center of a marking frame, and a generated picture corresponding to an XML file is stored in a label folder, so that a data set is manufactured;
step 2: dividing the initial picture sample set and the real frame picture sample set into a training set, a verification set and a test set according to a proper proportion;
in the embodiment, a data set is randomly divided into a training set, a verification set and a test set according to a ratio of 6:2:2, the training set, the verification set and the test set are independent of each other, in specific identification, the training set is used for training a model, the verification set is used for verifying the performance of the model after training is completed, and the test set is used for drawing a loss function curve.
And 3, step 3: a tea leaf picking grade target recognition model is constructed, the tea leaf picking grade target recognition model is an improved YoloV4 target recognition model, a MobilenetV3 feature extraction network is used for replacing a CSPDarkNet53 feature capture network, and a schematic structural diagram of MobilenetV3 adopted in the embodiment is shown in fig. 3.
The CSPDarkNet53 is selected as a feature extraction network in a traditional YoloV4 target recognition model, the structure of an original YoloV4 target recognition model is shown in figure 2, a CSPDarkNet53 feature extraction network model is large, poor in pertinence and high in requirements on processing and analyzing equipment, so that the CSPDarkNet53 feature capture network is replaced by a MobilenetV3 feature extraction network, feature extraction of a traditional model is optimized, the model operation amount is reduced, image recognition can be realized through a small-sized control panel, the size of a recognition module of picking equipment is reduced, and the model is conveniently applied to classified picking of superior tea leaves.
Step 3.1: setting a volume block in a YoloV4 backbone feature network as Depthwise-partial-volume, adopting a Bneck structure, and setting an activation function as H-swish; in a trunk feature extraction network CSPDarknet53 of YoloV4, a convolution block is DarknetConv2D _ BN _ Mish, an activation function is MISh, the convolution block is replaced by Depthwise-partial-consistent, a Bneck structure is adopted, and an H-swish activation function is used for replacing the MISh activation function in CSPDarknet53, the structure of MobileneetV 3 of the invention is schematically shown in FIG. 3, the frame diagram of improved YbilenetV 4 of the invention is shown in FIG. 4, and the feature capture network of larger CSPDarknet53 in YOLOV4 is replaced by MobileneetV 3.
Step 3.2: setting the input layer pictures as uniform size to be input into a feature capture network;
step 3.3: changing the original picture into 224 × 3 as a first feature layer by using a convolution network conv2d structure in MobilenetV 3;
step 3.4: changing the first feature layer into 112 × 16 as a second feature layer by using a residual network bneck3 × 3 structure in MobilenetV 3;
step 3.5: changing the second feature layer into 56 × 24 as a third feature layer by using a residual network bneck5 × 5 structure in MobilenetV 3;
step 3.6: changing the third feature layer into 28 × 40 as a fourth feature layer by using a residual network bneck3 × 3 structure in MobilenetV 3;
step 3.7: changing the third feature layer into 14 × 112 as a fifth feature layer by using a residual network bneck3 × 3 structure in MobilenetV 3;
step 3.8: the sixth feature layer was changed to 1 × 1280 as a feature output layer using pool pooling structure in MobilenetV3 and convolutional network conv2d, NBN structure.
And 4, step 4: importing the characteristic values captured by using a characteristic capture network MobilenetV3 into a characteristic layer to perform convolution operation for 3 times, importing the characteristic layer into a spatial pyramid pooling layer, and pooling the characteristic layer by using maximum pooling layers with different sizes;
in this embodiment, the feature value captured by the feature capture network mobileneetv 3 is introduced into the feature layer to perform convolution operation for 3 times, the feature layer is introduced into a Spatial pyramid pooling layer (SPP), and the feature layer is pooled by using maximum pooling layers (5, 9, 13) of different sizes.
And 5: stacking the pooled results, performing convolution for 3 times again, performing up-sampling on the feature layer after convolution for 3 times, stacking the feature layer after convolution for 3 times with the feature layer 1 and the feature layer 2 in the trunk feature extraction network to realize feature fusion, and performing down-sampling in the second stage after the construction of the feature pyramid is completed. The purpose of continuously upsampling and downsampling is to stack the samples for better features.
Step 6: setting a loss function, adding a cosine annealing attenuation function, and performing iterative training on the tea picking grade target identification model by using a training set until the loss function is converged to obtain a trained tea picking grade target identification model;
in this example, step 6.1: setting a loss function according to the data set;
further, step 6.1.1: using y _ true to take out the position of the point where the target really exists in the characteristic layer and the corresponding type of the point;
in this embodiment: step 6.1.1: and using y _ true to extract the position of the point where the target really exists in the feature layer and the corresponding type of the point.
Step 6.1.2: according to the class cluster objects distributed to the class clusters, repeatedly calculating and updating the class cluster clustering center;
step 6.1.3: when for each graph, IoU for all real and predicted blocks are calculated;
step 6.1.4: calculating LOSS CIOU As a function of the loss.
The IoU parameter calculated in step 6.1.3 represents the cross-over ratio, which is the most common index in target detection, IoU can be used to determine the positive sample and the negative sample, and can also be used to evaluate the distance between the prediction frame and the real frame, but IoU cannot accurately reflect the overlap ratio of the real frame and the prediction frame, LOSS CIOU And considering the distance, the overlapping rate, the scale and the penalty term between the prediction box and the real box, so that the prediction box regression becomes more stable.
The calculation formula of step 6.1.4 is:
in the formula, p 2 (b,b gt ) C represents the diagonal distance of the minimum closure area which can simultaneously contain the prediction frame and the real frame, alpha and v are penalty terms of the length-width ratio, and the formula of the alpha is as follows:
the formula for v is as follows:
in the formula, w gt 、h gt The width and height of the real box, respectively, W, h are the width and height of the predicted box, respectively.
Step 6.2: setting the number of iterations to 10000;
step 6.3: the training is divided into two stages, namely a freezing stage and a thawing stage, and the first 5000 iterations are set as the freezing stage and the second 5000 iterations are set as the thawing stage. The video memory occupied by the feature extraction network is set to be small when the feature extraction network is not changed in the first 5000 times of freezing stages, the network is only subjected to fine adjustment, the main trunk of the model is not frozen when the feature extraction network is changed in the second 5000 times of unfreezing stages, all parameters of the network are changed, and the video memory occupied by the network is increased. By adopting the iteration mode, the data volume needing to be processed during training is reduced, and the requirement of the model on the processor is reduced.
Step 6.4: and finishing setting and starting training, storing the trained model after 10000 times of iteration, drawing a loss function curve in the iteration process, and selecting the optimal model as a tea leaf picking grade target identification model according to the loss function curve.
In this embodiment, the loss function curve is shown in fig. 5, and the loss function curve is divided into a Train loss function curve and a val loss function curve, where the Train loss function curve represents the loss value of the whole training set; the val loss function curve represents the loss value for the entire test set. When the model is trained, the calculated loss function curve has approximately the following relationship:
when Train loss decreases, val loss stabilizes: network overfitting;
when Train loss stabilizes, val loss decreases: if the data set has serious problems, whether the label file has annotation errors or the data set is poor in quality can be checked, and the tea sample picture is reselected for annotation;
when Train loss decreases, val _ loss decreases: training is normal, and the model can be selected as a tea picking grade target identification model.
And 7: performing performance evaluation on the trained tea picking grade target identification model by using a verification set, and testing again by using a test set after the evaluation reaches the standard;
in this embodiment, a verification set is used to perform performance evaluation on the trained tea picking grade target identification model, and the comparison and evaluation of the improved algorithm and the original yoolov 4 result is shown in table 1.
TABLE 1 evaluation table comparing improved algorithm with original YoloV4 test result
According to the experimental result, the detection speed, the accuracy rate and the aspect of the redesigned tea picking grade target identification model are improved to some extent, and compared with the original accuracy rate, the accuracy rate is improved by 6.89%, and the detection speed of the model is improved by 6.4 times. Therefore, the tea leaf picking grade can be detected more accurately and effectively by using the improved YoloV4 according to different detection scenes.
And 8: and (4) guiding the evaluated tea leaf picking grade target identification model into a controller, and performing real-time video prediction on the picked tea leaves.
In this example, adopt the raspberry group as the controller, the raspberry group is a microcomputer mainboard based on ARM, and small being convenient for install into small-size tea picking equipment, pick grade division discernment to superior quality tealeaves automatically. The installation of the internal system of the raspberry group is completed, the Rasbian OS system provided by the raspberry group official is adopted in the raspberry group identification system, the Python library in the Rasbian OS system of the raspberry group is used for development, the camera detection algorithm is compiled and called on the basis of the model calling algorithm, and the video real-time detection and identification are carried out by connecting the raspberry group USB with the camera sensor.
And (3) the model generated in the step 6.5 is loaded into a raspberry group specified folder for model calling, the generated model is imported into the raspberry group, the raspberry group is connected with a camera through a usb interface, the camera is called and a model program generated before is called in the environment of the raspberry group, and all operations of real-time video prediction are realized through pictures captured by the camera in real time and calling of the model generated before.
Finally, it is noted that the above preferred embodiments are merely illustrative of the technical solutions of the invention and not restrictive, and that, although the invention has been described in detail with reference to the above preferred embodiments, it will be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the scope of the invention.
Claims (5)
1. A tea leaf picking and identifying method based on an improved YoloV4 model is characterized by comprising the following steps:
step 1: collecting a tea picture sample, and manually labeling the tea picture to complete data set manufacturing;
step 2: dividing an initial picture sample set and a marked picture sample set into a training set, a verification set and a test set;
and step 3: constructing a tea leaf picking grade target recognition model, wherein the tea leaf picking grade target recognition model is an improved YoloV4 target recognition model, and a MobilenetV3 feature extraction network is used for replacing a CSPDarkNet53 feature capture network;
and 4, step 4: importing the characteristic values captured by using a characteristic capture network MobilenetV3 into a characteristic layer for convolution operation for 3 times, importing the characteristic layer into a spatial pyramid pooling layer, and pooling the characteristic layer by using maximum pooling layers with different sizes;
and 5: stacking the pooled results, performing convolution for 3 times again, performing up-sampling on the feature layer after convolution for 3 times, stacking the feature layer after convolution for 3 times with a feature layer 1 and a feature layer 2 in a trunk feature extraction network to realize feature fusion, and performing down-sampling in the second stage after the construction of a feature pyramid is completed;
step 6: setting a loss function, adding a cosine annealing attenuation function, and performing iterative training on the tea picking grade target identification model by using a training set until the loss function is converged to obtain a trained tea picking grade target identification model;
and 7: performing performance evaluation on the trained tea picking grade target identification model by using a verification set, and testing again by using a test set after the evaluation reaches the standard;
and 8: and (4) guiding the evaluated tea leaf picking grade target identification model into a controller, and performing real-time video prediction on the picked tea leaves.
2. The tea picking identification method based on the improved yoolov 4 model as claimed in claim 1, wherein the step 3 comprises the following steps:
step 3.1: setting a volume block in a YoloV4 backbone feature network as Depthwise-partial-volume, adopting a Bneck structure, and setting an activation function as H-swish;
step 3.2: setting the input layer pictures as uniform size to be input into a feature capture network;
step 3.3: changing the original picture into 224 × 3 as a first feature layer by using a convolution network conv2d structure in MobilenetV 3;
step 3.4: changing the first feature layer into 112 × 16 as a second feature layer by using a residual network bneck3 × 3 structure in MobilenetV 3;
step 3.5: changing the second feature layer into 56 × 24 as a third feature layer by using a residual network bneck5 × 5 structure in MobilenetV 3;
step 3.6: the third feature layer was changed to 28 × 40 as the fourth feature layer using the residual net bneck3 × 3 structure in MobilenetV3,
step 3.7: changing the third feature layer into 14 × 112 as a fifth feature layer by using a residual network bneck3 × 3 structure in MobilenetV 3;
step 3.8: the sixth feature level was changed to 1 x 1280 as a feature output level using pool pooling in MobilenetV3 and convolutional networks conv2d, NBN.
3. The tea leaf picking identification method based on the improved YoloV4 model as claimed in claim 1, wherein the method comprises the following steps: the step 6 comprises the following steps:
step 6.1: setting a loss function according to the data set;
step 6.2: setting the number of iterations to 10000;
step 6.3: dividing training into two stages, namely a freezing stage and a thawing stage, setting the first 5000 iterations as the freezing stage and the second 5000 iterations as the thawing stage;
step 6.4: and finishing setting and starting training, storing the trained model after 10000 times of iteration, drawing a loss function curve in the iteration process, and selecting the optimal model as a tea leaf picking grade target identification model according to the loss function curve.
4. The tea picking identification method based on the improved yoolov 4 model as claimed in claim 3, wherein the method comprises the following steps: the step 6.1 comprises the following steps:
step 6.1.1: using y _ true to take out the position of the point where the target really exists in the characteristic layer and the corresponding type of the point;
step 6.1.2: according to the class cluster objects distributed to the class clusters, repeatedly calculating and updating the class cluster center;
step 6.1.3: when for each graph, IoU for all real and predicted blocks are calculated;
step 6.1.4: calculating LOSS CIOU As a function of the regression loss.
5. The tea picking identification method based on the improved yoolov 4 model as claimed in claim 4, wherein the method comprises the following steps: the calculation formula of step 6.1.4 is:
in the formula, p 2 (b,b gt ) C represents the diagonal distance of the minimum closure area which can contain the prediction frame and the real frame at the same time, alpha and v are penalty terms of the length-width ratio, and the formula of the alpha is as follows:
the formula for v is as follows:
in the formula, w gt 、h gt The width and height of the real box, respectively, and w, h the width and height of the predicted box, respectively.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210523294.0A CN115019226A (en) | 2022-05-13 | 2022-05-13 | Tea leaf picking and identifying method based on improved YoloV4 model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210523294.0A CN115019226A (en) | 2022-05-13 | 2022-05-13 | Tea leaf picking and identifying method based on improved YoloV4 model |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115019226A true CN115019226A (en) | 2022-09-06 |
Family
ID=83069223
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210523294.0A Pending CN115019226A (en) | 2022-05-13 | 2022-05-13 | Tea leaf picking and identifying method based on improved YoloV4 model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115019226A (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112001339A (en) * | 2020-08-27 | 2020-11-27 | 杭州电子科技大学 | Pedestrian social distance real-time monitoring method based on YOLO v4 |
CN113269132A (en) * | 2021-06-15 | 2021-08-17 | 成都恒创新星科技有限公司 | Vehicle detection method and system based on YOLOV4 optimization algorithm |
CN113674226A (en) * | 2021-07-31 | 2021-11-19 | 河海大学 | Tea leaf picking machine tea leaf bud tip detection method based on deep learning |
CN113887395A (en) * | 2021-09-29 | 2022-01-04 | 浙江工业大学 | Depth separable convolution YOLOv4 model-based filter bag opening position detection method |
US20220114759A1 (en) * | 2020-12-25 | 2022-04-14 | Apollo Intelligent Connectivity (Beijing) Technology Co., Ltd. | Target detection method, electronic device and medium |
CN114387520A (en) * | 2022-01-14 | 2022-04-22 | 华南农业大学 | Precision detection method and system for intensive plums picked by robot |
-
2022
- 2022-05-13 CN CN202210523294.0A patent/CN115019226A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112001339A (en) * | 2020-08-27 | 2020-11-27 | 杭州电子科技大学 | Pedestrian social distance real-time monitoring method based on YOLO v4 |
US20220114759A1 (en) * | 2020-12-25 | 2022-04-14 | Apollo Intelligent Connectivity (Beijing) Technology Co., Ltd. | Target detection method, electronic device and medium |
CN113269132A (en) * | 2021-06-15 | 2021-08-17 | 成都恒创新星科技有限公司 | Vehicle detection method and system based on YOLOV4 optimization algorithm |
CN113674226A (en) * | 2021-07-31 | 2021-11-19 | 河海大学 | Tea leaf picking machine tea leaf bud tip detection method based on deep learning |
CN113887395A (en) * | 2021-09-29 | 2022-01-04 | 浙江工业大学 | Depth separable convolution YOLOv4 model-based filter bag opening position detection method |
CN114387520A (en) * | 2022-01-14 | 2022-04-22 | 华南农业大学 | Precision detection method and system for intensive plums picked by robot |
Non-Patent Citations (2)
Title |
---|
XINTING LIAO, SHENGPING LV: "YOLOv4-MN3 for PCB Surface Defect Detection", 《APPLIED SCIENCES》 * |
陈龙: "茶叶嫩芽视觉识别与采摘技术研究", 《中国优秀硕士学位论文全文数据库农业科技辑》 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110619385B (en) | Structured network model compression acceleration method based on multi-stage pruning | |
CN104063686B (en) | Crop leaf diseases image interactive diagnostic system and method | |
CN109284760B (en) | Furniture detection method and device based on deep convolutional neural network | |
CN112949704B (en) | Tobacco leaf maturity state identification method and device based on image analysis | |
CN115171165A (en) | Pedestrian re-identification method and device with global features and step-type local features fused | |
CN111382808A (en) | Vehicle detection processing method and device | |
CN110599459A (en) | Underground pipe network risk assessment cloud system based on deep learning | |
CN113850136A (en) | Yolov5 and BCNN-based vehicle orientation identification method and system | |
CN113901928A (en) | Target detection method based on dynamic super-resolution, and power transmission line component detection method and system | |
CN115019226A (en) | Tea leaf picking and identifying method based on improved YoloV4 model | |
CN111881803A (en) | Livestock face recognition method based on improved YOLOv3 | |
CN117132802A (en) | Method, device and storage medium for identifying field wheat diseases and insect pests | |
CN114359359B (en) | Multitask optical and SAR remote sensing image registration method, equipment and medium | |
CN110852398A (en) | Cotton aphid identification method based on convolutional neural network | |
CN115761356A (en) | Image recognition method and device, electronic equipment and storage medium | |
CN112767427A (en) | Low-resolution image recognition algorithm for compensating edge information | |
CN111093140A (en) | Method, device, equipment and storage medium for detecting defects of microphone and earphone dust screen | |
CN112257706A (en) | Flower identification method based on local characteristics of pistils | |
CN114120023A (en) | Method and device for identifying copied image and computer readable storage medium | |
CN206236111U (en) | A kind of leaf image plant automatic identification equipment based on interactive voice | |
CN117372787B (en) | Image multi-category identification method and device | |
CN116188834B (en) | Full-slice image classification method and device based on self-adaptive training model | |
CN114049254B (en) | Low-pixel ox-head image reconstruction and identification method, system, equipment and storage medium | |
CN117392440B (en) | Textile fabric retrieval method and system based on tissue structure and color classification | |
CN112184056B (en) | Data feature extraction method and system based on convolutional neural network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20220906 |
|
RJ01 | Rejection of invention patent application after publication |