CN115995017A - Fruit identification and positioning method, device and medium - Google Patents
Fruit identification and positioning method, device and medium Download PDFInfo
- Publication number
- CN115995017A CN115995017A CN202211553660.3A CN202211553660A CN115995017A CN 115995017 A CN115995017 A CN 115995017A CN 202211553660 A CN202211553660 A CN 202211553660A CN 115995017 A CN115995017 A CN 115995017A
- Authority
- CN
- China
- Prior art keywords
- fruit
- fruits
- target detection
- image
- training
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- 235000013399 edible fruits Nutrition 0.000 title claims abstract description 232
- 238000000034 method Methods 0.000 title claims abstract description 47
- 238000001514 detection method Methods 0.000 claims abstract description 92
- 238000012549 training Methods 0.000 claims abstract description 73
- 238000002372 labelling Methods 0.000 claims abstract description 44
- 238000005286 illumination Methods 0.000 claims abstract description 16
- 230000006870 function Effects 0.000 claims description 37
- 238000004590 computer program Methods 0.000 claims description 24
- 238000004364 calculation method Methods 0.000 claims description 16
- 238000000605 extraction Methods 0.000 claims description 14
- 230000002776 aggregation Effects 0.000 claims description 8
- 238000004220 aggregation Methods 0.000 claims description 8
- 238000013507 mapping Methods 0.000 claims description 8
- 238000013527 convolutional neural network Methods 0.000 claims description 6
- 238000005070 sampling Methods 0.000 claims description 4
- 238000009432 framing Methods 0.000 claims description 2
- 230000004807 localization Effects 0.000 claims description 2
- 244000183278 Nephelium litchi Species 0.000 description 17
- 239000002420 orchard Substances 0.000 description 12
- 230000007613 environmental effect Effects 0.000 description 10
- 230000008859 change Effects 0.000 description 6
- 230000004913 activation Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 230000011218 segmentation Effects 0.000 description 3
- 238000004891 communication Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000010606 normalization Methods 0.000 description 2
- 244000294611 Punica granatum Species 0.000 description 1
- 235000014360 Punica granatum Nutrition 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000005375 photometry Methods 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
Images
Landscapes
- Image Analysis (AREA)
Abstract
The invention discloses a fruit identification and positioning method, which comprises the following steps: shooting fruits under different illumination conditions, and classifying shooting results to obtain a training image dataset; labeling the images in the training image data set, and setting labels for labeling results; training the fruit target detection model by utilizing the training image data set and the labeling result; and acquiring images of a plurality of fruits to be detected, and identifying and positioning the fruits in the images through a fruit target detection model after training to obtain the maturity and position information of the fruits to be detected. The invention can effectively solve the problems of low accuracy, non-universality and high data acquisition cost in the prior art.
Description
Technical Field
The invention relates to the technical field of fruit identification and positioning, in particular to a fruit identification and positioning method, device and medium.
Background
The identification and the positioning of fruits are the precondition and the basis for realizing automatic picking. The existing fruit identification and positioning method, such as a fruit positioning method and device proposed by patent document CN111126296A, and a mature pomegranate positioning method based on Mask R-CNN and 3-dimensional sphere fitting proposed by patent document CN112529948A, adopt a threshold segmentation or example segmentation method to identify fruit targets in an image, and the method is complex in algorithm, easy to be interfered by environment, large in data volume to be processed and cannot guarantee real-time performance.
The existing target technology utilizes the information of the color, shape, texture and the like of the fruits in the color image to divide the target in the image from the background, so as to realize the identification of the fruits in the image. The method has strict requirements on the environment, is easy to be interfered, has the phenomena of omission, identification error and the like, and cannot meet the fruit identification requirements in the orchard. For example, there are great differences in light conditions in orchards between different weather conditions and different times of the day; on the other hand, fruits in the orchard grow on fruit trees, and the situation that fruits are close to each other and are shielded with leaves and branches exists, so that fruit image background collected in the orchard is very complex, interference caused by the above factors cannot be well avoided in the prior art, and the recognition accuracy in the orchard environment is low and the universality is not achieved. When the method of example segmentation is adopted to mark the data set, the outline description points of the target need to be marked, so that the workload is high and the efficiency is low. The data volume required to be processed by the two identification methods is very large, and the real-time performance cannot be ensured due to slow processing. The method adopts a mode of acquiring the point cloud for positioning, and the point cloud data required by the method is difficult to acquire and has high cost.
Disclosure of Invention
The embodiment of the invention provides a fruit identification and positioning method, device and medium, which can effectively solve the problems of low accuracy, non-universality and high data acquisition cost in the prior art.
An embodiment of the invention provides a fruit identification and positioning method, which comprises the following steps:
shooting fruits under different illumination conditions, and classifying shooting results to obtain a training image dataset;
labeling the images in the training image data set, and setting labels for labeling results;
training the fruit target detection model by utilizing the training image data set and the labeling result;
and acquiring images of a plurality of fruits to be detected, and identifying and positioning the fruits in the images through a fruit target detection model after training to obtain the maturity and position information of the fruits to be detected.
Compared with the prior art, the fruit identification and positioning method disclosed by the embodiment of the invention ensures the diversity of environmental conditions obtained by the image in the data set by shooting under different weather conditions, so that the characteristics of litchi fruit targets under various conditions can be learned when the litchi fruit target detection model in an orchard is trained, the difficulty brought by light change is overcome, and the target detection model can accurately identify the litchi fruit targets under different environmental conditions. By combining the target detection result and the depth image to locate the target, compared with a method for locating by using point cloud data, the method only needs to use the depth sensor to shoot, and has low cost and simple data acquisition method.
Further, the shooting of fruits under different illumination conditions classifies shooting results to obtain a training image dataset, which specifically comprises:
and respectively shooting a fixed number of fruit images under various illumination conditions, classifying all shot fruit images according to the illumination conditions, and combining the fruit images into a training image data set.
When the litchi fruit image dataset is manufactured, shooting is carried out under different weather conditions so as to ensure the diversity of environmental conditions for image acquisition in the dataset, so that the characteristics of the litchi fruit target under various conditions can be learned when the litchi fruit target detection model of an orchard is trained, the difficulty brought by light change is overcome, and the target detection model can accurately identify the litchi fruit target under different environmental conditions.
Further, the labeling the images in the training image dataset and setting the label for the labeling result specifically includes:
labeling fruits in the image data set through a labeling tool, framing out fruit areas in the image by using geometric figure frames, and setting labels according to the maturity of the fruits in the image, wherein the label types comprise ripeness and immature.
When the data is marked, only one geometric figure frame surrounding the target is needed to be arranged, the outline of the target is not needed to be traced, and the workload is smaller in the marking process.
Further, the training of the fruit target detection model by using the training image dataset and the labeling result specifically includes:
loading an image dataset, inputting the training image dataset and a labeling result into a fruit target detection model, obtaining initial model parameters and calculating initial loss after model operation, continuously updating the model parameters and calculating the loss by using a back propagation iteration mode, and ending training when the model performance meets the requirement to obtain a fruit target detection model after final training is completed;
wherein, the fruit target detection model comprises: a feature extraction network, a neck, and a detection portion; the feature extraction network is composed of a convolutional neural network and attention functions, wherein the attention functions are multi-head attention functions formed by performing parallel calculation on the dot product scaling attention functions for a plurality of times and then splicing the dot product scaling attention functions; the neck adopts two structures of a feature pyramid structure and a path aggregation network, the feature pyramid structure is used for overlapping high-level feature mapping and low-level feature mapping through up-sampling, and the path aggregation network is used for transmitting positioning information from a shallow layer to a deep layer; the detection part outputs a target detection output frame according to the feature image generated by the feature extraction network and the neck, the output frame comprises a plurality of prior frames and a prediction frame, the prior frames are distributed in each pixel of the feature image and have different sizes, and the prediction frame is obtained through calculation of the prior frames and the feature image.
As a preferred embodiment, the training is finished when the model performance meets the requirement, which specifically includes:
the model performance meets the requirements specifically as follows: the loss is less than a preset error value;
the loss is obtained by adding the positioning loss, the confidence coefficient loss and the classification loss, is used for judging the error between the model prediction result of the current parameter and the real situation, and ends training when the loss is smaller than a preset error value.
Further, the collecting of images of a plurality of fruits to be detected, and identifying and positioning the fruits in the images through a fruit target detection model after training, so as to obtain position information of the fruits to be detected, specifically including:
loading a trained fruit target detection model, initializing shooting parameters of shooting equipment, and setting resolution of a shot image;
collecting a plurality of images of fruits to be detected through the shooting equipment; the image comprises a color image and a depth image, and the shooting equipment is specifically a depth sensor;
detecting fruits in the color image by using a fruit target detection model to obtain a plurality of target detection output frames, and respectively recording the horizontal coordinates and the vertical coordinates of the central points of the plurality of output frames in the color image; the target detection output frame further comprises labels, wherein the labels are divided into ripe and unripe labels and are used for identifying the ripeness of fruits;
obtaining depth values of central points of the plurality of output frames in the depth image;
and combining the abscissa and the ordinate with the depth value to obtain the position information of the fruit in the space coordinate system.
Compared with a method for positioning by utilizing point cloud data, the method only needs to utilize a depth sensor to shoot, and has low cost and simple data acquisition method.
Another embodiment of the present invention correspondingly provides a fruit identifying and positioning device, including: the device comprises an image acquisition and labeling module, a model training module and a fruit identification and positioning module;
the image acquisition and labeling module is used for shooting fruits under different illumination conditions, classifying shooting results to obtain a training image data set, labeling images in the training image data set, and setting labels for labeling results;
the model training module is used for training a fruit target detection model by utilizing the training image data set and the labeling result;
the fruit recognition and positioning module is used for collecting images of a plurality of fruits to be detected, recognizing and positioning the fruits in the images through the trained fruit target detection model, and obtaining maturity and position information of the fruits to be detected.
Compared with the prior art, the fruit identification and positioning device disclosed by the embodiment of the invention ensures the diversity of environmental conditions obtained by the image in the data set by shooting under different weather conditions, so that the characteristics of litchi fruit targets under various conditions can be learned when the litchi fruit target detection model in an orchard is trained, the difficulty brought by light change is overcome, and the target detection model can accurately identify the litchi fruit targets under different environmental conditions. By combining the target detection result and the depth image to locate the target, compared with a method for locating by utilizing point cloud data, the device only needs to utilize the depth sensor to shoot, and has low cost and simple data acquisition method.
Further, the fruit identifying and positioning module is used for collecting images of a plurality of fruits to be detected, identifying and positioning the fruits in the images through a trained fruit target detection model to obtain maturity and position information of the fruits to be detected, and specifically comprises the following steps:
loading a trained fruit target detection model, initializing shooting parameters of shooting equipment, and setting resolution of a shot image;
collecting a plurality of images of fruits to be detected through the shooting equipment; the image comprises a color image and a depth image, and the shooting equipment is specifically a depth sensor;
detecting fruits in the color image by using a fruit target detection model to obtain a plurality of target detection output frames, and respectively recording the horizontal coordinates and the vertical coordinates of the central points of the plurality of output frames in the color image; the target detection output frame further comprises labels, wherein the labels are divided into ripe and unripe labels and are used for identifying the ripeness of fruits;
obtaining depth values of central points of the plurality of output frames in the depth image;
and combining the abscissa and the ordinate with the depth value to obtain the position information of the fruit in the space coordinate system.
Another embodiment of the present invention provides a fruit identifying and positioning device, including a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, where the processor executes the computer program to implement the fruit identifying and positioning method according to the embodiment of the present invention.
Another embodiment of the present invention provides a storage medium, where the computer readable storage medium includes a stored computer program, where when the computer program runs, a device where the computer readable storage medium is located is controlled to execute the fruit identifying and positioning method according to the foregoing embodiment of the present invention.
Drawings
Fig. 1 is a flow chart of a fruit identifying and positioning method according to an embodiment of the invention.
Fig. 2 is a schematic diagram of a network structure of a fruit target detection model according to an embodiment of the present invention.
Fig. 3 is a schematic structural view of a fruit identifying and positioning device according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1, a flow chart of a fruit identifying and positioning method according to an embodiment of the invention includes:
s101: shooting fruits under different illumination conditions, and classifying shooting results to obtain a training image dataset;
s102: labeling the images in the training image data set, and setting labels for labeling results;
s103: training the fruit target detection model by utilizing the training image data set and the labeling result;
s104: and acquiring images of a plurality of fruits to be detected, and identifying and positioning the fruits in the images through a fruit target detection model after training to obtain the maturity and position information of the fruits to be detected.
According to the fruit identification and positioning method provided by the embodiment of the invention, shooting is performed under different weather conditions to ensure the diversity of environmental conditions of image acquisition in a data set, so that the characteristics of litchi fruit targets under various conditions can be learned when the litchi fruit target detection model in an orchard is trained, difficulties caused by light change are overcome, and the target detection model can accurately identify the litchi fruit targets under different environmental conditions. By combining the target detection result and the depth image to locate the target, compared with a method for locating by using point cloud data, the method only needs to use the depth sensor to shoot, and has low cost and simple data acquisition method.
For step S101, specifically, a fixed number of fruit images are photographed under various illumination conditions, and all the photographed fruit images are classified according to the illumination conditions and then combined into a training image dataset.
In a preferred embodiment, the direct sunlight image is obtained by forward light shooting on sunny days, and the side light image is obtained by side light shooting; shooting in the evening to obtain an image with low brightness; images under scattered light conditions were taken on cloudy days. When the data set is manufactured, the quantity of images shot in the four environments of direct sunlight, photometry, low brightness and scattered light in the training data set is ensured to be equal.
Specifically, in step S102, the fruits in the image dataset are marked by a marking tool, the fruit areas in the image are framed by geometric figure frames, and the obtained geometric figure frames are respectively provided with labels according to the maturity of the fruits in the image, wherein the label types comprise maturity and immature maturity.
In a preferred embodiment, the fruit of the image in the training dataset is manually marked, and only a rectangular frame surrounding the target is required to be arranged during marking, so that the outline of the target is not required to be dotted. Marking fruit areas in the image through rectangular frames by using marking tools to obtain real frames, and setting corresponding labels, wherein the label types comprise ripe fruits and unripe fruits so as to distinguish ripe fruits from unripe fruits.
For step S103, specifically, loading an image dataset, inputting the training image dataset and the labeling result into a fruit target detection model, obtaining initial model parameters and calculating initial loss after model operation, and then continuously updating the model parameters and calculating the loss by using a back propagation iteration mode, and ending training after the model performance reaches the requirement to obtain a fruit target detection model after final training is completed;
wherein, the fruit target detection model comprises: a feature extraction network, a neck, and a detection portion; the feature extraction network is composed of a convolutional neural network and attention functions, wherein the attention functions are multi-head attention functions formed by performing parallel calculation on the dot product scaling attention functions for a plurality of times and then splicing the dot product scaling attention functions; the neck adopts two structures of a feature pyramid structure and a path aggregation network, the feature pyramid structure is used for overlapping high-level feature mapping and low-level feature mapping through up-sampling, and the path aggregation network is used for transmitting positioning information from a shallow layer to a deep layer; the detection part outputs a target detection output frame according to the feature image generated by the feature extraction network and the neck, the output frame comprises a plurality of prior frames and a prediction frame, the prior frames are distributed in each pixel of the feature image and have different sizes, and the prediction frame is obtained through calculation of the prior frames and the feature image.
In a preferred embodiment, the model is trained using counter-propagating iterations to obtain model parameters suitable for target detection of the litchi in the orchard. The training step comprises the steps of loading data, establishing a model, updating model parameters, calculating loss, evaluating the model, judging the condition for ending training, and storing the model parameters. The condition of the judgment technology training is that the model performance reaches the requirement or the training frequency is larger than a set value, and the requirement is that the change value of the loss function is smaller than the set value.
In particular, the calculation loss uses an improved target detection loss function, including a positioning loss, a confidence loss, and a classification loss, which reflect the model prediction result using the current parameters and the error of the real situation, and the calculation method is as follows:
Loss=Loss cls +Loss obj +Loss box
the classification loss and the confidence loss adopt a binary cross entropy loss function, and the calculation method is expressed as follows:
where p represents a predicted value, x represents a sample, y represents a target value, n represents a total sample amount, and L represents a result of the binary cross entropy loss final calculation.
The positioning Loss adopts alpha-CIoU Loss α-CIoU The calculation method comprises the following steps:
in the formula, A, B represents an output frame and a real frame, respectively, |a n b| represents an area of intersection of a and B, |a u b| represents an area of union of a and B, and C represents an area of a smallest rectangle surrounding a and B. Alpha is an adjustable parameter, and the value of alpha is determined by comparing detection results when different values are taken, so that the flexibility of a target detection model can be improved. b and b gt The center points of the output frame and the real frame are respectively, ρ (·) is the euclidean distance, and c is the diagonal length of the smallest bounding box of the two frames. Beta is a positive trade-off parameter and v measures the uniformity of aspect ratio. The calculation methods of β and v are expressed as:
w in gt And h gt The width and height of the real frame are respectively, and w and h are respectively the width and height of the output frame.
In a preferred embodiment, the fruit target detection model is an orchard target detection model based on modified YOLOv5, and comprises a feature extraction network, a neck part and a detection part, wherein the specific network structure is shown in fig. 2.
In particular, the feature extraction network is a 'convolution-attention structure', and is composed of a convolution neural network and an attention function. The convolutional neural network forms three structures of Conv (Convolume), SPP (Spatial Pyramid Pooling ) and CSP bottleneck layer of the feature extraction network. Conv includes a convolution layer, a batch normalization layer, and an activation function. The activation function is a leakage linear rectification function. The CSP bottleneck layer includes a convolution layer, batch normalization, activation functions, and a residual network structure. The attention function uses a scaled dot product attention function, the calculation method is expressed as:
wherein Q, K, V are input feature patterns, K T The transposed matrix of K is represented,is a scale factor.
The scaled dot product attention function is calculated in parallel for a plurality of times and spliced to form a multi-head attention function, and the calculation method is expressed as follows:
MultiHead(Q,K,V)=Concat(head 1 ,…,head □ )W O
where head is the output of the scaled dot product attention function.
The multi-head attention function and the convolutional neural network are combined to form a feature extraction network.
The neck adopts two structures of a characteristic pyramid and a path aggregation network. The feature pyramid adopts a top-down mode, and the high-level feature mapping and the low-level feature mapping are overlapped through up-sampling. The path aggregation network adopts a bottom-up mode to transmit positioning information from a shallow layer to a deep layer.
The detection section outputs a target detection output frame based on the feature map generated by the feature extraction network and the neck. A plurality of frames of different sizes, called a priori frames, are generated in each pixel of the feature map, the a priori frames having sizes of 10×13, 16×30, 33×23, 30×61, 62×45, 59×119, 116×90, 156×198, 373×326, respectively. The prediction frame is obtained through prior frame and feature map calculation, and the calculation method comprises the following steps:
b x =σ(t x )+c x
b y =σ(t y )+c y
wherein σ (t) x )、σ(t y ) For the offset based on the coordinates of the upper left corner of the grid center point, σ is a sigmoid function. P is p w 、p □ Is the width and height of the prior frame. b x 、b y 、b w 、b □ Respectively the abscissa of the central point of the prediction frame, the ordinate of the central point, the width and the height.
For step S104, specifically, the collecting a plurality of images of the fruit to be detected, identifying and positioning the fruit in the images through the trained fruit target detection model, and obtaining the position information of the fruit to be detected specifically includes:
loading a trained fruit target detection model, initializing shooting parameters of shooting equipment, and setting resolution of a shot image;
collecting a plurality of images of fruits to be detected through the shooting equipment; the image comprises a color image and a depth image, and the shooting equipment is specifically a depth sensor;
detecting fruits in the color image by using a fruit target detection model to obtain a plurality of target detection output frames, and respectively recording the horizontal coordinates and the vertical coordinates of the central points of the plurality of output frames in the color image; the target detection output frame further comprises labels, wherein the labels are divided into ripe and unripe labels and are used for identifying the ripeness of fruits;
obtaining depth values of central points of the plurality of output frames in the depth image;
and combining the abscissa and the ordinate with the depth value to obtain the position information of the fruit in the space coordinate system.
In a preferred embodiment, using Intel RealSense D435 depth sensor as the camera, the parameters of the camera are initialized, setting the resolution of the acquired color and depth images to 640 x 480. The depth sensor is used for collecting a color image and a depth image in front of fruits, a target detection model is used for detecting the fruits in the color image, a target detection output frame is obtained, and the coordinates (x, z) of the center point of the output frame in the color image are recorded. And then acquiring a depth value of a point (x, z) in the depth image as a distance y between the fruit and the shooting point, wherein (x, y, z) represents position information of the fruit in a space coordinate system.
Referring to fig. 3, a schematic structural diagram of a fruit identifying and positioning device according to an embodiment of the present invention includes: an image acquisition and labeling module 201, a model training module 202 and a fruit identification and positioning module 203;
the image acquisition and labeling module 201 is configured to shoot fruits under different illumination conditions, classify shooting results to obtain a training image dataset, label images in the training image dataset, and set labels for labeling results;
the model training module 202 is configured to train a fruit target detection model by using the training image dataset and the labeling result;
the fruit recognition and positioning module 203 is configured to collect images of a plurality of fruits to be detected, and recognize and position the fruits in the images through a trained fruit target detection model to obtain maturity and position information of the fruits to be detected.
According to the fruit identification and positioning device provided by the embodiment of the invention, shooting is performed under different weather conditions to ensure the diversity of environmental conditions of image acquisition in a data set, so that the characteristics of litchi fruit targets under various conditions can be learned when the litchi fruit target detection model in an orchard is trained, difficulties caused by light change are overcome, and the target detection model can accurately identify the litchi fruit targets under different environmental conditions. By combining the target detection result and the depth image to locate the target, compared with a method for locating by utilizing point cloud data, the device only needs to utilize the depth sensor to shoot, and has low cost and simple data acquisition method.
Further, the fruit identifying and positioning module 203 is configured to collect images of a plurality of fruits to be detected, identify and position the fruits in the images through a trained fruit target detection model, and obtain maturity and position information of the fruits to be detected, and specifically includes:
loading a trained fruit target detection model, initializing shooting parameters of shooting equipment, and setting resolution of a shot image;
collecting a plurality of images of fruits to be detected through the shooting equipment; the image comprises a color image and a depth image, and the shooting equipment is specifically a depth sensor;
detecting fruits in the color image by using a fruit target detection model to obtain a plurality of target detection output frames, and respectively recording the horizontal coordinates and the vertical coordinates of the central points of the plurality of output frames in the color image; the target detection output frame further comprises labels, wherein the labels are divided into ripe and unripe labels and are used for identifying the ripeness of fruits;
obtaining depth values of central points of the plurality of output frames in the depth image;
and combining the abscissa and the ordinate with the depth value to obtain the position information of the fruit in the space coordinate system.
The embodiment of the invention also provides a fruit identification and positioning device. The fruit identification and positioning device of this embodiment includes: a processor, a memory, and a computer program stored in the memory and executable on the processor. The steps of the above embodiments of the fruit identification and positioning method are implemented by the processor when executing the computer program, for example, step S101 shown in fig. 1. Alternatively, the processor, when executing the computer program, performs the functions of the modules in the device embodiments described above, such as the fruit identification and positioning module 203.
The computer program may be divided into one or more modules, which are stored in the memory and executed by the processor to accomplish the present invention, for example. The one or more modules may be a series of computer program instruction segments capable of performing a specific function for describing the execution of the computer program in the fruit identification and localization device. For example, the computer program may be divided into an image acquisition and labeling module 201, a model training module 202 and a fruit recognition and positioning module 203, each of which functions in particular as follows:
the image acquisition and labeling module 201 is configured to shoot fruits under different illumination conditions, classify shooting results to obtain a training image dataset, label images in the training image dataset, and set labels for labeling results;
the model training module 202 is configured to train a fruit target detection model by using the training image dataset and the labeling result;
the fruit recognition and positioning module 203 is configured to collect images of a plurality of fruits to be detected, and recognize and position the fruits in the images through a trained fruit target detection model to obtain maturity and position information of the fruits to be detected.
The fruit identifying and positioning device can be a desktop computer, a notebook computer, a palm computer, a cloud server and other computing equipment. The fruit identification and location device may include, but is not limited to, a processor, a memory. It will be appreciated by those skilled in the art that the schematic diagram is merely an example of a fruit identification and locating device and is not limiting of the fruit identification and locating device, and may include more or fewer components than shown, or may combine certain components, or different components, e.g., the fruit identification and locating device may also include input and output devices, network access devices, buses, etc.
The processor may be a central processing unit (Central Processing Unit, CPU), other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), field programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. The general purpose processor may be a microprocessor or the processor may be any conventional processor or the like, which is the control center of the fruit identification and positioning device, with various interfaces and lines connecting the various parts of the overall fruit identification and positioning device.
The memory may be used to store the computer program or module, and the processor may implement the various functions of the fruit identification and location device by running or executing the computer program or module stored in the memory and invoking data stored in the memory. The memory may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program (such as a sound playing function, an image playing function, etc.) required for at least one function, and the like; the storage data area may store data (such as audio data, phonebook, etc.) created according to the use of the handset, etc. In addition, the memory may include high-speed random access memory, and may also include non-volatile memory, such as a hard disk, memory, plug-in hard disk, smart Media Card (SMC), secure Digital (SD) Card, flash Card (Flash Card), at least one disk storage device, flash memory device, or other volatile solid-state storage device.
Wherein the module integrated with the positioning device may be stored in a computer readable storage medium if implemented in the form of a software functional unit and sold or used as a stand alone product. Based on such understanding, the present invention may implement all or part of the flow of the method of the above embodiment, or may be implemented by a computer program to instruct related hardware, where the computer program may be stored in a computer readable storage medium, and when the computer program is executed by a processor, the computer program may implement the steps of each of the method embodiments described above. Wherein the computer program comprises computer program code which may be in source code form, object code form, executable file or some intermediate form etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth. It should be noted that the computer readable medium contains content that can be appropriately scaled according to the requirements of jurisdictions in which such content is subject to legislation and patent practice, such as in certain jurisdictions in which such content is subject to legislation and patent practice, the computer readable medium does not include electrical carrier signals and telecommunication signals.
It should be noted that the above-described apparatus embodiments are merely illustrative, and the units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. In addition, in the drawings of the embodiment of the device provided by the invention, the connection relation between the modules represents that the modules have communication connection, and can be specifically implemented as one or more communication buses or signal lines. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
While the foregoing is directed to the preferred embodiments of the present invention, it will be appreciated by those skilled in the art that changes and modifications may be made without departing from the principles of the invention, such changes and modifications are also intended to be within the scope of the invention.
Claims (10)
1. The fruit identifying and positioning method is characterized by comprising the following steps:
shooting fruits under different illumination conditions, and classifying shooting results to obtain a training image dataset;
labeling the images in the training image data set, and setting labels for labeling results;
training the fruit target detection model by utilizing the training image data set and the labeling result;
and acquiring images of a plurality of fruits to be detected, and identifying and positioning the fruits in the images through a fruit target detection model after training to obtain the maturity and position information of the fruits to be detected.
2. The method for identifying and positioning fruits according to claim 1, wherein the steps of photographing fruits under different illumination conditions, classifying photographing results, and obtaining a training image dataset comprise:
and respectively shooting a fixed number of fruit images under various illumination conditions, classifying all shot fruit images according to the illumination conditions, and combining the fruit images into a training image data set.
3. The method for identifying and positioning fruits according to claim 1, wherein the steps of labeling the images in the training image dataset and labeling the labeling result comprise:
labeling fruits in the image data set through a labeling tool, framing out fruit areas in the image by using geometric figure frames, and setting labels according to the maturity of the fruits in the image, wherein the label types comprise ripeness and immature.
4. The method for identifying and locating fruits according to claim 1, wherein training the fruit target detection model by using the training image dataset and the labeling result comprises:
loading an image dataset, inputting the training image dataset and a labeling result into a fruit target detection model, obtaining initial model parameters and calculating initial loss after model operation, continuously updating the model parameters and calculating the loss by using a back propagation iteration mode, and ending training when the model performance meets the requirement to obtain a fruit target detection model after final training is completed;
wherein, the fruit target detection model comprises: a feature extraction network, a neck, and a detection portion; the feature extraction network is composed of a convolutional neural network and attention functions, wherein the attention functions are multi-head attention functions formed by performing parallel calculation on the dot product scaling attention functions for a plurality of times and then splicing the dot product scaling attention functions; the neck adopts two structures of a feature pyramid structure and a path aggregation network, the feature pyramid structure is used for overlapping high-level feature mapping and low-level feature mapping through up-sampling, and the path aggregation network is used for transmitting positioning information from a shallow layer to a deep layer; the detection part outputs a target detection output frame according to the feature image generated by the feature extraction network and the neck, the output frame comprises a plurality of prior frames and a prediction frame, the prior frames are distributed in each pixel of the feature image and have different sizes, and the prediction frame is obtained through calculation of the prior frames and the feature image.
5. The method for identifying and locating fruits according to claim 4, wherein said training is finished when the model performance meets the requirement, specifically comprising:
the model performance meets the requirements specifically as follows: the loss is less than a preset error value;
the loss is obtained by adding the positioning loss, the confidence coefficient loss and the classification loss, is used for judging the error between the model prediction result of the current parameter and the real situation, and ends training when the loss is smaller than a preset error value.
6. The method for identifying and positioning fruits according to claim 1, wherein the collecting a plurality of images of fruits to be detected, identifying and positioning fruits in the images by a trained fruit target detection model, and obtaining position information of the fruits to be detected specifically comprises:
loading a trained fruit target detection model, initializing shooting parameters of shooting equipment, and setting resolution of a shot image;
collecting a plurality of images of fruits to be detected through the shooting equipment; the image comprises a color image and a depth image, and the shooting equipment is specifically a depth sensor;
detecting fruits in the color image by using a fruit target detection model to obtain a plurality of target detection output frames, and respectively recording the horizontal coordinates and the vertical coordinates of the central points of the plurality of output frames in the color image; the target detection output frame further comprises labels, wherein the labels are divided into ripe and unripe labels and are used for identifying the ripeness of fruits;
obtaining depth values of central points of the plurality of output frames in the depth image;
and combining the abscissa and the ordinate with the depth value to obtain the position information of the fruit in the space coordinate system.
7. A fruit identification and positioning device, comprising: the device comprises an image acquisition and labeling module, a model training module and a fruit identification and positioning module;
the image acquisition and labeling module is used for shooting fruits under different illumination conditions, classifying shooting results to obtain a training image data set, labeling images in the training image data set, and setting labels for labeling results;
the model training module is used for training a fruit target detection model by utilizing the training image data set and the labeling result;
the fruit recognition and positioning module is used for collecting images of a plurality of fruits to be detected, recognizing and positioning the fruits in the images through the trained fruit target detection model, and obtaining maturity and position information of the fruits to be detected.
8. The fruit identification and positioning device according to claim 7, wherein the fruit identification and positioning module is configured to collect images of a plurality of fruits to be detected, and identify and position the fruits in the images through a trained fruit target detection model, so as to obtain maturity and position information of the fruits to be detected, and the fruit identification and positioning device specifically comprises:
loading a trained fruit target detection model, initializing shooting parameters of shooting equipment, and setting resolution of a shot image;
collecting a plurality of images of fruits to be detected through the shooting equipment; the image comprises a color image and a depth image, and the shooting equipment is specifically a depth sensor;
detecting fruits in the color image by using a fruit target detection model to obtain a plurality of target detection output frames, and respectively recording the horizontal coordinates and the vertical coordinates of the central points of the plurality of output frames in the color image; the target detection output frame further comprises labels, wherein the labels are divided into ripe and unripe labels and are used for identifying the ripeness of fruits;
obtaining depth values of central points of the plurality of output frames in the depth image;
and combining the abscissa and the ordinate with the depth value to obtain the position information of the fruit in the space coordinate system.
9. A fruit identification and locating device comprising a processor, a memory and a computer program stored in the memory and configured to be executed by the processor, the processor implementing the fruit identification and locating method according to any one of claims 1 to 6 when executing the computer program.
10. A computer readable storage medium, characterized in that the computer readable storage medium comprises a stored computer program, wherein the computer program, when run, controls a device in which the computer readable storage medium is located to perform the fruit identification and localization method according to any one of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211553660.3A CN115995017A (en) | 2022-12-06 | 2022-12-06 | Fruit identification and positioning method, device and medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211553660.3A CN115995017A (en) | 2022-12-06 | 2022-12-06 | Fruit identification and positioning method, device and medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115995017A true CN115995017A (en) | 2023-04-21 |
Family
ID=85989721
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211553660.3A Withdrawn CN115995017A (en) | 2022-12-06 | 2022-12-06 | Fruit identification and positioning method, device and medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115995017A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116467596A (en) * | 2023-04-11 | 2023-07-21 | 广州国家现代农业产业科技创新中心 | Training method of rice grain length prediction model, morphology prediction method and apparatus |
-
2022
- 2022-12-06 CN CN202211553660.3A patent/CN115995017A/en not_active Withdrawn
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116467596A (en) * | 2023-04-11 | 2023-07-21 | 广州国家现代农业产业科技创新中心 | Training method of rice grain length prediction model, morphology prediction method and apparatus |
CN116467596B (en) * | 2023-04-11 | 2024-03-26 | 广州国家现代农业产业科技创新中心 | Training method of rice grain length prediction model, morphology prediction method and apparatus |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Wu et al. | Using channel pruning-based YOLO v4 deep learning algorithm for the real-time and accurate detection of apple flowers in natural environments | |
Lee et al. | Simultaneous traffic sign detection and boundary estimation using convolutional neural network | |
US11429818B2 (en) | Method, system and device for multi-label object detection based on an object detection network | |
CN109934121B (en) | Orchard pedestrian detection method based on YOLOv3 algorithm | |
CN110298266B (en) | Deep neural network target detection method based on multiscale receptive field feature fusion | |
CN109934115B (en) | Face recognition model construction method, face recognition method and electronic equipment | |
CN111160269A (en) | Face key point detection method and device | |
CN111080693A (en) | Robot autonomous classification grabbing method based on YOLOv3 | |
CN109165538B (en) | Bar code detection method and device based on deep neural network | |
CN109919930A (en) | The statistical method of fruit number on tree based on convolutional neural networks YOLO V3 | |
Masuda | Leaf area estimation by semantic segmentation of point cloud of tomato plants | |
CN115187803B (en) | Positioning method for picking process of famous tea tender shoots | |
CN112784869A (en) | Fine-grained image identification method based on attention perception and counterstudy | |
CN111709377B (en) | Feature extraction method, target re-identification method and device and electronic equipment | |
CN112883915A (en) | Automatic wheat ear identification method and system based on transfer learning | |
CN112164030A (en) | Method and device for quickly detecting rice panicle grains, computer equipment and storage medium | |
CN111950391A (en) | Fruit tree bud recognition method and device | |
Hou et al. | Detection and localization of citrus fruit based on improved You Only Look Once v5s and binocular vision in the orchard | |
CN116740528A (en) | Shadow feature-based side-scan sonar image target detection method and system | |
CN116188943A (en) | Solar radio spectrum burst information detection method and device | |
CN115995017A (en) | Fruit identification and positioning method, device and medium | |
CN109657540A (en) | Withered tree localization method and system | |
CN117079125A (en) | Kiwi fruit pollination flower identification method based on improved YOLOv5 | |
CN116863463A (en) | Egg assembly line rapid identification and counting method | |
CN116935296A (en) | Orchard environment scene detection method and terminal based on multitask deep learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20230421 |