CN115995017A - A method, device and medium for fruit identification and positioning - Google Patents

A method, device and medium for fruit identification and positioning Download PDF

Info

Publication number
CN115995017A
CN115995017A CN202211553660.3A CN202211553660A CN115995017A CN 115995017 A CN115995017 A CN 115995017A CN 202211553660 A CN202211553660 A CN 202211553660A CN 115995017 A CN115995017 A CN 115995017A
Authority
CN
China
Prior art keywords
fruit
fruits
target detection
training
images
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202211553660.3A
Other languages
Chinese (zh)
Inventor
毛亮
梁志尚
吴惠粦
田鑫裕
张兴龙
朱文铭
刘昌乐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou National Modern Agricultural Industry Science And Technology Innovation Center
Shenzhen Polytechnic
Original Assignee
Guangzhou National Modern Agricultural Industry Science And Technology Innovation Center
Shenzhen Polytechnic
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou National Modern Agricultural Industry Science And Technology Innovation Center, Shenzhen Polytechnic filed Critical Guangzhou National Modern Agricultural Industry Science And Technology Innovation Center
Priority to CN202211553660.3A priority Critical patent/CN115995017A/en
Publication of CN115995017A publication Critical patent/CN115995017A/en
Withdrawn legal-status Critical Current

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

本发明公开了一种果实识别与定位方法,包括以下步骤:在不同光照条件下对果实进行拍摄,将拍摄结果分类,得到训练图像数据集;对所述训练图像数据集中的图像进行标注,并对标注结果进行标签设置;利用所述训练图像数据集与标注结果对果实目标检测模型进行训练;采集若干张待检测果实的图像,通过训练完成的果实目标检测模型对所述图像中的果实进行识别与定位,获得待检测果实的成熟度与位置信息。本发明能有效解决现有技术准确率低、不通用、以及数据获取成本高的问题。

Figure 202211553660

The invention discloses a fruit identification and positioning method, comprising the following steps: photographing the fruit under different lighting conditions, classifying the photographing results to obtain a training image data set; marking the images in the training image data set, and Carry out label setting to labeling result; Utilize described training image data set and labeling result to fruit target detection model to train; Collect the image of several pieces of fruit to be detected, carry out the fruit in described image through the fruit target detection model that training is completed Identify and locate, and obtain the maturity and location information of the fruit to be detected. The invention can effectively solve the problems of low accuracy, non-universality and high cost of data acquisition in the prior art.

Figure 202211553660

Description

一种果实识别与定位方法、装置及介质A fruit identification and positioning method, device and medium

技术领域Technical Field

本发明涉及果实识别与定位技术领域,尤其涉及一种果实识别与定位方法、装置及介质。The present invention relates to the technical field of fruit identification and positioning, and in particular to a fruit identification and positioning method, device and medium.

背景技术Background Art

果实的识别和定位是实现自动化采摘的前提和基础。现有的水果识别和定位方法例如专利文献CN111126296A提出的水果定位方法及装置、专利文献CN112529948A提出的一种基于Mask R-CNN与3维球体拟合的成熟石榴定位方法采用阈值分割或者实例分割的方法识别图像中的水果目标,这种方法算法复杂、容易受到环境的干扰,而且需要处理的数据量大,无法保证实时性。Fruit recognition and positioning are the premise and basis for automatic harvesting. Existing fruit recognition and positioning methods, such as the fruit positioning method and device proposed in patent document CN111126296A, and a ripe pomegranate positioning method based on Mask R-CNN and 3D sphere fitting proposed in patent document CN112529948A, use threshold segmentation or instance segmentation to identify fruit targets in images. This method has complex algorithms, is easily affected by the environment, and requires a large amount of data to be processed, and cannot guarantee real-time performance.

现有的目标技术利用彩色图像中水果的颜色、形状、纹理等信息将图像中的目标与背景分割开来,实现图像中水果的识别。此方法对环境的要求严格,容易受到干扰而出现遗漏、识别错误等现象,不能满足果园中的果实识别要求。例如,不同天气情况和一天中的不同时间,果园中的光线条件有很大差别;另一方面,果园中的水果生长在果树上,与树叶和枝条存在相互靠近和遮挡的情况,使得果园中采集的果实图像背景非常复杂,现有的技术并不能很好地避免以上因素带来的干扰,在果园环境中的识别准确率低、不具有通用性。采用实例分割的方法在标记数据集时需要对目标的轮廓描点进行标记,工作量大,效率低。上述两种识别方法需要处理的数据量都非常大,处理起来缓慢无法保证实时性。采用获取点云的方式进行定位,此方法所需要的点云数据获取十分困难,成本高。Existing target technology uses the color, shape, texture and other information of fruits in color images to separate the target from the background in the image, so as to realize the recognition of fruits in the image. This method has strict requirements on the environment and is easily disturbed, resulting in omissions and recognition errors, which cannot meet the requirements of fruit recognition in orchards. For example, the light conditions in the orchard vary greatly under different weather conditions and at different times of the day; on the other hand, the fruits in the orchard grow on fruit trees, and there are close and occluded situations with leaves and branches, which makes the background of fruit images collected in the orchard very complex. Existing technology cannot well avoid the interference caused by the above factors, and the recognition accuracy in the orchard environment is low and not universal. The instance segmentation method needs to mark the outline points of the target when marking the data set, which is labor-intensive and inefficient. The above two recognition methods need to process a large amount of data, which is slow to process and cannot guarantee real-time performance. The point cloud data required for positioning is difficult to obtain and the cost is high.

发明内容Summary of the invention

本发明实施例提供一种果实识别与定位方法、装置及介质,能有效解决现有技术准确率低、不通用、以及数据获取成本高的问题。The embodiments of the present invention provide a fruit identification and positioning method, device and medium, which can effectively solve the problems of low accuracy, non-universality and high data acquisition cost of the prior art.

本发明一实施例提供一种果实识别与定位方法,包括以下步骤:An embodiment of the present invention provides a fruit identification and positioning method, comprising the following steps:

在不同光照条件下对果实进行拍摄,将拍摄结果分类,得到训练图像数据集;The fruits are photographed under different lighting conditions, and the photographing results are classified to obtain a training image dataset;

对所述训练图像数据集中的图像进行标注,并对标注结果进行标签设置;Annotating the images in the training image dataset and setting labels for the annotation results;

利用所述训练图像数据集与标注结果对果实目标检测模型进行训练;Using the training image data set and the annotation results to train a fruit target detection model;

采集若干张待检测果实的图像,通过训练完成的果实目标检测模型对所述图像中的果实进行识别与定位,获得待检测果实的成熟度与位置信息。Collect several images of fruits to be detected, identify and locate the fruits in the images through the trained fruit target detection model, and obtain the maturity and position information of the fruits to be detected.

与现有技术相比,本发明实施例公开的果实识别与定位方法通过在不同的天气条件下进行拍摄,以保证数据集中图像获取的环境条件的多样性,使得在训练果园荔枝目标检测模型时能学习到多种情况下荔枝果实目标的特征,克服光线变化带来的困难,保证目标检测模型能在不同环境条件下准确识别出荔枝果实目标。通过结合目标检测的结果和深度图像对目标进行定位,相比于利用点云数据定位的方法,本发明只需要利用深度传感器进行拍摄,成本低,数据获取方法简单。Compared with the prior art, the fruit recognition and positioning method disclosed in the embodiment of the present invention is to shoot under different weather conditions to ensure the diversity of environmental conditions for image acquisition in the data set, so that the characteristics of litchi fruit targets under various conditions can be learned when training the orchard litchi target detection model, overcome the difficulties caused by light changes, and ensure that the target detection model can accurately identify litchi fruit targets under different environmental conditions. By combining the results of target detection and the depth image to locate the target, compared with the method of using point cloud data for positioning, the present invention only needs to use a depth sensor for shooting, which is low in cost and simple in data acquisition method.

进一步的,所述在不同光照条件下对果实进行拍摄,将拍摄结果分类,得到训练图像数据集,具体包括:Furthermore, photographing the fruits under different lighting conditions and classifying the photographing results to obtain a training image data set specifically includes:

分别在各种光照条件下拍摄固定数量的果实图像,将所有拍摄得到的果实图像按照光照条件进行分类后组合成训练图像数据集。A fixed number of fruit images are taken under various lighting conditions, and all the captured fruit images are classified according to the lighting conditions and combined into a training image dataset.

在制作荔枝果实图像数据集时,在不同的天气条件下进行拍摄,以保证数据集中图像获取的环境条件的多样性,使得在训练果园荔枝目标检测模型时能学习到多种情况下荔枝果实目标的特征,克服光线变化带来的困难,保证目标检测模型能在不同环境条件下准确识别出荔枝果实目标。When making the litchi fruit image dataset, the images were taken under different weather conditions to ensure the diversity of environmental conditions for acquiring the images in the dataset. This allows the characteristics of litchi fruit targets in various situations to be learned when training the orchard litchi target detection model, overcoming the difficulties caused by light changes and ensuring that the target detection model can accurately identify litchi fruit targets under different environmental conditions.

进一步的,所述对所述训练图像数据集中的图像进行标注,并对标注结果进行标签设置,具体包括:Furthermore, the step of labeling the images in the training image data set and setting labels for the labeling results specifically includes:

通过标注工具对图像数据集中的果实进行标注,将图像中的果实区域用几何图形框框出,根据图像中果实的成熟度分别将得到的几何图形框设置标签,所述标签类型包括成熟与未成熟。The fruits in the image data set are annotated by using an annotation tool, the fruit area in the image is framed by a geometric frame, and the obtained geometric frames are labeled according to the maturity of the fruits in the image, and the label types include mature and unripe.

在进行数据标记时只需要设置一个包围住目标的几何图形框,无需对目标的轮廓进行描点,在标记过程中工作量更小。When marking data, you only need to set a geometric frame that surrounds the target, without drawing points on the target's outline, which reduces the workload during the marking process.

进一步的,所述利用所述训练图像数据集与标注结果对果实目标检测模型进行训练,具体包括:Furthermore, the training of the fruit target detection model using the training image dataset and the annotation results specifically includes:

加载图像数据集,将所述训练图像数据集与标注结果输入到果实目标检测模型中,经过模型运算后得到初始模型参数并计算初始损失,随后使用反向传播迭代的方式持续更新模型参数并计算损失,当模型性能达到要求后结束训练,得到最终训练完成的果实目标检测模型;Loading an image data set, inputting the training image data set and the annotation results into a fruit target detection model, obtaining initial model parameters and calculating initial losses after model operation, then continuously updating model parameters and calculating losses using back propagation iterations, and terminating training when model performance meets requirements to obtain a fruit target detection model that has been trained.

其中,所述果实目标检测模型包括:特征提取网络、颈部、检测部分;所述特征提取网络由卷积神经网络和注意力函数构成,所述注意力函数为将缩放点积注意力函数并行计算多次后进行拼接而构成的多头注意力函数;所述颈部采用特征金字塔和路径聚合网络两种结构,所述特征金字塔结构用于通过上采样将高级特征映射和低级特征映射重合,所述路径聚合网络用于将定位信息从浅层传输到深层;所述检测部分根据特征提取网络和颈部生成的特征图输出目标检测输出框,所述输出框包括若干个先验框与预测框,所述先验框分布在特征图的每个像素中且具有不同的大小尺寸,所述预测框通过先验框和特征图计算获得。Among them, the fruit target detection model includes: a feature extraction network, a neck, and a detection part; the feature extraction network is composed of a convolutional neural network and an attention function, and the attention function is a multi-head attention function formed by splicing the scaled dot product attention function after parallel calculation for multiple times; the neck adopts two structures: a feature pyramid and a path aggregation network. The feature pyramid structure is used to overlap the high-level feature map and the low-level feature map through upsampling, and the path aggregation network is used to transfer the positioning information from the shallow layer to the deep layer; the detection part outputs a target detection output frame according to the feature map generated by the feature extraction network and the neck, and the output frame includes a number of prior frames and prediction frames, the prior frames are distributed in each pixel of the feature map and have different sizes, and the prediction frame is obtained by calculating the prior frame and the feature map.

作为一个优选的实施例,所述当模型性能达到要求后结束训练,具体包括:As a preferred embodiment, the training is terminated when the model performance reaches the requirement, specifically including:

所述模型性能达到要求具体为:损失小于预设误差值;The model performance meets the requirements specifically as follows: the loss is less than the preset error value;

其中,损失由定位损失、置信度损失和分类损失相加得到,用于判定当前参数的模型预测结果和真实情况之间的误差,当所述损失小于预设误差值时,结束训练。The loss is obtained by adding the positioning loss, confidence loss and classification loss, and is used to determine the error between the model prediction result of the current parameter and the actual situation. When the loss is less than the preset error value, the training ends.

进一步的,所述采集若干张待检测果实的图像,通过训练完成的果实目标检测模型对所述图像中的果实进行识别与定位,获得待检测果实的位置信息,具体包括:Furthermore, the collecting of a plurality of images of fruits to be detected, identifying and locating the fruits in the images by using the trained fruit target detection model, and obtaining the position information of the fruits to be detected specifically includes:

加载已训练完成的果实目标检测模型,初始化拍摄设备的拍摄参数,设置拍摄所得图像的分辨率;Load the trained fruit object detection model, initialize the shooting parameters of the shooting device, and set the resolution of the captured image;

通过所述拍摄设备对待检测果实的采集若干张图像;其中,所述图像包括彩色图像与深度图像,所述拍摄设备具体为深度传感器;The photographing device is used to collect a plurality of images of the fruit to be inspected; wherein the images include color images and depth images, and the photographing device is specifically a depth sensor;

使用果实目标检测模型检测彩色图像中的果实,得到若干个目标检测输出框,分别记录所述若干个输出框的中心点在彩色图像中的横纵坐标;其中,所述目标检测输出框还包括标签,所述标签分为成熟与未成熟,用于识别果实的成熟度;The fruit in the color image is detected using a fruit target detection model to obtain a plurality of target detection output frames, and the horizontal and vertical coordinates of the center points of the plurality of output frames in the color image are recorded respectively; wherein the target detection output frame also includes a label, and the label is divided into mature and unripe, and is used to identify the maturity of the fruit;

获取所述深度图像中所述若干个输出框的中心点的深度数值;Obtaining depth values of center points of the plurality of output frames in the depth image;

将所述横纵坐标与深度数值组合,得到果实在空间坐标系中的位置信息。The horizontal and vertical coordinates are combined with the depth value to obtain the position information of the fruit in the spatial coordinate system.

通过结合目标检测的结果和深度图像对目标进行定位,相比于利用点云数据定位的方法,此方法只需要利用深度传感器进行拍摄,成本低,数据获取方法简单。The target is located by combining the results of target detection and the depth image. Compared with the method of using point cloud data for positioning, this method only needs to use a depth sensor for shooting, which is low-cost and has a simple data acquisition method.

本发明另一实施例对应提供了一种果实识别与定位装置,包括:图像采集与标注模块、模型训练模块和果实识别与定位模块;Another embodiment of the present invention provides a fruit recognition and positioning device, including: an image acquisition and annotation module, a model training module and a fruit recognition and positioning module;

所述图像采集与标注模块用于在不同光照条件下对果实进行拍摄,将拍摄结果分类,得到训练图像数据集,同时对所述训练图像数据集中的图像进行标注,并对标注结果进行标签设置;The image acquisition and annotation module is used to shoot fruits under different lighting conditions, classify the shooting results, obtain a training image data set, annotate the images in the training image data set, and set labels for the annotation results;

所述模型训练模块用于利用所述训练图像数据集与标注结果对果实目标检测模型进行训练;The model training module is used to train the fruit target detection model using the training image data set and the annotation results;

所述果实识别与定位模块用于采集若干张待检测果实的图像,通过训练完成的果实目标检测模型对所述图像中的果实进行识别与定位,获得待检测果实的成熟度与位置信息。The fruit recognition and positioning module is used to collect a number of images of fruits to be detected, identify and locate the fruits in the images through the trained fruit target detection model, and obtain the maturity and position information of the fruits to be detected.

与现有技术相比,本发明实施例公开的果实识别与定位装置通过在不同的天气条件下进行拍摄,以保证数据集中图像获取的环境条件的多样性,使得在训练果园荔枝目标检测模型时能学习到多种情况下荔枝果实目标的特征,克服光线变化带来的困难,保证目标检测模型能在不同环境条件下准确识别出荔枝果实目标。通过结合目标检测的结果和深度图像对目标进行定位,相比于利用点云数据定位的方法,本装置只需要利用深度传感器进行拍摄,成本低,数据获取方法简单。Compared with the prior art, the fruit recognition and positioning device disclosed in the embodiment of the present invention takes pictures under different weather conditions to ensure the diversity of environmental conditions for image acquisition in the data set, so that the characteristics of litchi fruit targets under various conditions can be learned when training the orchard litchi target detection model, overcoming the difficulties caused by light changes, and ensuring that the target detection model can accurately identify litchi fruit targets under different environmental conditions. By combining the results of target detection and the depth image to locate the target, compared with the method of using point cloud data for positioning, this device only needs to use a depth sensor for shooting, which is low in cost and simple in data acquisition method.

进一步的,所述果实识别与定位模块用于采集若干张待检测果实的图像,通过训练完成的果实目标检测模型对所述图像中的果实进行识别与定位,获得待检测果实的成熟度与位置信息,具体包括:Furthermore, the fruit recognition and positioning module is used to collect a number of images of fruits to be detected, identify and locate the fruits in the images through the trained fruit target detection model, and obtain the maturity and position information of the fruits to be detected, specifically including:

加载已训练完成的果实目标检测模型,初始化拍摄设备的拍摄参数,设置拍摄所得图像的分辨率;Load the trained fruit object detection model, initialize the shooting parameters of the shooting device, and set the resolution of the captured image;

通过所述拍摄设备对待检测果实的采集若干张图像;其中,所述图像包括彩色图像与深度图像,所述拍摄设备具体为深度传感器;The photographing device is used to collect a plurality of images of the fruit to be inspected; wherein the images include color images and depth images, and the photographing device is specifically a depth sensor;

使用果实目标检测模型检测彩色图像中的果实,得到若干个目标检测输出框,分别记录所述若干个输出框的中心点在彩色图像中的横纵坐标;其中,所述目标检测输出框还包括标签,所述标签分为成熟与未成熟,用于识别果实的成熟度;The fruit in the color image is detected using a fruit target detection model to obtain a plurality of target detection output frames, and the horizontal and vertical coordinates of the center points of the plurality of output frames in the color image are recorded respectively; wherein the target detection output frame also includes a label, and the label is divided into mature and unripe, and is used to identify the maturity of the fruit;

获取所述深度图像中所述若干个输出框的中心点的深度数值;Obtaining depth values of center points of the plurality of output frames in the depth image;

将所述横纵坐标与深度数值组合,得到果实在空间坐标系中的位置信息。The horizontal and vertical coordinates are combined with the depth value to obtain the position information of the fruit in the spatial coordinate system.

本发明另一实施例提供了一种果实识别与定位装置,包括处理器、存储器以及存储在所述存储器中且被配置为由所述处理器执行的计算机程序,所述处理器执行所述计算机程序时实现上述发明实施例所述的果实识别与定位方法。Another embodiment of the present invention provides a fruit identification and positioning device, including a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, and when the processor executes the computer program, the fruit identification and positioning method described in the above-mentioned embodiment of the invention is implemented.

本发明另一实施例提供了一种存储介质,所述计算机可读存储介质包括存储的计算机程序,其中,在所述计算机程序运行时控制所述计算机可读存储介质所在设备执行上述发明实施例所述的果实识别与定位方法。Another embodiment of the present invention provides a storage medium, wherein the computer-readable storage medium includes a stored computer program, wherein when the computer program is running, the device where the computer-readable storage medium is located is controlled to execute the fruit identification and positioning method described in the above-mentioned embodiment of the invention.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1是本发明一实施例提供的一种果实识别与定位方法的流程示意图。FIG. 1 is a schematic flow chart of a fruit identification and positioning method provided in accordance with an embodiment of the present invention.

图2是本发明一实施例提供的一种果实目标检测模型的网络结构示意图。FIG. 2 is a schematic diagram of a network structure of a fruit target detection model provided by an embodiment of the present invention.

图3是本发明一实施例提供的一种果实识别与定位装置的结构示意图。FIG. 3 is a schematic structural diagram of a fruit identification and positioning device provided by an embodiment of the present invention.

具体实施方式DETAILED DESCRIPTION

下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The following will be combined with the drawings in the embodiments of the present invention to clearly and completely describe the technical solutions in the embodiments of the present invention. Obviously, the described embodiments are only part of the embodiments of the present invention, not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by ordinary technicians in this field without creative work are within the scope of protection of the present invention.

参见图1,是本发明一实施例提供的一种果实识别与定位方法的流程示意图,包括:Referring to FIG. 1 , it is a schematic flow chart of a fruit identification and positioning method provided by an embodiment of the present invention, comprising:

S101:在不同光照条件下对果实进行拍摄,将拍摄结果分类,得到训练图像数据集;S101: photographing fruits under different lighting conditions, classifying the photographing results, and obtaining a training image data set;

S102:对所述训练图像数据集中的图像进行标注,并对标注结果进行标签设置;S102: annotating images in the training image data set, and setting labels for the annotated results;

S103:利用所述训练图像数据集与标注结果对果实目标检测模型进行训练;S103: training a fruit object detection model using the training image dataset and the annotation results;

S104:采集若干张待检测果实的图像,通过训练完成的果实目标检测模型对所述图像中的果实进行识别与定位,获得待检测果实的成熟度与位置信息。S104: Collect several images of fruits to be detected, identify and locate the fruits in the images through the trained fruit target detection model, and obtain the maturity and position information of the fruits to be detected.

本发明实施例提供的一种果实识别与定位方法通过在不同的天气条件下进行拍摄,以保证数据集中图像获取的环境条件的多样性,使得在训练果园荔枝目标检测模型时能学习到多种情况下荔枝果实目标的特征,克服光线变化带来的困难,保证目标检测模型能在不同环境条件下准确识别出荔枝果实目标。通过结合目标检测的结果和深度图像对目标进行定位,相比于利用点云数据定位的方法,本发明只需要利用深度传感器进行拍摄,成本低,数据获取方法简单。A fruit recognition and positioning method provided by an embodiment of the present invention is to shoot under different weather conditions to ensure the diversity of environmental conditions for image acquisition in the data set, so that the characteristics of litchi fruit targets under various conditions can be learned when training the orchard litchi target detection model, overcome the difficulties caused by light changes, and ensure that the target detection model can accurately identify litchi fruit targets under different environmental conditions. By combining the target detection results and the depth image to locate the target, compared with the method of using point cloud data for positioning, the present invention only needs to use a depth sensor for shooting, which is low in cost and simple in data acquisition method.

对于步骤S101,具体的,分别在各种光照条件下拍摄固定数量的果实图像,将所有拍摄得到的果实图像按照光照条件进行分类后组合成训练图像数据集。For step S101, specifically, a fixed number of fruit images are taken under various lighting conditions, and all the taken fruit images are classified according to the lighting conditions and then combined into a training image data set.

在一个优选的实施例中,在晴天时通过顺光拍摄获得阳光直射的图像,通过侧光拍摄获得侧光图像;在傍晚时拍摄获得亮度低的图像;在阴天时拍摄获得散射光条件下的图像。在制作数据集时保证上述的阳光直射、测光、亮度低、散射光四种环境下拍摄的图像在训练数据集中的数量相等。In a preferred embodiment, on a sunny day, images under direct sunlight are obtained by shooting in the front light, and images under side light are obtained by shooting in the side light; images under low brightness are obtained by shooting in the evening; and images under diffuse light conditions are obtained on a cloudy day. When preparing the data set, it is ensured that the number of images shot in the above four environments of direct sunlight, photometry, low brightness, and diffuse light in the training data set is equal.

对于步骤S102,具体的,通过标注工具对图像数据集中的果实进行标注,将图像中的果实区域用几何图形框框出,根据图像中果实的成熟度分别将得到的几何图形框设置标签,所述标签类型包括成熟与未成熟。For step S102, specifically, the fruits in the image data set are annotated by using an annotation tool, the fruit area in the image is framed by a geometric frame, and the obtained geometric frames are labeled according to the maturity of the fruits in the image, and the label types include mature and unripe.

在一个优选的实施例中,对训练数据集中的图像的果实进行人工标注,在标注时只需要设置一个包围住目标的矩形框,无需对目标的轮廓进行描点。使用标注工具通过矩形框将图像中的果实区域标注出来,获得真实框,并设置对应的标签,标签类型包括成熟和未成熟,以区分成熟与未成熟的果实。In a preferred embodiment, the fruits in the images of the training data set are manually annotated. When annotating, only a rectangular frame surrounding the target is required, and no points are drawn on the outline of the target. The fruit area in the image is annotated with an annotation tool using a rectangular frame to obtain a real frame and set a corresponding label. The label types include ripe and unripe to distinguish ripe and unripe fruits.

对于步骤S103,具体的,加载图像数据集,将所述训练图像数据集与标注结果输入到果实目标检测模型中,经过模型运算后得到初始模型参数并计算初始损失,随后使用反向传播迭代的方式持续更新模型参数并计算损失,当模型性能达到要求后结束训练,得到最终训练完成的果实目标检测模型;For step S103, specifically, an image data set is loaded, and the training image data set and the annotation results are input into the fruit target detection model. After the model operation, the initial model parameters are obtained and the initial loss is calculated. Then, the model parameters are continuously updated and the loss is calculated by back propagation iteration. When the model performance meets the requirements, the training is terminated to obtain the fruit target detection model that has been finally trained.

其中,所述果实目标检测模型包括:特征提取网络、颈部、检测部分;所述特征提取网络由卷积神经网络和注意力函数构成,所述注意力函数为将缩放点积注意力函数并行计算多次后进行拼接而构成的多头注意力函数;所述颈部采用特征金字塔和路径聚合网络两种结构,所述特征金字塔结构用于通过上采样将高级特征映射和低级特征映射重合,所述路径聚合网络用于将定位信息从浅层传输到深层;所述检测部分根据特征提取网络和颈部生成的特征图输出目标检测输出框,所述输出框包括若干个先验框与预测框,所述先验框分布在特征图的每个像素中且具有不同的大小尺寸,所述预测框通过先验框和特征图计算获得。Among them, the fruit target detection model includes: a feature extraction network, a neck, and a detection part; the feature extraction network is composed of a convolutional neural network and an attention function, and the attention function is a multi-head attention function formed by splicing the scaled dot product attention function after parallel calculation for multiple times; the neck adopts two structures: a feature pyramid and a path aggregation network. The feature pyramid structure is used to overlap the high-level feature map and the low-level feature map through upsampling, and the path aggregation network is used to transfer the positioning information from the shallow layer to the deep layer; the detection part outputs a target detection output frame according to the feature map generated by the feature extraction network and the neck, and the output frame includes a number of prior frames and prediction frames, the prior frames are distributed in each pixel of the feature map and have different sizes, and the prediction frame is obtained by calculating the prior frame and the feature map.

在一个优选的实施例中,使用反向传播迭代的方式,对模型进行训练,以获得适合果园荔枝目标检测的模型参数。训练步骤包括加载数据、建立模型、更新模型参数、计算损失、评估模型和判断结束训练的条件、保存模型参数。其中,所述判断技术训练的条件为“模型性能达到要求或训练次数大于设定值”,所述要求为“损失函数变化值小于设定值”。In a preferred embodiment, the model is trained using a back propagation iteration method to obtain model parameters suitable for orchard litchi target detection. The training steps include loading data, building a model, updating model parameters, calculating losses, evaluating the model, judging the conditions for ending training, and saving model parameters. Among them, the conditions for judging technical training are "model performance meets the requirements or the number of training times is greater than the set value", and the requirement is "the change value of the loss function is less than the set value".

特别地,所述计算损失使用改进后的目标检测损失函数,包括定位损失、置信度损失、分类损失,其反映使用当前参数的模型预测结果和真实情况的误差,计算方法为:In particular, the calculation loss uses an improved target detection loss function, including positioning loss, confidence loss, and classification loss, which reflects the error between the model prediction result using the current parameters and the actual situation. The calculation method is:

Loss=Losscls+Lossobj+Lossbox Loss=Loss cls +Loss obj +Loss box

分类损失和置信度损失采用二元交叉熵损失函数,计算方法表示为:The classification loss and confidence loss use the binary cross entropy loss function, and the calculation method is expressed as:

Figure BDA0003982457690000081
Figure BDA0003982457690000081

Figure BDA0003982457690000082
Figure BDA0003982457690000082

式中,p表示预测值,x表示样本,y表示目标值,n表示样本总量,L表示二元交叉熵损失最终计算的结果。In the formula, p represents the predicted value, x represents the sample, y represents the target value, n represents the total number of samples, and L represents the final calculation result of the binary cross entropy loss.

定位损失采用α-CIoU损失Lossα-CIoU,计算方法为:The positioning loss uses α-CIoU loss Loss α-CIoU , which is calculated as follows:

Figure BDA0003982457690000083
Figure BDA0003982457690000083

Figure BDA0003982457690000084
Figure BDA0003982457690000084

式中,A、B分别表示输出框和真实框,|A∩B|表示A和B交集的面积,|A∪B|表示A和B并集的面积,C表示包围A和B的最小矩形的面积。α为可调的参数,比较不同取值时的检测结果确定α的取值,能够提高调试目标检测模型的灵活性。b和bgt分别为输出框和真实框的中心点,ρ(·)为欧氏距离,c是两个框的最小包围框的对角线长度。β为正权衡参数,v衡量长宽比的一致性。β和v的计算方法分别表示为:In the formula, A and B represent the output box and the true box respectively, |A∩B| represents the area of the intersection of A and B, |A∪B| represents the area of the union of A and B, and C represents the area of the minimum rectangle enclosing A and B. α is an adjustable parameter. Comparing the detection results at different values to determine the value of α can improve the flexibility of debugging the target detection model. b and b gt are the center points of the output box and the true box respectively, ρ(·) is the Euclidean distance, and c is the diagonal length of the minimum enclosing box of the two boxes. β is a positive trade-off parameter, and v measures the consistency of the aspect ratio. The calculation methods of β and v are respectively expressed as:

Figure BDA0003982457690000091
Figure BDA0003982457690000091

Figure BDA0003982457690000092
Figure BDA0003982457690000092

式中wgt和hgt分别为真实框的宽和高,w和h分别为输出框的宽和高。Where w gt and h gt are the width and height of the real box, respectively, and w and h are the width and height of the output box, respectively.

在一个优选的实施例中,所述果实目标检测模型是一种基于改进型YOLOv5的果园目标检测模型,包括特征提取网络、颈部、检测部分,具体网络结构参见图2。In a preferred embodiment, the fruit target detection model is an orchard target detection model based on improved YOLOv5, including a feature extraction network, a neck, and a detection part. See Figure 2 for the specific network structure.

特别地,所述的特征提取网络为“卷积-注意力结构”,由卷积神经网络和注意力函数构成。其中卷积神经网络构成特征提取网络的Conv(Convolution,卷积)、SPP(SpatialPyramid Pooling,空间金字塔池化)和CSP瓶颈层三种结构。Conv包括一个卷积层、一个批归一化层、一个激活函数。所述的激活函数为带泄露线性整流函数。CSP瓶颈层包括卷积层、批归一化、激活函数、和一个残差网络结构。注意力函数使用缩放点积注意力函数,计算方法表示为:In particular, the feature extraction network is a "convolution-attention structure" composed of a convolutional neural network and an attention function. The convolutional neural network constitutes three structures of the feature extraction network: Conv (Convolution), SPP (Spatial Pyramid Pooling) and CSP bottleneck layer. Conv includes a convolution layer, a batch normalization layer, and an activation function. The activation function is a leaky linear rectification function. The CSP bottleneck layer includes a convolution layer, batch normalization, an activation function, and a residual network structure. The attention function uses a scaled dot product attention function, and the calculation method is expressed as:

Figure BDA0003982457690000093
Figure BDA0003982457690000093

式中,Q,K,V为输入的特征图,KT表示K的转置矩阵,

Figure BDA0003982457690000094
为比例因子。In the formula, Q, K, V are the input feature maps, K T represents the transposed matrix of K,
Figure BDA0003982457690000094
is the scale factor.

将缩放点积注意力函数并行计算多次并进行拼接,构成多头注意力函数,计算方法表示为:The scaled dot product attention function is calculated multiple times in parallel and concatenated to form a multi-head attention function. The calculation method is expressed as:

MultiHead(Q,K,V)=Concat(head1,…,head)WO MultiHead(Q,K,V)=Concat(head 1 ,…,head )W O

式中,head为缩放点积注意力函数的输出。Where head is the output of the scaled dot product attention function.

将多头注意力函数和卷积神经网络组合构成特征提取网络。The multi-head attention function and convolutional neural network are combined to form a feature extraction network.

颈部采用特征金字塔和路径聚合网络两种结构。特征金字塔采用自顶向下的方式,通过上采样将高级特征映射和低级特征映射重合。路径聚合网络采用自底向上的方式,将定位信息从浅层传输到深层。The neck adopts two structures: feature pyramid and path aggregation network. The feature pyramid adopts a top-down approach to overlap high-level feature maps with low-level feature maps through upsampling. The path aggregation network adopts a bottom-up approach to transfer positioning information from shallow layers to deep layers.

检测部分根据特征提取网络和颈部生成的特征图输出目标检测输出框。在特征图的每个像素中生成多个不同尺寸的框,称为先验框,先验框的尺寸分别为10×13、16×30、33×23、30×61、62×45、59×119、116×90、156×198、373×326。通过先验框和特征图计算获得预测框,计算方法为:The detection part outputs the target detection output box based on the feature extraction network and the feature map generated by the neck. Multiple boxes of different sizes are generated in each pixel of the feature map, called prior boxes. The sizes of the prior boxes are 10×13, 16×30, 33×23, 30×61, 62×45, 59×119, 116×90, 156×198, and 373×326. The prediction box is obtained by calculating the prior box and the feature map. The calculation method is:

bx=σ(tx)+cx b x =σ(t x )+c x

by=σ(ty)+cy by =σ( ty )+ cy

Figure BDA0003982457690000101
Figure BDA0003982457690000101

Figure BDA0003982457690000102
Figure BDA0003982457690000102

式中,σ(tx)、σ(ty)为基于网格中心点左上角点坐标的偏移量,σ为sigmoid函数。pw、p为先验框的宽高。bx、by、bw、b分别为预测框的中心点横坐标、中心点纵坐标、宽、高。Where σ(t x ) and σ(t y ) are the offsets based on the coordinates of the upper left corner of the grid center, and σ is the sigmoid function. p w , p are the width and height of the prior box. b x , by , b w , and b are the horizontal coordinate, vertical coordinate, width, and height of the center point of the predicted box, respectively.

对于步骤S104,具体的,所述采集若干张待检测果实的图像,通过训练完成的果实目标检测模型对所述图像中的果实进行识别与定位,获得待检测果实的位置信息,具体包括:For step S104, specifically, the collecting of a plurality of images of fruits to be detected, identifying and locating the fruits in the images by using the trained fruit target detection model, and obtaining the position information of the fruits to be detected specifically includes:

加载已训练完成的果实目标检测模型,初始化拍摄设备的拍摄参数,设置拍摄所得图像的分辨率;Load the trained fruit object detection model, initialize the shooting parameters of the shooting device, and set the resolution of the captured image;

通过所述拍摄设备对待检测果实的采集若干张图像;其中,所述图像包括彩色图像与深度图像,所述拍摄设备具体为深度传感器;The photographing device is used to collect a plurality of images of the fruit to be inspected; wherein the images include color images and depth images, and the photographing device is specifically a depth sensor;

使用果实目标检测模型检测彩色图像中的果实,得到若干个目标检测输出框,分别记录所述若干个输出框的中心点在彩色图像中的横纵坐标;其中,所述目标检测输出框还包括标签,所述标签分为成熟与未成熟,用于识别果实的成熟度;The fruit in the color image is detected using a fruit target detection model to obtain a plurality of target detection output frames, and the horizontal and vertical coordinates of the center points of the plurality of output frames in the color image are recorded respectively; wherein the target detection output frame also includes a label, and the label is divided into mature and unripe, and is used to identify the maturity of the fruit;

获取所述深度图像中所述若干个输出框的中心点的深度数值;Obtaining depth values of center points of the plurality of output frames in the depth image;

将所述横纵坐标与深度数值组合,得到果实在空间坐标系中的位置信息。The horizontal and vertical coordinates are combined with the depth value to obtain the position information of the fruit in the spatial coordinate system.

在一个优选的实施例中,使用Intel RealSense D435深度传感器作为摄像头,初始化摄像头的参数,设置所获取的彩色图像和深度图像的分辨率为640×480。用该深度传感器在果实前方采集一张彩色图像和深度图像,使用目标检测模型检测彩色图像中的果实,得到目标检测输出框,记录输出框的中心点在彩色图像的坐标(x,z)。随后获取深度图像中点(x,z)的深度数值,作为果实与拍摄点的距离y,则(x,y,z)表示在空间坐标系中果实的位置信息。In a preferred embodiment, an Intel RealSense D435 depth sensor is used as a camera, the parameters of the camera are initialized, and the resolution of the acquired color image and depth image is set to 640×480. The depth sensor is used to collect a color image and a depth image in front of the fruit, and the target detection model is used to detect the fruit in the color image, obtain the target detection output frame, and record the coordinates (x, z) of the center point of the output frame in the color image. Then, the depth value of the midpoint (x, z) of the depth image is obtained as the distance y between the fruit and the shooting point, and (x, y, z) represents the position information of the fruit in the spatial coordinate system.

参见图3,是本发明一实施例提供的一种果实识别与定位装置的结构示意图,包括:图像采集与标注模块201、模型训练模块202和果实识别与定位模块203;3 is a schematic diagram of the structure of a fruit recognition and positioning device provided by an embodiment of the present invention, comprising: an image acquisition and annotation module 201, a model training module 202 and a fruit recognition and positioning module 203;

所述图像采集与标注模块201用于在不同光照条件下对果实进行拍摄,将拍摄结果分类,得到训练图像数据集,同时对所述训练图像数据集中的图像进行标注,并对标注结果进行标签设置;The image acquisition and annotation module 201 is used to shoot fruits under different lighting conditions, classify the shooting results, obtain a training image data set, annotate the images in the training image data set, and set labels for the annotation results;

所述模型训练模块202用于利用所述训练图像数据集与标注结果对果实目标检测模型进行训练;The model training module 202 is used to train the fruit object detection model using the training image data set and the annotation results;

所述果实识别与定位模块203用于采集若干张待检测果实的图像,通过训练完成的果实目标检测模型对所述图像中的果实进行识别与定位,获得待检测果实的成熟度与位置信息。The fruit recognition and positioning module 203 is used to collect a number of images of fruits to be detected, and to recognize and locate the fruits in the images through the trained fruit target detection model to obtain the maturity and position information of the fruits to be detected.

本发明实施例提供的一种果实识别与定位装置通过在不同的天气条件下进行拍摄,以保证数据集中图像获取的环境条件的多样性,使得在训练果园荔枝目标检测模型时能学习到多种情况下荔枝果实目标的特征,克服光线变化带来的困难,保证目标检测模型能在不同环境条件下准确识别出荔枝果实目标。通过结合目标检测的结果和深度图像对目标进行定位,相比于利用点云数据定位的方法,本装置只需要利用深度传感器进行拍摄,成本低,数据获取方法简单。The fruit recognition and positioning device provided by the embodiment of the present invention is shot under different weather conditions to ensure the diversity of environmental conditions for image acquisition in the data set, so that the characteristics of litchi fruit targets under various conditions can be learned when training the orchard litchi target detection model, overcoming the difficulties caused by light changes, and ensuring that the target detection model can accurately identify litchi fruit targets under different environmental conditions. By combining the target detection results and the depth image to locate the target, compared with the method of using point cloud data for positioning, the device only needs to use a depth sensor for shooting, which is low in cost and simple in data acquisition method.

进一步的,所述果实识别与定位模块203用于采集若干张待检测果实的图像,通过训练完成的果实目标检测模型对所述图像中的果实进行识别与定位,获得待检测果实的成熟度与位置信息,具体包括:Furthermore, the fruit recognition and positioning module 203 is used to collect a number of images of fruits to be detected, and to recognize and locate the fruits in the images through the trained fruit target detection model to obtain the maturity and position information of the fruits to be detected, which specifically includes:

加载已训练完成的果实目标检测模型,初始化拍摄设备的拍摄参数,设置拍摄所得图像的分辨率;Load the trained fruit object detection model, initialize the shooting parameters of the shooting device, and set the resolution of the captured image;

通过所述拍摄设备对待检测果实的采集若干张图像;其中,所述图像包括彩色图像与深度图像,所述拍摄设备具体为深度传感器;The photographing device is used to collect a plurality of images of the fruit to be inspected; wherein the images include color images and depth images, and the photographing device is specifically a depth sensor;

使用果实目标检测模型检测彩色图像中的果实,得到若干个目标检测输出框,分别记录所述若干个输出框的中心点在彩色图像中的横纵坐标;其中,所述目标检测输出框还包括标签,所述标签分为成熟与未成熟,用于识别果实的成熟度;The fruit in the color image is detected using a fruit target detection model to obtain a plurality of target detection output frames, and the horizontal and vertical coordinates of the center points of the plurality of output frames in the color image are recorded respectively; wherein the target detection output frame also includes a label, and the label is divided into mature and unripe, and is used to identify the maturity of the fruit;

获取所述深度图像中所述若干个输出框的中心点的深度数值;Obtaining depth values of center points of the plurality of output frames in the depth image;

将所述横纵坐标与深度数值组合,得到果实在空间坐标系中的位置信息。The horizontal and vertical coordinates are combined with the depth value to obtain the position information of the fruit in the spatial coordinate system.

本发明一实施例还提供了一种果实识别与定位装置。该实施例的果实识别与定位装置包括:处理器、存储器以及存储在所述存储器中并可在所述处理器上运行的计算机程序。所述处理器执行所述计算机程序时实现上述各个果实识别与定位方法实施例中的步骤,例如图1所示的步骤S101。或者,所述处理器执行所述计算机程序时实现上述各装置实施例中各模块的功能,例如果实识别与定位模块203。An embodiment of the present invention also provides a fruit identification and positioning device. The fruit identification and positioning device of this embodiment includes: a processor, a memory, and a computer program stored in the memory and executable on the processor. When the processor executes the computer program, the steps in the above-mentioned various fruit identification and positioning method embodiments are implemented, such as step S101 shown in Figure 1. Alternatively, when the processor executes the computer program, the functions of each module in the above-mentioned device embodiments are implemented, such as the fruit identification and positioning module 203.

示例性的,所述计算机程序可以被分割成一个或多个模块,所述一个或者多个模块被存储在所述存储器中,并由所述处理器执行,以完成本发明。所述一个或多个模块可以是能够完成特定功能的一系列计算机程序指令段,该指令段用于描述所述计算机程序在所述果实识别与定位装置中的执行过程。例如,所述计算机程序可以被分割成图像采集与标注模块201、模型训练模块202和果实识别与定位模块203,各模块具体功能如下:Exemplarily, the computer program can be divided into one or more modules, and the one or more modules are stored in the memory and executed by the processor to complete the present invention. The one or more modules can be a series of computer program instruction segments that can complete specific functions, and the instruction segments are used to describe the execution process of the computer program in the fruit identification and positioning device. For example, the computer program can be divided into an image acquisition and annotation module 201, a model training module 202, and a fruit identification and positioning module 203, and the specific functions of each module are as follows:

所述图像采集与标注模块201用于在不同光照条件下对果实进行拍摄,将拍摄结果分类,得到训练图像数据集,同时对所述训练图像数据集中的图像进行标注,并对标注结果进行标签设置;The image acquisition and annotation module 201 is used to shoot fruits under different lighting conditions, classify the shooting results, obtain a training image data set, annotate the images in the training image data set, and set labels for the annotation results;

所述模型训练模块202用于利用所述训练图像数据集与标注结果对果实目标检测模型进行训练;The model training module 202 is used to train the fruit object detection model using the training image data set and the annotation results;

所述果实识别与定位模块203用于采集若干张待检测果实的图像,通过训练完成的果实目标检测模型对所述图像中的果实进行识别与定位,获得待检测果实的成熟度与位置信息。The fruit recognition and positioning module 203 is used to collect a number of images of fruits to be detected, and to recognize and locate the fruits in the images through the trained fruit target detection model to obtain the maturity and position information of the fruits to be detected.

所述果实识别与定位装置可以是桌上型计算机、笔记本、掌上电脑及云端服务器等计算设备。所述果实识别与定位装置可包括,但不仅限于,处理器、存储器。本领域技术人员可以理解,所述示意图仅仅是果实识别与定位装置的示例,并不构成对果实识别与定位装置的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件,例如所述果实识别与定位装置还可以包括输入输出设备、网络接入设备、总线等。The fruit identification and positioning device can be a computing device such as a desktop computer, a notebook, a palm computer, and a cloud server. The fruit identification and positioning device can include, but is not limited to, a processor and a memory. Those skilled in the art will appreciate that the schematic diagram is only an example of a fruit identification and positioning device and does not constitute a limitation on the fruit identification and positioning device. It can include more or less components than shown in the figure, or combine certain components, or different components. For example, the fruit identification and positioning device can also include input and output devices, network access devices, buses, etc.

所称处理器可以是中央处理单元(Central Processing Unit,CPU),还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等,所述处理器是所述果实识别与定位装置的控制中心,利用各种接口和线路连接整个果实识别与定位装置的各个部分。The processor may be a central processing unit (CPU), other general-purpose processors, digital signal processors (DSP), application-specific integrated circuits (ASIC), field-programmable gate arrays (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc. The general-purpose processor may be a microprocessor or any conventional processor, etc. The processor is the control center of the fruit identification and positioning device, and uses various interfaces and lines to connect the various parts of the entire fruit identification and positioning device.

所述存储器可用于存储所述计算机程序或模块,所述处理器通过运行或执行存储在所述存储器内的计算机程序或模块,以及调用存储在存储器内的数据,实现所述果实识别与定位装置的各种功能。所述存储器可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序(比如声音播放功能、图像播放功能等)等;存储数据区可存储根据手机的使用所创建的数据(比如音频数据、电话本等)等。此外,存储器可以包括高速随机存取存储器,还可以包括非易失性存储器,例如硬盘、内存、插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)、至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。The memory can be used to store the computer program or module, and the processor realizes various functions of the fruit identification and positioning device by running or executing the computer program or module stored in the memory, and calling the data stored in the memory. The memory can mainly include a program storage area and a data storage area, wherein the program storage area can store an operating system, an application required for at least one function (such as a sound playback function, an image playback function, etc.), etc.; the data storage area can store data created according to the use of the mobile phone (such as audio data, a phone book, etc.), etc. In addition, the memory can include a high-speed random access memory, and can also include a non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a smart memory card (Smart Media Card, SMC), a secure digital (Secure Digital, SD) card, a flash card (Flash Card), at least one disk storage device, a flash memory device, or other volatile solid-state storage devices.

其中,所述果实识别与定位装置集成的模块如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明实现上述实施例方法中的全部或部分流程,也可以通过计算机程序来指令相关的硬件来完成,所述的计算机程序可存储于一计算机可读存储介质中,该计算机程序在被处理器执行时,可实现上述各个方法实施例的步骤。其中,所述计算机程序包括计算机程序代码,所述计算机程序代码可以为源代码形式、对象代码形式、可执行文件或某些中间形式等。所述计算机可读介质可以包括:能够携带所述计算机程序代码的任何实体或装置、记录介质、U盘、移动硬盘、磁碟、光盘、计算机存储器、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、电载波信号、电信信号以及软件分发介质等。需要说明的是,所述计算机可读介质包含的内容可以根据司法管辖区内立法和专利实践的要求进行适当的增减,例如在某些司法管辖区,根据立法和专利实践,计算机可读介质不包括电载波信号和电信信号。Wherein, if the module integrated with the fruit identification and positioning device is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on such an understanding, the present invention implements all or part of the processes in the above-mentioned embodiment method, and can also be completed by instructing the relevant hardware through a computer program. The computer program can be stored in a computer-readable storage medium, and the computer program can implement the steps of the above-mentioned various method embodiments when executed by the processor. Wherein, the computer program includes computer program code, and the computer program code can be in source code form, object code form, executable file or some intermediate form. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, U disk, mobile hard disk, magnetic disk, optical disk, computer memory, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), electric carrier signal, telecommunication signal and software distribution medium. It should be noted that the content contained in the computer-readable medium can be appropriately increased or decreased according to the requirements of legislation and patent practice in the jurisdiction. For example, in some jurisdictions, according to legislation and patent practice, the computer-readable medium does not include electric carrier signal and telecommunication signal.

需说明的是,以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。另外,本发明提供的装置实施例附图中,模块之间的连接关系表示它们之间具有通信连接,具体可以实现为一条或多条通信总线或信号线。本领域普通技术人员在不付出创造性劳动的情况下,即可以理解并实施。It should be noted that the device embodiments described above are merely schematic, wherein the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the scheme of this embodiment. In addition, in the accompanying drawings of the device embodiments provided by the present invention, the connection relationship between the modules indicates that there is a communication connection between them, which may be specifically implemented as one or more communication buses or signal lines. A person of ordinary skill in the art may understand and implement it without creative work.

以上所述是本发明的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也视为本发明的保护范围。The above is a preferred embodiment of the present invention. It should be pointed out that a person skilled in the art can make several improvements and modifications without departing from the principle of the present invention. These improvements and modifications are also considered to be within the scope of protection of the present invention.

Claims (10)

1. The fruit identifying and positioning method is characterized by comprising the following steps:
shooting fruits under different illumination conditions, and classifying shooting results to obtain a training image dataset;
labeling the images in the training image data set, and setting labels for labeling results;
training the fruit target detection model by utilizing the training image data set and the labeling result;
and acquiring images of a plurality of fruits to be detected, and identifying and positioning the fruits in the images through a fruit target detection model after training to obtain the maturity and position information of the fruits to be detected.
2. The method for identifying and positioning fruits according to claim 1, wherein the steps of photographing fruits under different illumination conditions, classifying photographing results, and obtaining a training image dataset comprise:
and respectively shooting a fixed number of fruit images under various illumination conditions, classifying all shot fruit images according to the illumination conditions, and combining the fruit images into a training image data set.
3. The method for identifying and positioning fruits according to claim 1, wherein the steps of labeling the images in the training image dataset and labeling the labeling result comprise:
labeling fruits in the image data set through a labeling tool, framing out fruit areas in the image by using geometric figure frames, and setting labels according to the maturity of the fruits in the image, wherein the label types comprise ripeness and immature.
4. The method for identifying and locating fruits according to claim 1, wherein training the fruit target detection model by using the training image dataset and the labeling result comprises:
loading an image dataset, inputting the training image dataset and a labeling result into a fruit target detection model, obtaining initial model parameters and calculating initial loss after model operation, continuously updating the model parameters and calculating the loss by using a back propagation iteration mode, and ending training when the model performance meets the requirement to obtain a fruit target detection model after final training is completed;
wherein, the fruit target detection model comprises: a feature extraction network, a neck, and a detection portion; the feature extraction network is composed of a convolutional neural network and attention functions, wherein the attention functions are multi-head attention functions formed by performing parallel calculation on the dot product scaling attention functions for a plurality of times and then splicing the dot product scaling attention functions; the neck adopts two structures of a feature pyramid structure and a path aggregation network, the feature pyramid structure is used for overlapping high-level feature mapping and low-level feature mapping through up-sampling, and the path aggregation network is used for transmitting positioning information from a shallow layer to a deep layer; the detection part outputs a target detection output frame according to the feature image generated by the feature extraction network and the neck, the output frame comprises a plurality of prior frames and a prediction frame, the prior frames are distributed in each pixel of the feature image and have different sizes, and the prediction frame is obtained through calculation of the prior frames and the feature image.
5. The method for identifying and locating fruits according to claim 4, wherein said training is finished when the model performance meets the requirement, specifically comprising:
the model performance meets the requirements specifically as follows: the loss is less than a preset error value;
the loss is obtained by adding the positioning loss, the confidence coefficient loss and the classification loss, is used for judging the error between the model prediction result of the current parameter and the real situation, and ends training when the loss is smaller than a preset error value.
6. The method for identifying and positioning fruits according to claim 1, wherein the collecting a plurality of images of fruits to be detected, identifying and positioning fruits in the images by a trained fruit target detection model, and obtaining position information of the fruits to be detected specifically comprises:
loading a trained fruit target detection model, initializing shooting parameters of shooting equipment, and setting resolution of a shot image;
collecting a plurality of images of fruits to be detected through the shooting equipment; the image comprises a color image and a depth image, and the shooting equipment is specifically a depth sensor;
detecting fruits in the color image by using a fruit target detection model to obtain a plurality of target detection output frames, and respectively recording the horizontal coordinates and the vertical coordinates of the central points of the plurality of output frames in the color image; the target detection output frame further comprises labels, wherein the labels are divided into ripe and unripe labels and are used for identifying the ripeness of fruits;
obtaining depth values of central points of the plurality of output frames in the depth image;
and combining the abscissa and the ordinate with the depth value to obtain the position information of the fruit in the space coordinate system.
7. A fruit identification and positioning device, comprising: the device comprises an image acquisition and labeling module, a model training module and a fruit identification and positioning module;
the image acquisition and labeling module is used for shooting fruits under different illumination conditions, classifying shooting results to obtain a training image data set, labeling images in the training image data set, and setting labels for labeling results;
the model training module is used for training a fruit target detection model by utilizing the training image data set and the labeling result;
the fruit recognition and positioning module is used for collecting images of a plurality of fruits to be detected, recognizing and positioning the fruits in the images through the trained fruit target detection model, and obtaining maturity and position information of the fruits to be detected.
8. The fruit identification and positioning device according to claim 7, wherein the fruit identification and positioning module is configured to collect images of a plurality of fruits to be detected, and identify and position the fruits in the images through a trained fruit target detection model, so as to obtain maturity and position information of the fruits to be detected, and the fruit identification and positioning device specifically comprises:
loading a trained fruit target detection model, initializing shooting parameters of shooting equipment, and setting resolution of a shot image;
collecting a plurality of images of fruits to be detected through the shooting equipment; the image comprises a color image and a depth image, and the shooting equipment is specifically a depth sensor;
detecting fruits in the color image by using a fruit target detection model to obtain a plurality of target detection output frames, and respectively recording the horizontal coordinates and the vertical coordinates of the central points of the plurality of output frames in the color image; the target detection output frame further comprises labels, wherein the labels are divided into ripe and unripe labels and are used for identifying the ripeness of fruits;
obtaining depth values of central points of the plurality of output frames in the depth image;
and combining the abscissa and the ordinate with the depth value to obtain the position information of the fruit in the space coordinate system.
9. A fruit identification and locating device comprising a processor, a memory and a computer program stored in the memory and configured to be executed by the processor, the processor implementing the fruit identification and locating method according to any one of claims 1 to 6 when executing the computer program.
10. A computer readable storage medium, characterized in that the computer readable storage medium comprises a stored computer program, wherein the computer program, when run, controls a device in which the computer readable storage medium is located to perform the fruit identification and localization method according to any one of claims 1 to 6.
CN202211553660.3A 2022-12-06 2022-12-06 A method, device and medium for fruit identification and positioning Withdrawn CN115995017A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211553660.3A CN115995017A (en) 2022-12-06 2022-12-06 A method, device and medium for fruit identification and positioning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211553660.3A CN115995017A (en) 2022-12-06 2022-12-06 A method, device and medium for fruit identification and positioning

Publications (1)

Publication Number Publication Date
CN115995017A true CN115995017A (en) 2023-04-21

Family

ID=85989721

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211553660.3A Withdrawn CN115995017A (en) 2022-12-06 2022-12-06 A method, device and medium for fruit identification and positioning

Country Status (1)

Country Link
CN (1) CN115995017A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116467596A (en) * 2023-04-11 2023-07-21 广州国家现代农业产业科技创新中心 Training method of rice grain length prediction model, morphology prediction method and apparatus

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116467596A (en) * 2023-04-11 2023-07-21 广州国家现代农业产业科技创新中心 Training method of rice grain length prediction model, morphology prediction method and apparatus
CN116467596B (en) * 2023-04-11 2024-03-26 广州国家现代农业产业科技创新中心 Training method of rice grain length prediction model, morphology prediction method and apparatus

Similar Documents

Publication Publication Date Title
CN111310861B (en) A license plate recognition and location method based on deep neural network
CN109344701B (en) Kinect-based dynamic gesture recognition method
Wang et al. An improved light-weight traffic sign recognition algorithm based on YOLOv4-tiny
WO2022170844A1 (en) Video annotation method, apparatus and device, and computer readable storage medium
CN109902806B (en) Determination method of target bounding box of noisy image based on convolutional neural network
WO2020177432A1 (en) Multi-tag object detection method and system based on target detection network, and apparatuses
CN109766856B (en) A dual-stream RGB-D Faster R-CNN method for recognizing the posture of lactating sows
CN111080693A (en) Robot autonomous classification grabbing method based on YOLOv3
CN111160269A (en) A method and device for detecting facial key points
CN108805064A (en) A kind of fish detection and localization and recognition methods and system based on deep learning
CN112784869B (en) A fine-grained image recognition method based on attention perception and adversarial learning
CN111553949A (en) Positioning and grabbing method for irregular workpiece based on single-frame RGB-D image deep learning
CN110992325A (en) Deep Learning-Based Object Inventory Method, Apparatus and Equipment
CN110287798B (en) Vector network pedestrian detection method based on feature modularization and context fusion
WO2024077781A1 (en) Convolutional neural network model-based image recognition method and apparatus, and terminal device
CN111860537B (en) Green citrus identification method, equipment and device based on deep learning
CN114821102A (en) Intensive citrus quantity detection method, equipment, storage medium and device
CN105825168A (en) Golden snub-nosed monkey face detection and tracking algorithm based on S-TLD
CN116740528A (en) A method and system for target detection in side scan sonar images based on shadow features
CN111709377B (en) Feature extraction method, target re-identification method and device and electronic equipment
CN115995017A (en) A method, device and medium for fruit identification and positioning
CN115953744A (en) A vehicle recognition and tracking method based on deep learning
CN115240188A (en) A real-time detection method of orange picking robot target based on deep learning
CN113505629A (en) Intelligent storage article recognition device based on light weight network
CN108764365A (en) A kind of device signboard detection method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20230421

WW01 Invention patent application withdrawn after publication