CN116071560A

CN116071560A - A method of fruit recognition based on convolutional neural network

Info

Publication number: CN116071560A
Application number: CN202211452782.3A
Authority: CN
Inventors: 李鲁群; 陶霜霜; 胡天乐; 张慎文; 许崇海
Original assignee: Shanghai Normal University
Current assignee: Shanghai Normal University
Priority date: 2022-11-21
Filing date: 2022-11-21
Publication date: 2023-05-05

Abstract

The invention discloses a fruit identification method based on a convolutional neural network, which comprises the following steps: s1: fruit image data acquisition is completed through a camera and a lens; s2: image processing, namely, aiming at the characteristics of the image, completing the work of image preprocessing, image segmentation, morphological processing and connected region feature analysis; s3: constructing a convolutional neural network model, and extracting multi-scale features from the image processing result through convolutional neural network training; s4: after the convolutional neural network training is completed, the data result is sent to the chip through the serial port, and the chip controls the steering engine to enable the mechanical arm to sort fruits; the method has the advantages that the K-means fruit segmentation algorithm is improved, morphological processing and connected region feature analysis are utilized to complete the selection and positioning of a target region, the characteristic of the specific edge implicit extraction of the convolutional neural network is utilized, meanwhile, the characteristic learning is carried out on a sample, the fruit variety identification and classification are realized, and the accurate identification rate of the fruit variety is improved.

Description

A fruit recognition method based on convolutional neural network

技术领域Technical Field

本发明属于图像识别相关技术领域，具体涉及一种基于卷积神经网络的水果识别方法。The invention belongs to the technical field related to image recognition, and in particular relates to a fruit recognition method based on a convolutional neural network.

背景技术Background Art

目前，我国现行的水果产业结构中机械化程度较低，大部分生产环节尤其果实采摘主要依靠费时费力的人工进行。整个水果生产作业中包括采摘、储藏、运输、加工以及销售等环节，因此研究开发水果农业生产机器人，是提高水果生产效率和节约人力成本的必然趋势。而无论是采摘、分拣机器人，还是水果生产环节中的果实品质、品种检测系统，它们的正常工作都依赖于图像处理模块对果实的正确识别，例如采摘机器人只有从果树中识别出水果并获取到果实的准确位置，才能为机械手臂提供运动参数，进而完成对果实的摘取操作。At present, the mechanization level in the current fruit industry structure in my country is low, and most of the production links, especially fruit picking, mainly rely on time-consuming and labor-intensive manual work. The entire fruit production operation includes picking, storage, transportation, processing and sales. Therefore, the research and development of fruit agricultural production robots is an inevitable trend to improve fruit production efficiency and save labor costs. Whether it is a picking and sorting robot, or a fruit quality and variety detection system in the fruit production link, their normal operation depends on the correct recognition of the fruit by the image processing module. For example, the picking robot can only provide motion parameters for the mechanical arm and complete the fruit picking operation after it recognizes the fruit from the fruit tree and obtains the accurate position of the fruit.

近几年，随着机器视觉和计算机技术的成熟，深度学习技术迅速发展，它能够出色地完成各项计算机视觉任务，经过大量数据训练，能够自动地学习到不同事物的特征信息，获取各类别的差异，能够将原始数据转换为更加抽象、高级的表达，进而完成图像分类、检测等任务，从而越来越广泛的应用于农作物生长监测中；然而，目前基于图像深度学习的水果品种识别方法的准确率偏低，不能完全满足实际应用的要求。In recent years, with the maturity of machine vision and computer technology, deep learning technology has developed rapidly. It can excellently complete various computer vision tasks. After a large amount of data training, it can automatically learn the characteristic information of different things, obtain the differences between categories, and convert the original data into more abstract and advanced expressions, thereby completing image classification, detection and other tasks, and thus is increasingly widely used in crop growth monitoring; however, the current fruit variety recognition method based on image deep learning has a low accuracy rate and cannot fully meet the requirements of practical applications.

发明内容Summary of the invention

本发明的目的在于提供一种基于卷积神经网络的水果识别方法，以解决上述背景技术中提出的水果品种识别方法的准确率偏低问题。The object of the present invention is to provide a fruit recognition method based on a convolutional neural network to solve the problem of low accuracy of the fruit variety recognition method proposed in the above background technology.

为实现上述目的，本发明提供如下技术方案：一种基于卷积神经网络的水果识别方法，包括如下步骤：To achieve the above object, the present invention provides the following technical solution: a fruit recognition method based on convolutional neural network, comprising the following steps:

S1：水果图像数据采集，通过摄像头和镜头完成水果图像数据采集；S1: Fruit image data collection, which is completed through cameras and lenses;

S2：图像处理，针对图像特点，完成图像预处理、图像分割、形态学处理以及连通区域特征分析的工作；S2: Image processing, based on the image characteristics, complete image preprocessing, image segmentation, morphological processing and connected area feature analysis;

S3：构建卷积神经网络模型，并通过卷积神经网络训练对图像处理的结果进行多尺度的特征提取；S3: Construct a convolutional neural network model and perform multi-scale feature extraction on the image processing results through convolutional neural network training;

S4：在完成卷积神经网络训练后，将数据结果通过串口发送数据给芯片，芯片控制控制舵机使机械臂分拣水果。S4: After completing the convolutional neural network training, the data results are sent to the chip through the serial port, and the chip controls the servo to make the robotic arm sort the fruits.

优选的，所述S1中的步骤中摄像头负责水果图像信息的采集，主要依据检测精度、视野尺寸进行选择，镜头参数主要是焦距，其选择主要依据工作距离进行确定。Preferably, in the step S1, the camera is responsible for collecting the fruit image information, which is mainly selected based on the detection accuracy and the field of view size. The lens parameter is mainly the focal length, and its selection is mainly determined based on the working distance.

优选的，所述S2中的图像预处理是针对水果图像特点，提取图像的R空间颜色特征分量，进行高斯滤波处理，高斯模板通过对二维高斯函数的离散化表示，对于一个大小为(2u+1)×(2u+1)的矩阵M﹐其(a，b)位置的元素表示为：Preferably, the image preprocessing in S2 is to extract the R space color feature component of the image according to the characteristics of the fruit image, and perform Gaussian filtering. The Gaussian template is represented by discretizing the two-dimensional Gaussian function. For a matrix M of size (2u+1)×(2u+1), the element at the (a, b) position is represented as:

其中α为高斯函数标准差，高斯滤波模板大小为5×5。Where α is the standard deviation of the Gaussian function, and the size of the Gaussian filter template is 5×5.

优选的，所述S2中的图像分割是利用改进K均值聚类分割算法对图像进行分割，形成分割区域，具体聚类分割算法如下：Preferably, the image segmentation in S2 is to segment the image using an improved K-means clustering segmentation algorithm to form segmented regions. The specific clustering segmentation algorithm is as follows:

Stepl：首先将样本图像灰度值转换为尺寸大小为N的一维样本数据集C，其中N样本图像像素个数，设迭代运算次数为Ⅰ且聚类类型为j时的聚类中心为Z_j(I)，从数据集C中随机选取一个样本对象作为初始聚类中心Z₁(1)；Step 1: First, convert the grayscale value of the sample image into a one-dimensional sample data set C of size N, where N is the number of sample image pixels. Assume that the number of iterations is I and the cluster center when the cluster type is j is Z _j (I). Randomly select a sample object from the data set C as the initial cluster center Z ₁ (1);

计算每个样本Xm，与已有聚类中心的最短距离d(X_m)，m＝1，2...N，其中X_m为数据集中的第m个样本；接着计算每个样本对象被选为下一个聚类中心的概率P(X_m)：Calculate the shortest distance d(X _m ) between each sample X m and the existing cluster center, m=1, 2...N, where X _m is the mth sample in the data set; then calculate the probability P(X _m ) of each sample object being selected as the next cluster center:

按照轮盘法选择下一个聚类中心重复步骤直到选择出k个对象组成初始聚类中心Z_j(1)，j＝1，2，3，...k，具体的，聚类中心个数k＝4；Select the next cluster center according to the roulette wheel method and repeat the steps until k objects are selected to form the initial cluster center Z _j (1), j = 1, 2, 3, ... k, specifically, the number of cluster centers k = 4;

Step2：迭代计算：根据相似度准则计算样本数据集C中每个样本xm与初始聚类中的距离D(X_m；Z_j(I))，m＝1，2，3....N，j＝1，2，3…k，如下式(5)所示，将各个数据对象划分到距离最小的聚类集合簇S_j中，则X_m∈S_j：Step 2: Iterative calculation: Calculate the distance D(X _m ; Z _j (I)) between each sample xm in the sample data set C and the initial cluster according to the similarity criterion, m = 1, 2, 3....N, j = 1, 2, 3...k, as shown in the following formula (5), and divide each data object into the cluster set S _j with the smallest distance, then X _m∈ S _j :

Step3：聚类中心更新：按照聚类中心更新公式计算各聚类集合中的均值作为该集合新的聚类中心，更新得到新的聚类集合中心，设

为聚类j的元素，聚类j的元素个数为n_j，聚类中心更新公式如式(6)所示：Step 3: Cluster center update: According to the cluster center update formula, calculate the mean of each cluster set as the new cluster center of the set, and update the new cluster set center.

is the element of cluster j, the number of elements of cluster j is n _j , and the cluster center update formula is shown in formula (6):

Step4：终止条件：循环更新聚类集合中心，直到各聚类中心不再发生变化或者误差平方和局部最小为止，聚类准则函数J计算方法为：Step 4: Termination condition: Update the cluster set centers cyclically until the cluster centers no longer change or the error square sum is locally minimized. The clustering criterion function J is calculated as follows:

精度误差为ξ，若|J(I+1)-J(I)|<ξ，则算法结束，终止迭代，否则反复执行迭代计算和聚类中心直至满足终止条件。The precision error is ξ. If |J(I+1)-J(I)|<ξ, the algorithm ends and the iteration is terminated. Otherwise, the iterative calculation and clustering center are repeatedly performed until the termination condition is met.

优选的，所述S2中的形态学处理是经过图像分割得到的区域还存在很多的噪声干扰信息，为了祛除噪声区域，利用形态学处理算法对分割区域进行处理，设A表示图像矩阵，B表示结构元素，形态学处理方法如下式所示：Preferably, the morphological processing in S2 is that there is still a lot of noise interference information in the area obtained by image segmentation. In order to remove the noise area, the segmented area is processed using a morphological processing algorithm. Let A represent the image matrix, B represent the structural element, and the morphological processing method is shown in the following formula:

采用尺寸为10×10的矩形结构元素对聚类分割区域进行开操作。A rectangular structure element with a size of 10×10 is used to open the cluster segmentation area.

优选的，所述S2中的连通区域特征分析是经过形态学处理后，细小的噪声干扰区域已经祛除；考虑到水果的实际大小，需要进一步祛除非目标区域；将灰度一致，且满足8邻接的像素判定为相同区域，通过连通区域面积特征，过滤噪声干扰区域，根据如下公式提取出像素面积在(SMin，SMax)的区域X，Preferably, the connected region feature analysis in S2 is after morphological processing, and the small noise interference area has been eliminated; considering the actual size of the fruit, it is necessary to further eliminate the non-target area; the pixels with the same grayscale and satisfying 8 adjacency are determined as the same area, and the noise interference area is filtered through the connected region area feature, and the area X with the pixel area in (SMin, SMax) is extracted according to the following formula:

其中SMn和SMax分别为面积像素的参数下限值和上限值。Among them, SMn and SMax are the lower and upper limit values of the area pixel parameters respectively.

优选的，所述S3中的构建卷积神经网络模型如下：Preferably, the convolutional neural network model constructed in S3 is as follows:

1、卷积层1. Convolutional layer

在卷积神经网络中，第一个卷积层直接接受图像像素级的输入，每一个卷积操作只处理一小块图像，进行卷积变化后再传到后面的网络，每一层卷积都会提取数据中最有效的特征；卷积层是卷积神经网络的核心层，而卷积是卷积层的核心，关于简单卷积的运算，是一种简单的二维空间卷积运算

将矩阵中输入的每个元素与中间矩阵的每个元素对应相乘并相加.这便是卷积操作；In a convolutional neural network, the first convolution layer directly accepts pixel-level input of the image. Each convolution operation only processes a small piece of the image, and then passes it to the subsequent network after convolution. Each layer of convolution extracts the most effective features of the data. The convolution layer is the core layer of the convolutional neural network, and convolution is the core of the convolution layer. The operation of simple convolution is a simple two-dimensional spatial convolution operation.

Multiply and add each element of the input matrix with each element of the intermediate matrix. This is the convolution operation;

其中中间的矩阵叫做卷积核，又叫权重过滤器或滤波器(filter)，它是卷积过程的核心，它能够检测图像的水平边缘、垂直边缘、增强图片中心区域权重等；当把卷积推广到更高维度时，把卷积核作为在输入矩阵上一个移动窗口，便可进行对应的卷积运算；中CNN通过加强神经网络中相邻层之间节点的局部连接模式来挖掘自然图像的空间局部关联信息，第m层节点的输人是第m﹣1层节点的一部分，这些节点具有空间相邻的视觉感受野；卷积神经网络中每个神经元的权重个数均为卷积核的大小，即每个神经元只与图片部分像素相连接；采用两个卷积层(Covnl，Covn2)，卷积核3×3，可由学习的卷积核对原始图像进行卷积运算，卷积后的结果加上阈值经过激活函数作用形成本层输出的特征图，计算公式为：The middle matrix is called the convolution kernel, also called the weight filter or filter. It is the core of the convolution process. It can detect the horizontal and vertical edges of the image, enhance the weight of the central area of the image, etc. When the convolution is extended to a higher dimension, the convolution kernel is used as a moving window on the input matrix to perform the corresponding convolution operation. CNN mines the spatial local correlation information of natural images by strengthening the local connection mode of nodes between adjacent layers in the neural network. The input of the m-th layer node is part of the m-1-th layer node, and these nodes have spatially adjacent visual receptive fields. The number of weights of each neuron in the convolutional neural network is the size of the convolution kernel, that is, each neuron is only connected to part of the pixels of the image. Two convolution layers (Covn1, Covn2) are used, and the convolution kernel is 3×3. The original image can be convolved by the learned convolution kernel. The result after convolution plus the threshold is activated through the activation function to form the feature map output by this layer. The calculation formula is:

式中l为层数，k为卷积核序号，M_j为M个特征图在第j个；f为激活函数；bias为阀值；卷积层的第一层(convl)采用随机初始化64个3×3×3的卷积核对图片进行卷积，步长为1，采用卷积后图形的尺寸与原尺寸一致；Conv2中采用16个3×3×64的卷积核，步长为1；Where l is the number of layers, k is the convolution kernel number, _Mj is the jth of M feature maps; f is the activation function; bias is the threshold; the first layer of the convolution layer (convl) uses 64 randomly initialized 3×3×3 convolution kernels to convolve the image with a step size of 1, and the size of the image after convolution is consistent with the original size; Conv2 uses 16 3×3×64 convolution kernels with a step size of 1;

2、池化层2. Pooling layer

池化层目的是为了进一步降低网络训练参数及模型的过拟合程度，池化的方式通常有3种：①最大池化，选择池化窗口中的最大值作为采样值；②均值池化，将池化窗口中的所有值相加取平均，以平均值作为采样值；③随机池化，借概率的方法确定选择哪―项；池化层可以减小图像的尺寸，提高整个算法的运算速度以及减小噪声的影响；池化层主要是针对卷积后的特征图，按照卷积的方法对图像部分区域求均值或最大值，用来代表其采样的区域；这是为了描述大的图像，对不同位置的特征进行聚合统计，这个均值或者最大值就是聚合统计的方法，也就是池化；中两层下采样层都采用卷积核为3×3，步长为2的最大池化；其运算过程可表示为：The purpose of the pooling layer is to further reduce the overfitting degree of network training parameters and models. There are usually three ways of pooling: ① Maximum pooling, selecting the maximum value in the pooling window as the sampling value; ② Mean pooling, adding all values in the pooling window and averaging them, and using the average value as the sampling value; ③ Random pooling, using the probability method to determine which one to choose; The pooling layer can reduce the size of the image, improve the operation speed of the entire algorithm and reduce the impact of noise; The pooling layer is mainly for the feature map after convolution, and calculates the mean or maximum value of part of the image area according to the convolution method to represent the sampling area; This is to describe large images and aggregate statistics of features at different positions. This mean or maximum value is the method of aggregate statistics, that is, pooling; The two downsampling layers in the middle use the maximum pooling with a convolution kernel of 3×3 and a step size of 2; Its operation process can be expressed as:

式中g(·)为下采样函数；f为激活函数；Where g(·) is the downsampling function; f is the activation function;

3、激活函数3. Activation Function

激活函数是用来保证其结果的非线性，即在卷积运算后，把输出值另加偏移量，输人到激活函数，作为下一层的输入；常用的激活函数有Sigmoid、Tanh和ReLu；激活函数选用Relu，在x<0时，硬饱和，x>0时，导数为1，保持梯度不衰减，从而缓解梯度消失问题.能更快收敛，表达式为：The activation function is used to ensure the nonlinearity of the result. That is, after the convolution operation, the output value is added with an offset and input to the activation function as the input of the next layer. Common activation functions are Sigmoid, Tanh and ReLu. The activation function uses Relu. When x<0, it is hard saturated. When x>0, the derivative is 1, keeping the gradient from decaying, thereby alleviating the gradient vanishing problem. It can converge faster. The expression is:

4、全连接层4. Fully connected layer

全连接层相当于“分类器”的作用；即将卷积层、激活函数和池化层学到的“分布式特征表示”映射到样本标记空间；的CNN模型采用两层全连接层，每层神经元数为128，最后输出采用softmax分类器，Softmax函数是多项式回归函数，适合解决多类图片的识别问题，在全连接层使用Dropout随机忽略一部分神经元，以避免模型过拟合，增强了模型的健壮性。The fully connected layer is equivalent to the role of a "classifier", that is, the "distributed feature representation" learned by the convolutional layer, activation function and pooling layer is mapped to the sample label space; the CNN model uses two fully connected layers, each with 128 neurons, and the final output uses a softmax classifier. The Softmax function is a polynomial regression function, which is suitable for solving the recognition problem of multiple categories of images. Dropout is used in the fully connected layer to randomly ignore some neurons to avoid overfitting of the model and enhance the robustness of the model.

与现有技术相比，本发明提供了一种基于卷积神经网络的水果识别方法，具备以下有益效果：Compared with the prior art, the present invention provides a fruit recognition method based on convolutional neural network, which has the following beneficial effects:

本发明通过改进K均值水果分割算法，并利用形态学处理和连通区域特征分析完成目标区域的选择定位，实现了水果的检测识别，确定水果的区域，以此区域为中心点对图像进行放缩，调整到合适的尺寸，同时利用卷积神经网络特有的边隐式提取特征的同时边对样本进行特征学习，也通过减少了程序前期处理工作，对水果特征进行训练提取，通过不同测试集的水果图像进行验证测试，实现水果品种识别分类，提高了水果品种的准确识别率。The present invention improves the K-means fruit segmentation algorithm, and uses morphological processing and connected region feature analysis to complete the selection and positioning of the target area, thereby realizing the detection and recognition of fruits, determining the area of the fruit, scaling the image with the area as the center point, and adjusting it to a suitable size. At the same time, the unique feature of the convolutional neural network is used to implicitly extract features while learning features of samples. The present invention also reduces the pre-processing work of the program, trains and extracts fruit features, and performs verification tests through fruit images of different test sets, thereby realizing the recognition and classification of fruit varieties and improving the accurate recognition rate of fruit varieties.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

附图用来提供对本发明的进一步理解，并且构成说明书的一部分，与本发明的实施例一起用于解释本发明，并不构成对本发明的限制，在附图中：The accompanying drawings are used to provide a further understanding of the present invention and constitute a part of the specification. Together with the embodiments of the present invention, they are used to explain the present invention and do not constitute a limitation of the present invention. In the accompanying drawings:

图1为本发明提出的一种基于卷积神经网络的水果识别方法的流程图；FIG1 is a flow chart of a fruit recognition method based on a convolutional neural network proposed by the present invention;

图2为本发明的检测算法流程图；FIG2 is a flow chart of a detection algorithm of the present invention;

图3为本发明的卷积神经网络模型图；FIG3 is a convolutional neural network model diagram of the present invention;

图4为R空间颜色分量示意图；FIG4 is a schematic diagram of color components in R space;

图5为改进K均值分割结果示意图；FIG5 is a schematic diagram of improved K-means segmentation results;

图6为形态学处理结果示意图；Fig. 6 is a schematic diagram of morphological processing results;

图7为连通区域特征分析示意图。FIG. 7 is a schematic diagram of connected region feature analysis.

具体实施方式DETAILED DESCRIPTION

下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例仅仅是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。The following will be combined with the drawings in the embodiments of the present invention to clearly and completely describe the technical solutions in the embodiments of the present invention. Obviously, the described embodiments are only part of the embodiments of the present invention, not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by ordinary technicians in this field without creative work are within the scope of protection of the present invention.

请参阅图1-3，本发明提供一种技术方案：种基于卷积神经网络的水果识别方法，包括如下步骤：Please refer to Figures 1-3. The present invention provides a technical solution: a fruit recognition method based on a convolutional neural network, comprising the following steps:

图像采集是图像处理的前提，水果图像采集后，为了提高图像处理速度和算法的自适应性，针对水果图像特点，提取图像的R空间颜色特征分量，进行高斯滤波处理，高斯模板通过对二维高斯函数的离散化表示，对于一个大小为(2u+1)×(2u+1)的矩阵M﹐其(a，b)位置的元素表示为：Image acquisition is the premise of image processing. After the fruit image is acquired, in order to improve the image processing speed and the adaptability of the algorithm, the R space color feature components of the image are extracted according to the characteristics of the fruit image, and Gaussian filtering is performed. The Gaussian template is represented by the discretization of the two-dimensional Gaussian function. For a matrix M of size (2u+1)×(2u+1), the elements at the (a, b) position are expressed as:

其中α为高斯函数标准差，高斯滤波模板大小为5×5，检测结果如图4所示；Where α is the standard deviation of the Gaussian function, the size of the Gaussian filter template is 5×5, and the detection results are shown in Figure 4;

针对图像特点，利用改进K均值聚类分割算法对图像进行分割，形成分割区域，具体聚类分割算法如下：According to the characteristics of the image, the improved K-means clustering segmentation algorithm is used to segment the image to form segmentation areas. The specific clustering segmentation algorithm is as follows:

精度误差为ξ，若|J(I+1)-J(I)|<ξ，则算法结束，终止迭代，否则反复执行迭代计算和聚类中心直至满足终止条件，图像分割结果如图5所示。The precision error is ξ. If |J(I+1)-J(I)|<ξ, the algorithm ends and the iteration is terminated. Otherwise, the iterative calculation and clustering center are repeatedly performed until the termination condition is met. The image segmentation result is shown in Figure 5.

形态学处理是经过图像分割得到的区域还存在很多的噪声干扰信息，为了祛除噪声区域，利用形态学处理算法对分割区域进行处理，设A表示图像矩阵，B表示结构元素，形态学处理方法如下式所示：Morphological processing is a process in which there is still a lot of noise interference information in the area obtained by image segmentation. In order to remove the noise area, the segmented area is processed using the morphological processing algorithm. Let A represent the image matrix and B represent the structural element. The morphological processing method is shown in the following formula:

采用尺寸为10×10的矩形结构元素对聚类分割区域进行开操作，图像处理结果如图6所示；A rectangular structural element with a size of 10×10 is used to open the cluster segmentation area. The image processing result is shown in Figure 6;

连通区域特征分析是经过形态学处理后，细小的噪声干扰区域已经祛除；考虑到水果的实际大小，需要进一步祛除非目标区域；将灰度一致，且满足8邻接的像素判定为相同区域，通过连通区域面积特征，过滤噪声干扰区域，根据如下公式提取出像素面积在(SMin，SMax)的区域X，The connected region feature analysis is that after morphological processing, the small noise interference area has been removed; considering the actual size of the fruit, it is necessary to further remove the non-target area; the pixels with the same grayscale and 8 adjacent pixels are judged as the same area, and the noise interference area is filtered out through the connected region area feature. The area X with the pixel area in (SMin, SMax) is extracted according to the following formula:

其中SMn和SMax分别为面积像素的参数下限值和上限值，图像处理结果如图7所示；Where SMn and SMax are the lower and upper limits of the area pixel parameters, respectively. The image processing result is shown in Figure 7;

优选的，S1中的步骤中摄像头负责水果图像信息的采集，主要依据检测精度、视野尺寸进行选择，镜头参数主要是焦距，其选择主要依据工作距离进行确定。Preferably, in step S1, the camera is responsible for collecting fruit image information, which is mainly selected based on detection accuracy and field of view size. The lens parameter is mainly focal length, and its selection is mainly determined based on working distance.

如图3所示，S3中的构建卷积神经网络模型如下：As shown in Figure 3, the convolutional neural network model in S3 is as follows:

1、卷积层1. Convolutional layer

2、池化层2. Pooling layer

3、激活函数3. Activation Function

4、全连接层4. Fully connected layer

尽管已经示出和描述了本发明的实施例，对于本领域的普通技术人员而言，可以理解在不脱离本发明的原理和精神的情况下可以对这些实施例进行多种变化、修改、替换和变型，本发明的范围由所附权利要求及其等同物限定。Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that various changes, modifications, substitutions and variations may be made to the embodiments without departing from the principles and spirit of the present invention, and that the scope of the present invention is defined by the appended claims and their equivalents.

Claims

1. A fruit recognition method based on convolutional neural network, characterized in that it comprises the following steps:

S1: Fruit image data collection, which is completed through cameras and lenses;

S2: Image processing, based on the image characteristics, complete image preprocessing, image segmentation, morphological processing and connected area feature analysis;

S3: Construct a convolutional neural network model and perform multi-scale feature extraction on the image processing results through convolutional neural network training;

S4: After completing the convolutional neural network training, the data results are sent to the chip through the serial port, and the chip controls the servo to make the robotic arm sort the fruits.

2. A fruit recognition method based on convolutional neural network according to claim 1, characterized in that: in the step of S1, the camera is responsible for the collection of fruit image information, which is mainly selected based on detection accuracy and visual field size, and the lens parameter is mainly focal length, which is mainly determined based on working distance.

3. A fruit recognition method based on convolutional neural network according to claim 1, characterized in that: the image preprocessing in S2 is to extract the R space color feature component of the image according to the characteristics of the fruit image, and perform Gaussian filtering. The Gaussian template is represented by discretization of the two-dimensional Gaussian function. For a matrix M of size (2u+1)×(2u+1), the element at the (a, b) position is represented as:

Where α is the standard deviation of the Gaussian function, and the size of the Gaussian filter template is 5×5.

4. a kind of fruit identification method based on convolutional neural network according to claim 3, it is characterized in that: the image segmentation among the described S2 is to utilize improved K mean value cluster segmentation algorithm that image is segmented, forms segmentation area, and concrete cluster segmentation algorithm is as follows:

Step 1: First, convert the grayscale value of the sample image into a one-dimensional sample data set C of size N, where N is the number of sample image pixels. Assume that the number of iterations is I and the cluster center when the cluster type is j is Zj(I). Randomly select a sample object from the data set C as the initial cluster center Z1(1);

Calculate the shortest distance d(Xm) between each sample Xm and the existing cluster center, m=1, 2...N, where Xm is the mth sample in the data set; then calculate the probability P(Xm) of each sample object being selected as the next cluster center:

Select the next cluster center according to the roulette wheel method and repeat the steps until k objects are selected to form the initial cluster center Zj(1), j=1, 2, 3, ...k, specifically, the number of cluster centers k=4;

Step 2: Iterative calculation: According to the similarity criterion, the distance D(Xm; Zj(I)) between each sample xm in the sample data set C and the initial cluster is calculated, m = 1, 2, 3....N, j = 1, 2, 3...k, as shown in the following formula (5), and each data object is divided into the cluster set Sj with the smallest distance, then Xm∈Sj:

Step 3: Cluster center update: According to the cluster center update formula, calculate the mean of each cluster set as the new cluster center of the set, and update the new cluster set center.

is the element of cluster j, the number of elements in cluster j is nj, and the cluster center update formula is shown in formula (6):

Step 4: Termination condition: Update the cluster set centers cyclically until the cluster centers no longer change or the error square sum is locally minimized. The clustering criterion function J is calculated as follows:

The precision error is ξ. If |J(I+1)-J(I)|<ξ, the algorithm ends and the iteration is terminated. Otherwise, the iterative calculation and clustering center are repeatedly performed until the termination condition is met.

5. a kind of fruit identification method based on convolutional neural network according to claim 4, it is characterized in that: the morphological treatment among the described S2 is that the region that obtains through image segmentation also has a lot of noise interference information, in order to eliminate the noise area, utilize the morphological treatment algorithm that the segmented region is processed, establish A to represent image matrix, B represents structural element, the morphological treatment method is shown as follows:

A rectangular structure element with a size of 10×10 is used to open the cluster segmentation area.

6. a kind of fruit identification method based on convolutional neural network according to claim 5, it is characterized in that: the connected region feature analysis among described S2 is after morphological treatment, and tiny noise interference region has been eliminated; Consider the actual size of fruit, need to further eliminate non-target area; Gray scale is consistent, and satisfy 8 adjacent pixels to be judged as same area, by connected region area feature, filter noise interference region, extract pixel area in (SMin, SMax) region X according to following formula,

Among them, SMn and SMax are the lower and upper limit values of the area pixel parameters respectively.

7. A fruit identification method based on convolutional neural network according to claim 1, characterized in that: the convolutional neural network model constructed in S3 is as follows:

1. Convolutional layer

In a convolutional neural network, the first convolution layer directly accepts pixel-level input of the image. Each convolution operation only processes a small piece of the image, which is then passed to the subsequent network after convolution. Each convolution layer extracts the most effective features of the data. The convolution layer is the core layer of the convolutional neural network, and convolution is the core of the convolution layer. The operation of simple convolution is a simple two-dimensional spatial convolution operation.

The middle matrix is called the convolution kernel, also called the weight filter or filter. It is the core of the convolution process. It can detect the horizontal and vertical edges of the image, enhance the weight of the central area of the image, etc. When the convolution is extended to a higher dimension, the convolution kernel is used as a moving window on the input matrix to perform the corresponding convolution operation. CNN mines the spatial local correlation information of natural images by strengthening the local connection mode of nodes between adjacent layers in the neural network. The input of the m-th layer node is part of the m-1-th layer node, and these nodes have spatially adjacent visual receptive fields. The number of weights of each neuron in the convolutional neural network is the size of the convolution kernel, that is, each neuron is only connected to part of the pixels of the image. Two convolution layers (Covn1, Covn2) are used, and the convolution kernel is 3×3. The original image can be convolved by the learned convolution kernel. The result after convolution plus the threshold is activated through the activation function to form the feature map output by this layer. The calculation formula is:

Where l is the number of layers, k is the convolution kernel number, Mj is the jth of M feature maps; f is the activation function; bias is the threshold; the first layer of the convolution layer (convl) uses 64 randomly initialized 3×3×3 convolution kernels to convolve the image with a step size of 1, and the size of the image after convolution is consistent with the original size; Conv2 uses 16 3×3×64 convolution kernels with a step size of 1;

2. Pooling layer

The purpose of the pooling layer is to further reduce the overfitting degree of network training parameters and models. There are usually three ways of pooling: ① Maximum pooling, selecting the maximum value in the pooling window as the sampling value; ② Mean pooling, adding all values in the pooling window and averaging them, and using the average value as the sampling value; ③ Random pooling, using the probability method to determine which one to choose; The pooling layer can reduce the size of the image, improve the operation speed of the entire algorithm and reduce the impact of noise; The pooling layer is mainly for the feature map after convolution, and calculates the mean or maximum value of part of the image area according to the convolution method to represent the sampling area; This is to describe large images and aggregate statistics of features at different positions. This mean or maximum value is the method of aggregate statistics, that is, pooling; The two downsampling layers in the middle use the maximum pooling with a convolution kernel of 3×3 and a step size of 2; Its operation process can be expressed as:

Where g(·) is the downsampling function; f is the activation function;

3. Activation Function

The activation function is used to ensure the nonlinearity of the result. That is, after the convolution operation, the output value is added with an offset and input to the activation function as the input of the next layer. Common activation functions are Sigmoid, Tanh and ReLu. The activation function uses Relu. When x<0, it is hard saturated. When x>0, the derivative is 1, keeping the gradient from decaying, thereby alleviating the gradient vanishing problem. It can converge faster. The expression is:

4. Fully connected layer

The fully connected layer is equivalent to the role of a "classifier", that is, the "distributed feature representation" learned by the convolutional layer, activation function and pooling layer is mapped to the sample label space; the CNN model uses two fully connected layers, each with 128 neurons, and the final output uses a softmax classifier. The Softmax function is a polynomial regression function, which is suitable for solving the recognition problem of multiple categories of images. Dropout is used in the fully connected layer to randomly ignore some neurons to avoid overfitting of the model and enhance the robustness of the model.