CN116071560A - A method of fruit recognition based on convolutional neural network - Google Patents

A method of fruit recognition based on convolutional neural network Download PDF

Info

Publication number
CN116071560A
CN116071560A CN202211452782.3A CN202211452782A CN116071560A CN 116071560 A CN116071560 A CN 116071560A CN 202211452782 A CN202211452782 A CN 202211452782A CN 116071560 A CN116071560 A CN 116071560A
Authority
CN
China
Prior art keywords
image
convolution
neural network
convolutional neural
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211452782.3A
Other languages
Chinese (zh)
Inventor
李鲁群
陶霜霜
胡天乐
张慎文
许崇海
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Normal University
Original Assignee
Shanghai Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Normal University filed Critical Shanghai Normal University
Priority to CN202211452782.3A priority Critical patent/CN116071560A/en
Publication of CN116071560A publication Critical patent/CN116071560A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/762Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a fruit identification method based on a convolutional neural network, which comprises the following steps: s1: fruit image data acquisition is completed through a camera and a lens; s2: image processing, namely, aiming at the characteristics of the image, completing the work of image preprocessing, image segmentation, morphological processing and connected region feature analysis; s3: constructing a convolutional neural network model, and extracting multi-scale features from the image processing result through convolutional neural network training; s4: after the convolutional neural network training is completed, the data result is sent to the chip through the serial port, and the chip controls the steering engine to enable the mechanical arm to sort fruits; the method has the advantages that the K-means fruit segmentation algorithm is improved, morphological processing and connected region feature analysis are utilized to complete the selection and positioning of a target region, the characteristic of the specific edge implicit extraction of the convolutional neural network is utilized, meanwhile, the characteristic learning is carried out on a sample, the fruit variety identification and classification are realized, and the accurate identification rate of the fruit variety is improved.

Description

一种基于卷积神经网络的水果识别方法A fruit recognition method based on convolutional neural network

技术领域Technical Field

本发明属于图像识别相关技术领域,具体涉及一种基于卷积神经网络的水果识别方法。The invention belongs to the technical field related to image recognition, and in particular relates to a fruit recognition method based on a convolutional neural network.

背景技术Background Art

目前,我国现行的水果产业结构中机械化程度较低,大部分生产环节尤其果实采摘主要依靠费时费力的人工进行。整个水果生产作业中包括采摘、储藏、运输、加工以及销售等环节,因此研究开发水果农业生产机器人,是提高水果生产效率和节约人力成本的必然趋势。而无论是采摘、分拣机器人,还是水果生产环节中的果实品质、品种检测系统,它们的正常工作都依赖于图像处理模块对果实的正确识别,例如采摘机器人只有从果树中识别出水果并获取到果实的准确位置,才能为机械手臂提供运动参数,进而完成对果实的摘取操作。At present, the mechanization level in the current fruit industry structure in my country is low, and most of the production links, especially fruit picking, mainly rely on time-consuming and labor-intensive manual work. The entire fruit production operation includes picking, storage, transportation, processing and sales. Therefore, the research and development of fruit agricultural production robots is an inevitable trend to improve fruit production efficiency and save labor costs. Whether it is a picking and sorting robot, or a fruit quality and variety detection system in the fruit production link, their normal operation depends on the correct recognition of the fruit by the image processing module. For example, the picking robot can only provide motion parameters for the mechanical arm and complete the fruit picking operation after it recognizes the fruit from the fruit tree and obtains the accurate position of the fruit.

近几年,随着机器视觉和计算机技术的成熟,深度学习技术迅速发展,它能够出色地完成各项计算机视觉任务,经过大量数据训练,能够自动地学习到不同事物的特征信息,获取各类别的差异,能够将原始数据转换为更加抽象、高级的表达,进而完成图像分类、检测等任务,从而越来越广泛的应用于农作物生长监测中;然而,目前基于图像深度学习的水果品种识别方法的准确率偏低,不能完全满足实际应用的要求。In recent years, with the maturity of machine vision and computer technology, deep learning technology has developed rapidly. It can excellently complete various computer vision tasks. After a large amount of data training, it can automatically learn the characteristic information of different things, obtain the differences between categories, and convert the original data into more abstract and advanced expressions, thereby completing image classification, detection and other tasks, and thus is increasingly widely used in crop growth monitoring; however, the current fruit variety recognition method based on image deep learning has a low accuracy rate and cannot fully meet the requirements of practical applications.

发明内容Summary of the invention

本发明的目的在于提供一种基于卷积神经网络的水果识别方法,以解决上述背景技术中提出的水果品种识别方法的准确率偏低问题。The object of the present invention is to provide a fruit recognition method based on a convolutional neural network to solve the problem of low accuracy of the fruit variety recognition method proposed in the above background technology.

为实现上述目的,本发明提供如下技术方案:一种基于卷积神经网络的水果识别方法,包括如下步骤:To achieve the above object, the present invention provides the following technical solution: a fruit recognition method based on convolutional neural network, comprising the following steps:

S1:水果图像数据采集,通过摄像头和镜头完成水果图像数据采集;S1: Fruit image data collection, which is completed through cameras and lenses;

S2:图像处理,针对图像特点,完成图像预处理、图像分割、形态学处理以及连通区域特征分析的工作;S2: Image processing, based on the image characteristics, complete image preprocessing, image segmentation, morphological processing and connected area feature analysis;

S3:构建卷积神经网络模型,并通过卷积神经网络训练对图像处理的结果进行多尺度的特征提取;S3: Construct a convolutional neural network model and perform multi-scale feature extraction on the image processing results through convolutional neural network training;

S4:在完成卷积神经网络训练后,将数据结果通过串口发送数据给芯片,芯片控制控制舵机使机械臂分拣水果。S4: After completing the convolutional neural network training, the data results are sent to the chip through the serial port, and the chip controls the servo to make the robotic arm sort the fruits.

优选的,所述S1中的步骤中摄像头负责水果图像信息的采集,主要依据检测精度、视野尺寸进行选择,镜头参数主要是焦距,其选择主要依据工作距离进行确定。Preferably, in the step S1, the camera is responsible for collecting the fruit image information, which is mainly selected based on the detection accuracy and the field of view size. The lens parameter is mainly the focal length, and its selection is mainly determined based on the working distance.

优选的,所述S2中的图像预处理是针对水果图像特点,提取图像的R空间颜色特征分量,进行高斯滤波处理,高斯模板通过对二维高斯函数的离散化表示,对于一个大小为(2u+1)×(2u+1)的矩阵M﹐其(a,b)位置的元素表示为:Preferably, the image preprocessing in S2 is to extract the R space color feature component of the image according to the characteristics of the fruit image, and perform Gaussian filtering. The Gaussian template is represented by discretizing the two-dimensional Gaussian function. For a matrix M of size (2u+1)×(2u+1), the element at the (a, b) position is represented as:

Figure SMS_1
Figure SMS_1

其中α为高斯函数标准差,高斯滤波模板大小为5×5。Where α is the standard deviation of the Gaussian function, and the size of the Gaussian filter template is 5×5.

优选的,所述S2中的图像分割是利用改进K均值聚类分割算法对图像进行分割,形成分割区域,具体聚类分割算法如下:Preferably, the image segmentation in S2 is to segment the image using an improved K-means clustering segmentation algorithm to form segmented regions. The specific clustering segmentation algorithm is as follows:

Stepl:首先将样本图像灰度值转换为尺寸大小为N的一维样本数据集C,其中N样本图像像素个数,设迭代运算次数为Ⅰ且聚类类型为j时的聚类中心为Zj(I),从数据集C中随机选取一个样本对象作为初始聚类中心Z1(1);Step 1: First, convert the grayscale value of the sample image into a one-dimensional sample data set C of size N, where N is the number of sample image pixels. Assume that the number of iterations is I and the cluster center when the cluster type is j is Z j (I). Randomly select a sample object from the data set C as the initial cluster center Z 1 (1);

计算每个样本Xm,与已有聚类中心的最短距离d(Xm),m=1,2...N,其中Xm为数据集中的第m个样本;接着计算每个样本对象被选为下一个聚类中心的概率P(Xm):Calculate the shortest distance d(X m ) between each sample X m and the existing cluster center, m=1, 2...N, where X m is the mth sample in the data set; then calculate the probability P(X m ) of each sample object being selected as the next cluster center:

Figure SMS_2
Figure SMS_2

按照轮盘法选择下一个聚类中心重复步骤直到选择出k个对象组成初始聚类中心Zj(1),j=1,2,3,...k,具体的,聚类中心个数k=4;Select the next cluster center according to the roulette wheel method and repeat the steps until k objects are selected to form the initial cluster center Z j (1), j = 1, 2, 3, ... k, specifically, the number of cluster centers k = 4;

Step2:迭代计算:根据相似度准则计算样本数据集C中每个样本xm与初始聚类中的距离D(Xm;Zj(I)),m=1,2,3....N,j=1,2,3…k,如下式(5)所示,将各个数据对象划分到距离最小的聚类集合簇Sj中,则Xm∈SjStep 2: Iterative calculation: Calculate the distance D(X m ; Z j (I)) between each sample xm in the sample data set C and the initial cluster according to the similarity criterion, m = 1, 2, 3....N, j = 1, 2, 3...k, as shown in the following formula (5), and divide each data object into the cluster set S j with the smallest distance, then X m∈ S j :

Figure SMS_3
Figure SMS_3

Step3:聚类中心更新:按照聚类中心更新公式计算各聚类集合中的均值作为该集合新的聚类中心,更新得到新的聚类集合中心,设

Figure SMS_4
为聚类j的元素,聚类j的元素个数为nj,聚类中心更新公式如式(6)所示:Step 3: Cluster center update: According to the cluster center update formula, calculate the mean of each cluster set as the new cluster center of the set, and update the new cluster set center.
Figure SMS_4
is the element of cluster j, the number of elements of cluster j is n j , and the cluster center update formula is shown in formula (6):

Figure SMS_5
Figure SMS_5

Step4:终止条件:循环更新聚类集合中心,直到各聚类中心不再发生变化或者误差平方和局部最小为止,聚类准则函数J计算方法为:Step 4: Termination condition: Update the cluster set centers cyclically until the cluster centers no longer change or the error square sum is locally minimized. The clustering criterion function J is calculated as follows:

Figure SMS_6
Figure SMS_6

精度误差为ξ,若|J(I+1)-J(I)|<ξ,则算法结束,终止迭代,否则反复执行迭代计算和聚类中心直至满足终止条件。The precision error is ξ. If |J(I+1)-J(I)|<ξ, the algorithm ends and the iteration is terminated. Otherwise, the iterative calculation and clustering center are repeatedly performed until the termination condition is met.

优选的,所述S2中的形态学处理是经过图像分割得到的区域还存在很多的噪声干扰信息,为了祛除噪声区域,利用形态学处理算法对分割区域进行处理,设A表示图像矩阵,B表示结构元素,形态学处理方法如下式所示:Preferably, the morphological processing in S2 is that there is still a lot of noise interference information in the area obtained by image segmentation. In order to remove the noise area, the segmented area is processed using a morphological processing algorithm. Let A represent the image matrix, B represent the structural element, and the morphological processing method is shown in the following formula:

Figure SMS_7
Figure SMS_7

采用尺寸为10×10的矩形结构元素对聚类分割区域进行开操作。A rectangular structure element with a size of 10×10 is used to open the cluster segmentation area.

优选的,所述S2中的连通区域特征分析是经过形态学处理后,细小的噪声干扰区域已经祛除;考虑到水果的实际大小,需要进一步祛除非目标区域;将灰度一致,且满足8邻接的像素判定为相同区域,通过连通区域面积特征,过滤噪声干扰区域,根据如下公式提取出像素面积在(SMin,SMax)的区域X,Preferably, the connected region feature analysis in S2 is after morphological processing, and the small noise interference area has been eliminated; considering the actual size of the fruit, it is necessary to further eliminate the non-target area; the pixels with the same grayscale and satisfying 8 adjacency are determined as the same area, and the noise interference area is filtered through the connected region area feature, and the area X with the pixel area in (SMin, SMax) is extracted according to the following formula:

Figure SMS_8
Figure SMS_8

其中SMn和SMax分别为面积像素的参数下限值和上限值。Among them, SMn and SMax are the lower and upper limit values of the area pixel parameters respectively.

优选的,所述S3中的构建卷积神经网络模型如下:Preferably, the convolutional neural network model constructed in S3 is as follows:

1、卷积层1. Convolutional layer

在卷积神经网络中,第一个卷积层直接接受图像像素级的输入,每一个卷积操作只处理一小块图像,进行卷积变化后再传到后面的网络,每一层卷积都会提取数据中最有效的特征;卷积层是卷积神经网络的核心层,而卷积是卷积层的核心,关于简单卷积的运算,是一种简单的二维空间卷积运算

Figure SMS_9
将矩阵中输入的每个元素与中间矩阵的每个元素对应相乘并相加.这便是卷积操作;In a convolutional neural network, the first convolution layer directly accepts pixel-level input of the image. Each convolution operation only processes a small piece of the image, and then passes it to the subsequent network after convolution. Each layer of convolution extracts the most effective features of the data. The convolution layer is the core layer of the convolutional neural network, and convolution is the core of the convolution layer. The operation of simple convolution is a simple two-dimensional spatial convolution operation.
Figure SMS_9
Multiply and add each element of the input matrix with each element of the intermediate matrix. This is the convolution operation;

其中中间的矩阵叫做卷积核,又叫权重过滤器或滤波器(filter),它是卷积过程的核心,它能够检测图像的水平边缘、垂直边缘、增强图片中心区域权重等;当把卷积推广到更高维度时,把卷积核作为在输入矩阵上一个移动窗口,便可进行对应的卷积运算;中CNN通过加强神经网络中相邻层之间节点的局部连接模式来挖掘自然图像的空间局部关联信息,第m层节点的输人是第m﹣1层节点的一部分,这些节点具有空间相邻的视觉感受野;卷积神经网络中每个神经元的权重个数均为卷积核的大小,即每个神经元只与图片部分像素相连接;采用两个卷积层(Covnl,Covn2),卷积核3×3,可由学习的卷积核对原始图像进行卷积运算,卷积后的结果加上阈值经过激活函数作用形成本层输出的特征图,计算公式为:The middle matrix is called the convolution kernel, also called the weight filter or filter. It is the core of the convolution process. It can detect the horizontal and vertical edges of the image, enhance the weight of the central area of the image, etc. When the convolution is extended to a higher dimension, the convolution kernel is used as a moving window on the input matrix to perform the corresponding convolution operation. CNN mines the spatial local correlation information of natural images by strengthening the local connection mode of nodes between adjacent layers in the neural network. The input of the m-th layer node is part of the m-1-th layer node, and these nodes have spatially adjacent visual receptive fields. The number of weights of each neuron in the convolutional neural network is the size of the convolution kernel, that is, each neuron is only connected to part of the pixels of the image. Two convolution layers (Covn1, Covn2) are used, and the convolution kernel is 3×3. The original image can be convolved by the learned convolution kernel. The result after convolution plus the threshold is activated through the activation function to form the feature map output by this layer. The calculation formula is:

Figure SMS_10
Figure SMS_10

式中l为层数,k为卷积核序号,Mj为M个特征图在第j个;f为激活函数;bias为阀值;卷积层的第一层(convl)采用随机初始化64个3×3×3的卷积核对图片进行卷积,步长为1,采用卷积后图形的尺寸与原尺寸一致;Conv2中采用16个3×3×64的卷积核,步长为1;Where l is the number of layers, k is the convolution kernel number, Mj is the jth of M feature maps; f is the activation function; bias is the threshold; the first layer of the convolution layer (convl) uses 64 randomly initialized 3×3×3 convolution kernels to convolve the image with a step size of 1, and the size of the image after convolution is consistent with the original size; Conv2 uses 16 3×3×64 convolution kernels with a step size of 1;

2、池化层2. Pooling layer

池化层目的是为了进一步降低网络训练参数及模型的过拟合程度,池化的方式通常有3种:①最大池化,选择池化窗口中的最大值作为采样值;②均值池化,将池化窗口中的所有值相加取平均,以平均值作为采样值;③随机池化,借概率的方法确定选择哪―项;池化层可以减小图像的尺寸,提高整个算法的运算速度以及减小噪声的影响;池化层主要是针对卷积后的特征图,按照卷积的方法对图像部分区域求均值或最大值,用来代表其采样的区域;这是为了描述大的图像,对不同位置的特征进行聚合统计,这个均值或者最大值就是聚合统计的方法,也就是池化;中两层下采样层都采用卷积核为3×3,步长为2的最大池化;其运算过程可表示为:The purpose of the pooling layer is to further reduce the overfitting degree of network training parameters and models. There are usually three ways of pooling: ① Maximum pooling, selecting the maximum value in the pooling window as the sampling value; ② Mean pooling, adding all values in the pooling window and averaging them, and using the average value as the sampling value; ③ Random pooling, using the probability method to determine which one to choose; The pooling layer can reduce the size of the image, improve the operation speed of the entire algorithm and reduce the impact of noise; The pooling layer is mainly for the feature map after convolution, and calculates the mean or maximum value of part of the image area according to the convolution method to represent the sampling area; This is to describe large images and aggregate statistics of features at different positions. This mean or maximum value is the method of aggregate statistics, that is, pooling; The two downsampling layers in the middle use the maximum pooling with a convolution kernel of 3×3 and a step size of 2; Its operation process can be expressed as:

Figure SMS_11
Figure SMS_11

式中g(·)为下采样函数;f为激活函数;Where g(·) is the downsampling function; f is the activation function;

3、激活函数3. Activation Function

激活函数是用来保证其结果的非线性,即在卷积运算后,把输出值另加偏移量,输人到激活函数,作为下一层的输入;常用的激活函数有Sigmoid、Tanh和ReLu;激活函数选用Relu,在x<0时,硬饱和,x>0时,导数为1,保持梯度不衰减,从而缓解梯度消失问题.能更快收敛,表达式为:The activation function is used to ensure the nonlinearity of the result. That is, after the convolution operation, the output value is added with an offset and input to the activation function as the input of the next layer. Common activation functions are Sigmoid, Tanh and ReLu. The activation function uses Relu. When x<0, it is hard saturated. When x>0, the derivative is 1, keeping the gradient from decaying, thereby alleviating the gradient vanishing problem. It can converge faster. The expression is:

Figure SMS_12
Figure SMS_12

4、全连接层4. Fully connected layer

全连接层相当于“分类器”的作用;即将卷积层、激活函数和池化层学到的“分布式特征表示”映射到样本标记空间;的CNN模型采用两层全连接层,每层神经元数为128,最后输出采用softmax分类器,Softmax函数是多项式回归函数,适合解决多类图片的识别问题,在全连接层使用Dropout随机忽略一部分神经元,以避免模型过拟合,增强了模型的健壮性。The fully connected layer is equivalent to the role of a "classifier", that is, the "distributed feature representation" learned by the convolutional layer, activation function and pooling layer is mapped to the sample label space; the CNN model uses two fully connected layers, each with 128 neurons, and the final output uses a softmax classifier. The Softmax function is a polynomial regression function, which is suitable for solving the recognition problem of multiple categories of images. Dropout is used in the fully connected layer to randomly ignore some neurons to avoid overfitting of the model and enhance the robustness of the model.

与现有技术相比,本发明提供了一种基于卷积神经网络的水果识别方法,具备以下有益效果:Compared with the prior art, the present invention provides a fruit recognition method based on convolutional neural network, which has the following beneficial effects:

本发明通过改进K均值水果分割算法,并利用形态学处理和连通区域特征分析完成目标区域的选择定位,实现了水果的检测识别,确定水果的区域,以此区域为中心点对图像进行放缩,调整到合适的尺寸,同时利用卷积神经网络特有的边隐式提取特征的同时边对样本进行特征学习,也通过减少了程序前期处理工作,对水果特征进行训练提取,通过不同测试集的水果图像进行验证测试,实现水果品种识别分类,提高了水果品种的准确识别率。The present invention improves the K-means fruit segmentation algorithm, and uses morphological processing and connected region feature analysis to complete the selection and positioning of the target area, thereby realizing the detection and recognition of fruits, determining the area of the fruit, scaling the image with the area as the center point, and adjusting it to a suitable size. At the same time, the unique feature of the convolutional neural network is used to implicitly extract features while learning features of samples. The present invention also reduces the pre-processing work of the program, trains and extracts fruit features, and performs verification tests through fruit images of different test sets, thereby realizing the recognition and classification of fruit varieties and improving the accurate recognition rate of fruit varieties.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

附图用来提供对本发明的进一步理解,并且构成说明书的一部分,与本发明的实施例一起用于解释本发明,并不构成对本发明的限制,在附图中:The accompanying drawings are used to provide a further understanding of the present invention and constitute a part of the specification. Together with the embodiments of the present invention, they are used to explain the present invention and do not constitute a limitation of the present invention. In the accompanying drawings:

图1为本发明提出的一种基于卷积神经网络的水果识别方法的流程图;FIG1 is a flow chart of a fruit recognition method based on a convolutional neural network proposed by the present invention;

图2为本发明的检测算法流程图;FIG2 is a flow chart of a detection algorithm of the present invention;

图3为本发明的卷积神经网络模型图;FIG3 is a convolutional neural network model diagram of the present invention;

图4为R空间颜色分量示意图;FIG4 is a schematic diagram of color components in R space;

图5为改进K均值分割结果示意图;FIG5 is a schematic diagram of improved K-means segmentation results;

图6为形态学处理结果示意图;Fig. 6 is a schematic diagram of morphological processing results;

图7为连通区域特征分析示意图。FIG. 7 is a schematic diagram of connected region feature analysis.

具体实施方式DETAILED DESCRIPTION

下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The following will be combined with the drawings in the embodiments of the present invention to clearly and completely describe the technical solutions in the embodiments of the present invention. Obviously, the described embodiments are only part of the embodiments of the present invention, not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by ordinary technicians in this field without creative work are within the scope of protection of the present invention.

请参阅图1-3,本发明提供一种技术方案:种基于卷积神经网络的水果识别方法,包括如下步骤:Please refer to Figures 1-3. The present invention provides a technical solution: a fruit recognition method based on a convolutional neural network, comprising the following steps:

S1:水果图像数据采集,通过摄像头和镜头完成水果图像数据采集;S1: Fruit image data collection, which is completed through cameras and lenses;

S2:图像处理,针对图像特点,完成图像预处理、图像分割、形态学处理以及连通区域特征分析的工作;S2: Image processing, based on the image characteristics, complete image preprocessing, image segmentation, morphological processing and connected area feature analysis;

图像采集是图像处理的前提,水果图像采集后,为了提高图像处理速度和算法的自适应性,针对水果图像特点,提取图像的R空间颜色特征分量,进行高斯滤波处理,高斯模板通过对二维高斯函数的离散化表示,对于一个大小为(2u+1)×(2u+1)的矩阵M﹐其(a,b)位置的元素表示为:Image acquisition is the premise of image processing. After the fruit image is acquired, in order to improve the image processing speed and the adaptability of the algorithm, the R space color feature components of the image are extracted according to the characteristics of the fruit image, and Gaussian filtering is performed. The Gaussian template is represented by the discretization of the two-dimensional Gaussian function. For a matrix M of size (2u+1)×(2u+1), the elements at the (a, b) position are expressed as:

Figure SMS_13
Figure SMS_13

其中α为高斯函数标准差,高斯滤波模板大小为5×5,检测结果如图4所示;Where α is the standard deviation of the Gaussian function, the size of the Gaussian filter template is 5×5, and the detection results are shown in Figure 4;

针对图像特点,利用改进K均值聚类分割算法对图像进行分割,形成分割区域,具体聚类分割算法如下:According to the characteristics of the image, the improved K-means clustering segmentation algorithm is used to segment the image to form segmentation areas. The specific clustering segmentation algorithm is as follows:

Stepl:首先将样本图像灰度值转换为尺寸大小为N的一维样本数据集C,其中N样本图像像素个数,设迭代运算次数为Ⅰ且聚类类型为j时的聚类中心为Zj(I),从数据集C中随机选取一个样本对象作为初始聚类中心Z1(1);Step 1: First, convert the grayscale value of the sample image into a one-dimensional sample data set C of size N, where N is the number of sample image pixels. Assume that the number of iterations is I and the cluster center when the cluster type is j is Z j (I). Randomly select a sample object from the data set C as the initial cluster center Z 1 (1);

计算每个样本Xm,与已有聚类中心的最短距离d(Xm),m=1,2...N,其中Xm为数据集中的第m个样本;接着计算每个样本对象被选为下一个聚类中心的概率P(Xm):Calculate the shortest distance d(X m ) between each sample X m and the existing cluster center, m=1, 2...N, where X m is the mth sample in the data set; then calculate the probability P(X m ) of each sample object being selected as the next cluster center:

Figure SMS_14
Figure SMS_14

按照轮盘法选择下一个聚类中心重复步骤直到选择出k个对象组成初始聚类中心Zj(1),j=1,2,3,...k,具体的,聚类中心个数k=4;Select the next cluster center according to the roulette wheel method and repeat the steps until k objects are selected to form the initial cluster center Z j (1), j = 1, 2, 3, ... k, specifically, the number of cluster centers k = 4;

Step2:迭代计算:根据相似度准则计算样本数据集C中每个样本xm与初始聚类中的距离D(Xm;Zj(I)),m=1,2,3....N,j=1,2,3…k,如下式(5)所示,将各个数据对象划分到距离最小的聚类集合簇Sj中,则Xm∈SjStep 2: Iterative calculation: Calculate the distance D(X m ; Z j (I)) between each sample xm in the sample data set C and the initial cluster according to the similarity criterion, m = 1, 2, 3....N, j = 1, 2, 3...k, as shown in the following formula (5), and divide each data object into the cluster set S j with the smallest distance, then X m∈ S j :

Figure SMS_15
Figure SMS_15

Step3:聚类中心更新:按照聚类中心更新公式计算各聚类集合中的均值作为该集合新的聚类中心,更新得到新的聚类集合中心,设

Figure SMS_16
为聚类j的元素,聚类j的元素个数为nj,聚类中心更新公式如式(6)所示:Step 3: Cluster center update: According to the cluster center update formula, calculate the mean of each cluster set as the new cluster center of the set, and update the new cluster set center.
Figure SMS_16
is the element of cluster j, the number of elements of cluster j is n j , and the cluster center update formula is shown in formula (6):

Figure SMS_17
Figure SMS_17

Step4:终止条件:循环更新聚类集合中心,直到各聚类中心不再发生变化或者误差平方和局部最小为止,聚类准则函数J计算方法为:Step 4: Termination condition: Update the cluster set centers cyclically until the cluster centers no longer change or the error square sum is locally minimized. The clustering criterion function J is calculated as follows:

Figure SMS_18
Figure SMS_18

精度误差为ξ,若|J(I+1)-J(I)|<ξ,则算法结束,终止迭代,否则反复执行迭代计算和聚类中心直至满足终止条件,图像分割结果如图5所示。The precision error is ξ. If |J(I+1)-J(I)|<ξ, the algorithm ends and the iteration is terminated. Otherwise, the iterative calculation and clustering center are repeatedly performed until the termination condition is met. The image segmentation result is shown in Figure 5.

形态学处理是经过图像分割得到的区域还存在很多的噪声干扰信息,为了祛除噪声区域,利用形态学处理算法对分割区域进行处理,设A表示图像矩阵,B表示结构元素,形态学处理方法如下式所示:Morphological processing is a process in which there is still a lot of noise interference information in the area obtained by image segmentation. In order to remove the noise area, the segmented area is processed using the morphological processing algorithm. Let A represent the image matrix and B represent the structural element. The morphological processing method is shown in the following formula:

Figure SMS_19
Figure SMS_19

采用尺寸为10×10的矩形结构元素对聚类分割区域进行开操作,图像处理结果如图6所示;A rectangular structural element with a size of 10×10 is used to open the cluster segmentation area. The image processing result is shown in Figure 6;

连通区域特征分析是经过形态学处理后,细小的噪声干扰区域已经祛除;考虑到水果的实际大小,需要进一步祛除非目标区域;将灰度一致,且满足8邻接的像素判定为相同区域,通过连通区域面积特征,过滤噪声干扰区域,根据如下公式提取出像素面积在(SMin,SMax)的区域X,The connected region feature analysis is that after morphological processing, the small noise interference area has been removed; considering the actual size of the fruit, it is necessary to further remove the non-target area; the pixels with the same grayscale and 8 adjacent pixels are judged as the same area, and the noise interference area is filtered out through the connected region area feature. The area X with the pixel area in (SMin, SMax) is extracted according to the following formula:

Figure SMS_20
Figure SMS_20

其中SMn和SMax分别为面积像素的参数下限值和上限值,图像处理结果如图7所示;Where SMn and SMax are the lower and upper limits of the area pixel parameters, respectively. The image processing result is shown in Figure 7;

S3:构建卷积神经网络模型,并通过卷积神经网络训练对图像处理的结果进行多尺度的特征提取;S3: Construct a convolutional neural network model and perform multi-scale feature extraction on the image processing results through convolutional neural network training;

S4:在完成卷积神经网络训练后,将数据结果通过串口发送数据给芯片,芯片控制控制舵机使机械臂分拣水果。S4: After completing the convolutional neural network training, the data results are sent to the chip through the serial port, and the chip controls the servo to make the robotic arm sort the fruits.

优选的,S1中的步骤中摄像头负责水果图像信息的采集,主要依据检测精度、视野尺寸进行选择,镜头参数主要是焦距,其选择主要依据工作距离进行确定。Preferably, in step S1, the camera is responsible for collecting fruit image information, which is mainly selected based on detection accuracy and field of view size. The lens parameter is mainly focal length, and its selection is mainly determined based on working distance.

如图3所示,S3中的构建卷积神经网络模型如下:As shown in Figure 3, the convolutional neural network model in S3 is as follows:

1、卷积层1. Convolutional layer

在卷积神经网络中,第一个卷积层直接接受图像像素级的输入,每一个卷积操作只处理一小块图像,进行卷积变化后再传到后面的网络,每一层卷积都会提取数据中最有效的特征;卷积层是卷积神经网络的核心层,而卷积是卷积层的核心,关于简单卷积的运算,是一种简单的二维空间卷积运算

Figure SMS_21
将矩阵中输入的每个元素与中间矩阵的每个元素对应相乘并相加.这便是卷积操作;In a convolutional neural network, the first convolution layer directly accepts pixel-level input of the image. Each convolution operation only processes a small piece of the image, and then passes it to the subsequent network after convolution. Each layer of convolution extracts the most effective features of the data. The convolution layer is the core layer of the convolutional neural network, and convolution is the core of the convolution layer. The operation of simple convolution is a simple two-dimensional spatial convolution operation.
Figure SMS_21
Multiply and add each element of the input matrix with each element of the intermediate matrix. This is the convolution operation;

其中中间的矩阵叫做卷积核,又叫权重过滤器或滤波器(filter),它是卷积过程的核心,它能够检测图像的水平边缘、垂直边缘、增强图片中心区域权重等;当把卷积推广到更高维度时,把卷积核作为在输入矩阵上一个移动窗口,便可进行对应的卷积运算;中CNN通过加强神经网络中相邻层之间节点的局部连接模式来挖掘自然图像的空间局部关联信息,第m层节点的输人是第m﹣1层节点的一部分,这些节点具有空间相邻的视觉感受野;卷积神经网络中每个神经元的权重个数均为卷积核的大小,即每个神经元只与图片部分像素相连接;采用两个卷积层(Covnl,Covn2),卷积核3×3,可由学习的卷积核对原始图像进行卷积运算,卷积后的结果加上阈值经过激活函数作用形成本层输出的特征图,计算公式为:The middle matrix is called the convolution kernel, also called the weight filter or filter. It is the core of the convolution process. It can detect the horizontal and vertical edges of the image, enhance the weight of the central area of the image, etc. When the convolution is extended to a higher dimension, the convolution kernel is used as a moving window on the input matrix to perform the corresponding convolution operation. CNN mines the spatial local correlation information of natural images by strengthening the local connection mode of nodes between adjacent layers in the neural network. The input of the m-th layer node is part of the m-1-th layer node, and these nodes have spatially adjacent visual receptive fields. The number of weights of each neuron in the convolutional neural network is the size of the convolution kernel, that is, each neuron is only connected to part of the pixels of the image. Two convolution layers (Covn1, Covn2) are used, and the convolution kernel is 3×3. The original image can be convolved by the learned convolution kernel. The result after convolution plus the threshold is activated through the activation function to form the feature map output by this layer. The calculation formula is:

Figure SMS_22
Figure SMS_22

式中l为层数,k为卷积核序号,Mj为M个特征图在第j个;f为激活函数;bias为阀值;卷积层的第一层(convl)采用随机初始化64个3×3×3的卷积核对图片进行卷积,步长为1,采用卷积后图形的尺寸与原尺寸一致;Conv2中采用16个3×3×64的卷积核,步长为1;Where l is the number of layers, k is the convolution kernel number, Mj is the jth of M feature maps; f is the activation function; bias is the threshold; the first layer of the convolution layer (convl) uses 64 randomly initialized 3×3×3 convolution kernels to convolve the image with a step size of 1, and the size of the image after convolution is consistent with the original size; Conv2 uses 16 3×3×64 convolution kernels with a step size of 1;

2、池化层2. Pooling layer

池化层目的是为了进一步降低网络训练参数及模型的过拟合程度,池化的方式通常有3种:①最大池化,选择池化窗口中的最大值作为采样值;②均值池化,将池化窗口中的所有值相加取平均,以平均值作为采样值;③随机池化,借概率的方法确定选择哪―项;池化层可以减小图像的尺寸,提高整个算法的运算速度以及减小噪声的影响;池化层主要是针对卷积后的特征图,按照卷积的方法对图像部分区域求均值或最大值,用来代表其采样的区域;这是为了描述大的图像,对不同位置的特征进行聚合统计,这个均值或者最大值就是聚合统计的方法,也就是池化;中两层下采样层都采用卷积核为3×3,步长为2的最大池化;其运算过程可表示为:The purpose of the pooling layer is to further reduce the overfitting degree of network training parameters and models. There are usually three ways of pooling: ① Maximum pooling, selecting the maximum value in the pooling window as the sampling value; ② Mean pooling, adding all values in the pooling window and averaging them, and using the average value as the sampling value; ③ Random pooling, using the probability method to determine which one to choose; The pooling layer can reduce the size of the image, improve the operation speed of the entire algorithm and reduce the impact of noise; The pooling layer is mainly for the feature map after convolution, and calculates the mean or maximum value of part of the image area according to the convolution method to represent the sampling area; This is to describe large images and aggregate statistics of features at different positions. This mean or maximum value is the method of aggregate statistics, that is, pooling; The two downsampling layers in the middle use the maximum pooling with a convolution kernel of 3×3 and a step size of 2; Its operation process can be expressed as:

Figure SMS_23
Figure SMS_23

式中g(·)为下采样函数;f为激活函数;Where g(·) is the downsampling function; f is the activation function;

3、激活函数3. Activation Function

激活函数是用来保证其结果的非线性,即在卷积运算后,把输出值另加偏移量,输人到激活函数,作为下一层的输入;常用的激活函数有Sigmoid、Tanh和ReLu;激活函数选用Relu,在x<0时,硬饱和,x>0时,导数为1,保持梯度不衰减,从而缓解梯度消失问题.能更快收敛,表达式为:The activation function is used to ensure the nonlinearity of the result. That is, after the convolution operation, the output value is added with an offset and input to the activation function as the input of the next layer. Common activation functions are Sigmoid, Tanh and ReLu. The activation function uses Relu. When x<0, it is hard saturated. When x>0, the derivative is 1, keeping the gradient from decaying, thereby alleviating the gradient vanishing problem. It can converge faster. The expression is:

Figure SMS_24
Figure SMS_24

4、全连接层4. Fully connected layer

全连接层相当于“分类器”的作用;即将卷积层、激活函数和池化层学到的“分布式特征表示”映射到样本标记空间;的CNN模型采用两层全连接层,每层神经元数为128,最后输出采用softmax分类器,Softmax函数是多项式回归函数,适合解决多类图片的识别问题,在全连接层使用Dropout随机忽略一部分神经元,以避免模型过拟合,增强了模型的健壮性。The fully connected layer is equivalent to the role of a "classifier", that is, the "distributed feature representation" learned by the convolutional layer, activation function and pooling layer is mapped to the sample label space; the CNN model uses two fully connected layers, each with 128 neurons, and the final output uses a softmax classifier. The Softmax function is a polynomial regression function, which is suitable for solving the recognition problem of multiple categories of images. Dropout is used in the fully connected layer to randomly ignore some neurons to avoid overfitting of the model and enhance the robustness of the model.

尽管已经示出和描述了本发明的实施例,对于本领域的普通技术人员而言,可以理解在不脱离本发明的原理和精神的情况下可以对这些实施例进行多种变化、修改、替换和变型,本发明的范围由所附权利要求及其等同物限定。Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that various changes, modifications, substitutions and variations may be made to the embodiments without departing from the principles and spirit of the present invention, and that the scope of the present invention is defined by the appended claims and their equivalents.

Claims (7)

1.一种基于卷积神经网络的水果识别方法,其特征在于,包括如下步骤:1. A fruit recognition method based on convolutional neural network, characterized in that it comprises the following steps: S1:水果图像数据采集,通过摄像头和镜头完成水果图像数据采集;S1: Fruit image data collection, which is completed through cameras and lenses; S2:图像处理,针对图像特点,完成图像预处理、图像分割、形态学处理以及连通区域特征分析的工作;S2: Image processing, based on the image characteristics, complete image preprocessing, image segmentation, morphological processing and connected area feature analysis; S3:构建卷积神经网络模型,并通过卷积神经网络训练对图像处理的结果进行多尺度的特征提取;S3: Construct a convolutional neural network model and perform multi-scale feature extraction on the image processing results through convolutional neural network training; S4:在完成卷积神经网络训练后,将数据结果通过串口发送数据给芯片,芯片控制控制舵机使机械臂分拣水果。S4: After completing the convolutional neural network training, the data results are sent to the chip through the serial port, and the chip controls the servo to make the robotic arm sort the fruits. 2.根据权利要求1所述的一种基于卷积神经网络的水果识别方法,其特征在于:所述S1中的步骤中摄像头负责水果图像信息的采集,主要依据检测精度、视野尺寸进行选择,镜头参数主要是焦距,其选择主要依据工作距离进行确定。2. A fruit recognition method based on convolutional neural network according to claim 1, characterized in that: in the step of S1, the camera is responsible for the collection of fruit image information, which is mainly selected based on detection accuracy and visual field size, and the lens parameter is mainly focal length, which is mainly determined based on working distance. 3.根据权利要求1所述的一种基于卷积神经网络的水果识别方法,其特征在于:所述S2中的图像预处理是针对水果图像特点,提取图像的R空间颜色特征分量,进行高斯滤波处理,高斯模板通过对二维高斯函数的离散化表示,对于一个大小为(2u+1)×(2u+1)的矩阵M﹐其(a,b)位置的元素表示为:3. A fruit recognition method based on convolutional neural network according to claim 1, characterized in that: the image preprocessing in S2 is to extract the R space color feature component of the image according to the characteristics of the fruit image, and perform Gaussian filtering. The Gaussian template is represented by discretization of the two-dimensional Gaussian function. For a matrix M of size (2u+1)×(2u+1), the element at the (a, b) position is represented as:
Figure QLYQS_1
Figure QLYQS_1
其中α为高斯函数标准差,高斯滤波模板大小为5×5。Where α is the standard deviation of the Gaussian function, and the size of the Gaussian filter template is 5×5.
4.根据权利要求3所述的一种基于卷积神经网络的水果识别方法,其特征在于:所述S2中的图像分割是利用改进K均值聚类分割算法对图像进行分割,形成分割区域,具体聚类分割算法如下:4. a kind of fruit identification method based on convolutional neural network according to claim 3, it is characterized in that: the image segmentation among the described S2 is to utilize improved K mean value cluster segmentation algorithm that image is segmented, forms segmentation area, and concrete cluster segmentation algorithm is as follows: Stepl:首先将样本图像灰度值转换为尺寸大小为N的一维样本数据集C,其中N样本图像像素个数,设迭代运算次数为Ⅰ且聚类类型为j时的聚类中心为Zj(I),从数据集C中随机选取一个样本对象作为初始聚类中心Z1(1);Step 1: First, convert the grayscale value of the sample image into a one-dimensional sample data set C of size N, where N is the number of sample image pixels. Assume that the number of iterations is I and the cluster center when the cluster type is j is Zj(I). Randomly select a sample object from the data set C as the initial cluster center Z1(1); 计算每个样本Xm,与已有聚类中心的最短距离d(Xm),m=1,2...N,其中Xm为数据集中的第m个样本;接着计算每个样本对象被选为下一个聚类中心的概率P(Xm):Calculate the shortest distance d(Xm) between each sample Xm and the existing cluster center, m=1, 2...N, where Xm is the mth sample in the data set; then calculate the probability P(Xm) of each sample object being selected as the next cluster center:
Figure QLYQS_2
Figure QLYQS_2
按照轮盘法选择下一个聚类中心重复步骤直到选择出k个对象组成初始聚类中心Zj(1),j=1,2,3,...k,具体的,聚类中心个数k=4;Select the next cluster center according to the roulette wheel method and repeat the steps until k objects are selected to form the initial cluster center Zj(1), j=1, 2, 3, ...k, specifically, the number of cluster centers k=4; Step2:迭代计算:根据相似度准则计算样本数据集C中每个样本xm与初始聚类中的距离D(Xm;Zj(I)),m=1,2,3....N,j=1,2,3…k,如下式(5)所示,将各个数据对象划分到距离最小的聚类集合簇Sj中,则Xm∈Sj:Step 2: Iterative calculation: According to the similarity criterion, the distance D(Xm; Zj(I)) between each sample xm in the sample data set C and the initial cluster is calculated, m = 1, 2, 3....N, j = 1, 2, 3...k, as shown in the following formula (5), and each data object is divided into the cluster set Sj with the smallest distance, then Xm∈Sj:
Figure QLYQS_3
Figure QLYQS_3
Step3:聚类中心更新:按照聚类中心更新公式计算各聚类集合中的均值作为该集合新的聚类中心,更新得到新的聚类集合中心,设
Figure QLYQS_4
为聚类j的元素,聚类j的元素个数为nj,聚类中心更新公式如式(6)所示:
Step 3: Cluster center update: According to the cluster center update formula, calculate the mean of each cluster set as the new cluster center of the set, and update the new cluster set center.
Figure QLYQS_4
is the element of cluster j, the number of elements in cluster j is nj, and the cluster center update formula is shown in formula (6):
Figure QLYQS_5
Figure QLYQS_5
Step4:终止条件:循环更新聚类集合中心,直到各聚类中心不再发生变化或者误差平方和局部最小为止,聚类准则函数J计算方法为:Step 4: Termination condition: Update the cluster set centers cyclically until the cluster centers no longer change or the error square sum is locally minimized. The clustering criterion function J is calculated as follows:
Figure QLYQS_6
Figure QLYQS_6
精度误差为ξ,若|J(I+1)-J(I)|<ξ,则算法结束,终止迭代,否则反复执行迭代计算和聚类中心直至满足终止条件。The precision error is ξ. If |J(I+1)-J(I)|<ξ, the algorithm ends and the iteration is terminated. Otherwise, the iterative calculation and clustering center are repeatedly performed until the termination condition is met.
5.根据权利要求4所述的一种基于卷积神经网络的水果识别方法,其特征在于:所述S2中的形态学处理是经过图像分割得到的区域还存在很多的噪声干扰信息,为了祛除噪声区域,利用形态学处理算法对分割区域进行处理,设A表示图像矩阵,B表示结构元素,形态学处理方法如下式所示:5. a kind of fruit identification method based on convolutional neural network according to claim 4, it is characterized in that: the morphological treatment among the described S2 is that the region that obtains through image segmentation also has a lot of noise interference information, in order to eliminate the noise area, utilize the morphological treatment algorithm that the segmented region is processed, establish A to represent image matrix, B represents structural element, the morphological treatment method is shown as follows:
Figure QLYQS_7
Figure QLYQS_7
采用尺寸为10×10的矩形结构元素对聚类分割区域进行开操作。A rectangular structure element with a size of 10×10 is used to open the cluster segmentation area.
6.根据权利要求5所述的一种基于卷积神经网络的水果识别方法,其特征在于:所述S2中的连通区域特征分析是经过形态学处理后,细小的噪声干扰区域已经祛除;考虑到水果的实际大小,需要进一步祛除非目标区域;将灰度一致,且满足8邻接的像素判定为相同区域,通过连通区域面积特征,过滤噪声干扰区域,根据如下公式提取出像素面积在(SMin,SMax)的区域X,6. a kind of fruit identification method based on convolutional neural network according to claim 5, it is characterized in that: the connected region feature analysis among described S2 is after morphological treatment, and tiny noise interference region has been eliminated; Consider the actual size of fruit, need to further eliminate non-target area; Gray scale is consistent, and satisfy 8 adjacent pixels to be judged as same area, by connected region area feature, filter noise interference region, extract pixel area in (SMin, SMax) region X according to following formula,
Figure QLYQS_8
Figure QLYQS_8
其中SMn和SMax分别为面积像素的参数下限值和上限值。Among them, SMn and SMax are the lower and upper limit values of the area pixel parameters respectively.
7.根据权利要求1所述的一种基于卷积神经网络的水果识别方法,其特征在于:所述S3中的构建卷积神经网络模型如下:7. A fruit identification method based on convolutional neural network according to claim 1, characterized in that: the convolutional neural network model constructed in S3 is as follows: 1、卷积层1. Convolutional layer 在卷积神经网络中,第一个卷积层直接接受图像像素级的输入,每一个卷积操作只处理一小块图像,进行卷积变化后再传到后面的网络,每一层卷积都会提取数据中最有效的特征;卷积层是卷积神经网络的核心层,而卷积是卷积层的核心,关于简单卷积的运算,是一种简单的二维空间卷积运算
Figure QLYQS_9
将矩阵中输入的每个元素与中间矩阵的每个元素对应相乘并相加.这便是卷积操作;
In a convolutional neural network, the first convolution layer directly accepts pixel-level input of the image. Each convolution operation only processes a small piece of the image, which is then passed to the subsequent network after convolution. Each convolution layer extracts the most effective features of the data. The convolution layer is the core layer of the convolutional neural network, and convolution is the core of the convolution layer. The operation of simple convolution is a simple two-dimensional spatial convolution operation.
Figure QLYQS_9
Multiply and add each element of the input matrix with each element of the intermediate matrix. This is the convolution operation;
其中中间的矩阵叫做卷积核,又叫权重过滤器或滤波器(filter),它是卷积过程的核心,它能够检测图像的水平边缘、垂直边缘、增强图片中心区域权重等;当把卷积推广到更高维度时,把卷积核作为在输入矩阵上一个移动窗口,便可进行对应的卷积运算;中CNN通过加强神经网络中相邻层之间节点的局部连接模式来挖掘自然图像的空间局部关联信息,第m层节点的输人是第m﹣1层节点的一部分,这些节点具有空间相邻的视觉感受野;卷积神经网络中每个神经元的权重个数均为卷积核的大小,即每个神经元只与图片部分像素相连接;采用两个卷积层(Covnl,Covn2),卷积核3×3,可由学习的卷积核对原始图像进行卷积运算,卷积后的结果加上阈值经过激活函数作用形成本层输出的特征图,计算公式为:The middle matrix is called the convolution kernel, also called the weight filter or filter. It is the core of the convolution process. It can detect the horizontal and vertical edges of the image, enhance the weight of the central area of the image, etc. When the convolution is extended to a higher dimension, the convolution kernel is used as a moving window on the input matrix to perform the corresponding convolution operation. CNN mines the spatial local correlation information of natural images by strengthening the local connection mode of nodes between adjacent layers in the neural network. The input of the m-th layer node is part of the m-1-th layer node, and these nodes have spatially adjacent visual receptive fields. The number of weights of each neuron in the convolutional neural network is the size of the convolution kernel, that is, each neuron is only connected to part of the pixels of the image. Two convolution layers (Covn1, Covn2) are used, and the convolution kernel is 3×3. The original image can be convolved by the learned convolution kernel. The result after convolution plus the threshold is activated through the activation function to form the feature map output by this layer. The calculation formula is:
Figure QLYQS_10
Figure QLYQS_10
式中l为层数,k为卷积核序号,Mj为M个特征图在第j个;f为激活函数;bias为阀值;卷积层的第一层(convl)采用随机初始化64个3×3×3的卷积核对图片进行卷积,步长为1,采用卷积后图形的尺寸与原尺寸一致;Conv2中采用16个3×3×64的卷积核,步长为1;Where l is the number of layers, k is the convolution kernel number, Mj is the jth of M feature maps; f is the activation function; bias is the threshold; the first layer of the convolution layer (convl) uses 64 randomly initialized 3×3×3 convolution kernels to convolve the image with a step size of 1, and the size of the image after convolution is consistent with the original size; Conv2 uses 16 3×3×64 convolution kernels with a step size of 1; 2、池化层2. Pooling layer 池化层目的是为了进一步降低网络训练参数及模型的过拟合程度,池化的方式通常有3种:①最大池化,选择池化窗口中的最大值作为采样值;②均值池化,将池化窗口中的所有值相加取平均,以平均值作为采样值;③随机池化,借概率的方法确定选择哪―项;池化层可以减小图像的尺寸,提高整个算法的运算速度以及减小噪声的影响;池化层主要是针对卷积后的特征图,按照卷积的方法对图像部分区域求均值或最大值,用来代表其采样的区域;这是为了描述大的图像,对不同位置的特征进行聚合统计,这个均值或者最大值就是聚合统计的方法,也就是池化;中两层下采样层都采用卷积核为3×3,步长为2的最大池化;其运算过程可表示为:The purpose of the pooling layer is to further reduce the overfitting degree of network training parameters and models. There are usually three ways of pooling: ① Maximum pooling, selecting the maximum value in the pooling window as the sampling value; ② Mean pooling, adding all values in the pooling window and averaging them, and using the average value as the sampling value; ③ Random pooling, using the probability method to determine which one to choose; The pooling layer can reduce the size of the image, improve the operation speed of the entire algorithm and reduce the impact of noise; The pooling layer is mainly for the feature map after convolution, and calculates the mean or maximum value of part of the image area according to the convolution method to represent the sampling area; This is to describe large images and aggregate statistics of features at different positions. This mean or maximum value is the method of aggregate statistics, that is, pooling; The two downsampling layers in the middle use the maximum pooling with a convolution kernel of 3×3 and a step size of 2; Its operation process can be expressed as:
Figure QLYQS_11
Figure QLYQS_11
式中g(·)为下采样函数;f为激活函数;Where g(·) is the downsampling function; f is the activation function; 3、激活函数3. Activation Function 激活函数是用来保证其结果的非线性,即在卷积运算后,把输出值另加偏移量,输人到激活函数,作为下一层的输入;常用的激活函数有Sigmoid、Tanh和ReLu;激活函数选用Relu,在x<0时,硬饱和,x>0时,导数为1,保持梯度不衰减,从而缓解梯度消失问题.能更快收敛,表达式为:The activation function is used to ensure the nonlinearity of the result. That is, after the convolution operation, the output value is added with an offset and input to the activation function as the input of the next layer. Common activation functions are Sigmoid, Tanh and ReLu. The activation function uses Relu. When x<0, it is hard saturated. When x>0, the derivative is 1, keeping the gradient from decaying, thereby alleviating the gradient vanishing problem. It can converge faster. The expression is:
Figure QLYQS_12
Figure QLYQS_12
4、全连接层4. Fully connected layer 全连接层相当于“分类器”的作用;即将卷积层、激活函数和池化层学到的“分布式特征表示”映射到样本标记空间;的CNN模型采用两层全连接层,每层神经元数为128,最后输出采用softmax分类器,Softmax函数是多项式回归函数,适合解决多类图片的识别问题,在全连接层使用Dropout随机忽略一部分神经元,以避免模型过拟合,增强了模型的健壮性。The fully connected layer is equivalent to the role of a "classifier", that is, the "distributed feature representation" learned by the convolutional layer, activation function and pooling layer is mapped to the sample label space; the CNN model uses two fully connected layers, each with 128 neurons, and the final output uses a softmax classifier. The Softmax function is a polynomial regression function, which is suitable for solving the recognition problem of multiple categories of images. Dropout is used in the fully connected layer to randomly ignore some neurons to avoid overfitting of the model and enhance the robustness of the model.
CN202211452782.3A 2022-11-21 2022-11-21 A method of fruit recognition based on convolutional neural network Pending CN116071560A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211452782.3A CN116071560A (en) 2022-11-21 2022-11-21 A method of fruit recognition based on convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211452782.3A CN116071560A (en) 2022-11-21 2022-11-21 A method of fruit recognition based on convolutional neural network

Publications (1)

Publication Number Publication Date
CN116071560A true CN116071560A (en) 2023-05-05

Family

ID=86170659

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211452782.3A Pending CN116071560A (en) 2022-11-21 2022-11-21 A method of fruit recognition based on convolutional neural network

Country Status (1)

Country Link
CN (1) CN116071560A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117315596A (en) * 2023-09-07 2023-12-29 重庆云网科技股份有限公司 Deep learning-based motor vehicle black smoke detection and identification method
CN117422717A (en) * 2023-12-19 2024-01-19 长沙韶光芯材科技有限公司 Intelligent mask stain positioning method and system
CN117636055A (en) * 2023-12-12 2024-03-01 北京易恒盈通科技有限公司 Cloud storage method and system for digital information
CN117746272A (en) * 2024-02-21 2024-03-22 西安迈远科技有限公司 A UAV-based water resource data collection and processing method and system
CN118334633A (en) * 2024-06-13 2024-07-12 内蒙古向新而行农业科技有限公司 Insect condition forecasting method and system
CN118736566A (en) * 2024-07-09 2024-10-01 重庆市奉节职业教育中心(重庆市奉节师范学校) Navel orange quality identification system and method based on artificial intelligence
CN118972625A (en) * 2024-10-18 2024-11-15 成都艺馨达科技有限公司 A method, device and computer equipment for video image processing based on compressed neural network

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117315596A (en) * 2023-09-07 2023-12-29 重庆云网科技股份有限公司 Deep learning-based motor vehicle black smoke detection and identification method
CN117636055A (en) * 2023-12-12 2024-03-01 北京易恒盈通科技有限公司 Cloud storage method and system for digital information
CN117422717A (en) * 2023-12-19 2024-01-19 长沙韶光芯材科技有限公司 Intelligent mask stain positioning method and system
CN117422717B (en) * 2023-12-19 2024-02-23 长沙韶光芯材科技有限公司 Intelligent mask stain positioning method and system
CN117746272A (en) * 2024-02-21 2024-03-22 西安迈远科技有限公司 A UAV-based water resource data collection and processing method and system
CN118334633A (en) * 2024-06-13 2024-07-12 内蒙古向新而行农业科技有限公司 Insect condition forecasting method and system
CN118334633B (en) * 2024-06-13 2024-09-03 内蒙古向新而行农业科技有限公司 Insect condition forecasting method and system
CN118736566A (en) * 2024-07-09 2024-10-01 重庆市奉节职业教育中心(重庆市奉节师范学校) Navel orange quality identification system and method based on artificial intelligence
CN118972625A (en) * 2024-10-18 2024-11-15 成都艺馨达科技有限公司 A method, device and computer equipment for video image processing based on compressed neural network

Similar Documents

Publication Publication Date Title
CN116071560A (en) A method of fruit recognition based on convolutional neural network
CN108982508B (en) A defect detection method for plastic package IC chips based on feature template matching and deep learning
CN111898736B (en) An Efficient Pedestrian Re-identification Method Based on Attribute Awareness
CN104850845B (en) A kind of traffic sign recognition method based on asymmetric convolutional neural networks
CN105488534B (en) Traffic scene deep analysis method, apparatus and system
CN111310861A (en) A license plate recognition and localization method based on deep neural network
CN111310862A (en) Deep neural network license plate positioning method based on image enhancement in complex environment
CN111428550A (en) Vehicle detection method based on improved YO L Ov3
CN105184265A (en) Self-learning-based handwritten form numeric character string rapid recognition method
CN111079847A (en) Remote sensing image automatic labeling method based on deep learning
CN111178177A (en) Cucumber disease identification method based on convolutional neural network
CN113610035B (en) A method for segmentation and identification of weeds in rice tillering stage based on improved encoding and decoding network
CN111738367B (en) Part classification method based on image recognition
Bonifacio et al. Determination of common Maize (Zea mays) disease detection using Gray-Level Segmentation and edge-detection technique
CN109033978A (en) A kind of CNN-SVM mixed model gesture identification method based on error correction strategies
CN111582401B (en) A Sunflower Seed Sorting Method Based on Double-branch Convolutional Neural Network
CN111767860A (en) A method and terminal for realizing image recognition through convolutional neural network
CN117036243A (en) Method, device, equipment and storage medium for detecting surface defects of shaving board
CN107133579A (en) Based on CSGF (2D)2The face identification method of PCANet convolutional networks
CN117894023A (en) Code spraying defect detection method based on image skeleton super-pixel GCN classification
CN111563542A (en) Automatic plant classification method based on convolutional neural network
CN115171074A (en) Vehicle target identification method based on multi-scale yolo algorithm
CN119169258A (en) A method and system for multi-granularity low-quality image target recognition
CN118736436A (en) A crop recognition method based on multispectral satellite images
CN118486052A (en) Disease medium biological intelligent detection method and system based on multiple vision distances

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20230505

WD01 Invention patent application deemed withdrawn after publication