CN110516685A

CN110516685A - Detection method of lens turbidity based on convolutional neural network

Info

Publication number: CN110516685A
Application number: CN201910468518.0A
Authority: CN
Inventors: 刘振宇; 宋建聪
Original assignee: Shenyang University of Technology
Current assignee: Shenyang University of Technology
Priority date: 2019-05-31
Filing date: 2019-05-31
Publication date: 2019-11-29

Abstract

Lenticular opacities degree detecting method based on convolutional neural networks, the method steps are as follows: (1), to testing image using illumination enhancing method to carry out pretreatment and form preprocessing image data；(2), the preprocessing image data in (1) step is substituted into lenticular opacities degree detecting learning model and realizes detection, the Inception-V3 model and parameter that the present invention is crossed using ImageNet pre-training, and be trained to obtain disaggregated model using the thought of transfer learning, it can be achieved to pass through cell phone application real-time perfoming lenticular opacities investigation and comparison after the completion of the system.

Description

Detection method of lens turbidity based on convolutional neural network

技术领域technical field

本发明提供一种基于卷积神经网络的晶状体浑浊程度检测方法，属于图像处理领域。The invention provides a method for detecting the turbidity degree of a lens based on a convolutional neural network, which belongs to the field of image processing.

背景技术Background technique

晶状体浑浊程度在很多领域都可以作为一个中间参考数据，例如，通过该数据对白内障进行分析等，目前对于晶状体浑浊程度的自动分类方法已有很大进展，但仍存在一些问题。首先，现有方法大多是把分类问题作为研究重点，而对于特征提取方面，主要还是利用人工寻找特征，具有很大的主观性。而且，由于目前已公开且标注的数据集只供研究部分晶状体浑浊程度的情形，且数据量较少，导致无法使用深度学习自动提取特征，所训练的模型也无法达到普遍研究的效果。The degree of lens turbidity can be used as an intermediate reference data in many fields. For example, cataracts can be analyzed through this data. At present, the automatic classification method for the degree of lens turbidity has made great progress, but there are still some problems. First of all, most of the existing methods focus on classification problems, while for feature extraction, they mainly use artificial features to find features, which is very subjective. Moreover, since the currently public and labeled data sets are only used to study the degree of lens turbidity, and the amount of data is small, it is impossible to use deep learning to automatically extract features, and the trained model cannot achieve the effect of general research.

发明内容Contents of the invention

发明目的：Purpose of the invention:

本发明提供一种基于卷积神经网络的晶状体浑浊程度检测方法，其目的是解决以往所存在的问题。The invention provides a method for detecting the turbidity degree of the lens based on a convolutional neural network, and its purpose is to solve the existing problems in the past.

技术方案Technical solutions

基于卷积神经网络的晶状体浑浊程度检测方法，其特征在于：该方法步骤如下：The lens turbidity detection method based on convolutional neural network is characterized in that: the method steps are as follows:

(1)、对待测图像采用光照增强法进行预处理形成预处理图像数据；(1) The image to be tested is preprocessed by the illumination enhancement method to form preprocessed image data;

(2)、将(1)步骤中的预处理图像数据代入晶状体浑浊程度检测学习模型实现检测。(2) Substituting the preprocessed image data in step (1) into the lens turbidity detection learning model to realize the detection.

(2)步骤中的晶状体浑浊程度检测模型的构建方法如下：(2) The construction method of the lens turbidity detection model in the step is as follows:

(2.1)、构建MSLPP数据集；然后对数据集中的图像进行预处理；并将预处理的MSLPP数据集分为训练集、验证集和测试集；(2.1), construct the MSLPP data set; Then the image in the data set is preprocessed; And the preprocessed MSLPP data set is divided into training set, verification set and test set;

(2.2)、利用(2.1)步骤中的预处理之后的图像形成的数据集进行模型训练得晶状体浑浊程度检测学习模型。(2.2), using the data set formed from the preprocessed image in step (2.1) to perform model training to obtain a learning model for detecting the degree of lens turbidity.

(2.1)步骤中MSLPP数据集的构建方式如下：收集临床中由裂隙灯采集的眼部晶体图像，并将其分为正常、早期晶状体浑浊及晶状体浑浊三类。(2.1) The construction method of the MSLPP data set in the step is as follows: collect the ocular lens images collected by the slit lamp in clinical practice, and divide them into three categories: normal, early lens opacity and lens opacity.

MSLPP数据集样本采集环境复杂多样，样本包含种类多样。The sample collection environment of the MSLPP dataset is complex and diverse, and the samples contain a variety of types.

所述的环境复杂多样包括：亮环境、暗环境及反光情况；所述样本种类多样包括：样本针对核性晶状体浑浊、皮质性晶状体浑浊及后囊性晶状体浑浊。The complex and diverse environments include: bright environment, dark environment, and reflective conditions; the diverse types of samples include: samples for nuclear lens opacity, cortical lens opacity, and posterior capsular lens opacity.

(2.1)步骤中的预处理方法为：先采用光照增强法对数据集进行处理，再对光照增强法处理后的数据集利用数据扩增法处理。The preprocessing method in the step (2.1) is: first use the illumination enhancement method to process the data set, and then use the data amplification method to process the data set processed by the illumination enhancement method.

光照增强法采用如下方法：The light enhancement method uses the following methods:

将输入图像尺寸压缩为299×299像素，设图像任意一点A(x,y)的三通道像素其中R(x,y)、G(x,y)、B(x,y)分别代表点A(x,y)中红、绿、蓝三个通道的亮度值，三个通道的亮度值范围均为0-255,则图像的平均像素值表示为：Compress the input image size to 299×299 pixels, and set the three-channel pixel of any point A(x,y) in the image Among them, R(x,y), G(x,y), and B(x,y) respectively represent the brightness values of the red, green, and blue channels in point A(x,y), and the brightness value ranges of the three channels Both are 0-255, the average pixel value of the image Expressed as:

若则图像各点像素值变为若则图像各点像素值变为若则图像各点像素值不变。like Then the pixel value of each point in the image becomes like Then the pixel value of each point in the image becomes like Then the pixel value of each point in the image remains unchanged.

2.1)步骤的预处理中的数据扩增法有以下三种，三种方法中选择其中之一、之二或者全部：2.1) There are the following three data amplification methods in the preprocessing of the step, and one, two or all of the three methods are selected:

(1)平移：将光照增强后的图像上下左右分别平移5-20个像素点(该处平移5-20个像素点均可，但本发明中采用的数据为12，且与图4中生成的图像相对应)；使得图像平移扩增至原来的倍数，该倍数为平移的次数；(1) translation: the image after the illumination enhancement is translated up, down, left, and right respectively by 5-20 pixels (this place can be translated by 5-20 pixels, but the data adopted in the present invention is 12, and it is the same as that generated in Fig. 4 corresponding to the image); the image translation is amplified to the original multiple, and the multiple is the number of times of translation;

(2)旋转：将光照增强后的图像顺逆旋转(选择性的，可以只顺时针一次，也可以只逆时针一次，也可以顺时针和逆时针并列各一次)5°-20°(但本发明中采用的数据为15°，且与图4中生成的图像相对应)；使得图像旋转扩增至原来的倍数，该倍数为旋转的次数；(2) Rotation: Rotate the image after the illumination enhancement (selectively, it can be clockwise only once, or only counterclockwise once, or clockwise and counterclockwise can be paralleled once) 5°-20° (but The data adopted in the present invention is 15 °, and corresponds to the image generated in Fig. 4); Make the image rotate and amplify to the original multiple, and this multiple is the number of times of rotation;

(3)镜像：将光照增强后的图像上下左右各镜像一次，即上下翻转一次，左右翻转一次，使得图像镜像扩增至原来的倍数，该倍数为镜像的次数；(3) Mirror image: Mirror the image after illumination enhancement up, down, left, and right once, that is, flip it up and down once, and flip it left and right once, so that the image mirror image is amplified to the original multiple, which is the number of mirror images;

如果选择三种方法其中之二或者全部时，对图像扩增都只针对同一个光照增强后的图像进行，然后将增强后的图像一起使用。If two or all of the three methods are selected, the image augmentation is only performed on the same light-enhanced image, and then the enhanced images are used together.

(2.2)步骤模型训练采用卷积神经网络进行迁移学习后再利用预处理后的MSLPP数据集对迁移学习后的模型继续训练；(2.2) Step model training uses the convolutional neural network for migration learning and then uses the preprocessed MSLPP data set to continue training the model after migration learning;

卷积神经网络主要分为卷积层、池化层和全连接层三个部分；最终选用的卷积神经网络结构是谷歌基于GoogLeNet提出的Inception-V3模型；The convolutional neural network is mainly divided into three parts: the convolutional layer, the pooling layer and the fully connected layer; the final convolutional neural network structure is the Inception-V3 model proposed by Google based on GoogLeNet;

迁移学习及训练的步骤为：首先，基于ImageNet(ImageNet是一个计算机视觉系统识别项目名称，是目前世界上图像识别最大的数据库，该数据库已开源，可直接使用)图像标注的数据集，在Inception-V3模型上进行预训练，提取一个2048维的特征向量；这一阶段充分利用知识迁移，使用预训练权重进行特征提取，不对Inception-V3的权重参数进行训练；然后，将所述特征向量输入一个单层的全连接神经网络，使用一个包含Softmax分类器的单层全连接神经网络，再经过预处理后的MSLPP数据集(指的是经过眼科医生分类且标注后的图像)训练后即得到最终分类结果。The steps of transfer learning and training are as follows: First, based on ImageNet (ImageNet is the name of a computer vision system recognition project, it is currently the largest database of image recognition in the world, the database is open source and can be used directly) image annotation data set, in Inception -Pre-train on the V3 model to extract a 2048-dimensional feature vector; this stage makes full use of knowledge transfer, uses pre-trained weights for feature extraction, and does not train the weight parameters of Inception-V3; then, input the feature vector A single-layer fully-connected neural network, using a single-layer fully-connected neural network containing a Softmax classifier, and then trained on the preprocessed MSLPP dataset (referring to images classified and labeled by ophthalmologists) to obtain final classification result.

训练过程如下：(此处“训练”指的是预处理后所有的训练过程，而“再经过已分类的晶状体浑浊晶体图像训练后即得到最终分类结果”中的“训练”指的是迁移学习之后的微调参数部分的训练。步骤一二对应采用卷积神经网络进行迁移学习，步骤三四对应利用预处理后的训练集对迁移学习后的模型继续训练)The training process is as follows: (Here, "training" refers to all the training processes after preprocessing, and "training" in "the final classification result is obtained after the training of the classified lens cloudy crystal image" refers to transfer learning The subsequent training of the fine-tuning parameter part. Steps 1 and 2 correspond to using convolutional neural networks for transfer learning, and steps 3 and 4 correspond to using the preprocessed training set to continue training the model after transfer learning)

(一)加载已经去掉全连接层的Inception-V3模型以及用ImageNet数据集预训练得出的权重参数；(1) Load the Inception-V3 model with the fully connected layer removed and the weight parameters obtained from pre-training with the ImageNet dataset;

(二)由(一)中得到的初始化后的Inception-V3模型上添加全连接层结构，并在全连接层中加入Dropout策略(是防止模型过拟合的一种方法)，比率设置为0.75；提取一个2048维的特征向量；(2) Add a fully connected layer structure to the initialized Inception-V3 model obtained in (1), and add a Dropout strategy to the fully connected layer (a method to prevent the model from overfitting), and set the ratio to 0.75 ;Extract a 2048-dimensional feature vector;

(三)将除了全连接层以外的所有特征提取层冻结，然后将学习率设为0.001，利用预处理后的MSLPP数据训练集训练1个epoch，迭代550次；(3) Freeze all feature extraction layers except the fully connected layer, then set the learning rate to 0.001, use the preprocessed MSLPP data training set to train 1 epoch, and iterate 550 times;

(四)将所有层解冻，利用微调迁移学习，继续用MSLPP数据训练集进行训练，采用随机梯度下降的方法，初始学习率设为0.01，训练100个epoch，每个epoch迭代550次，每结束一个epoch，利用验证集测试模型准确率，若准确率较上次提高，保存此次训练参数，若准确率降低，则利用之前保存的参数继续训练。批样本数batch_size设置为32，动量momentum设置为0.9。(4) Unfreeze all layers, use fine-tuning transfer learning, continue to use the MSLPP data training set for training, use the stochastic gradient descent method, set the initial learning rate to 0.01, train 100 epochs, and each epoch iterates 550 times. One epoch, use the verification set to test the accuracy of the model. If the accuracy is higher than last time, save the training parameters. If the accuracy is lower, use the previously saved parameters to continue training. The number of batch samples batch_size is set to 32, and the momentum momentum is set to 0.9.

(批样本数batch_size是指一个bitch中所包含的样本数，是训练过程中需要设置的一个参数。)(The number of batch samples batch_size refers to the number of samples contained in a bit, which is a parameter that needs to be set during the training process.)

((2)(3)(4)中所提到的数值均为训练模型时所设置的具体参数，虽然可在一定范围之内调整，但调整训练参数需要一定的技巧(根据训练集和验证集的召回率和损失函数值进行调整)，若范围内任意调整无法保证训练的结果。)(The values mentioned in (2)(3)(4) are the specific parameters set when training the model. Although they can be adjusted within a certain range, adjusting the training parameters requires certain skills (according to the training set and verification Adjust the recall rate and loss function value of the set), if any adjustment within the range cannot guarantee the training results.)

(三)步骤中冻结特征提取层只选择全连接层进行训练，应在小范围内更新权值，以免破坏预训练好的特征。(3) In the step of freezing the feature extraction layer, only the fully connected layer is selected for training, and the weights should be updated in a small range to avoid destroying the pre-trained features.

上述小范围指的是在冻结其他层、只改变全连接层的情况下，训练时模型内部的权重系数变化不大。此句是对步骤三中做法的解释说明，不是需要手动调节的参数。The above-mentioned small range means that when other layers are frozen and only the fully connected layer is changed, the weight coefficients inside the model do not change much during training. This sentence is an explanation of the method in step 3, not a parameter that needs to be adjusted manually.

基于卷积神经网络的晶状体浑浊程度检测系统，其特征在于：该系统包括图像预处理模块和检测模块；A lens turbidity detection system based on a convolutional neural network, characterized in that: the system includes an image preprocessing module and a detection module;

图像预处理模块对待测图像进行预处理形成预处理图像数据；将图像预处理模块中的图像数据代入检测模块中的晶状体浑浊程度检测学习模型中实现检测；The image preprocessing module preprocesses the image to be tested to form preprocessed image data; the image data in the image preprocessing module is substituted into the lens turbidity detection learning model in the detection module to realize detection;

检测模块包括MSLPP数据集构件模块、图像预处理模块和模型训练模块；The detection module includes MSLPP dataset component module, image preprocessing module and model training module;

MSLPP数据集构件模块构建MSLPP数据集；然后通过图像预处理模块对数据集中的图像进行预处理；The MSLPP data set component module constructs the MSLPP data set; then the image in the data set is preprocessed by the image preprocessing module;

模型训练模块对图像预处理模块预处理之后的图像形成的数据集进行模型训练得学习模型。The model training module performs model training on the data set formed from the image preprocessed by the image preprocessing module to obtain a learning model.

优点效果：Advantages and effects:

本发明提供一种基于卷积神经网络的晶状体浑浊程度检测方法，本发明设计了一个基于卷积神经网络的晶状体浑浊程度特征自动学习模型，具体流程如图1所示。首先，为了解决当前晶状体浑浊程度数据集缺乏的问题，本发明收集了临床中由裂隙灯采集的眼部晶体图像，并由眼部采样技师将其分为正常、早期状体浑浊及晶状体浑浊三类，构建MSLPP数据集；然后对图像进行预处理，主要操作为光照增强和数据量的扩增；最后，为了解决自动提取深度特征的问题，本发明利用ImageNet预训练过的Inception-V3模型及参数，并采用迁移学习的思想进行训练从而得到分类模型，该系统完成后可实现通过手机APP实时进行晶状体浑浊研究对比。The present invention provides a lens turbidity detection method based on a convolutional neural network. The present invention designs an automatic learning model of lens turbidity characteristics based on a convolutional neural network. The specific process is shown in FIG. 1 . First of all, in order to solve the problem of the current lack of data sets of lens turbidity, the present invention collects eye lens images collected by slit lamps in clinical practice, and the eye sampling technician divides them into three categories: normal, early lens turbidity and lens turbidity. class, build the MSLPP data set; then the image is preprocessed, the main operations are illumination enhancement and data volume amplification; finally, in order to solve the problem of automatically extracting depth features, the present invention utilizes the Inception-V3 model and parameters, and use the idea of transfer learning to train to obtain a classification model. After the completion of the system, real-time research and comparison of lens opacity can be realized through the mobile phone APP.

附图说明Description of drawings

图1为基于卷积神经网络的晶状体浑浊特征自动学习模型流程图；Fig. 1 is the flow chart of automatic learning model of lens opacity feature based on convolutional neural network;

图2为MSLPP数据集部分样本图像；Figure 2 is a partial sample image of the MSLPP dataset;

图3为光照增强前后对比图；Figure 3 is a comparison diagram before and after illumination enhancement;

图4为数据扩增对比图；Figure 4 is a comparison chart of data amplification;

图5为迁移学习策略训练示意图；Fig. 5 is a schematic diagram of transfer learning strategy training;

图6为模型可视化特征图；Figure 6 is a model visualization feature map;

图7为平移的举例示意图。平移：举例：左图为原图，右图为图片向右平移x个像素，再向下平移y个像素之后的图像。Fig. 7 is a schematic diagram of an example of translation. Translation: For example: the left image is the original image, and the right image is the image after the image is shifted to the right by x pixels, and then moved down by y pixels.

具体实施方式Detailed ways

基于卷积神经网络的晶状体浑浊程度检测方法，该方法步骤如下：A method for detecting the degree of lens turbidity based on a convolutional neural network, the steps of which are as follows:

(1)平移：将光照增强后的图像上下左右分别平移5-20个像素点；使得图像平移扩增至原来的倍数，该倍数为平移的次数；(1) Translation: translate the image after illumination enhancement by 5-20 pixels up, down, left, and right; make the image translation amplified to the original multiple, which is the number of translations;

(2)旋转：将光照增强后的图像顺逆旋转5°-20°；使得图像旋转扩增至原来的倍数，该倍数为旋转的次数；(2) Rotation: Rotate the image after illumination enhancement by 5°-20° forward and backward; make the image rotation amplified to the original multiple, which is the number of rotations;

迁移学习及训练的步骤为：The steps of transfer learning and training are:

第一步，基于ImageNet图像标注的数据集(是ImageNet数据集，里面包含的是自然图像)，在Inception-V3模型上进行预训练，提取一个2048维的特征向量；The first step is to pre-train on the Inception-V3 model based on the ImageNet image annotation dataset (the ImageNet dataset, which contains natural images), and extract a 2048-dimensional feature vector;

第二步，将所述特征向量输入一个单层的全连接神经网络，使用一个包含Softmax分类器的单层全连接神经网络，再经过预处理后的MSLPP数据集训练后即得到最终分类结果。In the second step, the feature vector is input into a single-layer fully-connected neural network, and a single-layer fully-connected neural network including a Softmax classifier is used, and the final classification result is obtained after training on the preprocessed MSLPP data set.

第一步中在Inception-V3模型上进行预训练过程为：The pre-training process on the Inception-V3 model in the first step is:

(二)由(一)中得到的初始化后的Inception-V3模型上添加全连接层结构，并在全连接层中加入Dropout策略，比率设置为0.75；提取一个2048维的特征向量；ImageNet数据集是已经标注好的公开数据集，属于公知。(2) Add a fully connected layer structure to the initialized Inception-V3 model obtained in (1), and add a Dropout strategy to the fully connected layer, with the ratio set to 0.75; extract a 2048-dimensional feature vector; ImageNet dataset It is a public dataset that has been marked and belongs to public knowledge.

第二步中的使用一个包含Softmax分类器的单层全连接神经网络，再经过预处理后的MSLPP数据集训练后即得到最终分类结果的步骤如下：In the second step, a single-layer fully connected neural network including a Softmax classifier is used, and the final classification result is obtained after training with the preprocessed MSLPP data set as follows:

(1)将除了全连接层以外的所有特征提取层冻结(这里只有全连接层对应的就是上面所述的单层)，然后将学习率设为0.001，利用预处理后的MSLPP数据训练集训练1个epoch，迭代550次；(1) Freeze all feature extraction layers except the fully connected layer (here only the fully connected layer corresponds to the single layer mentioned above), then set the learning rate to 0.001, and use the preprocessed MSLPP data training set to train 1 epoch, 550 iterations;

(2)将所有层解冻，利用微调迁移学习，继续用MSLPP数据训练集进行训练，采用随机梯度下降的方法，初始学习率设为0.01，训练100个epoch，每个epoch迭代550次，每结束一个epoch，利用验证集测试模型准确率，若准确率较上次提高，保存此次训练参数，若准确率降低，则利用之前保存的参数继续训练。批样本数batch_size设置为32，动量momentum设置为0.9。(2) Unfreeze all layers, use fine-tuning transfer learning, continue to use the MSLPP data training set for training, use the stochastic gradient descent method, set the initial learning rate to 0.01, train 100 epochs, and each epoch iterates 550 times. One epoch, use the verification set to test the accuracy of the model. If the accuracy is higher than last time, save the training parameters. If the accuracy is lower, use the previously saved parameters to continue training. The number of batch samples batch_size is set to 32, and the momentum momentum is set to 0.9.

该系统包括图像预处理模块和检测模块；The system includes an image preprocessing module and a detection module;

下面对本发明进行进一步的详细说明：The present invention is further described in detail below:

本发明设计了一个基于卷积神经网络的晶状体浑浊程度检测方法，具体流程如图1所示。首先，为了解决当前晶状体浑浊数据集缺乏的问题，本发明收集了临床中由裂隙灯采集的眼部晶体图像，并由眼科医生将其分为正常、早期晶状体浑浊及晶状体浑浊三类，构建MSLPP数据集；然后对图像进行预处理，主要操作为光照增强和数据量的扩增；最后，为了解决自动提取深度特征的问题，本发明利用ImageNet预训练过的Inception-V3模型及参数，并采用迁移学习的思想进行训练从而得到分类模型，该系统完成后可实现通过手机APP实时进行晶状体浑浊研究。并可以利用该数据作为中间数据对皮质性晶状体浑浊、核性晶状体浑浊及后囊性晶状体浑浊普遍参考适用。The present invention designs a lens turbidity detection method based on a convolutional neural network, and the specific process is shown in FIG. 1 . First of all, in order to solve the problem of lack of current lens turbidity data sets, the present invention collects the ocular lens images collected by the slit lamp in clinic, and divides them into three categories by ophthalmologists: normal, early lens turbidity and lens turbidity, and constructs MSLPP data set; then the image is preprocessed, the main operations are illumination enhancement and data volume amplification; finally, in order to solve the problem of automatically extracting depth features, the present invention utilizes the Inception-V3 model and parameters pre-trained by ImageNet, and adopts The idea of transfer learning is trained to obtain a classification model. After the system is completed, it can realize real-time research on lens opacity through the mobile phone APP. And the data can be used as intermediate data for general reference of cortical lens opacity, nuclear lens opacity and posterior capsular lens opacity.

数据集与图像预处理Dataset and Image Preprocessing

MSLPP数据集MSLPP dataset

数据库是实现深度学习系统的重要组成部分，高质量的数据库可以增强系统筛查的准确性。但由于目前缺乏大型公开已标记的裂隙灯眼部晶体图像数据集，因此需要构建用于晶状体浑浊分类的数据集。本发明采用的数据集为与沈阳艾洛博智能科技有限公司和沈阳何氏眼科集团合作开发，并将其命名为MSLPP(Marked Slit Lamp Picture Project)的数据集。The database is an important part of implementing a deep learning system, and a high-quality database can enhance the accuracy of system screening. However, due to the lack of large publicly available datasets of labeled slit-lamp ocular lens images, a dataset for lens opacity classification needs to be constructed. The data set used in the present invention is developed in cooperation with Shenyang Ai Luobo Intelligent Technology Co., Ltd. and Shenyang He's Ophthalmology Group, and it is named MSLPP (Marked Slit Lamp Picture Project) data set.

MSLPP数据集共包含16239张图片，其中晶状体浑浊眼部样本图像5302张，早期晶状体浑浊眼部样本图像5400张，正常人的眼部样本图像5537张，图像采集于2015年到2018年，来自于2864个健康人和5532个晶状体浑浊者。The MSLPP data set contains a total of 16,239 images, including 5,302 eye sample images of lens turbidity, 5,400 eye sample images of early lens turbidity, and 5,537 eye sample images of normal people. The images were collected from 2015 to 2018, from 2864 healthy people and 5532 lens opacities.

该数据集中所收集的图像均为裂隙灯拍摄的眼部样本图像，所使用的裂隙灯主要为台式裂隙灯和手机裂隙灯，该数据库部分样本图像如图2所示。由图可以看出，将裂隙光聚焦到瞳孔区，内部呈透明或浅黄底色即是正常的晶体，如图2(a)所示；内部呈透明见黄底，光斑略暗即为早期晶状体浑浊，如图2(b)所示；若内部有明显浑浊，病灶位置可见即为晶状体浑浊，如图2(c)所示。The images collected in this data set are all eye sample images taken by slit lamps. The slit lamps used are mainly desktop slit lamps and mobile phone slit lamps. Some sample images of the database are shown in Figure 2. It can be seen from the figure that when the slit light is focused to the pupil area, the inside is transparent or light yellow, which is a normal lens, as shown in Figure 2(a); the inside is transparent and yellow, and the light spot is slightly dark, which is the early lens Turbidity, as shown in Figure 2(b); if there is obvious turbidity inside, and the lesion can be seen, it means that the lens is cloudy, as shown in Figure 2(c).

根据眼部晶状体混浊部位的不同又分为皮质性晶状体浑浊、核性晶状体浑浊和后囊性晶状体浑浊。核性晶状体浑浊，初期核为黄色，随病情进展核的颜色逐渐加深而呈黄褐色、棕色、棕黑色甚至黑色，如图2(c)-1所示；皮质性晶状体浑浊，初期在晶状体皮质中可见到有空泡和水隙形成，水隙从周边向中央扩大，形成轮辐状混浊，晶状体周边前、后皮质出现楔形混浊，呈羽毛状，随着晶状体浑浊的深入晶状体混浊加重直至呈乳白色完全混浊，如图2(c)-2所示；后囊性晶状体浑浊，用裂隙灯显微镜检查可以看到后囊下由许多黄色小点、小空泡、结晶样颗粒构成的盘状浑浊，如图2(c)-3所示。在实际操作中，核性晶状体浑浊及皮质性晶状体浑浊比较常见，而后囊性晶状体浑浊较少。According to the different parts of the eye lens opacity, it is divided into cortical lens opacity, nuclear lens opacity and posterior capsular lens opacity. In nuclear lens turbidity, the initial nucleus is yellow, and as the disease progresses, the color of the nucleus gradually becomes yellowish-brown, brown, brown-black or even black, as shown in Figure 2(c)-1; The formation of vacuoles and water gaps can be seen in the middle, and the water gaps expand from the periphery to the center, forming spoke-shaped opacities. Wedge-shaped opacities appear in the anterior and posterior cortex of the lens, which is feather-like. As the lens opacity deepens, the lens opacity increases until it becomes milky white Complete turbidity, as shown in Figure 2(c)-2; the posterior capsule lens is turbid, and a disk-shaped turbidity composed of many yellow dots, small vacuoles, and crystal-like particles under the posterior capsule can be seen under the slit lamp microscope. As shown in Figure 2(c)-3. In practice, nuclear lens opacities and cortical lens opacities are more common, while posterior capsular lens opacities are less common.

MSLPP数据集有如下特点：The MSLPP data set has the following characteristics:

(1)样本均采自何氏眼科医院，由眼科医生利用裂隙灯设备为待筛查者拍摄其眼部晶体图像，并由何氏眼科三名主治医师进行筛查分类；(1) The samples were all collected from He's Eye Hospital, and the ophthalmologists used slit lamp equipment to take images of the eye crystals of the candidates to be screened, and the three attending physicians of He's Ophthalmology performed screening and classification;

(2)样本采集环境复杂多样，亮环境、暗环境及一些反光等情况均有涉及；(2) The sample collection environment is complex and diverse, involving bright environments, dark environments, and some reflections;

(3)样本包含种类多样，可针对核性晶状体浑浊，皮质性晶状体浑浊及后囊性晶状体浑浊。(3) The samples contain a variety of types, which can be used for nuclear lens opacities, cortical lens opacities and posterior capsular lens opacities.

光照增强light enhancement

亮度是图像处理过程中重点关注的部分。该数据集采集时，由于实际筛查环境复杂多样，所拍摄样本图像亮度差异大，会影响深度学习准确率，因此，需要对样本图像进行光照调节，突出样本特征，减小由图像亮度差异带来的影响，具体做法如下：Brightness is the most important part of image processing. During the collection of this data set, due to the complex and diverse actual screening environment, the brightness of the captured sample images varies greatly, which will affect the accuracy of deep learning. The impact is as follows:

将输入图像尺寸压缩为299×299，设任意一点A(x,y)的三通道像素则图像的平均像素值可表示为：Compress the input image size to 299×299, set the three-channel pixel of any point A(x,y) Then the average pixel value of the image Can be expressed as:

若则图像各点像素值变为若则图像各点像素值变为若则图像各点像素值不变。图3为光照增强前后对比图。like Then the pixel value of each point in the image becomes like Then the pixel value of each point in the image becomes like Then the pixel value of each point in the image remains unchanged. Figure 3 is a comparison before and after illumination enhancement.

数据扩增data augmentation

为了避免模型训练时发生过拟合情况，在进行图像预处理时，需要对样本进行数量扩增处理，有利于改善模型的性能，提高图像分类准确率。本发明所采用的数据扩增方式有以下三种：In order to avoid overfitting during model training, it is necessary to increase the number of samples during image preprocessing, which is conducive to improving the performance of the model and improving the accuracy of image classification. The data amplification mode that the present invention adopts has following three kinds:

(1)平移：将图像上下左右分别平移12个像素点(1) Translation: translate the image up, down, left, and right by 12 pixels

(2)旋转：将图像顺逆旋转15°(2) Rotation: Rotate the image forward and reverse by 15°

(3)镜像：将图像上下左右各镜像一次(3) Mirror image: Mirror the image up, down, left, and right once

由于台式裂隙灯和便携式裂隙灯的光照入射方式不同，台式裂隙灯光线从右侧倾斜30°射入，晶体切面显示在裂隙光线左侧，而手持式裂隙灯光线从左侧倾斜30°射入，切面显示在光线右侧，为了消除此特征对系统产生的影响，我们在预处理中将对图片进行左右镜像处理。具体扩增前后对比图如图4所示，三行由上到下分别为晶状体浑浊、早期晶状体浑浊和正常图像的图像数量增强前后对比图。Due to the difference in light incidence between the desktop slit lamp and the portable slit lamp, the light of the desktop slit lamp is incident from the right at an angle of 30°, and the crystal section is displayed on the left side of the slit light, while the light of the hand-held slit lamp is incident from the left at an angle of 30° , the cut plane is displayed on the right side of the light. In order to eliminate the impact of this feature on the system, we will mirror the image left and right in the preprocessing. The specific comparison chart before and after amplification is shown in Figure 4, and the three lines from top to bottom are the comparison charts before and after the image quantity enhancement of lens opacity, early lens opacity and normal image respectively.

模型与方法Models and Methods

卷积神经网络convolutional neural network

本发明采用的方法是卷积神经网络，主要分为卷积层、池化层和全连接层三个部分。最终选用的卷积神经网络结构是谷歌基于GoogLeNet(ILSVRC2014,the ImageNetLarge Scale Visual Recognition Challenge of 2014比赛的冠军模型)提出的Inception-V3模型。The method adopted in the present invention is a convolutional neural network, which is mainly divided into three parts: a convolutional layer, a pooling layer and a fully connected layer. The final convolutional neural network structure is the Inception-V3 model proposed by Google based on GoogLeNet (ILSVRC2014, the champion model of the ImageNet Large Scale Visual Recognition Challenge of 2014 competition).

在整个网络中，利用卷积层和池化层提取图像中的有效特征，在网络中引入非线性激活函数，减少有效特征所占维度，输出能够表示输入图像的高级特征，最后，由全连接层将这些特征用于对所要筛查的输入图像的分类。除了上述所提到的基本网络结构，本发明还在最后的全连接层中增加了Dropout策略，能够有效避免过拟合，提高网络泛化能力，加快网络的训练过程。In the whole network, the convolutional layer and the pooling layer are used to extract the effective features in the image, the nonlinear activation function is introduced in the network, the dimension occupied by the effective features is reduced, and the advanced features that can represent the input image are output. Finally, the full connection The layers use these features to classify the input image to be screened. In addition to the basic network structure mentioned above, the present invention also adds a Dropout strategy in the last fully connected layer, which can effectively avoid over-fitting, improve the network generalization ability, and speed up the training process of the network.

迁移学习transfer learning

在医学图像领域，缺乏大量公开已标注的数据集将是深度学习应用于医疗图像处理中的难题之一。在样本不足的情况下，易导致模型训练过程中出现不收敛或者所训练出来的模型泛化能力差等一系列问题。因此，本发明采用迁移学习的方法，来解决以上问题。In the field of medical images, the lack of a large number of publicly annotated datasets will be one of the difficulties in applying deep learning to medical image processing. In the case of insufficient samples, it is easy to cause a series of problems such as non-convergence in the model training process or poor generalization ability of the trained model. Therefore, the present invention adopts the method of transfer learning to solve the above problems.

基于卷积神经网络模型迁移学习的晶状体浑浊分类方法流程图如图5所示。首先，基于ImageNet图像标注的数据集，在Inception-V3模型上进行预训练，提取一个2048维的特征向量。这一阶段充分利用知识迁移，使用预训练权重进行特征提取，不对Inception-V3的权重参数进行训练，与传统方法相比，提取特征更加高效。然后，将特征向量输入一个单层的全连接神经网络，因为训练好的Inception-V3模型已经将原始的图像抽象成更加容易分类的特征向量，因此使用一个包含Softmax分类器的单层全连接神经网络，再经过已分类的晶状体浑浊晶体图像训练后即得到最终分类结果。这一阶段，输入的特征向量主要承担对分类器的训练任务，使得分类器能够更好地基于已提取的特征完成分类。The flow chart of the lens opacity classification method based on the convolutional neural network model transfer learning is shown in Figure 5. First, based on the ImageNet image annotation dataset, the Inception-V3 model is pre-trained to extract a 2048-dimensional feature vector. This stage makes full use of knowledge transfer, uses pre-trained weights for feature extraction, and does not train the weight parameters of Inception-V3. Compared with traditional methods, feature extraction is more efficient. Then, the feature vector is input into a single-layer fully connected neural network, because the trained Inception-V3 model has abstracted the original image into a feature vector that is easier to classify, so a single-layer fully connected neural network containing a Softmax classifier is used The network is trained on the classified cloudy crystal images of the lens to obtain the final classification result. At this stage, the input feature vector is mainly responsible for the training task of the classifier, so that the classifier can better complete the classification based on the extracted features.

实验experiment

实验数据Experimental data

MSLPP数据集共包含5302张晶状体浑浊晶体图像，5400张早期晶状体浑浊晶体图像，5537张正常晶体图像。本发明将该数据集划分训练集、验证集和测试集，其中，在三个类别中分别随机取出500张作为测试集，其余样本按6:1的比例随机分为训练集和验证集。其中，训练集共12630张，包含晶状体浑浊4083张、早期晶状体浑浊4241张、正常4306张；验证集共2109张，包含晶状体浑浊719张、早期晶状体浑浊659张、正常731张。具体各类别下图片数量如表1所示。The MSLPP dataset contains a total of 5302 turbid crystal images, 5400 early turbid crystal images, and 5537 normal crystal images. The present invention divides the data set into a training set, a verification set and a test set, wherein 500 samples are randomly selected from the three categories as a test set, and the remaining samples are randomly divided into a training set and a verification set according to a ratio of 6:1. Among them, the training set has a total of 12,630 images, including 4,083 lens opacities, 4,241 early lens opacities, and 4,306 normal images; the verification set has a total of 2,109 images, including 719 lens opacities, 659 early lens opacities, and 731 normal images. The number of pictures in each category is shown in Table 1.

表1各分类下图片数量对比Table 1 Comparison of the number of pictures under each category

将训练集和验证集数据进行扩增，测试集保持不变。扩增后，训练集和验证集的总数量由原来的14739张增加到了132651张，其中晶状体浑浊患者晶体裂隙灯图像变为43218张，早期晶状体浑浊患者晶体裂隙灯图像变为44100张，正常人的晶体裂隙灯图像变为45333张，具体各分类数量见表2。The training set and validation set data are augmented, and the test set remains unchanged. After the expansion, the total number of training and verification sets increased from 14,739 to 132,651, of which 43,218 were lens slit lamp images of patients with lens opacity, and 44,100 were images of crystal slit lamp images of patients with early lens opacity. The number of crystal slit lamp images changed to 45333, and the specific number of each category is shown in Table 2.

表2扩增前后数量对比图Table 2 Quantity comparison before and after amplification

训练过程training process

本发明的所有代码都是以Keras(Keras是一个高层神经网络API，由纯Python编写而成)为前端、以TensorFlow(TensorFlow是谷歌研发的第二代人工智能学习系统)为后端完成的，该框架基于Ubuntu16.04(64位)+CUDA9.1+CUDNN9.0系统。采用的编程语言为python。All codes of the present invention are completed with Keras (Keras is a high-level neural network API written in pure Python) as the front end and TensorFlow (TensorFlow is the second-generation artificial intelligence learning system developed by Google) as the back end. The framework is based on Ubuntu16.04 (64-bit)+CUDA9.1+CUDNN9.0 system. The programming language used is python.

训练过程如下：The training process is as follows:

(1)加载去掉全连接层的Inception-V3模型以及用ImageNet数据集预训练得出的权重参数；(1) Load the Inception-V3 model with the fully connected layer removed and the weight parameters obtained by pre-training with the ImageNet dataset;

(2)在初始化后的Inception-V3网络上添加全连接层结构，并在全连接层中加入Dropout策略，比率设置为0.75；(2) Add a fully connected layer structure to the initialized Inception-V3 network, and add a Dropout strategy to the fully connected layer, and set the ratio to 0.75;

(3)将除了全连接层以外的所有特征提取层冻结，然后将学习率设为0.001，利用预处理后的训练集训练1个epoch，迭代550次；(冻结特征提取层可以只选择全连接层进行训练，一方面能够防止过拟合，另一方面，由于在一个已训练好的模型上继续训练，故应在小范围内更新权值，以免破坏预训练好的特征)(3) Freeze all feature extraction layers except the fully connected layer, then set the learning rate to 0.001, use the preprocessed training set to train 1 epoch, and iterate 550 times; (the frozen feature extraction layer can only choose full connection Layer training, on the one hand, can prevent overfitting, on the other hand, since the training continues on a trained model, the weights should be updated in a small range to avoid destroying the pre-trained features)

(4)将所有层解冻，利用微调迁移学习(fine-tune)，继续用MSLPP数据集进行训练，采用随机梯度下降的方法，初始学习率设为0.01，训练100个epoch，每个epoch迭代550次，每结束一个epoch，利用验证集测试模型准确率，若准确率较上次提高，保存此次训练参数，若准确率降低，则利用之前保存的参数继续训练。批样本数batch_size设置为32，动量momentum设置为0.9。(4) Unfreeze all layers, use fine-tune transfer learning (fine-tune), continue to use the MSLPP data set for training, use the stochastic gradient descent method, set the initial learning rate to 0.01, train 100 epochs, and each epoch iterates 550 Every time an epoch ends, use the verification set to test the accuracy of the model. If the accuracy is higher than last time, save the training parameters. If the accuracy is lower, use the previously saved parameters to continue training. The number of batch samples batch_size is set to 32, and the momentum momentum is set to 0.9.

利用quiver(针对Keras平台的一个可视化工具)将所得到的模型进行可视化，得到的特征图如图6所示。Use quiver (a visualization tool for the Keras platform) to visualize the obtained model, and the obtained feature map is shown in Figure 6.

系统评价system assesment

模型训练完成后，利用验证集对模型进行验证，其中晶状体浑浊的召回率为88.24％，早期晶状体浑浊的召回率为86.63％，正常的召回率为97.51％。随后，我们又用训练时模型未接触过的测试集进行测试，测试集包含图像共1500张，其中操作者判定为晶状体浑浊的有500张，早期晶状体浑浊500张，正常500张，经系统分类后情况如下表3所示。After the model training is completed, the verification set is used to verify the model. The recall rate of lens opacity is 88.24%, the recall rate of early lens opacity is 86.63%, and the recall rate of normal lens is 97.51%. Then, we tested with the test set that the model had not been exposed to during training. The test set contained a total of 1,500 images, of which 500 were judged to be cloudy by the operator, 500 were early cloudy, and 500 were normal, which were classified by the system. The latter situation is shown in Table 3 below.

本发明采用四种常用的指标来评估系统的性能：准确率Accuracy、召回率Recall、精确率Precision和F1指标F1_meature。准确率是分类性能的总体度量，它是分类正确的样本数与总样本数之比；召回率是所有正例样本中被分对的比例；精确率是被分为正例的样本中实际为正例的比例；F1指标是精确率和召回率的调和均值。它们的计算方法是：The present invention uses four commonly used indexes to evaluate the performance of the system: Accuracy, Recall, Precision and F1 index F1_meature. Accuracy is the overall measure of classification performance, which is the ratio of the number of correctly classified samples to the total number of samples; the recall rate is the proportion of all positive samples that are paired; the precision rate is the actual number of samples classified as positive samples. The proportion of positive examples; the F1 indicator is the harmonic mean of precision and recall. They are calculated as:

其中，TP、TN、FP、FN分别是真阳性(true positive)，真阴性(true negative)，假阳性(false positive)和假阴性(false negative)的数量。以晶状体浑浊样本为例，“真阳性”意味着晶状体浑浊样本被正确分类为晶状体浑浊。如果晶状体浑浊样本被错误地归为其他分类，我们将其称为“假阴性”。“真阴性”和“假阳性”的含义类似，“真阴性”意味着其他分类样本未被错误地归为晶状体浑浊，而“假阳性”意味着其他分类样本被错误地归为晶状体浑浊。Among them, TP, TN, FP, and FN are the numbers of true positives, true negatives, false positives, and false negatives, respectively. Using the lens opacity sample as an example, a "true positive" means that the lens opacity sample was correctly classified as lens opacity. If a lens opacity sample is misclassified as another classification, we refer to this as a "false negative". The meanings of "true negative" and "false positive" are similar, with "true negative" meaning that other classified samples were not incorrectly classified as lens opacity, and "false positive" meaning that other classified samples were incorrectly classified as lens opacity.

按照上述测评方式，该系统的性能如下表4所示。According to the above evaluation method, the performance of the system is shown in Table 4 below.

表4模型可靠性判定Table 4 Model reliability judgment

基于卷积神经网络的晶状体浑浊程度检测系统，该系统包括图像预处理模块和检测模块；A lens turbidity detection system based on a convolutional neural network, which includes an image preprocessing module and a detection module;

Claims

1. the lenticular opacities degree detecting method based on convolutional neural networks, it is characterised in that: the method steps are as follows:

(1), pretreatment is carried out using illumination enhancing method to testing image and forms preprocessing image data；

(2), the preprocessing image data in (1) step is substituted into lenticular opacities degree detecting learning model and realizes detection.

2. the lenticular opacities degree detecting method according to claim 1 based on convolutional neural networks, it is characterised in that:

(2) construction method of the lenticular opacities degree detecting model in step is as follows:

(2.1), MSLPP data set is constructed；Then the image concentrated to data pre-processes；And by pretreated MSLPP number It is divided into training set, verifying collection and test set according to collection；

(2.2), it is muddy that the data set progress model training formed using the image after the pretreatment in (2.1) step obtains crystalline lens Turbid degree detecting learning model.

3. the lenticular opacities degree detecting method according to claim 2 based on convolutional neural networks, it is characterised in that: (2.1) building mode of MSLPP data set is as follows in step: the eye crystal image acquired in clinic by slit-lamp is collected, and It is classified as normal, early stage lenticular opacities and lenticular opacities three classes.

4. the lenticular opacities degree detecting method according to claim 2 based on convolutional neural networks, it is characterised in that: (2.1) preprocess method in step are as follows: first data set is handled using illumination enhancing method, then to illumination enhancing method processing Data set afterwards is handled using data amplification.

5. the lenticular opacities degree detecting method according to claim 1 or 4 based on convolutional neural networks, feature exist In:

Illumination enhances method with the following method:

By 299 × 299 pixel of input image size boil down to, if the triple channel pixel of image any point A (x, y)For [R(x,y),G(x,y),B(x,y)]^T, wherein R (x, y), G (x, y), B (x, y) respectively represent red, green, blue three in point A (x, y) The brightness value in a channel, the range of luminance values in three channels are 0-255, then the average pixel value of imageIt indicates are as follows:

IfThen image each point pixel value becomesIfThen image each point pixel value becomes IfThen image each point pixel value is constant.

6. the lenticular opacities degree detecting method according to claim 5 based on convolutional neural networks, it is characterised in that:

2.1) the data amplification in the pretreatment of step has following three kinds, in three kinds of methods selection one of them, two or All:

(1) it translates: the enhanced image of illumination is translated to 5-20 pixel respectively up and down；So that image translation is expanded to Multiple originally, the multiple are the number of translation；

(2) it rotates: by the enhanced image of illumination along 5 ° -20 ° of reverse rotation；So that image rotation is expanded to original multiple, it should Multiple is the number of rotation；

(3) mirror image: by the enhanced image of illumination, each mirror image is primary up and down, that is, spins upside down once, and left and right overturning is primary, So that image mirrors are expanded to original multiple, which is the number of mirror image；

If expanded image all just for the enhanced figure of the same illumination when two or the whole of three kinds of methods of selection wherein As carrying out, then enhanced image is used together.

7. the lenticular opacities degree detecting method according to claim 2 based on convolutional neural networks, it is characterised in that: (2.2) step model training recycles pretreated MSLPP data set pair after carrying out transfer learning using convolutional neural networks Model after transfer learning continues to train；

Transfer learning and the step of training are as follows:

The first step carries out pre-training based on the data set of ImageNet image labeling on Inception-V3 model, extracts one The feature vector of a 2048 dimension；

Described eigenvector is inputted the full Connection Neural Network of a single layer by second step, is classified using one comprising Softmax The full Connection Neural Network of the single layer of device obtains final classification result after training using pretreated MSLPP data set.

8. the lenticular opacities degree detecting method according to claim 7 based on convolutional neural networks, it is characterised in that:

Pre-training process is carried out in the first step on Inception-V3 model are as follows:

(1) load has been removed the Inception-V3 model of full articulamentum and has been obtained with ImageNet data set pre-training Weight parameter；

(2) full articulamentum structure is added on Inception-V3 network after initialization, and is added in full articulamentum Dropout strategy, ratio are set as 0.75；Extract the feature vector of one 2048 dimension；

One full Connection Neural Network of single layer comprising Softmax classifier of use in second step, using pretreated The step of obtaining final classification result after the training of MSLPP data set is as follows:

(1) all feature extraction layers other than full articulamentum are freezed, learning rate is then set as 0.001, utilize pre- place After reason MSLPP data training set training 1 epoch, iteration 550 times；

(2) all layers are thawed, using fine tuning transfer learning, continues to be trained with MSLPP data training set, using boarding steps The method of decline is spent, initial learning rate is set as 0.01, trains 100 epoch, and each epoch iteration 550 times is every to terminate one Epoch collects test model accuracy rate using verifying, if accuracy rate was improved compared with last time, saves this training parameter, if accuracy rate drops It is low, then continue to train using previously stored parameter；It criticizes sample number batch_size and is set as 32, momentum momentum is set as 0.9。

9. the lenticular opacities degree detecting method according to claim 8 based on convolutional neural networks, it is characterised in that: (3) freeze feature extract layer only selects full articulamentum to be trained in step, should update weight in a small range, in order to avoid destroy The good feature of pre-training.

10. lenticular opacities degree-measuring system of the base according to claim 1 based on convolutional neural networks, feature exist In: the system includes image pre-processing module and detection module；

Image pre-processing module carries out pretreatment to testing image and forms preprocessing image data；Detection module is by image preprocessing Image data in module, which substitutes into, realizes detection in the lenticular opacities degree detecting learning model in detection module；

Detection module includes MSLPP data set construction part module, image pre-processing module and model training module；

MSLPP data set construction part module constructs MSLPP data set；Then image data concentrated by image pre-processing module It is pre-processed；

The data set that model training module forms the image after image pre-processing module pretreatment, which carries out model training, to be learned Practise model.