CN113421250A

CN113421250A - Intelligent fundus disease diagnosis method based on lesion-free image training

Info

Publication number: CN113421250A
Application number: CN202110756395.8A
Authority: CN
Inventors: 赵赫; 李慧琦
Original assignee: Beijing Institute of Technology BIT
Current assignee: Beijing Institute of Technology BIT
Priority date: 2021-07-05
Filing date: 2021-07-05
Publication date: 2021-09-21

Abstract

The invention relates to an intelligent diagnosis method for fundus diseases based on non-pathological image training, belonging to the technical field of image classification and disease diagnosis. Including: 1. Construct training set and test set and complete data set preprocessing; 2. Build encoder, decoder, discriminator, and restoration decoder models for non-lesion image training; 3. Build surrogate tasks based on image transformation; 4. Construct a weighted loss function based on reconstruction loss, discriminant loss, and restoration loss; 5. Model training; 6. Use the trained encoding-decoding model to test the image to be detected. The method gets rid of the conditional dependence on the coexistence of different types of data in the training set through the training method of image reconstruction; the introduction of the proxy task reduces the data demand of the model; the constraints of the image and the feature space strengthen the model for the image organization structure. The above features together improve the model's ability to recognize disease images.

Description

An intelligent diagnosis method for fundus diseases based on non-lesion image training

技术领域technical field

本发明涉及一种基于无病变影像训练的眼底疾病智能诊断方法，属于图像分类以及疾病诊断技术领域。The invention relates to an intelligent diagnosis method for fundus diseases based on non-pathological image training, belonging to the technical field of image classification and disease diagnosis.

背景技术Background technique

眼底图像对医学上的疾病诊断有着重要的意义，常被用于眼科医生诊断多种疾病。眼睛的各种疾病以及影响血液循环和大脑的疾病均可在眼底图像中显现出来，包括引起失明的黄斑变性、青光眼和系统性疾病并发症如糖尿病视网膜病变、高血压等。相比于其他医学影像，获取眼科图像的设备要求较低，适用于基层的大范围普查，为基层的病人提供高效的诊断服务，具有广阔的应用前景以及实际社会价值。人工智能在医学影像辅助诊断具有速度快、准确性高等优势，在辅助医生分析、识别病变，提高诊断效率上起着重要作用。Fundus images are of great significance to medical disease diagnosis and are often used by ophthalmologists to diagnose various diseases. Various diseases of the eye, as well as diseases affecting blood circulation and the brain, can be visualized in fundus images, including macular degeneration, which causes blindness, glaucoma, and complications of systemic diseases such as diabetic retinopathy, high blood pressure, and more. Compared with other medical images, the equipment requirements for obtaining ophthalmic images are lower, and it is suitable for large-scale census at the grassroots level and provides efficient diagnostic services for patients at the grassroots level. It has broad application prospects and practical social value. Artificial intelligence has the advantages of high speed and high accuracy in medical image-assisted diagnosis, and plays an important role in assisting doctors in analyzing and identifying lesions and improving the efficiency of diagnosis.

现阶段的医学影像诊断算法主要基于深度神经网络，利用健康与疾病样本进行训练，需要大量有标注的数据作为训练基础。在临床上，有标签的病变数据较为稀少，对于一些新型疾病，在短期内获得大量病变标注是十分困难的。在利用少量样本进行训练时，会导致模型性能下降等问题。另一方面，在医学影像分析领域，受到病变样本量的约束，大量的健康样本也无法被有效地利用。虽然有研究初步探索了单一类别样本下的分类算法，但其算法在测试过程中因耗时久、准确度较低等问题而无法应用于临床中。结合以上两点问题，如何在无病变数据，即只利用健康人的数据条件下，完成高性能诊断系统的开发，是医学影像分析领域需要解决的一个研究难题。The current medical imaging diagnosis algorithms are mainly based on deep neural networks, which use health and disease samples for training, and require a large amount of labeled data as the training basis. In clinical practice, labeled lesion data is scarce. For some new diseases, it is very difficult to obtain a large number of lesion labels in a short period of time. When using a small number of samples for training, it will lead to problems such as model performance degradation. On the other hand, in the field of medical image analysis, a large number of healthy samples cannot be effectively utilized due to the constraints of the sample size of lesions. Although some studies have initially explored the classification algorithm under a single category of samples, the algorithm cannot be used in clinical practice due to the problems of time-consuming and low accuracy in the testing process. Combining the above two points, how to complete the development of a high-performance diagnostic system under the condition of no lesion data, that is, only using the data of healthy people, is a research problem that needs to be solved in the field of medical image analysis.

本发明的目的是通过利用无监督诊断算法、结合特征空间中眼底图像分布特点致力于解决无病变影像场景下的临床疾病诊断，提出基于无病变影像训练的深度学习眼底疾病智能诊断方法，辅助医生完成高准确率的疾病诊断。The purpose of the present invention is to solve the clinical disease diagnosis in the non-lesion image scene by using the unsupervised diagnosis algorithm and combining the fundus image distribution characteristics in the feature space, and propose a deep learning fundus disease intelligent diagnosis method based on non-lesion image training to assist doctors. Complete disease diagnosis with high accuracy.

发明内容SUMMARY OF THE INVENTION

本发明的目的在于针对现有眼底图像分类诊断算法存在的如下两方面缺陷问题：1)现有算法依赖大量标注数据：缺乏数据条件下，算法性能不佳；2)过度依赖平衡的数据标签：在只有一类标签前提下，算法性能欠缺；提出了一种基于无病变影像训练的眼底疾病智能诊断方法。The object of the present invention is to address the following two defects in the existing fundus image classification and diagnosis algorithm: 1) the existing algorithm relies on a large amount of labeled data: under the condition of lack of data, the algorithm performance is not good; 2) over-reliance on balanced data labels: Under the premise of only one type of labels, the algorithm performance is lacking. An intelligent diagnosis method of fundus diseases based on non-lesion image training is proposed.

为了达到上述目的，本发明采取如下方案。In order to achieve the above object, the present invention adopts the following scheme.

所述基于无病变影像训练的眼底疾病智能诊断方法，通过以下步骤实现：The intelligent diagnosis method for fundus diseases based on non-lesion image training is realized by the following steps:

步骤一：对收集的图像进行预处理，构建数据集，具体为：对收集的图像进行筛选，剔除图像质量欠佳的影像，再将筛选后的图像分辨率统一为相同维度W*W*c，并将像素值归一到[-1,1]间；Step 1: Preprocess the collected images to construct a data set, specifically: screen the collected images, remove the images with poor image quality, and then unify the resolution of the filtered images into the same dimension W*W*c , and normalize the pixel value to [-1,1];

其中，c大于等于1；Among them, c is greater than or equal to 1;

数据集分为训练集与测试集，皆选用临床采集的眼科图像；训练集中的影像由健康个体的眼科影像构成，测试集中的图像为健康个体与患病个体影像的集合；质量欠佳的影像，具体为：图片过暗、拍摄视角偏差较大以及拍摄抖动的图像；The data set is divided into a training set and a test set, both of which are ophthalmological images collected clinically; the images in the training set are composed of ophthalmological images of healthy individuals, and the images in the test set are a collection of images of healthy individuals and diseased individuals; images of poor quality , specifically: the picture is too dark, the shooting angle of view has a large deviation, and the shooting image is shaken;

步骤二：设计包括编码器、解码器、判别器1、判别器2以及复原解码器的网络模型，具体包含如下子步骤：Step 2: Design a network model including an encoder, a decoder, a discriminator 1, a discriminator 2 and a restoration decoder, which specifically includes the following sub-steps:

步骤2.1构建多层卷积层的编码器；Step 2.1 Build an encoder with multiple convolutional layers;

其中，编码器包含串联的N组降采样层、正则化层以及激活函数；编码器输入为经步骤一后维度为W*W*c的图像，输出为1*1*2^N+1的表达图像本质的特征向量z；Among them, the encoder includes N groups of downsampling layers, regularization layers and activation functions in series; the encoder input is an image with dimension W*W*c after step 1, and the output is an expression of 1*1*2 ^N+1 The feature vector z of the image essence;

N的取值小于等于log₂W；The value of N is less than or equal to log ₂ W;

降采样层中包括具有卷积核大小为n*n、步长为2的卷积层，通道数从2²以2的倍数递增依次升至2^N+1；The downsampling layer includes a convolutional layer with a convolution kernel size of n*n and a stride of 2, and the number of channels increases from 2 ² to 2 ^N+1 in multiples of 2;

激活函数为负斜率L的LeakyReLU激活函数；The activation function is the LeakyReLU activation function with negative slope L;

n的取值为[3,5]；L取值为[0,1]；The value of n is [3,5]; the value of L is [0,1];

步骤2.2构建多层反卷积层的解码器；Step 2.2 Build a decoder of multiple deconvolution layers;

其中，解码器包括串联的N组升采样层、正则化层以及激活函数；解码器输入为步骤二编码器输出的特征向量z，输出是维度为W*W*c的图像；Wherein, the decoder includes N groups of upsampling layers, regularization layers and activation functions connected in series; the input of the decoder is the feature vector z output by the encoder in step 2, and the output is an image with dimension W*W*c;

升采样层中包括具有卷积核大小为n*n、步长为2的卷积层，通道数从2^N+1以2的倍数依次递减到2²；The upsampling layer includes a convolutional layer with a convolution kernel size of n*n and a stride of 2, and the number of channels is sequentially decreased from ^2N+1 to 2 2 in multiples of ² ;

激活函数为负斜率L的ReLU激活函数，最外层激活函数为Tanh函数；The activation function is the ReLU activation function with negative slope L, and the outermost activation function is the Tanh function;

步骤2.3采用多层感知机结构构建特征空间的判别器1；Step 2.3 uses a multilayer perceptron structure to construct the discriminator 1 of the feature space;

其中，判别器1的输入为步骤2.1输出的表达图像本质的特征向量z；且该判别器1包括K个串联的全连接层、Dropout层以及激活函数；Wherein, the input of the discriminator 1 is the feature vector z that expresses the essence of the image output in step 2.1; and the discriminator 1 includes K series fully connected layers, dropout layers and activation functions;

全连接层的每层神经元个数从2^K以2的倍数递减至2⁰；The number of neurons in each layer of the fully connected layer decreases from ^2K to 20 in multiples of ² ;

Dropout层的随机丢弃参数为p；The random drop parameter of the Dropout layer is p;

激活函数除最后一层的激活函数为sigmoid外，其余K-1个激活函数为负斜率为L的LeakyReLU；In addition to the activation function of the last layer, the activation function is sigmoid, and the remaining K-1 activation functions are LeakyReLU with a negative slope of L;

其中，K的取值范围为[5,log₂W]；p的取值范围为[0,1]；Among them, the value range of K is [5, log ₂ W]; the value range of p is [0, 1];

步骤2.4采用PatchGAN结构构建图像空间的判别器2；Step 2.4 uses the PatchGAN structure to construct the discriminator 2 of the image space;

其中，判别器2包括P个卷积层；前P-1层卷积层包括卷积层、正则化层以及激活函数；Among them, the discriminator 2 includes P convolutional layers; the first P-1 convolutional layer includes a convolutional layer, a regularization layer and an activation function;

卷积层包括具有卷积核大小为n*n、步长为2的卷积层，通道数从2²以2的倍数递增依次升至2^P；The convolution layer includes a convolution layer with a convolution kernel size of n*n and a stride of 2, and the number of channels increases from 2 ² to 2 ^P in a multiple of 2;

最后一层卷积后直接输出，其余P-1个激活函数为负斜率L的LeakyReLU函数；The last layer is directly output after convolution, and the remaining P-1 activation functions are LeakyReLU functions with negative slope L;

其中，P的取值范围为[5,log₂W]；Among them, the value range of P is [5, log ₂ W];

步骤2.5采用多层反卷积层构建复原解码器；Step 2.5 uses multiple deconvolution layers to construct a restoration decoder;

其中，复原解码器包括串联的N组升采样层、正则化层以及激活函数；解码器输入为步骤二编码器的输出向量，输出为维度为W*W*c的图像；Wherein, the restoration decoder includes N groups of up-sampling layers, regularization layers and activation functions connected in series; the input of the decoder is the output vector of the encoder in step 2, and the output is an image of dimension W*W*c;

升采样单元中包括具有卷积核大小为n*n、步长为2的卷积层，通道数从2^N+1以2的倍数依次递减到2²；The upsampling unit includes a convolution layer with a convolution kernel size of n*n and a stride of 2, and the number of channels is sequentially decreased from 2 ^N+1 to 2 2 in multiples of ² ;

前N-1组“升采样层、正则化层以及激活函数中的激活函数”为负斜率L的ReLU激活函数，最后一组“升采样层、正则化层以及激活函数”中的激活函数为Tanh函数；The first N-1 groups of "upsampling layer, regularization layer and activation function in activation function" are ReLU activation functions with negative slope L, and the activation function in the last group of "upsampling layer, regularization layer and activation function" is Tanh function;

步骤三：构建代理任务，具体为：利用训练集中的图像构建以重构为基础的代理任务，即采用局部像素转换、图像亮度非线性变换、局部区域补丁方式构建代理任务，具体包括如下子步骤：Step 3: Constructing a proxy task, specifically: using the images in the training set to construct a reconstruction-based proxy task, that is, using local pixel transformation, image brightness nonlinear transformation, and local area patch methods to construct a proxy task, which specifically includes the following sub-steps :

步骤3.1局部像素转换，即对图像中局部区域不同位置的像素值进行随机交换，输出随机变换后的图像，具体为：在图像中随机选取M₁个大小为[1,7]之间整数的、图像块，在图像块内部对像素进行随机交换；Step 3.1 Local pixel conversion, that is to randomly exchange pixel values at different positions in the local area of the image, and output a randomly transformed image, specifically: randomly select M ₁ integers with a size between [1, 7] in the image. , image block, and randomly swap pixels inside the image block;

步骤3.2图像亮度非线性变换，输出非线性变换后的图像，具体为：通过随机给出的三个控制点基于公式(1)构建贝塞尔映射曲线B(t)，再基于该曲线完成对于图像像素值的映射：Step 3.2 The image brightness is nonlinearly transformed, and the nonlinearly transformed image is output, specifically: constructing a Bezier mapping curve B(t) based on formula (1) through three randomly given control points, and then completing the calculation based on the curve. A map of image pixel values:

B(t)＝(1-t)²P₀+2t(1-t)P₁+t²P₂,t∈[0,1] (1)B(t)=(1-t) ² P ₀ +2t(1-t)P ₁ +t ² P ₂ ,t∈[0,1] (1)

其中，t表示像素的亮度，P₀，P₁，P₂为随机获取的三个控制点；Among them, t represents the brightness of the pixel, and P ₀ , P ₁ , and P ₂ are three control points obtained randomly;

步骤3.3局部区域补丁，即对图像中随机选择的区域进行随机像素值填充，得到局部区域补丁后的图像；Step 3.3 Local area patching, that is, filling the randomly selected area in the image with random pixel values to obtain the image after the local area patching;

在图像中随机选取M₂个大小为

间整数的图像块，将图像块中包含的像素以从均匀分布中获取的随机噪声值作为填充；Randomly select M ₂ in the image of size as

An image block between integers, and the pixels contained in the image block are filled with random noise values obtained from a uniform distribution;

步骤四：构建总损失函数，具体为图像重构损失、特征重构损失、图像空间与特征空间的判别损失、基于代理任务的复原损失的加权和；Step 4: Construct a total loss function, specifically the weighted sum of image reconstruction loss, feature reconstruction loss, discrimination loss between image space and feature space, and restoration loss based on proxy task;

其中，图像重构损失

采用L1损失函数，约束真实图像与重构图像的差异，计算公式如(2)：Among them, the image reconstruction loss

The L1 loss function is used to constrain the difference between the real image and the reconstructed image. The calculation formula is as follows (2):

其中，x代表输入图像，

分别代表前述的解码器与编码器；

表示对输入图像x进行编码得到的编码输出，

表示对输入图像x的编码输出，再进行解码，得到解码输出；‖·‖₁表示1范数；where x represents the input image,

respectively represent the aforementioned decoder and encoder;

represents the encoded output obtained by encoding the input image x,

Represents the encoded output of the input image x, and then decodes it to obtain the decoded output; ‖·‖ ₁ represents the 1 norm;

特征重构损失

同样利用L1损失函数，在特征空间中的图像的特征表达与重构图像特征表达的差异性，计算公式如(3)：Feature reconstruction loss

Also using the L1 loss function, the difference between the feature expression of the image in the feature space and the feature expression of the reconstructed image is calculated as (3):

其中，z为步骤2.1中编码器输出的图像本质的特征向量；

表示对输入特征向量z进行解码获得的解码输出；

表示对输入特征向量z先进行解码再进行编码的输出；Among them, z is the feature vector of the image essence output by the encoder in step 2.1;

represents the decoded output obtained by decoding the input feature vector z;

Represents the output of decoding the input feature vector z first and then encoding it;

图像空间与特征空间的判别器损失

约束由编码器与解码器的输出与真实空间中图像与特征的差异，计算公式如(4)：Discriminator loss in image space and feature space

The constraint is determined by the difference between the output of the encoder and the decoder and the image and features in the real space. The calculation formula is as follows (4):

其中，D_I、D_F分别为图像空间的判别器1与特征空间的判别器2；

为输入图像x经编码器、判别器2输出的结果；

为特征向量z经解码器、判别器1输出的结果；D_I、D_F通过下述损失公式

(5)和(6)进行迭代优化：Among them, D _I and _DF are the discriminator 1 of the image space and the discriminator 2 of the feature space, respectively;

is the result output by the encoder and the discriminator 2 for the input image x;

is the result of the feature vector z output by the decoder and the discriminator 1; D _I , D _F through the following loss formula

(5) and (6) are iteratively optimized:

其中，D_I(x)为输入图像x经判别器1输出结果，D_F(z)为特征向量z经过判别器2输出结果；Wherein, D _I (x) is the output result of the input image x through the discriminator 1, and D _F (z) is the output result of the feature vector z through the discriminator 2;

基于代理任务的复原损失

利用复原的代理任务，强化编码器对于图像特征的提取能力，计算公式如(7)：Proxy task-based restoration loss

Using the restored proxy task to strengthen the encoder's ability to extract image features, the calculation formula is as follows (7):

其中，

代表复原解码器；x_a代表由代理任务获得的变换图像输入；

代表变换图像x_a经编码器、复原解码器后的输出；in,

represents the restoration decoder; x _a represents the transformed image input obtained by the proxy task;

represents the output of the transformed image x _a after the encoder and the restoration decoder;

总损失函数如下公式(8)计算：The total loss function is calculated as formula (8):

其中，a为图像重构损失、b为特征重构损失、c为图像空间与特征空间的判别损失、d为部分损失的权重系数；Among them, a is the image reconstruction loss, b is the feature reconstruction loss, c is the discrimination loss between the image space and the feature space, and d is the weight coefficient of the partial loss;

步骤五：模型训练，得到训练好的编码器和解码器，包括如下子步骤：Step 5: Model training to obtain the trained encoder and decoder, including the following sub-steps:

步骤5.1将正常人的健康眼底图像输入编码器

前向传播获得图像特征向量，输入解码器

中，前向获得重构图像；再将重构图像输入编码器中，前向传播获得重构的特征向量；以步骤3.1-步骤3.3构建的三种图像为编码-复原解码器的输入，原始图像为训练标签，完成对于代理任务的学习，具体为：对输入的健康眼底图像按照代理任务随机变换，输入编码器与复原解码器中，前向获得复原图像；Step 5.1 Input the healthy fundus images of normal people into the encoder

The image feature vector is obtained by forward propagation, which is input to the decoder

, the reconstructed image is obtained in the forward direction; then the reconstructed image is input into the encoder, and the reconstructed feature vector is obtained by forward propagation; the three images constructed in step 3.1-step 3.3 are used as the input of the encoding-restoring decoder, the original The image is a training label to complete the learning of the proxy task, specifically: randomly transform the input healthy fundus image according to the proxy task, input it into the encoder and the restoration decoder, and obtain the restored image forward;

步骤5.2计算图像重构损失与特征重构损失；Step 5.2 Calculate image reconstruction loss and feature reconstruction loss;

步骤5.3将重构的图像与重构的特征向量分别输入图像空间与特征空间的判别器1、2中，计算图像空间与特征空间的判别损失；Step 5.3 Input the reconstructed image and the reconstructed feature vector into the discriminators 1 and 2 of the image space and the feature space respectively, and calculate the discrimination loss of the image space and the feature space;

步骤5.4计算代理任务的复原损失；Step 5.4 Calculate the recovery loss of the agent task;

步骤5.5进行反向传播与参数优化，采用判别器与编码器-解码器交替优化的方式，先对判别器进行优化，再对编码器-解码器进行优化；In step 5.5, backpropagation and parameter optimization are performed, and the discriminator and the encoder-decoder are optimized alternately, and the discriminator is optimized first, and then the encoder-decoder is optimized;

步骤5.6重复步骤5.1-5.5，将训练集中所有图像遍历一遍后，记录该次过程中总损失函数值并绘制曲线，当损失曲线收敛平稳后，对学习率进行调整，以便模型继续学习；Step 5.6 Repeat steps 5.1-5.5. After traversing all the images in the training set, record the total loss function value in this process and draw a curve. When the loss curve converges smoothly, adjust the learning rate so that the model can continue to learn;

步骤5.7保存训练好的编码器、解码器；Step 5.7 Save the trained encoder and decoder;

步骤六：利用训练好的编码器、解码器对测试集图像进行测试，选取阈值用于大批量图像判别，输出图像正常与否的结论，具体为；Step 6: Use the trained encoder and decoder to test the images of the test set, select a threshold for judging large batches of images, and output the conclusion of whether the images are normal or not, specifically;

步骤6.1输入图像经过编码器获得图像特征向量，再经过解码器获得重构图像，将重构图像重新输入到编码器中，获得重构的特征向量；Step 6.1 The input image obtains the image feature vector through the encoder, then obtains the reconstructed image through the decoder, and re-inputs the reconstructed image into the encoder to obtain the reconstructed feature vector;

步骤6.2计算输入图像与重构图像的差异d1，计算图像特征向量与重构特征向量的差异d2；Step 6.2 Calculate the difference d1 between the input image and the reconstructed image, and calculate the difference d2 between the image feature vector and the reconstructed feature vector;

步骤6.3将两部分差异d1，d2求平均作为输入图像的分数值，分数值越大说明图像为病变图像的概率越高，反之则越低；根据测试集的标签，选取最优的阈值，如大于该阈值则判定输入图像为病变图像，反之则为正常图像；Step 6.3 Average the difference d1 and d2 of the two parts as the score value of the input image. The larger the score value, the higher the probability that the image is a diseased image, and vice versa; the optimal threshold is selected according to the label of the test set, such as If it is greater than the threshold, the input image is determined to be a diseased image, otherwise, it is a normal image;

至此，步骤一到步骤六，完成了无病变影像训练的疾病诊断方法。So far, from steps 1 to 6, the disease diagnosis method without lesion image training has been completed.

有益效果beneficial effect

本发明是一种基于无病变影像训练的眼底疾病智能诊断方法，与现有的疾病诊断算法相比，具有如下有益效果：The present invention is an intelligent diagnosis method for fundus diseases based on non-pathological image training. Compared with the existing disease diagnosis algorithm, the invention has the following beneficial effects:

1.所述方法直接利用健康人的影像进行训练，无需具有病变的数据平衡训练集，有效地解决了现有算法依赖疾病图像数据，同时迎合了临床疾病数据稀少的应用现状；1. The method directly uses the images of healthy people for training, and does not require a data balance training set with lesions, which effectively solves the problem that existing algorithms rely on disease image data, and caters to the current application status of scarce clinical disease data;

2.所述方法通过图像空间、特征空间两个维度对图像重构进行约束，提高了模型对于正常图像的特征感知能力；2. The method constrains image reconstruction through two dimensions of image space and feature space, and improves the feature perception ability of the model for normal images;

3.所述方法在重构基础上，提出了图像与特征空间的判别损失，进一步提升了模型各结构部分：编码器、解码器对于所训练的健康影像的特征学习能力；进一步提升了模型应对疾病图像的识别能力；3. On the basis of reconstruction, the method proposes a discriminative loss between images and feature spaces, which further improves the structural parts of the model: the encoder and decoder feature learning capabilities for the trained healthy images; further improves the model response Recognition ability of disease images;

4.所述方法对于代理任务的提出，在数据量一定的情况下，通过不同形式的还原任务，使得模型学习到了深层图像的边界以及结构信息，在对健康图像重构的任务上提供了帮助，提升了模型在疾病识别方面的性能。4. The method proposed for the proxy task, under the condition of a certain amount of data, through different forms of restoration tasks, the model can learn the boundary and structural information of the deep image, which provides help in the task of reconstructing healthy images. , which improves the performance of the model in disease recognition.

附图说明Description of drawings

图1是本发明一种基于无病变影像训练的眼底疾病智能诊断方法所依托的网络结构示意图；1 is a schematic diagram of the network structure on which an intelligent diagnosis method for fundus diseases based on disease-free image training of the present invention is based;

图2是本发明一种基于无病变影像训练的眼底疾病智能诊断方法实例的流程示意图；2 is a schematic flowchart of an example of an intelligent diagnosis method for fundus diseases based on disease-free image training according to the present invention;

图3是本发明一种基于无病变影像训练的眼底疾病智能诊断方法的测试流程；Fig. 3 is a kind of test flow of the present invention's method for intelligent diagnosis of fundus diseases based on non-pathological image training;

图4是本发明一种基于无病变影像训练的眼底疾病智能诊断方法对比现有方法的ROC曲线图。FIG. 4 is a ROC curve diagram of an intelligent diagnosis method for fundus diseases based on non-pathological image training of the present invention compared with the existing method.

具体实施方式Detailed ways

下面结合附图和实施例对本发明一种基于无病变影像训练的眼底疾病智能诊断方法做进一步说明和详细描述。A method for intelligent diagnosis of fundus diseases based on disease-free image training of the present invention will be further described and described in detail below with reference to the accompanying drawings and embodiments.

实施例1Example 1

本实施例阐述了本发明所述的一种基于无病变影像训练的眼底疾病智能诊断方法的具体实施。This embodiment describes the specific implementation of the method for intelligent diagnosis of fundus diseases based on non-lesion image training according to the present invention.

本发明可应用于不同规模医院及医疗机构对于疾病的筛查工作，通过对就诊的患者进行医学图像采集、模型分类识别，根据步骤六的图像间差异即可判定患者是否为健康或患病个体。在不同规模的医疗机构中，可根据不同地点的设备计算能力调整统一图像像素大小W。The present invention can be applied to the screening of diseases in hospitals and medical institutions of different scales. By collecting medical images, classifying and recognizing models of the patients, whether the patients are healthy or sick individuals can be determined according to the difference between the images in step 6. . In medical institutions of different scales, the uniform image pixel size W can be adjusted according to the computing power of devices in different locations.

图1是本发明一种基无病变影像训练的眼底疾病智能诊断方法所依托的网络结构示意图。图1中

为编码器，

为解码器，

为复原解码器，D_F为特征空间的判别器，D_I为图像空间的判别器，x为输入图像，z为图像的特征向量，

为重构图像，x′_a为代理任务的复原图像；通过图像空间与特征空间的重构损失、图像空间与特征空间的判别损失与代理任务的复原损失对模型进行参数更新；FIG. 1 is a schematic diagram of the network structure on which an intelligent diagnosis method for fundus diseases based on disease-free image training of the present invention is based. In Figure 1

for the encoder,

for the decoder,

In order to restore the decoder, D _F is the discriminator of the feature space, D _I is the discriminator of the image space, x is the input image, z is the feature vector of the image,

In order to reconstruct the image, x′ _a is the restored image of the proxy task; the parameters of the model are updated through the reconstruction loss of the image space and the feature space, the discriminative loss of the image space and the feature space and the restoration loss of the proxy task;

图2是本发明的具体实施方式中疾病识别算法流程图，以眼底OCT图像为具体实例，包括如下步骤：Fig. 2 is a flowchart of the disease identification algorithm in the specific embodiment of the present invention, taking the fundus OCT image as a specific example, including the following steps:

步骤A：构建训练集与测试集；Step A: Construct training set and test set;

采用由光学相干断层扫描技术(OCT)获取的1万张影像为实施图像，只取无病变的健康数据作为训练集；同时，取等量的200张健康数据与200张病变数据作为测试集；10,000 images obtained by optical coherence tomography (OCT) were used as the implementation images, and only the healthy data without lesions was taken as the training set; at the same time, the same amount of 200 healthy data and 200 lesion data were taken as the test set;

经过图像质量筛选，并剔除模糊度过高、过暗的不符合条件数据后，对图像进行缩放至256*256*1的像素大小，并归一至[-1,1]的像素值区间；After the image quality is filtered and the unqualified data with too high blur and too dark is removed, the image is scaled to the pixel size of 256*256*1, and normalized to the pixel value range of [-1,1];

经过步骤A，完成了对于训练集与测试集的构建；After step A, the construction of the training set and the test set is completed;

步骤B：网络模型的构建；Step B: Construction of the network model;

对编码器、解码器、复原解码器、判别器模型1、判别器模型2进行搭建；并根据损失函数对模型参数进行更新，将训练好的模型，如编码器、解码器进行保存，用以模型测试；Build the encoder, decoder, restoration decoder, discriminator model 1, and discriminator model 2; update the model parameters according to the loss function, and save the trained models, such as encoder and decoder, for use model testing;

步骤B.1编码器模型包含串联的8组降采样层、正则化层以及激活函数；编码器模型输入为步骤A处理后的维度为256*256*1的图像，输出为1*1*512的特征向量z；降采样层为卷积核大小为4*4、步长为2的卷积层，通道数从4以2的倍数递增依次升至512；激活函数为负斜率0.2的LeakyReLU激活函数；Step B.1 The encoder model includes 8 groups of downsampling layers, regularization layers and activation functions in series; the input of the encoder model is the image processed in step A with a dimension of 256*256*1, and the output is 1*1*512 The downsampling layer is a convolutional layer with a convolution kernel size of 4*4 and a stride of 2. The number of channels increases from 4 to 512 in multiples of 2; the activation function is LeakyReLU activation with a negative slope of 0.2 function;

步骤B.2解码器包括串联的8组升采样层、正则化层以及激活函数；解码器模型输入为编码器的输出向量，输出为维度为256*256*1的图像；升采样层中包括具有卷积核大小为4*4、步长为2的卷积层，通道数从512以2的倍数依次递减到4；激活函数为负斜率0.2的ReLU激活函数，最外层激活函数为Tanh函数；Step B.2 The decoder includes 8 groups of upsampling layers, regularization layers and activation functions connected in series; the input of the decoder model is the output vector of the encoder, and the output is an image with a dimension of 256*256*1; the upsampling layer includes It has a convolutional layer with a convolution kernel size of 4*4 and a stride of 2. The number of channels decreases from 512 to 4 in multiples of 2; the activation function is the ReLU activation function with a negative slope of 0.2, and the outermost activation function is Tanh function;

步骤B.3复原解码器包括串联的8组升采样层、正则化层以及激活函数；解码器模型输入为编码器的输出向量，输出为维度为256*256*1的图像；升采样单元中包括具有卷积核大小为4*4、步长为2的卷积层，通道数512以2的倍数依次递减到4；激活函数为负斜率0.2的ReLU激活函数，最外层激活函数为Tanh函数；Step B.3 The restoration decoder includes 8 groups of upsampling layers, regularization layers and activation functions connected in series; the input of the decoder model is the output vector of the encoder, and the output is an image with a dimension of 256*256*1; in the upsampling unit Including a convolutional layer with a convolution kernel size of 4*4 and a stride of 2, the number of channels 512 is decreased to 4 in multiples of 2; the activation function is the ReLU activation function with a negative slope of 0.2, and the outermost activation function is Tanh function;

步骤B.4判别器模型1的输入为编码器输出的特征向量z，包括7个串联的全连接层、Dropout层以及激活函数；全连接层的每层步骤D.4利用判别器模型，采用对抗的判别损失，提升了编码器对于图像编码的能力，同时提升了解码器对于特征向量重构图像的能力；使得模型了对于非训练影像(疾病图像)的响应大，提升模型对于疾病的识别能力。Step B.4 The input of the discriminator model 1 is the feature vector z output by the encoder, including 7 serially connected layers, dropout layers and activation functions; step D.4 of each layer of the fully connected layer uses the discriminator model, using The adversarial discriminant loss improves the encoder’s ability to encode images and the decoder’s ability to reconstruct images from feature vectors; it makes the model more responsive to non-training images (disease images) and improves the model’s ability to recognize diseases ability.

神经元个数从128以2的倍数递减至1；Dropout层的随机丢弃参数为0.5；激活函数除最后一层的激活函数为sigmoid外，其余部分激活函数为负斜率为0.2的LeakyReLU；The number of neurons is decreased from 128 to 1 in multiples of 2; the random drop parameter of the Dropout layer is 0.5; the activation function is LeakyReLU with a negative slope of 0.2 except that the activation function of the last layer is sigmoid;

步骤B.5判别器模型2包括5个卷积层；前4层卷积层包括卷积层、正则化层以及激活函数；卷积层包括具有卷积核大小为4*4、步长为2的卷积层，通道数从4以2的倍数递增依次升至32；激活函数为负斜率为0.2的LeakyReLU，最后一层卷积后直接输出；Step B.5 The discriminator model 2 includes 5 convolutional layers; the first 4 convolutional layers include convolutional layers, regularization layers and activation functions; For the convolutional layer of 2, the number of channels increases from 4 to 32 in multiples of 2; the activation function is LeakyReLU with a negative slope of 0.2, and the last layer is directly output after convolution;

步骤C:代理任务的构建；Step C: the construction of agency task;

在步骤A的训练图像中，对每个图像随机进行局部像素转换、亮度非线性变换以及局部区域补丁中的某一变换；In the training image of step A, randomly perform local pixel conversion, luminance nonlinear conversion and a certain conversion in the local area patch for each image;

局部像素转换利用随机选取6个大小为[1,6]之间整数的图像块，在每一个选取的图像块中对像素进行随机交换；Local pixel conversion uses randomly selected 6 image blocks whose size is an integer between [1, 6], and randomly exchanges pixels in each selected image block;

亮度非线性变换利用公式(1)中构建的贝塞尔映射曲线，对图像的像素值进行调控，完成对于像素值的变换；The non-linear brightness transformation uses the Bezier mapping curve constructed in formula (1) to control the pixel value of the image and complete the transformation of the pixel value;

局部区域补丁在图像中随机选取6个大小为[42,51]间整数的图像块，将图像块中包含的像素用从均匀分布中获取的随机噪声值进行填充；The local area patch randomly selects 6 image blocks whose size is an integer between [42, 51] in the image, and fills the pixels contained in the image block with random noise values obtained from a uniform distribution;

步骤D：损失函数与模型训练；Step D: Loss function and model training;

步骤D.1构建模型损失函数，将数据表示为

其中

代表健康的训练图像；x_a为经过代理任务处理后的图像；本方法通过优化加权的损失函数

进行参数更新，具体内容如下：Step D.1 Build the model loss function, representing the data as

in

represents a healthy training image; x _a is the image processed by the proxy task; this method optimizes the weighted loss function

Update the parameters, the details are as follows:

步骤D.2图像的重构损失

Step D.2 Image Reconstruction Loss

步骤D.3特征的重构损失

Step D.3 Reconstruction Loss of Features

步骤D.4图像空间与特征空间的判别器损失

Step D.4 Discriminator Loss in Image Space and Feature Space

步骤D.5代理任务的复原损失

Step D.5 Recovery Loss for Proxy Tasks

步骤D.6总体损失函数

Step D.6 Overall Loss Function

其中a＝1,b＝10,c＝10,d＝10；where a=1, b=10, c=10, d=10;

步骤E：模型训练，更新并保存参数；训练如图2所示，首先将处理好的正常图像依次输入编码器-解码器-编码器中，依次获得特征向量，重构图像以及重构的特征向量，计算图像重构损失

与特征重构损失

将重构的特征向量以及重构图像分别初入特征判别器模型1与特征判别器模型2中以计算判别器损失

通过步骤A.2的图像依次通过编码器-复原解码器中，依次获得复原特征向量与重构复原图像，计算图像的复原损失

按照步骤D.1的损失权重对最终的损失

进行反向传播与参数优化，优化方式选用Adam，批大小为32，每一批图像对判别器进行一次优化，再对生成器进行一次优化，学习率设为2e^-4；最后将训练好的模型保存；Step E: Model training, update and save the parameters; the training is shown in Figure 2. First, the processed normal images are input into the encoder-decoder-encoder in turn, and the feature vectors, reconstructed images and reconstructed features are obtained in turn. vector, computes the image reconstruction loss

with feature reconstruction loss

Enter the reconstructed feature vector and reconstructed image into feature discriminator model 1 and feature discriminator model 2 respectively to calculate the discriminator loss

The image in step A.2 passes through the encoder-recovery decoder in turn, obtains the restoration feature vector and reconstructs the restored image in turn, and calculates the restoration loss of the image

Follow the loss weights in step D.1 to the final loss

Backpropagation and parameter optimization are performed. The optimization method is Adam. The batch size is 32. The discriminator is optimized once for each batch of images, and the generator is optimized again. The learning rate is set to 2e ^-4 . Finally, the trained model save;

其中，生成器包括编码器、解码器和复原解码器；Wherein, the generator includes an encoder, a decoder and a recovery decoder;

步骤F：模型测试与图像分类；Step F: Model testing and image classification;

如图3所示，将输入图像送入模型中，计算输入图像与重构图像的差异、图像特征与重构图像特征的差异，计算两者平均作为图像判别为病变图像的概率，通过阈值筛选，对病变影像进行识别。As shown in Figure 3, the input image is sent into the model, the difference between the input image and the reconstructed image, the difference between the image feature and the reconstructed image feature are calculated, and the average of the two is calculated as the probability of the image to be judged as the lesion image, and the threshold is filtered. , to identify the lesion image.

至此，就实现了基于无病变影像的疾病诊断方法的全过程。实验结果ROC曲线如图4所示，本方法的所在曲线远高于其他方法，表明该方法在缺少病变图像的训练条件下，可以很好地完成对于健康影像、病变影像的识别，弥补了现有算法在类别缺失条件下无法进行临床分类任务的问题。So far, the whole process of the disease diagnosis method based on non-lesion images has been realized. The ROC curve of the experimental results is shown in Figure 4. The curve of this method is much higher than that of other methods, indicating that this method can complete the recognition of healthy images and lesion images well under the training condition of lack of lesion images, which makes up for the current situation. There is a problem that the algorithm cannot perform clinical classification tasks under the condition of missing categories.

所述方法直接利用健康人的影像进行训练，无需具有病变的数据平衡训练集，有效地解决了现有算法依赖疾病图像数据，同时迎合了临床疾病数据稀少的应用现状，具体体现为：步骤A通过图像的变换削减了算法对于数据量的需求；步骤C的训练过程解决了现有算法对于多类别训练图像的依赖；The method directly uses the images of healthy people for training, and does not require a data balance training set with lesions, effectively solves the problem that existing algorithms rely on disease image data, and caters to the application status quo of clinical disease data is scarce, and is embodied in the following steps: Step A The requirement of the algorithm for the amount of data is reduced by the transformation of the image; the training process of step C solves the dependence of the existing algorithm on multi-category training images;

所述方法通过图像空间、特征空间两个维度对图像重构进行约束，提高了模型对于正常图像的特征感知能力。步骤D.2与步骤D.3利用L1损失对两个空间原始输入与重构输出进行比较，通过约束差异，提高模型对于图像特性的感知能力；The method constrains image reconstruction through two dimensions of image space and feature space, and improves the feature perception ability of the model for normal images. Steps D.2 and D.3 use the L1 loss to compare the original input and the reconstructed output of the two spaces, and improve the model's ability to perceive image characteristics by constraining the difference;

所述方法在重构的基础上，提出了图像与特征空间的判别损失，进一步提升了模型各结构部分：编码器、解码器对于所训练的健康影像的特征学习能力；进一步提升了模型应对疾病图像的识别能力，体现为：步骤D.4利用判别器模型，采用对抗的判别损失，提升了编码器对于图像编码的能力，同时提升了解码器对于特征向量重构图像的能力；使得模型了对于非训练影像(疾病图像)的响应大，提升模型对于疾病的识别能力。On the basis of reconstruction, the method proposes a discriminative loss between images and feature spaces, which further improves each structural part of the model: the encoder and decoder's ability to learn the features of the trained healthy images; further improves the model's ability to deal with diseases The image recognition ability is reflected as follows: Step D.4 uses the discriminator model and adopts the adversarial discriminant loss to improve the encoder's ability to encode images, and at the same time improve the decoder's ability to reconstruct images from feature vectors; The response to non-training images (disease images) is large, which improves the model's ability to identify diseases.

所述方法对于代理任务的提出，在数据量一定的情况下，通过不同形式的还原任务，使得模型学习到了深层图像的边界、结构等信息，在对于健康图像重构的任务上提供了帮助，提升了模型在疾病识别的上的性能。步骤C通过对已有的健康图像进行代理任务变换，结合步骤D.6的复原损失，根据不同任务提升了模型对于边界、结构信息的学习，强化了模型对于重构任务的准确率；进而提升模型对于疾病图像的识别能力。The proposed method for the proxy task, under the condition of a certain amount of data, through different forms of restoration tasks, enables the model to learn the boundary, structure and other information of the deep image, which provides help for the task of healthy image reconstruction. Improve the performance of the model in disease recognition. In step C, the existing healthy images are transformed by proxy tasks, combined with the restoration loss of step D.6, the learning of the boundary and structural information of the model is improved according to different tasks, and the accuracy of the model for the reconstruction task is strengthened; The ability of the model to recognize disease images.

以上所述为本发明的较佳实施例而已，本发明不应该局限于该实施例和附图所公开的内容。凡是不脱离本发明所公开的精神下完成的等效或修改，都落入本发明保护的范围。The above descriptions are only the preferred embodiments of the present invention, and the present invention should not be limited to the contents disclosed in the embodiments and the accompanying drawings. All equivalents or modifications accomplished without departing from the disclosed spirit of the present invention fall into the protection scope of the present invention.

Claims

1. An intelligent diagnosis method for fundus diseases based on non-pathological image training is characterized in that: the method is realized by the following steps:

the method comprises the following steps: preprocessing the collected images to construct a data set; wherein, the data set is divided into a training set and a testing set;

step two: designing a network model comprising an encoder, a decoder, a discriminator 1, a discriminator 2 and a restoration decoder, and specifically comprising the following substeps:

step 2.1, constructing an encoder of the multilayer convolution layer;

the encoder comprises N groups of down-sampling layers, a regularization layer and an activation function which are connected in series;

step 2.2, constructing a decoder of the multilayer deconvolution layer;

the decoder comprises N groups of up-sampling layers, a regularization layer and an activation function which are connected in series;

step 2.3, constructing a discriminator 1 of a feature space by adopting a multi-layer perceptron structure;

step 2.4, constructing a discriminator 2 of an image space by adopting a PatchGAN structure;

wherein, the discriminator 2 comprises P convolution layers; the front P-1 convolutional layers comprise convolutional layers, regularization layers and activation functions;

step 2.5, a restoration decoder is constructed by adopting a plurality of layers of deconvolution layers;

the recovery decoder comprises N groups of up-sampling layers, a regularization layer and an activation function which are connected in series;

step three: constructing an agent task, specifically: the method comprises the following steps of constructing an agent task based on reconstruction by using images in a training set, namely constructing the agent task by adopting a local pixel conversion mode, an image brightness nonlinear transformation mode and a local region patch mode, and specifically comprising the following substeps:

step 3.1, local pixel conversion, namely, randomly exchanging pixel values of different positions of a local area in an image and outputting the image after random conversion;

step 3.2, carrying out nonlinear transformation on the image brightness, and outputting the image subjected to the nonlinear transformation;

step 3.3, local area patching, namely, carrying out random pixel value filling on the randomly selected area in the image to obtain the patched image of the local area;

step four: constructing a total loss function, specifically a weighted sum of image reconstruction loss, feature reconstruction loss, discrimination loss of an image space and a feature space and restoration loss based on an agent task;

step five: model training to obtain a trained encoder and decoder, comprising the following substeps:

step 5.1 inputting the healthy eyeground image of the normal person into the encoder

Forward propagation to obtain image feature vector, input to decoder

In the middle, a reconstructed image is obtained in the forward direction; inputting the reconstructed image into an encoder, and carrying out forward propagation to obtain a reconstructed feature vector; taking the three images constructed in the steps 3.1 to 3.3 as the input of a coding-restoring decoder, and the original image as a training label, the learning of the agent task is completed, and the method specifically comprises the following steps: randomly transforming the input healthy eye fundus images according to the proxy task, inputting the transformed healthy eye fundus images into an encoder and a recovery decoder, and obtaining a recovery image in the forward direction;

step 5.2, calculating image reconstruction loss and characteristic reconstruction loss;

step 5.3, inputting the reconstructed image and the reconstructed feature vector into discriminators 1 and 2 of an image space and a feature space respectively, and calculating discrimination loss of the image space and the feature space;

step 5.4, calculating the recovery loss of the agent task;

step 5.5, performing back propagation and parameter optimization, and optimizing the discriminator and the encoder-decoder by adopting a mode of alternately optimizing the discriminator and the encoder-decoder;

step 5.6, repeating the steps 5.1-5.5, traversing all images in the training set once, recording the value of the total loss function in the process, drawing a curve, and adjusting the learning rate after the loss curve is converged stably so as to facilitate the model to continue learning;

step 5.7, storing the trained encoder and decoder;

step six: testing the images of the test set by using a trained encoder and a trained decoder, selecting a threshold value for judging the large-batch images, and outputting a conclusion whether the images are normal or not, specifically;

step 6.1, the input image is processed by an encoder to obtain an image characteristic vector, then processed by a decoder to obtain a reconstructed image, and the reconstructed image is input into the encoder again to obtain the characteristic vector of the reconstructed image;

step 6.2, calculating the difference d1 between the input image and the reconstructed image, and calculating the difference d2 between the image feature vector and the reconstructed feature vector;

step 6.3, averaging the differences d1 and d2 of the two parts to be used as a score value of the input image, wherein the larger the score value is, the higher the probability that the image is a lesion image is, and the lower the probability is otherwise; and selecting an optimal threshold according to the label of the test set, and if the optimal threshold is larger than the optimal threshold, judging that the input image is a lesion image, otherwise, judging that the input image is a normal image.

2. An intelligent diagnosis method for fundus diseases based on non-pathological image training as claimed in claim 1, wherein: the method comprises the following steps: screening the collected images, eliminating images with poor image quality, unifying the resolution of the screened images into the same dimension W C, and normalizing the pixel values to the range of [ -1,1 ].

3. An intelligent diagnosis method for fundus diseases based on non-pathological image training as claimed in claim 2, wherein: c is 1 or more.

4. A method as claimed in claim 3, wherein the method comprises the following steps: the training set and the testing set both adopt clinically collected ophthalmic images; the images in the training set are formed by the ophthalmic images of healthy individuals, and the images in the testing set are a set of the images of the healthy individuals and the diseased individuals; the image with poor quality specifically includes: too dark picture, large deviation of shooting angle of view, and shooting a jittery image.

5. An intelligent diagnosis method for fundus diseases based on non-pathological image training as claimed in claim 4, wherein: in step 2.1 the encoder input is the image with dimension W x c after step one and the output is 1 x 2^N+1A feature vector z expressing the essence of the image; the value of N is less than or equal to log₂W; the down-sampling layer comprises convolution layer with convolution kernel size of n × n and step length of 2, and the number of channels is from 2²Sequentially increasing to 2 by multiple increment of 2^N+1(ii) a The activation function is a LeakyReLU activation function with a negative slope L; n is [3,5]](ii) a L is [0,1]]。

6. An intelligent diagnosis method for fundus diseases based on non-pathological image training as claimed in claim 5, wherein: in step 2.2, the decoder inputs the eigenvector z output by the encoder in step two, and the output is an image with the dimension of W x c; the up-sampling layer comprises convolution layer with convolution kernel size n × n and step length 2, and the number of channels is from 2^N+1Sequentially decreasing to 2 by multiple of 2²(ii) a The activation function is a ReLU activation function with a negative slope L, and the outermost activation function is a Tanh function.

7. An intelligent diagnosis method for fundus diseases based on non-pathological image training as claimed in claim 6, wherein: in step 2.3, the input of the discriminator 1 is the feature vector z expressing the essence of the image output in step 2.1; the discriminator 1 comprises K series-connected full-connection layers, a Dropout layer and an activation function; the number of neurons in each layer of the full-connection layer is from 2^KDecreasing by a factor of 2 to 2⁰(ii) a The random discard parameter of the Dropout layer is p; the activation functions are all K-1 activation functions except the activation function of the last layer which is sigmoid, and the other activation functions are LeakyReLU with the negative slope L; wherein, the value range of K is [5, log₂W](ii) a p has a value range of [0,1]]。

8. An intelligent diagnosis method for fundus diseases based on non-pathological image training as claimed in claim 7, wherein: in step 2.4, the convolution layer includes a convolution kernel having a convolution kernel size of n x n,Convolution layer with step length of 2 and channel number of 2²Sequentially increasing to 2 by multiple increment of 2^P(ii) a Directly outputting the convolution of the last layer, wherein the rest P-1 activation functions are LeakyReLU functions with negative slope L; wherein, the value range of P is [5, log₂W]；

In step 2.5, the decoder inputs the output vector of the encoder in step two and outputs the image with dimension W x c; the up-sampling unit comprises convolution layers with convolution kernel size of n x n and step length of 2, and the number of channels is from 2^N+1Sequentially decreasing to 2 by multiple of 2²(ii) a The activation function in the first N-1 groups of the up-sampling layer, the regularization layer and the activation function is a ReLU activation function with a negative slope L, and the activation function in the last group of the up-sampling layer, the regularization layer and the activation function is a Tanh function.

9. An intelligent diagnosis method for fundus diseases based on non-pathological image training as claimed in claim 8, wherein: step 3.1, specifically: randomly selecting M in an image₁Size of [1,7 ]]The pixels of the image blocks are randomly exchanged inside the image blocks;

step 3.2, specifically: constructing a Bezier mapping curve B (t) based on formula (1) through three control points which are given randomly, and completing mapping of image pixel values based on the curve:

B(t)＝(1-t)²P₀+2t(1-t)P₁+t²P₂，t∈[0，1] (1)

where t denotes the luminance of the pixel, P₀，P₁，P₂Three control points are randomly acquired;

step 3.3, randomly selecting M in the image₂Each size is

Inter-integer image blocks, the pixels contained in the image block are filled with random noise values obtained from a uniform distribution.

10. According to claim9 the intelligent diagnosis method for fundus diseases based on the training of the non-pathological image is characterized in that: in step four, image reconstruction is lost

And (3) constraining the difference between the real image and the reconstructed image by adopting an L1 loss function, and calculating the formula as (2):

wherein, x represents the input image,

respectively representing the decoder and the encoder;

representing the coded output resulting from coding the input image x,

representing the encoded output of the input image x, and then decoding to obtain a decoded output; II-₁Represents a norm of 1;

loss of feature reconstruction

The difference between the feature expression of the image in the feature space and the feature expression of the reconstructed image is calculated by using the L1 loss function, and the formula is shown as (3):

wherein z is a feature vector z of the essence of the image output by the encoder in the step 2.1;

represents a decoded output obtained by decoding the input feature vector z;

represents the output of decoding and then encoding the input eigenvector z;

discriminator loss of image space and feature space

The difference between the output of the encoder and the decoder and the image and the feature in the real space is constrained, and the calculation formula is as follows (4):

wherein D is_I、D_FA discriminator 1 of an image space and a discriminator 2 of a feature space, respectively;

is the output result of the input image x through the encoder and the discriminator 2;

is the result of the output of the characteristic vector z by the decoder and the discriminator 1; d_I、D_FBy the following loss equation

(5) And (6) performing iterative optimization:

wherein D is_I(x) For the input image x, the result, D, is output via a discriminator 1_F(z) the result of the feature vector z output by the discriminator 2;

agent task based recovery loss

And enhancing the extraction capability of the encoder on the image characteristics by utilizing the restored proxy task, wherein the calculation formula is as follows:

wherein,

represents a restoration decoder; x is the number of_aRepresenting transformed image input obtained by the agent task;

representing a transformed image x_aThe output after the encoder and the recovery decoder;

the final loss function is calculated as follows in equation (8):

where a is an image reconstruction loss, b is a feature reconstruction loss, c is a discrimination loss between an image space and a feature space, and d is a weight coefficient of a partial loss.