CN110120055B

CN110120055B - Fundus fluorography image non-perfusion area automatic segmentation method based on deep learning

Info

Publication number: CN110120055B
Application number: CN201910294122.9A
Authority: CN
Inventors: 叶娟; 吴健; 金凯; 尤堃; 潘相吉; 陆逸飞; 刘志芳
Original assignee: Zhejiang University ZJU
Current assignee: Zhejiang University ZJU
Priority date: 2019-04-12
Filing date: 2019-04-12
Publication date: 2023-04-18
Anticipated expiration: 2039-04-12
Also published as: CN110120055A

Abstract

The invention discloses a fundus fluorography image perfusion-free area automatic segmentation method based on deep learning. The invention trains the constructed convolutional neural network aiming at the fundus contrast image of the perfusion region-free fundus contrast image which is manually segmented and labeled by a doctor, so that the final output value of the convolutional neural network accords with the labeling result of the doctor, and the trained convolutional neural network can be used for automatically segmenting and identifying the perfusion region-free fundus retina in diabetic retinopathy. The method of the invention realizes automatic learning of Xi Suoxu characteristics from a training example fundus angiography image library and semantic segmentation by adopting deep learning through fundus angiography images without perfusion region position marks. Parameters of the convolutional neural network are continuously optimized in the training process, and data characteristics are extracted, so that when fundus laser treatment is carried out on diabetic retinopathy in the auxiliary clinical application, a non-perfusion area needing to be treated is identified, and fundus laser is accurately assisted.

Description

Fundus fluorography image perfusion-free area automatic segmentation method based on deep learning

Technical Field

The invention belongs to the field of image processing technology, and particularly relates to a method for automatically segmenting a perfusion-free region of an eye fundus fluorescence angiography image based on deep learning.

Background

Diabetic Retinopathy (DR), the most common complication of diabetes, is an abnormal metabolism of insulin, which causes alterations in the ocular tissue, nerves and blood vessels microcirculation, resulting in impairment of the ocular nutrition and visual function. At present, more than 6 million DR patients exist in the world, no obvious abnormal symptoms exist in the initial stage of disease, but the vision loss is finally caused, and the disease becomes one of four approximate blindness diseases. Therefore, early discovery and early treatment of DR is very important and is closely related to the vision of the patient's prognosis.

Fundus laser is currently the most important treatment for DR. However, when fundus laser therapy is performed at present, an experienced fundus disease doctor needs to locate a focus and set the energy of laser according to Fundus Fluorography (FFA), and the DR fundus focus needing therapy at multiple positions in real time and accurately is difficult to identify, so that fundus laser can not be accurately assisted by a non-perfusion area on an FFA image. The computer aided diagnosis (CADx) system of DR greatly reduces the reading pressure of doctors, improves the working efficiency, and most importantly, can reduce the influence of personal subjective factors when doctors diagnose and improve the accuracy of clinical diagnosis. The conventional DR CADx system cannot intelligently identify focuses needing laser treatment, cannot tightly combine auxiliary diagnosis and treatment, and has a great promotion space. In order to realize intelligent identification of DR focus needing fundus laser treatment, an ophthalmologist needs to accurately mark a non-perfusion area needing treatment in an FFA image, and then a Convolutional Neural Network (CNN) is constructed to train and learn the marked FFA image. A perfusion-free area recognition system based on FFA image deep learning is constructed, DR focuses needing laser treatment are accurately recognized, the technology is a key technology for realizing a DR intelligent eye fundus laser navigation system, and urgent clinical requirements are met in DR diagnosis and treatment.

Disclosure of Invention

In order to solve the problems in the background art, the invention provides a fundus fluorography image perfusion-free area automatic segmentation method based on deep learning, which can fully utilize the information in an FFA image so as to realize semantic segmentation and identification of a DR perfusion-free area.

The technical scheme adopted by the invention is as follows:

the invention comprises the following steps:

step 1: collecting and segmenting fundus fluorescence angiography images, marking the fundus fluorescence angiography images as three types of images including a non-perfusion area, a non-perfusion area and a laser spot, and putting the images including the non-perfusion area into a data set;

step 2: preprocessing the images containing the non-perfusion areas in the data set in the step 1, and forming data of a training database by the preprocessed images containing the non-perfusion areas; the preprocessing comprises the steps of sequentially carrying out image denoising smoothing, image contrast enhancement, image downsampling and pixel normalization processing on the images containing the non-perfusion areas;

and step 3: amplifying the training database data in the step 2 to obtain an amplified training data set, wherein the amplification method adopts a method of image inversion and random Gaussian noise addition;

and 4, step 4: carrying out contour line marking on the image containing the non-perfusion area in the amplified training data set, and converting the image containing the non-perfusion area after contour line marking into a binary segmentation image by using water flooding filling;

and 5: constructing a convolutional neural network;

step 6: training the convolutional neural network in the step 5 by adopting the training data set processed in the step 4, and adjusting parameters of the convolutional neural network according to a set learning rate during training so as to obtain a convolutional neural network which is used for semantic segmentation of the non-perfusion area after multiple times of training;

and 7: and (4) inputting the image to be segmented into the convolutional neural network trained in the step (6), and calculating the output value of the last layer in the convolutional neural network through a softmax function to obtain the classification probability of each pixel point in the image to be segmented, so as to realize the semantic segmentation of the perfusion area of the image to be segmented.

In the step 1, the fundus fluorography image mainly comprises two images including a laser spot and an image without the laser spot, wherein the image with the laser spot is divided into an image without a perfusion area and an image without the perfusion area.

The convolutional neural network construction process of the step 5 is as follows:

the convolutional neural network is mainly formed by sequentially connecting an input layer, four up-sampling modules, four down-sampling modules, an output convolutional module and an output convolutional layer in a data transmission sequence, wherein the input layer is input into a first up-sampling module, the four up-sampling modules are sequentially connected, a fourth up-sampling module is input into the first down-sampling module, the four down-sampling modules are sequentially connected, and the fourth down-sampling module is output and connected to the output convolutional layer through the output convolutional module;

the up-sampling module comprises two convolution modules and a maximum pooling layer, and the two convolution modules are sequentially connected and output to the maximum pooling layer; the down-sampling module comprises two convolution modules and a down-sampling layer, and the two convolution modules are sequentially connected and output to the down-sampling layer;

an attention mechanism is fused between the second convolution module of the first up-sampling module and the second convolution module of the fourth down-sampling module, an attention mechanism is fused between the second convolution module of the second up-sampling module and the second convolution module of the third down-sampling module, an attention mechanism is fused between the second convolution module of the third up-sampling module and the second convolution module of the second down-sampling module, and an attention mechanism is fused between the second convolution module of the fourth up-sampling module and the second convolution module of the first down-sampling module;

each attention mechanism comprises an input layer, a rectification linear unit layer, a 1*1 convolution layer, a Sigmoid function classification layer and a resampling layer which are sequentially connected, wherein the input layer consists of two 1*1 convolution layers, the input of the two 1*1 convolution layers of the input layer is respectively the output of the second convolution module in the corresponding up-sampling module and the output of the second convolution module in the corresponding down-sampling module, the two 1*1 convolution layers of the input layer are added and then output to the one-dimensional convolution layer through the ReLU activation function layer, the one-dimensional convolution layer is output to the resampling layer through the Sigmoid activation function layer, the output of the resampling layer is multiplied by the output of the second convolution module in the corresponding down-sampling module to serve as the output of the attention mechanism, and the output of the down-sampling layer in the corresponding down-sampling module are spliced to serve as the output of the corresponding down-sampling module.

The convolution module and the output convolution module are formed by sequentially connecting a convolution layer, a batch normalization layer and a rectification linear unit layer.

In the step 5, the set learning rate is 0.1, the training round is 500 rounds, the learning rates are respectively attenuated at 250 rounds and 400 rounds, and the attenuation rate is 0.1; the learning rate of each training is less than or equal to the learning rate of the previous training.

The invention has the advantages of

The method can be applied to a database with relatively small data volume, the automatic learning of the Xi Suoxu characteristics from the training database and the classification and judgment are realized through deep learning after data amplification, the data characteristics for judgment are continuously corrected and the convolutional neural network parameters are adjusted in the training process, so that the sensitivity and specificity in clinical application are improved, and the accuracy and reliability of semantic segmentation are further improved along with the increment of the number of FFA images in the training example.

Drawings

FIG. 1 is a full convolution neural network model structure of the present invention.

FIG. 2 is an attention mechanism of the present invention.

The specific implementation mode is as follows:

the invention is further described below with reference to the following figures and examples.

The core of the semantic segmentation method of the invention is as follows: aiming at FFA images containing non-perfusion areas and including doctor segmentation labels, a multilayer convolutional neural network is established, the convolutional neural network is trained based on the FFA images, the final output semantic segmentation result of the convolutional neural network is made to accord with the doctor labeling result, and therefore the trained convolutional neural network can be used for automatically segmenting and identifying the non-perfusion areas.

The specific embodiment is as follows:

the DR non-perfusion area intelligent identification method based on deep learning comprises the following steps:

step 1: acquiring and segmenting annotated fundus fluorography images

The FFA images were acquired from the second hospital ophthalmological center affiliated with the university of zhejiang medical school, and from 74 eyes of 67 patients aged 28 to 84 years within 25 months from 2016 to 2018 and 9 months, fundus Fluorescein Angiography (FFA) was performed by a Heidelberg confocal angiography instrument (HRA). The photographing was performed by two ophthalmologists, and the fundus image resolution was 768 × 768 pixels. Patients were excluded from the study if mydriatic and fluorescein sodium contrast agents were used before fundus imaging, and the fundus images could not be captured with severe ametropia problems. The non-perfused areas were labeled by five experienced ophthalmologists according to the guidelines of the Diabetic Retinopathy clinical guideline (Diabetic Retinopathy PPP-updated 2016). The expert diagnostic groups are blinded and cannot obtain data for deep learning prediction.

Step 2: preprocessing the image containing the non-perfusion area: and (2) sequentially carrying out preprocessing of image denoising and smoothing, image contrast enhancement, image downsampling and pixel normalization on the images containing the perfusion-free areas in the data set in the step (1), and forming data of a training database by the preprocessed images containing the perfusion-free areas.

And step 3: and (3) amplifying the training database data in the step (2) by adopting a method of turning and randomly adding Gaussian noise.

And 4, step 4: carrying out contour line marking on the image containing no perfusion area in the training data set after amplification, and converting the image containing no perfusion area after contour line marking into a binary segmentation image by using water diffusion filling

And 5: constructing a convolutional neural network;

the convolutional neural network architecture uses an attention mechanism fully convolutional neural network model for the perfusion-free region semantic segmentation technology, and the model structure is shown in fig. 1. An input image with a size of 512 × 512 is subjected to two convolution modules, namely 3*3 convolution-Batch Normalization (BN) -rectified linear unit (ReLU) operation, and then down-sampled to a 1/2 long-wide size by using a maximum pooling layer (maxporoling). The final characteristics are obtained after four operations. And then entering an upsampling stage, in the upsampling stage, after upsampling the features, inputting the features with the same size as that in the downsampling stage into an attention mechanism to generate the features after weight distribution, splicing the two features, inputting the spliced features into a convolution module, performing the operation for four times in the upsampling stage, obtaining a result with the same size as the original image through the convolution module for one time, and obtaining a binary semantic segmentation result through a convolution layer with a convolution kernel of 1*1 and a softmax function.

The attention mechanism of the present invention is shown in fig. 2, wherein x represents a feature diagram before down-sampling, m represents a gate signal (gating signal), which is a next-stage feature of x, and the size of the gate signal is the size of x after down-sampling once, and the two are added and then subjected to a ReLU activation function; and then, reducing the dimension of the channel number to 1 by utilizing 1*1 convolution, obtaining the weight of the characteristic diagram by utilizing a sigmoid activation function, and resampling (resampling) the size of the characteristic diagram to be the size of the characteristic diagram x. The weight is cross-connected (skip connection) with x, and each channel is multiplied to obtain weighted feature x'. Note that the mechanism may emphasize the identification of key regions by the model.

Where the convolutional neural network accepts as input an FFA contrast image, i.e. an image in the training dataset. The full convolution neural network replaces the full connection layer with upsampling for semantic segmentation. The extracted features are up-sampled and reduced to output sizes, so that a classification result of each pixel point in the original image is obtained, and a segmentation result is obtained. The attention mechanism is to get inspiration from the human visual attention mechanism, and when people observe things, people often observe local areas according to needs. The attention mechanism models this process by weight assignment, using the underlying (forward-passing front-layer) features to derive weights via an attention function. The convolution layer (convolution) is used to extract image features and determine the effect of model recognition. ReLU is used as an activation function, and the salient features are passed through the model, and the useless features are filtered.

Step 6: training convolutional neural networks

And training a neural network architecture in the convolutional neural network for multiple times by adopting FFA images corresponding to the semantic segmentation training example, wherein the training turns are 500 according to a set learning rate of 0.1, the learning rates are respectively attenuated at 250 turns and 400 turns, and the attenuation rate is 0.1. Optimizing the neural network architecture parameters by using an SGD algorithm so as to obtain a convolutional neural network after multiple training without perfusion area identification;

preferably, a cross entropy function is used as a model loss function during training, an SGD algorithm is used as an optimizer, wherein the momentum is 0.9, the initial learning rate is 0.1, and the weight attenuation is 0.0001; the cross-entropy cost function has non-negativity, and approaches 0 when the actual output value approaches the desired value. The expression is as follows:

wherein, y _i Is the expected output value of the ith neuron, a _i For its actual output value, n is the total number of neurons involved in the calculation

Preferably, the momentum-based random gradient descent algorithm simulates the inertia of the movement of an object, the previous updating direction is reserved to a certain extent during optimization updating, and the final updating direction is finely adjusted through the learning, so that the learning stability is improved, and the capability of getting rid of local optimization to a certain extent is achieved. The expression is as follows:

Δx _t ＝m*Δx _t-1 -α*g _t

wherein Δ x _t And Δ x _t-1 The displacement updates at t and t-1, m being the momentum, alpha being the learning rate, g _t The gradient at time t.

Preferably, when the convolutional neural network is trained, the learning rate is reduced in stages along with model training, in this example, 0.1 is used as the initial learning rate, the training round is 500 rounds, the learning rate is attenuated at 250 rounds and 400 rounds respectively, and the attenuation rate is 0.1.

And 7: and calculating the output value of the last layer in the convolutional neural network through a softmax function to obtain the classification probability of each pixel point in the image to be segmented, and taking the maximum value index of the channel dimensionality to obtain a semantic segmentation result of the non-perfusion area.

The method realizes automatic learning of the required characteristics from the training data set and semantic segmentation through deep learning, and continuously optimizes the characteristics and parameters for discrimination in the training process. In the early stage test, 332 perfusion-free region FFA images marked by an ophthalmologist are used for training, a test set comprises 60 fundus images containing perfusion-free regions, and after the convolutional neural network is trained, the coincidence degree of semantically segmenting the perfusion-free regions can reach 65.24%. The DR non-perfusion area automatic segmentation and identification system based on deep learning can be applied to the fields of hospital clinic, telemedicine, auxiliary treatment and the like.

Claims

1. The fundus fluorography image perfusion-free area automatic segmentation method based on deep learning is characterized by comprising the following steps of:

step 1: collecting and segmenting and labeling fundus fluorescence angiography images, labeling the fundus fluorescence angiography images as three types of images including a non-perfusion area, a non-perfusion area and a laser spot, and putting the images including the non-perfusion area into a data set;

step 2: preprocessing the images containing the non-perfusion areas in the data set in the step 1, and forming data of a training database by the preprocessed images containing the non-perfusion areas; the preprocessing comprises sequentially carrying out image denoising and smoothing, image contrast enhancement, image down-sampling and pixel normalization processing on the images containing the non-perfusion areas;

and 5: constructing a convolutional neural network;

each attention mechanism comprises an input layer, a rectification linear unit layer, a 1*1 convolution layer, a Sigmoid function classification layer and a resampling layer which are sequentially connected, wherein the input layer consists of two 1*1 convolution layers, the input of the two 1*1 convolution layers of the input layer is respectively the output of the second convolution module in the corresponding up-sampling module and the output of the second convolution module in the corresponding down-sampling module, the two 1*1 convolution layers of the input layer are added and then output to a one-dimensional convolution layer through a ReLU activation function layer, the one-dimensional convolution layer is output to the resampling layer through the Sigmoid activation function layer, the output of the resampling layer is multiplied by the output of the second convolution module in the corresponding down-sampling module to serve as the output of the attention mechanism, and the output of the down-sampling layer in the corresponding down-sampling module are spliced to serve as the output of the corresponding down-sampling module;

2. The method for automatically segmenting the perfusion-free area of the fundus fluorography image based on the deep learning as claimed in claim 1, wherein in the step 1, the fundus fluorography image mainly comprises two types of images including laser spots and images without laser spots, wherein the images including the laser spots are divided into the perfusion-free area image and the perfusion-free area image.

3. The method for automatically segmenting the perfusion-free area of the fundus fluorography image based on the deep learning as claimed in claim 1, wherein the convolution module and the output convolution module are respectively formed by sequentially connecting a convolution layer, a batch normalization layer and a rectification linear unit layer.

4. The fundus fluorography image perfusion-free region automatic segmentation method based on deep learning as claimed in claim 1, wherein the learning rate set in the training of the step 5 is 0.1, the training round is 500, the learning rate is respectively attenuated at 250 rounds and 400 rounds, and the attenuation rate is 0.1; the learning rate of each training is less than or equal to the learning rate of the previous training.