CN114066884A

CN114066884A - Retinal blood vessel segmentation method and device, electronic device and storage medium

Info

Publication number: CN114066884A
Application number: CN202210023966.1A
Authority: CN
Inventors: 杨卫华; 邵怡韦; 万程; 蒋沁; 张�杰
Original assignee: Eye Hospital Nanjing Medical University
Current assignee: Eye Hospital Nanjing Medical University
Priority date: 2022-01-11
Filing date: 2022-01-11
Publication date: 2022-02-18
Anticipated expiration: 2042-01-11
Also published as: CN114066884B

Abstract

The present disclosure provides a retinal vessel segmentation method and apparatus, an electronic device, and a storage medium. The method comprises the following steps: acquiring fundus images, dividing the fundus images into a training set and a testing set, and performing corresponding preprocessing operation on the fundus images in the training set and the testing set; respectively constructing a segmentation network and a discrimination network; inputting the marked fundus images in the training set into a segmentation network for training, inputting the unmarked fundus images in the training set into the segmentation network after a preset number of training rounds, and alternately training the segmentation network and a discrimination network to obtain a trained retinal blood vessel segmentation model; inputting the fundus image to be segmented into a retinal vessel segmentation model to obtain a segmented output image; and splicing all the output images to obtain a retinal blood vessel segmentation result image. The retinal blood vessels in the fundus image can be automatically and accurately extracted, the segmentation result contains the tiny details of the blood vessels, and the detail information of the image is richer and can be used for clinical auxiliary diagnosis.

Description

Retinal blood vessel segmentation method and device, electronic device and storage medium

Technical Field

The disclosure belongs to the technical field of medical image processing, and particularly relates to a retinal blood vessel segmentation method and device, electronic equipment and a storage medium.

Background

Retinal blood vessels are important components of the eyeball, and many features such as the network form of the blood vessels can directly reflect some diseases. Particularly, for example, chronic diseases such as diabetes and hypertension, and various ophthalmic diseases such as retinal vascular diseases, such diseases may cause retinal vascular deformation (e.g., changes in characteristics such as diameter and thickness of blood vessels) to some extent, and cause retinal vascular hemorrhage, edema, sclerosis, exudation, and hemangioma-like changes, thereby reflecting some known or unknown pathological changes of the human body and their nature, characteristics, and degree. Meanwhile, normal retinal blood vessels need to be avoided during retinal disease treatment (such as retinal photocoagulation), and accurate blood vessel segmentation is a basic technology of full-intelligent retinal laser currently under development, and can reduce damage of laser treatment to the normal retinal blood vessels. Therefore, segmenting retinal blood vessels to obtain retinal vascular morphology analysis results plays a very important role in the diagnosis and treatment of such diseases.

However, in three-dimensional imaging, the topological structure of retinal blood vessels is very complex, the number of tiny branches is very large, a tree structure is formed, and the arteries and the veins are independent from each other, so that the superposition phenomenon cannot occur. And the three-dimensional information is inevitably lost in the two-dimensional fundus image, so that the blood vessels in the image have the phenomena of similar crossing, winding and the like. In addition, in the process of acquiring the fundus image by the medical imaging apparatus, a certain degree of distortion may also be caused due to the presence of noise, uneven illumination, poor contrast between blood vessels at fine blood vessels and the background, and the like. Therefore, the development of a method capable of accurately segmenting retinal blood vessels has become a research focus in the field of medical image processing in recent years.

At present, the traditional method and the method based on the fully supervised machine learning are mostly adopted for the segmentation of the retinal blood vessels. However, the performance of the conventional method, such as the method based on matched filtering, depends on the degree of matching between the template and the blood vessel, and is affected by factors such as blood vessel central light reflection, radius change, lesion interference, and the like, so that the conventional method can make the sensitivity of retinal blood vessel segmentation reach an acceptable range, but the specificity is generally to be improved. The retina blood vessel segmentation method based on the fully supervised machine learning has the segmentation performance depending on a large amount of labeled marking data. However, in practical applications, since it takes a lot of time to annotate medical images, especially retinal vessels with complex topology, and an experienced clinical expert is needed, the labeled data is often very rare, so that the retinal vessel segmentation method based on the fully supervised machine learning is difficult to be effectively applied in practical clinics.

Disclosure of Invention

The present disclosure is directed to at least one of the technical problems in the prior art, and provides a retinal vessel segmentation method and apparatus, an electronic device, and a storage medium.

In one aspect of the present disclosure, a retinal vessel segmentation method is provided, including:

acquiring fundus images, dividing the fundus images into a training set and a testing set, and performing corresponding preprocessing operation on the fundus images in the training set and the testing set; wherein the training set comprises a marked fundus image and an unmarked fundus image;

respectively constructing a segmentation network and a discrimination network;

inputting the marked fundus images in the training set into the segmentation network for training, inputting the unmarked fundus images in the training set into the segmentation network after a preset number of training rounds, and alternately training the segmentation network and the discrimination network to obtain a trained retinal blood vessel segmentation model;

inputting the fundus image to be segmented into the retinal vessel segmentation model to obtain a segmented output image;

and splicing all the output images to obtain a retinal blood vessel segmentation result image.

In some embodiments, the pre-processing operation comprises:

filtering the fundus image, and performing gray level transformation, standardization, contrast-limited adaptive histogram equalization and gamma transformation to protrude out of a blood vessel region to obtain a noise-removed and enhanced fundus image;

and randomly shearing the fundus images with the original sizes in the training set by adopting a patch shearing method so as to perform data enhancement on the fundus images in the training set.

In some embodiments, the split network is based on a U-Net network, comprising three parts: the device comprises an encoder, a far-jump connection module and a decoder;

said encoder comprises 5 downsampled layers, wherein each of said downsampled layers comprises 2 convolutional layers of 3 x 3 and one maximum pooling layer of 2 x 2; the convolution kernel numbers of convolution layers in the 1 st, 2 nd, 3 th, 4 th and 5 th downsampling layers are respectively 64, 128, 256, 512 and 1024, and the step length is 1;

the decoder comprises 4 upsampling layers, wherein each upsampling layer comprises 1 deconvolution layer of 2 x 2 and 2 convolution layers of 3 x 3; in the up-sampling layer, the convolution kernel number of each layer of the deconvolution layer is respectively 512, 256, 128 and 64, and the step length is 2; the number of convolution kernels of each convolution layer is 512, 256, 128 and 64 respectively, and the step length is 1; and/or the presence of a gas in the gas,

the discrimination network is based on a full convolution neural network and comprises 5 convolution layers, wherein the convolution kernels of the 5 convolution layers are respectively 64, 128, 256, 512 and 1; the size of each convolution kernel is 4 × 4; the step length of the 5 convolutional layers is 2, 2, 2, 2 and 1 respectively;

using a leak-ReLU as an activation function in the discrimination network; to convert the model into a complete convolutional network, an upsampling layer is added to the last layer, the output is rescaled to the size of the input map, and a confidence map is output.

In some embodiments, inputting the labeled fundus images in the training set into the segmentation network for training, inputting the unlabeled fundus images in the training set into the segmentation network after a preset number of training rounds, and training the segmentation network and the discrimination network alternately to obtain a trained retinal vessel segmentation model, including:

inputting the marked fundus images in the training set into the segmentation network for training, inputting the unmarked fundus images in the training set into the segmentation network after a preset number of training turns to obtain a prediction probability map, inputting the prediction probability map into the discrimination network for authenticity judgment, and outputting a confidence map;

and carrying out binarization on pixels with high authenticity, adding the pixels into the segmentation network for self-supervision training, calculating a multitask loss function of the segmentation network and a space cross entropy loss function of the discrimination network, and iteratively updating parameters to enable the generated semantic labels to accord with real segmentation labels in feature distribution so as to enhance the segmentation performance of the segmentation network and obtain the trained retinal vessel segmentation model.

In another aspect of the present disclosure, there is provided a retinal vessel segmentation apparatus including:

the system comprises an acquisition module, a preprocessing module and a processing module, wherein the acquisition module is used for acquiring fundus images, dividing the fundus images into a training set and a testing set, and performing corresponding preprocessing operation on the fundus images in the training set and the testing set; wherein the training set comprises a marked fundus image and an unmarked fundus image;

the construction module is used for respectively constructing a segmentation network and a discrimination network;

the training module is used for inputting the marked fundus images in the training set into the segmentation network for training, inputting the unmarked fundus images in the training set into the segmentation network after a preset number of training rounds, and alternately training the segmentation network and the discrimination network to obtain a trained retinal blood vessel segmentation model;

the prediction module is used for inputting the fundus image to be segmented into the retinal blood vessel segmentation model to obtain a segmented output image;

and the splicing module is used for splicing all the output images to obtain a retinal blood vessel segmentation result image.

In some embodiments, the obtaining module is further specifically configured to:

In some embodiments, the training module is further specifically configured to:

In another aspect of the present disclosure, an electronic device is provided, including:

one or more processors;

a storage unit for storing one or more programs which, when executed by the one or more processors, enable the one or more processors to implement the method according to the preceding description.

In another aspect of the present disclosure, a computer-readable storage medium is provided, on which a computer program is stored, which, when being executed by a processor, is adapted to carry out the method according to the above.

The retinal blood vessel segmentation method and the retinal blood vessel segmentation device can automatically and accurately extract the retinal blood vessels in the fundus image, the segmentation result contains tiny details of the blood vessels, and the detailed information of the image is richer and can be used for clinical auxiliary diagnosis. Under the condition of limited labeled data, a large amount of easily obtained unlabeled data is utilized by adding a resist generation type learning model, so that the segmentation precision of the retinal blood vessels is improved, and the possibility of the practical clinical application of the retinal blood vessel segmentation technology is increased.

Drawings

FIG. 1 is a block diagram of a retinal vessel segmentation method according to an embodiment of the present disclosure;

FIG. 2 is a flow chart of a retinal vessel segmentation method according to another embodiment of the present disclosure;

FIG. 3 is an enhanced image according to another embodiment of the present disclosure;

FIG. 4 is a diagram illustrating the effect of dough sheet cutting according to another embodiment of the present disclosure;

FIG. 5 shows a retinal vessel segmentation result 1 according to another embodiment of the present disclosure;

FIG. 6 shows a retinal vessel segmentation result 2 according to another embodiment of the present disclosure;

fig. 7 is a schematic structural diagram of a retinal blood vessel segmentation apparatus according to another embodiment of the present disclosure.

Detailed Description

For a better understanding of the technical aspects of the present disclosure, reference is made to the following detailed description taken in conjunction with the accompanying drawings.

The retinal vessel segmentation method provided by the present disclosure is a retinal vessel segmentation method based on antagonistic generation type semi-supervised learning, firstly, an input fundus image is preprocessed to protrude a vessel region, then a marked fundus image in a training set is input into a segmentation network for training, after a preset training round, an unmarked fundus image is input for semi-supervised learning, the performance of the segmentation network is enhanced, and a segmentation result is output, as shown in fig. 1, the retinal vessel segmentation method based on the antagonistic generation type semi-supervised learning mainly comprises the following steps.

As shown in fig. 1 and 2, an embodiment of the present disclosure relates to a retinal vessel segmentation method S100, including:

s110, acquiring fundus images, dividing the fundus images into a training set and a testing set, and performing corresponding preprocessing operation on the fundus images in the training set and the testing set; wherein the training set comprises marked fundus images and unmarked fundus images.

Specifically, in this step, the public data sets may be selected as the training and test sets for the training and test sets, for example, the public data sets DRIVE and STARE as the training and test sets for the fundus image. Of course, in addition to this, several fundus images may be acquired, which are proportionally divided into a training set and a data set. The present embodiment is described by taking the public data set as an example, but not limited thereto.

In this step, a number of marked fundus images and unmarked fundus images are included in the training set, where the number of unmarked fundus images is much greater than the number of marked fundus images in order to improve the subsequent model training accuracy. In the public data set, 20 pictures in the DRIVE data set may be used as labeled data for training, 20 pictures may be used as test data, and all 28 pictures in the STARE data set may be used as unlabeled data for training. To remove noise and highlight blood vessels, the training and test sets are image enhanced using gray scale transformation, normalization, contrast limited adaptive histogram equalization, and gamma transformation, as shown with reference to fig. 3.

In addition to the above-described preprocessing of the fundus image, data enhancement may be performed on the fundus image in order to improve the model generalization performance.

Specifically, both the labeled fundus image and the unlabeled fundus image in the training set are subjected to random slicing processing, and the original size image is cut into 128 × 128 slices. Finally, 2000 labeled 128 × 128 slices and 20000 unlabeled 128 × 128 slices are obtained for training, as shown with reference to fig. 4.

And S120, respectively constructing a segmentation network and a judgment network.

Specifically, in this step, the model of the present disclosure includes two parts, namely a segmentation network and a discrimination network, and the segmentation network is constructed by an encoder, a far-jump connection module and a decoder, and is used for self-supervision training, and for a given unmarked fundus image, a prediction probability map is generated. And then, restoring the extracted high-level features into an image through an encoder, and outputting a segmentation result image. The discrimination network constructs an image discriminator through a full convolution neural network, and can judge the authenticity of the generated prediction probability map pixel by pixel and output a confidence map.

In one embodiment, the split network is based on a U-Net network, comprising three parts: the device comprises an encoder, a far-jump connection module and a decoder;

the decoder comprises 4 upsampling layers, wherein each upsampling layer comprises 1 deconvolution layer of 2 x 2 and 2 convolution layers of 3 x 3; in the up-sampling layer, the convolution kernel number of each layer of the deconvolution layer is respectively 512, 256, 128 and 64, and the step length is 2; the convolution kernel number of each convolution layer is 512, 256, 128 and 64 respectively, and the step size is 1.

In one embodiment, the discriminating network is based on a full convolutional neural network, comprising 5 convolutional layers, the number of convolutional cores of the 5 convolutional layers is 64, 128, 256, 512, 1; the size of each convolution kernel is 4 × 4; the step length of the 5 convolutional layers is 2, 2, 2, 2 and 1 respectively;

And S130, inputting the marked fundus images in the training set into the segmentation network for training, inputting the unmarked fundus images in the training set into the segmentation network after a preset number of training rounds, and alternately training the segmentation network and the discrimination network to obtain a trained retinal blood vessel segmentation model.

Specifically, in this step, labeled fundus images in the training set are input to the segmentation network for training, after a certain number of training rounds (for example, 100 training rounds), unlabeled fundus images in the training set are input to the segmentation network to obtain a prediction probability map, and are input to the discrimination network for authenticity determination, and a confidence map is output. And carrying out binarization on pixels with high authenticity, adding the pixels into the segmentation network for self-supervision training, calculating a multitask loss function of the segmentation network and a space cross entropy loss function of the discrimination network, and iteratively updating parameters to enable the generated semantic labels to accord with real segmentation labels in feature distribution so as to enhance the segmentation performance of the segmentation network and obtain the trained retinal vessel segmentation model.

The prediction probability map and the marked fundus image generated by the segmentation network are sent to the discrimination network. And finally, the network restores the output of the convolutional layer to the mapping size through up-sampling convolution, and outputs a probability confidence map that each pixel in the image is a real label. The confidence map is used to determine a region sufficiently close to the distribution of the annotated fundus image. Then, a threshold value is used to binarize the confidence map to highlight the confidence region. And finally, adding the binarized confidence map and the unmarked fundus image in each training round into a marked fundus image input segmentation network for self-supervision training, and inputting the unmarked fundus image into the segmentation network to output a prediction probability map. And inputting the prediction probability map into a discrimination network for authenticity discrimination to output a confidence map. And each training turn is carried out, a multitask loss function of the segmentation network and a space cross entropy loss function of the discrimination network are calculated, parameters are updated in an iterative mode, and the generated semantic tags are enabled to accord with real segmentation tags in feature distribution, so that the segmentation performance of the segmentation network is enhanced, and the effect of semi-supervised learning is achieved.

The training process is illustrated below.

Assuming a given size ofH×WX 3 input imageX _nUse ofS(.) denotes a split network, with a size ofH×W×CIs/are as followsS(X _n) Representing a prediction probability map, in whichCIs the number of categories. Use ofD(.) represents a discriminative network, the input of which is sizeH×W×CAnd output a prediction probability map of sizeH×WConfidence map of x 1. The inputs to the discrimination network include: predictive probability mapS(X _n) Or genuine labelsY _n. Therefore, the bcedciceloss of the discrimination network is:

wherein the content of the first and second substances,y _n=0 represents the predicted probability map of a sample as the output of the segmentation network,y _n=1 indicates that the sample is a genuine label. In addition to this, the present invention is,D(S(X _n))^{h w(,)}is composed of^{h w(,)}The confidence in the authenticity of the pixel at which,D(Y _n)^{h w(,)}the same is true.

The segmentation network is optimized by minimizing a multitask loss function, and the formula is as follows:

wherein the content of the first and second substances,L _ce、L _advandL _semiBCEDiceLoss, immunity loss and semi-supervision loss. Lambda [ alpha ]_advAnd λ_semiThe weights used to minimize the multitask penalty function.

L _ceIn order to calculate BCEDiceLoss between a prediction probability graph of marked data in a segmentation network and a real label, the formula is as follows:

L _advin order to add the influence of the output result of the discrimination network into the loss function of the segmentation network, the model achieves the effect of counterstudy. By minimizingL _advLoss, the segmentation network can be trained to deceive the discrimination network by maximizing the probability of generating a prediction result from the real distribution, thereby achieving a better semi-supervised learning effect. The formula is as follows:

L _semithe method is used for calculating the loss of the unmarked data between a confidence map generated in the discriminant network and a prediction probability map output by the segmentation network. The formula is as follows:

wherein the content of the first and second substances,I(.) is a function of an indicator,T _semiis a sensitivity threshold, true value, controlling the process of self-supervised learning

Setting according to elements, and if the binary is carried out, determining the pixel category in the image

Is provided with

Is 1.

In specific training, 2000 128 × 128 labeled slices in the training set and 20000 128 × 128 unlabeled slices in the training set are input into the model for training. The method is carried out on a workstation provided with an NVIDIA-GTX1080 GPU, the programming language is Python, a PyTorch deep learning framework is adopted, and the weight lambda is_advAnd λ_semiSet to 0.01 and 0.1,T _semiset to 0.2. Inputting 2000 pieces of marked data into a segmentation network for training, adding unmarked data into a model after training for 100 rounds, alternately training the segmentation network and the discrimination network, and only updating segmentation network parameters or discrimination network parameters in each iteration until a satisfactory generation effect is achieved. In order to verify the performance of the model, testing is carried out on a test set, the test set is also subjected to slicing processing, and after prediction is output, parameters such as translation step length and the like are usedAnd splicing the predicted patches into a large image again to serve as a final predicted image. The retinal vessel segmentation results are shown in fig. 5 and 6.

And S140, inputting the fundus image to be segmented into the retinal blood vessel segmentation model to obtain a segmented output image.

And S150, splicing all the output images to obtain a retinal blood vessel segmentation result image.

The retinal blood vessel segmentation method disclosed by the embodiment of the disclosure can automatically and accurately extract the retinal blood vessels in the fundus image, the segmentation result contains tiny details of the blood vessels, and the detailed information of the image is richer and can be used for clinical auxiliary diagnosis. Under the condition of limited labeled data, a large amount of easily obtained unlabeled data is utilized by adding a resist generation type learning model, so that the segmentation precision of the retinal blood vessels is improved, and the possibility of the practical clinical application of the retinal blood vessel segmentation technology is increased.

In the following, the semi-supervised network of the present disclosure is compared with the fully supervised network based on the U-Net network, and the segmentation effect indexes of the two models on the test set are shown in table 1 below. The AUC value and the Sensitivity of the semi-supervised model introduced with the additional label-free training samples are greatly improved in the test set, which shows that the semi-supervised model trained in the method can identify more details in the process of segmenting blood vessels, so that the pixels are correctly classified. The fact that the similarity coefficient of Jaccard is improved also indicates that the prediction result of the algorithm is higher in similarity with the real result sample set. In addition, the F1 score is improved, which shows that the network stability of the network is improved to some extent. This is due to: 1) the generation network in the semi-supervised model adopts a classic U-Net model, and the model combines the high-level features and the primary features of the original image through four continuous down-sampling extraction high-level features, corresponding four continuous up-sampling recovery primary features and each layer of long-jump connection, so that the network can better identify edge detail features; 2) the method adopts the full convolution neural network as the discrimination network, and the discrimination network discriminates and classifies the output of the generated network and minimizes discrimination loss, so that the information of the unlabeled picture is utilized to the maximum extent, the counterstudy effect is realized, and the segmentation effect of the generated network is improved.

In another aspect of the present disclosure, as shown in fig. 7, a retinal vessel segmentation apparatus 100 is provided, and the apparatus 100 may be applied to the method described above, and specifically, refer to the related description, which is not repeated herein. The apparatus 100 comprises:

an obtaining module 110, configured to obtain fundus images, divide the fundus images into a training set and a test set, and perform corresponding preprocessing operations on the fundus images in the training set and the test set; wherein the training set comprises a marked fundus image and an unmarked fundus image;

a construction module 120, configured to respectively construct a segmentation network and a discrimination network;

a training module 130, configured to input the labeled fundus images in the training set into the segmentation network for training, after a preset number of training rounds, input the unlabeled fundus images in the training set into the segmentation network, and train the segmentation network and the discrimination network alternately to obtain a trained retinal blood vessel segmentation model;

the prediction module 140 is configured to input the fundus image to be segmented into the retinal blood vessel segmentation model to obtain a segmented output image;

and the splicing module 150 is used for splicing all the output images to obtain a retinal blood vessel segmentation result image.

The retinal blood vessel segmentation device disclosed by the embodiment of the disclosure can automatically and accurately extract retinal blood vessels in the fundus image, the segmentation result contains tiny details of the blood vessels, and the detailed information of the image is richer and can be used for clinical auxiliary diagnosis. Under the condition of limited labeled data, a large amount of easily obtained unlabeled data is utilized by adding a resist generation type learning model, so that the segmentation precision of the retinal blood vessels is improved, and the possibility of the practical clinical application of the retinal blood vessel segmentation technology is increased.

In some embodiments, the obtaining module 110 is further specifically configured to:

In some embodiments, the training module 130 is further specifically configured to:

one or more processors;

The computer readable medium may be included in the apparatus, device, system, or may exist separately.

The computer readable storage medium may be any tangible medium that can contain or store a program, and may be an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, more specific examples of which include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, an optical fiber, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination thereof.

The computer readable storage medium may also include a propagated data signal with computer readable program code embodied therein, for example, in a non-transitory form, such as in a carrier wave or in a carrier wave, wherein the carrier wave is any suitable carrier wave or carrier wave for carrying the program code.

It is to be understood that the above embodiments are merely exemplary embodiments that are employed to illustrate the principles of the present disclosure, and that the present disclosure is not limited thereto. It will be apparent to those skilled in the art that various changes and modifications can be made therein without departing from the spirit and scope of the disclosure, and these are to be considered as the scope of the disclosure.

Claims

1. A retinal vessel segmentation method, comprising:

respectively constructing a segmentation network and a discrimination network;

2. The method of claim 1, wherein the preprocessing operation comprises:

3. The method of claim 1, wherein the split network is based on a U-Net network and comprises three parts: the device comprises an encoder, a far-jump connection module and a decoder;

4. The method according to any one of claims 1 to 3, wherein inputting the labeled fundus images in the training set into the segmentation network for training, inputting the unlabeled fundus images in the training set into the segmentation network after a preset number of training rounds, and training the segmentation network and the discrimination network alternately to obtain a trained retinal vessel segmentation model comprises:

5. A retinal vessel segmentation apparatus, comprising:

6. The apparatus of claim 5, wherein the obtaining module is further configured to:

7. The apparatus of claim 5, wherein the split network is based on a U-Net network, comprising three parts: the device comprises an encoder, a far-jump connection module and a decoder;

8. The apparatus according to any one of claims 5 to 7, wherein the training module is further configured to:

9. An electronic device, comprising:

one or more processors;

a storage unit to store one or more programs that, when executed by the one or more processors, cause the one or more processors to implement the method of any of claims 1 to 4.

10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, is able to carry out a method according to any one of claims 1 to 4.