CN113763292A - Fundus retina image segmentation method based on deep convolutional neural network - Google Patents

Fundus retina image segmentation method based on deep convolutional neural network Download PDF

Info

Publication number
CN113763292A
CN113763292A CN202010795314.0A CN202010795314A CN113763292A CN 113763292 A CN113763292 A CN 113763292A CN 202010795314 A CN202010795314 A CN 202010795314A CN 113763292 A CN113763292 A CN 113763292A
Authority
CN
China
Prior art keywords
image
segmentation
layer
encoder
channel
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010795314.0A
Other languages
Chinese (zh)
Inventor
蒋芸
高静
王发林
姚慧霞
马泽琪
张婧瑶
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northwest Normal University
Original Assignee
Northwest Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northwest Normal University filed Critical Northwest Normal University
Priority to CN202010795314.0A priority Critical patent/CN113763292A/en
Publication of CN113763292A publication Critical patent/CN113763292A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/40Image enhancement or restoration using histogram techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/90Determination of colour characteristics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30041Eye; Retina; Ophthalmic

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Biology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Medical Informatics (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Quality & Reliability (AREA)
  • Eye Examination Apparatus (AREA)

Abstract

The invention discloses a fundus retina image segmentation method based on a deep convolutional neural network. In addition, in order to increase the segmentation accuracy, the method uses a new data preprocessing method of the fundus retina image to enhance the image; the method comprises the steps of solving the problem of fine blood vessel segmentation by using an end-to-end deep convolutional network, obtaining deep significant features of a lesion area and visualizing the deep significant features; a series of problems caused by large pixels of various medical images are solved by using a method of combining various deep neural networks.

Description

Fundus retina image segmentation method based on deep convolutional neural network
Technical Field
The invention belongs to the technical field of medical image processing, relates to a retinal fundus image segmentation method, and particularly relates to a fundus retinal image segmentation method based on a deep convolutional neural network.
Background
Retinal images have been widely used in the diagnosis, screening and treatment of cardiovascular and ophthalmic diseases, including systemic diseases such as age-related macular degeneration, glaucoma, hypertension, arteriosclerosis, and cardiovascular diseases. Vessel, optic disc and optic cup segmentation are essential steps required for quantitative analysis of retinal images. First, optic disc cup segmentation is commonly used for glaucoma detection, and optic nerve head assessment is a convenient and widely accepted method by clinicians to calculate the risk of glaucoma by using the ratio of the vertical cup to disk ratio, so accurate optic disc cup segmentation is of great importance. Secondly, the upper and lower branch artery blood vessels and the branched capillary blood vessels in the retina image jointly form data distribution required to be learned by various methods. Some pathological diseases in the human body, such as diabetic retinal diseases, atherosclerosis, and the like, can be found through the change of retinal vascular morphology. For example, diabetic retinopathy can lead to vision loss due to hyperglycemia and late hypertension. Diabetic retinopathy is a leading cause of blindness in the elderly worldwide. The world health organization recommends annual screening of the eye because if diabetic retinopathy is found at an early stage, it can be effectively treated by laser therapy. Thus, observation of changes in retinal vascular morphology allows early diagnosis and treatment of these health-threatening conditions.
Manual segmentation of the optic discs and cups in retinal images is a tedious task, requires experience and skill, and is not suitable for such large-scale examinations, as many patients need to be tested and treated, resulting in many patients not being able to receive adequate treatment. Secondly, the medical image acquisition is expensive and complex, and in most cases, the training of the medical image lacks standard labels. There is therefore a need to develop a method for automated segmentation of vessels, discs and cups in retinal images. The method has the advantages that the pathological change tissues in the retinal image are segmented by using the powerful characteristic learning capacity of the deep convolutional neural network, so that various problems in auxiliary diagnosis of the retinal image can be effectively solved, the workload of medical staff is reduced, and meanwhile, valuable medical image reference can be provided for the medical staff.
In the prior art, some methods have been used to successfully apply Convolutional Neural Networks (CNNs) to automatically segment retinal blood vessels. These methods have proven useful in restoring global spatial resolution at the network output. However, these methods have certain problems: 1) the simple image preprocessing method cannot effectively utilize the structural information of the feature map, and may hinder the segmentation performance; 2) the receptive field is improved by using a down-sampling mode, but when the optic disc and the optic cup are jointly segmented, the loss of the characteristic information is caused by an overlarge down-sampling factor because the optic disc area on the label graph is smaller; 3) for an overlarge segmentation region, the receptive field in the methods is not large enough, the global information cannot be fully understood, and some large segmentation regions cannot be accurately identified; 4) these methods use deconvolution to perform upsampling, which increases the parameters and computation of the model and affects the performance of the model.
Disclosure of Invention
The invention aims to provide a fundus retinal image segmentation method based on a deep convolutional neural network, which reduces the number of model parameters, improves the receptive field and context understanding of a model, and improves the accuracy and speed of retinal blood vessels, optic discs and optic cup segmentation models.
In order to solve the technical problems, the invention provides the following technical scheme:
the invention relates to a method for preprocessing a fundus retina image, which comprises the following steps:
A. acquiring a retinal image dataset;
B. pretreatment: and sequentially performing gray processing, normalization, contrast-limited adaptive histogram equalization and gamma non-linearization processing on the retinal image to obtain a preprocessed image.
Preferably, the gradation processing includes the steps of:
decomposing the retina image into a red R channel, a green G channel and a blue B channel by adopting three RGB components, and fusing the images of the three channels according to a proportion to convert the images into a gray image;
the conversion formula is as follows:
Igray=0.299×r+0.587×g+0.114×b
in the formula: r, G, B represent the values of the R channel, G channel and B channel, respectively;
the normalization comprises the following steps:
let X be { X ═ X1,x2,...,xnIs the image dataset; carrying out dimension normalization on the data set X by using a Z fraction normalization method; the Z-score normalization formula is as follows:
Xnorm=(X-μ)/σ
where μ is the mean of x and σ is the standard deviation of x; at the moment, positive and negative values exist in x, and the mean value of x is 0 and the standard deviation is 1;
then carrying out minimum-maximum normalization on each image data in the data set X, and remapping the value of X into the range of 0-255; the min-max normalization formula is as follows:
Figure BDA0002625387640000031
in the formula, xi∈Xnorm,i∈[1,2,...,n]
The contrast-limited adaptive histogram equalization adopts a CLAHE algorithm to enhance the contrast between blood vessels and the background of the whole data set, and then gamma nonlinearity is used for carrying out nonlinear operation on the brightness or tristimulus values in the image to obtain a preprocessed image, wherein the gamma value is 1.2.
The invention relates to a method for segmenting optic disk and optic cup images based on a deep convolutional neural network, which comprises the following steps: a1, preprocessing images of the optic disc and the optic cup by the image preprocessing method;
b1, generator: carrying out cascade convolution on the preprocessed image by adopting convolution layers with different expansion rates, outputting a characteristic diagram, periodically rearranging information in the characteristic diagram through scale reconstruction of the characteristic diagram, and generating a segmented image of a video disc and a video cup with the same size as a target;
c1, discriminator: and constructing a discriminator which has the same convolution unit form as the generator, forming a generation confrontation network model by the discriminator and the generator, automatically adjusting the adjusting parameters and outputting a final image of the generator.
Preferably, the generator comprises an encoder and a decoder, and the training data samples of the generator are
Figure BDA0002625387640000041
xiAs fundus image, yiA label graph of the optic disc and the optic cup is shown;
Figure BDA0002625387640000042
h is the picture length, W is the picture width, C1Is xiNumber of channels of (C)2Is yiThe encoder performs downsampling by using cascade expansion convolution to obtain an output characteristic diagram
Figure BDA0002625387640000043
The decoder performs down-sampling by adopting scale reconstruction, and converts the characteristic diagram u into yiThe same size is obtained, and characteristic graphs with the same size are obtained
Figure BDA0002625387640000044
The generator outputs the segmentation result C (x)i);
The discriminator is used for training data sample image set
Figure BDA0002625387640000045
Hesheng (Chinese character of 'He')Image set generated by image forming device
Figure BDA0002625387640000046
Performing discrimination, and if the discriminator is used for collecting the image of the training data sample
Figure BDA0002625387640000047
Image collection generated by sum generator
Figure BDA0002625387640000048
If the training data sample image set is classified into one type, adjusting the adjusting parameters of the discriminator, and if the discriminator is used for classifying the training data sample image set into one type, adjusting the adjusting parameters of the discriminator
Figure BDA0002625387640000049
Image collection generated by sum generator
Figure BDA00026253876400000410
And if the image is divided into two types, adjusting the adjusting parameters of the generator, and enabling the generator to regenerate the segmentation image, wherein the calculation formula of the final image of the generator is as follows:
Figure BDA00026253876400000411
in the formula, thetaCParameters representing an encoder; thetaDA parameter representing a decoder; ex,y[]Representing the expected value.
Preferably, the cascade convolution cascades N expansion convolution layers each time, the size of the convolution kernel is k × k, and the expansion rates are [ r [ [ k ] respectively1,…,ri,…,rn]After N convolutional layers, all characteristic information in the receptive field area can be completely covered;
defining the maximum value of the expansion rate of the ith layer in the expansion convolution kernel as RiAnd an expansion rate ri≤Ri,RiThe calculation formula of (2) is as follows:
Figure BDA00026253876400000412
and (R)1-1)+(R2-1)<k is when Ri>0,Ri≠Ri+1(ii) a Expansion ratio r within each groupiThe relationship of a common factor cannot be formed; r isiRepresenting the expansion rate of the ith layer in the expanded convolution kernel, and k is the size of the convolution kernel.
The invention relates to a retinal blood vessel image segmentation method based on a deep convolutional neural network, which comprises the following steps of:
1. performing image preprocessing on retinal blood vessels by adopting the image preprocessing method;
2. the preprocessed image is subjected to down-sampling by an average pooling layer to form down-sampled images of different levels, then a common convolution layer is used for expanding a channel of the down-sampled image, and the down-sampled image after the level expansion and an output image of a multi-path FCN (fuzzy C-channel) model encoder of the previous level are fused into an input image of the level and input into the multi-path FCN model encoder;
3. the multipath FCN model encoder encodes an input image, obtains a blood vessel characteristic image in a retina image, forms blood vessel characteristic images of different levels, fuses the blood vessel characteristic image of the level and an output image of a next level multipath FCN model decoder into the input image of the level, inputs the input image into the multipath FCN model decoder, and the multipath FCN model decoder is used for inverse operation of the encoder and decodes the input image to form output characteristic images of different depths;
4. and expanding the features of the output feature images with different depths to the same size as the input image by using an up-sampling method, then respectively inputting the feature images into different convolutional layers for channel compression, then classifying the feature images by using a Softmax function, fusing the obtained multiple probability maps into a double-channel probability map, and then performing threshold segmentation to obtain a final blood vessel segmentation map.
Preferably, the multi-path FCN model is composed of two encoder paths and two decoder paths, each of the encoder paths and the decoder paths includes a convolutional layer, a residual block, a batch normalization layer, and a ReLU activation layer, the encoder paths generate a set of encoder feature maps using the residual block and the convolutional layer, normalize each layer of feature maps using the batch normalization layer, and then activate them using a ReLU activation function, and the decoder paths decode images generated by the encoder using the deconvolution layer and the residual block, normalize each layer of feature maps using the batch normalization layer, and then activate them using the ReLU activation function.
Compared with the prior art, the invention has the following beneficial effects:
1) various deep neural networks are combined to solve different problems in fundus image segmentation. A plurality of novel depth models are provided, wherein the novel depth models are based on a depth convolution network and are combined with models such as a noise reduction convolution automatic encoder, a convolution network and an attention mechanism. The new model can realize the processing functions of denoising, enhancing, feature mapping, segmenting and the like of the original retina image through the proposed data enhancement method.
2) The medical image segmentation scheme has practical value and reference value. And constructing a plurality of segmentation methods aiming at different medical images. The method can be used for assisting diagnosis of the patient by combining the overall characteristics of the whole image on the basis of obtaining the relevant tissues by segmentation, and has important practical value. For medical staff, the achievement of each stage of the segmentation method has great reference value. In the aspect of later scientific research, various deep learning algorithm data in the invention also have great reference values.
3) A novel retina image preprocessing model is provided, and the extraction and recognition capability of the model on image features is effectively improved; the comprehension capability of the model to local context information is improved by stacking and expanding convolution, and the relevance of characteristic information in a receptive field is kept; the scale reconstruction layer is used for replacing up-sampling modes such as deconvolution and bilinear interpolation, extra parameters and calculated amount do not need to be brought in, and the learning capability is reserved for restoring the detail information.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:
FIG. 1 is a diagram of the data pre-processing of a retinal image of the eye by the segmentation method of the present invention.
Fig. 2 is a diagram of a disc and cup segmentation network.
FIG. 3 is a diagram of a concatenated extended convolution structure employed in the segmentation method of the present invention.
Fig. 4 is a diagram of a structure of a scale reconstruction layer used in the segmentation method of the present invention.
FIG. 5: a retinal vessel segmentation network structure diagram;
FIG. 6: optic disc and optic cup segmentation result graph;
FIG. 7: and (5) a retinal blood vessel segmentation result graph.
Detailed Description
The preferred embodiments of the present invention will be described in conjunction with the accompanying drawings, and it will be understood that they are described herein for the purpose of illustration and explanation and not limitation.
In addition, if a detailed description of the known art is not necessary to show the features of the present invention, it is omitted.
Example 1
The segmentation method is based on a deep convolutional neural network, and relates to a data preprocessing model of a fundus retina image, an end-to-end multi-label deep convolutional network model and application of the model to segmentation of fundus retina blood vessels, optic discs and optic cups.
The segmentation method adopts a deep convolutional neural network to map the characteristics of vascular tissues, optic disc cup tissues and lesion tissues in the medical images, and uses the convolutional network to segment the images. In addition, in order to increase the segmentation accuracy, the image is subjected to enhancement processing by using a new data preprocessing method of the fundus retina image; the method comprises the steps of solving the problem of fine blood vessel segmentation by using an end-to-end deep convolutional network, obtaining deep significant features of a lesion area and visualizing the deep significant features; a series of problems caused by large pixels of various medical images are solved by using a method of combining various deep neural networks. The segmentation method specifically comprises the following steps:
1) selecting a data set
Retinal vessel segmentation dataset: digital Recovery Images for Vessel Extraction (DRIVE), Structured Analysis of the recovery (STARE), CHASE _ DB1 (CHASE). The retinal images of the three data sets are obtained under different acquisition devices, lighting conditions, and the like, which can verify the robustness and generalization of the framework proposed herein. The DRIVE data set contained 40 images with a resolution of 565 x 584 px; the STARE data set has 20 images of 700 × 605px, of which half have lesions of different conditions; the CHASE data set consisted of 28 images with a resolution of 999 × 960 px;
optic disc cup segmentation dataset: the Drishti-GS1 dataset contained a total of 101 fundus images, 31 normal and 70 diseased; wherein the training set comprises 50 images, and the test set comprises 51 images; all image optic discs and optic cup areas are marked by 4 ophthalmologists with different clinical experience;
2) novel data preprocessing method for fundus retina image
After appropriate pre-processing of the image, the model can learn its implied data distribution more efficiently. The segmentation method of the invention provides a four-step preprocessing strategy to process images in sequence. Figure 1 shows images under different treatments. The first pre-processing strategy is to convert a color image into a grayscale image. The color original retina image shown in fig. 1(a) is decomposed into an R channel shown in fig. 1(B), a G channel shown in fig. 1(c), and a B channel shown in fig. 1 (d). It can be seen that the G channel has higher vascular and background discrimination, while the R and B channels are more noisy and have lower contrast. Although more noise exists in the R channel and the B channel, corresponding blood vessel characteristic information exists in the R channel and the B channel. Therefore, in order to ensure the lowest loss of characteristic information and make the distinction degree between the blood vessel backgrounds higher, on the basis of a G channel, an R channel and a B channel are proportionally fused into the G channel to be converted into a gray image. The conversion formula is as follows:
Igray=0.299×r+0.587×g+0.114×b (1)
(1) in the formula: r, G, B represent the values of the R channel, G channel and B channel, respectively; according to the expression (1), in the converted grayscale image shown in fig. 1(e), the R channel, the G channel, and the B channel of the original image account for 29.9%, 58.7%, and 11.4%, respectively, which not only retains part of the feature information of the R channel and the B channel, but also maximally utilizes the information of the G channel.
The second preprocessing strategy is data normalization. Let X be { X ═ X1,x2,...,xnIs the image dataset; the data set X is first dimension normalized using a Z-score normalization method. The Z-score normalization formula is as follows:
Xnorm=(X-μ)/σ (2)
(2) where μ is the mean of x and σ is the standard deviation of x. At this time, positive and negative values exist in x, and the mean value of x is 0 and the standard deviation is 1. Next, each piece of image data in the data set X is subjected to minimum-maximum normalization, and the value of X is remapped to the range of 0-255. The min-max normalization formula is as follows:
Figure BDA0002625387640000081
(3) in the formula, xi∈Xnorm,i∈[1,2,...,n]. The effect of data normalization is shown in fig. 1 (f). Fig. 1(f) reduces the interference of the fundus retina image caused by the uneven light through normalization, so that the picture can resist the attack of geometric transformation, and can find out the invariants in the picture, thereby being beneficial to the segmentation of the retinal blood vessel details.
A third pre-processing strategy is to use CLAHE to enhance the vessel-to-background contrast of the entire data set. The CLAHE image is shown in fig. 1(g), which is based on fig. 1(f) and uses CLAHE to enhance the contrast of retinal blood vessels with the background, highlighting fine blood vessels. The last pre-processing strategy is a non-linear operation using Gamma non-linearity to handle luminance or tristimulus values in the image shown in fig. 1(g), to expand the difference between different classes of pixel values by balancing the color range due to uneven illumination. In the segmentation method of the present invention, the gamma value is set to 1.2, and the processed image is visualized as shown in fig. 1 (h). The present invention uses OpenCV to implement CLAHE and gamma non-linearization strategies.
Aiming at different problems of different data set acquisition equipment, shooting angles, illumination and the like, the four steps of gray processing, normalization, contrast-limited adaptive histogram equalization and gamma nonlinearity are performed on the retinal image, so that the discrimination between the blood vessel and the background is enhanced, and the influence caused by other interference factors in the acquisition process is reduced. The effect of each step of preprocessing is verified through a comparison experiment, so that the four-step preprocessing strategy has positive and effective influence, and the extraction and identification capability of the model on the image characteristics can be improved.
3) New division model for fundus optic disk and optic cup
The segmentation of discs and cups is a very time-consuming task that is currently performed only by professionals. Using a computer to segment discs and cups automatically is attractive because computers segment more objectively and faster than humans. The segmentation method of the invention realizes the simultaneous segmentation of the optic disk and the optic cup in the eyeground image by using an end-to-end multi-label deep learning model, and the network structure of the multi-label deep learning model is shown in figure 2 and mainly comprises two network structures of a generator (C) and a discriminator (D). C is a full convolution segmentation network, mainly composed of encoder and decoder; c training data samples are
Figure BDA0002625387640000091
xiAs fundus image, yiIs a label diagram of the optic disc and the optic cup.
Figure BDA0002625387640000101
H is the picture length, W is the picture width, C1Is xiNumber of channels of (C)2Is yiThe number of channels of (2). The encoder uses fundus image xiThe segmentation method of the invention reduces down-sampling factors to carry out dense feature sampling and introduces the sense of expansion convolution and expansion in order to improve the capability of feature extractionThe wilderness, concatenated different expansion rate extended convolutions avoids the 'gridding' problem in extended convolutions. Final output characteristic diagram of coder
Figure BDA0002625387640000102
The decoder module uses the feature map u as input and converts the feature map u into y by means of scale reconstruction instead of upsamplingiSame size, and outputting feature maps with same size after applying scale reconstruction
Figure BDA0002625387640000103
The final C outputs the segmentation result C (x)i). D is used for judging 'true/false' (1/0), and the model is two types of models which are respectively y according to the output result D (x, x)iAnd C (x)i) To guide C to train, respectively. During training, the input sample images of D are divided into two types:
Figure BDA0002625387640000104
and
Figure BDA0002625387640000105
the output is therefore also divided into two categories: when (x, y) is input, the output D (x, y) is 1; when (x, c (x)) is input, D (x, c (x)) is output as 0. The final objective function of the model is as follows:
Figure BDA0002625387640000106
(4) in the formula, thetaCParameters representing an encoder; thetaDA parameter representing a decoder; ex,y[]Representing the expected value.
Cascade expansion convolution: the larger receptive field can promote the understanding of the model to the global context of the fundus image, and the difference between the optic disc cup region and the focus region can be distinguished more accurately. Typically, downsampling using either Pooling layers (Pooling) or convolutional layers with paces in a standard fabric FCN is used to expand the model's field of reception. However, excessive downsampling operation may cause loss of feature information, is not beneficial to model learning, and may increase complexity of upsampling.
To alleviate the above problem, the model is made to reduce the down-sampling operation while keeping the model field of view unchanged or larger. The segmentation method of the present invention employs convolutional layers of different expansion rates for concatenation, as shown in fig. 3 (a). Assuming that N expansion convolution layers are cascaded each time, the size of the convolution kernel is k × k, and the expansion rates are r1,…,ri,…,rn]After N convolutional layers, all the characteristic information in the receptive field area can be completely covered.
Defining the maximum value of the expansion rate of the ith layer in the expansion convolution kernel as:
Figure BDA0002625387640000111
and (R)1-1)+(R2-1)<k is when Ri>0,Ri≠Ri+1. Therefore, the expansion rate ri≤Ri. However, the expansion ratio r within each groupiCannot be factored (e.g., 2,4, 8, etc.). r isiRepresenting the expansion rate of the ith layer in the expanded convolution kernel, and k is the size of the convolution kernel.
In FIG. 3(b), set (r)1,r2,r3) Size (k) of three convolution kernels ═ 1,2,5r1,kr2,kr3) (3,5,11), after a series of convolution operations, L4Characteristic value p in layer4Receptor field R ofb17, for p4The contributing characteristic is from RbAll features within a region, and no void portion is present in fig. 3 (a). The effective ratio of the characteristic values is 100%, and the relevance among the characteristic values is ensured. Thus, r is set by definition reasonable1,r2,r3The final convolution kernel can completely acquire the relevance with all characteristic values in the receptor field area, thereby effectively relieving the problem of 'gridding'. F in FIG. 3aRepresents the expansion ratio (r)1,r2,r3) The contribution area of the eigenvalue of (2,2,4), FbRepresents the expansion ratio (r)1,r2,r3) The contribution area of the eigenvalue of (1,2, 5).
Scale reconstruction layer: in order to improve the performance of the model and reduce the parameters and the calculation amount of the model, the segmentation method of the invention performs upsampling by using a scale reconstruction layer, and the scale reconstruction layer performs periodic rearrangement on information in the feature map and compresses the channel number of the feature map so as to expand the height and the width to achieve the effect of upsampling, as shown in fig. 4. Assuming that the dimension of the input feature map is W × H × (d × d × C), d is a down-sampling factor, and the dimension of the feature map obtained by the scale reconstruction layer is (W × d) × (H × d) × C. The scale reconstruction layer has the following advantages as an upsampling mode: 1. compared with a deconvolution mode, the scale reconstruction layer does not increase extra parameters and calculation overhead, and the speed of the model can be improved. And the scale reconstruction layer is learnable, so that the detailed information lost in the loss down-sampling can be captured and recovered. 2. Compared with the bilinear interpolation mode, the bilinear interpolation up-sampling mode does not need to bring in extra parameters and calculated amount, but can not carry out learning, and can not accurately restore lost characteristic information. The scale-restructuring layer thus combines the advantages of both.
Accurate segmentation of the optic disc and cup is of great practical significance to assist the physician in screening for glaucoma. The multi-label deep convolution neural network model provided by the segmentation method of the invention segments the optic disc and the optic cup in the fundus image. The concept of GAN is used, GAN is combined into a model, a generator is divided into an encoder and a decoder, a down-sampling factor is reduced in the encoder, and loss of some characteristic information is avoided. And the expansion convolution is used for replacing the traditional convolution, the receptive field of the model is expanded, and the problem of gridding is avoided while the receptive field is increased by cascading the expansion convolutions with different expansion rates. In a decoder, a scale reconstruction layer is provided to replace an up-sampling mode such as deconvolution, parameters are reduced, the learning capability is kept, and the performance of the model is improved. Finally, the network model was verified on the DRISHTI-GS1 data set, and the segmentation result is shown in FIG. 6, wherein FIG. 6(a) is the fundus retina image before segmentation; FIG. 6(b) shows the optic disc and optic cup area manually divided by the expert, the small circle representing the optic cup and the large circle representing the test disc; fig. 6(c) is the result of our proposed multi-label deep learning model segmentation. From a comparison of fig. 6(b) and 6(c), it can be seen that the results of the model segmentation are very close to the areas of the optic disk cup manually segmented by the expert, indicating that the proposed method has very good segmentation performance.
4) Fundus retinal vessel segmentation
The fundus is the only part of the human body which can directly observe the blood vessel, and the change of the fundus, such as the width, the angle, the branch shape and the like of the blood vessel, provides a basis for early diagnosis of diseases. Fundus blood vessel analysis is the main mode for diagnosing fundus diseases at present, and fundus image blood vessel segmentation is the necessary step for quantitative analysis of diseases. Inspired by FCN, the invention designs a new structure for fundus retina blood vessel segmentation, named as a baseline full convolution neural network, and the structure is shown in figure 5. The network comprises an encoding path and a decoding path which are symmetrical in structure, wherein the encoding path and the decoding path are composed of a convolutional layer, a residual error module, a batch normalization layer, a ReLU activation layer and the like. The encoder uses rich convolutional layers to encode the low-dimensional input image so as to extract context semantic information, reduce the influence of background noise and acquire the blood vessel characteristics in the retina image. The decoder is designed to perform the inverse of the encoding and recover the spatial information by upsampling and fusing the low dimensional features so that the retinal vessels can be accurately located. The network mainly comprises three parts: the method comprises the steps that firstly, a multi-scale input layer is constructed and used for constructing image pyramid input to achieve multi-level reuse of image feature information; second, multipath FCNs, which are used as backbone structures to learn rich hierarchical representation; and finally, a multi-output fusion module combines the low-level features and the high-level features to fully utilize the feature information of different depths and achieve better effect through feature fusion. The following is a detailed description of the three sections.
Multipath FCN: the network consists of two encoder paths (c), and two decoder paths (c). Each encoder path generates a set of encoder feature maps using a residual module and a convolutional layer, and normalizes each layer of feature maps using a batch normalization layer, which is then activated using a ReLU activation function. And the paths decode the characteristics of the paths (the paths) by using an deconvolution layer and a residual module, normalize the characteristic graph of each layer by using a batch normalization layer, and then activate by using a ReLU activation function. Skipping the connection to fuse the characteristic diagram of the encoder and the characteristic diagram of the decoder together so as to reduce the loss of the characteristic information and fuse the characteristic information;
multi-scale input layer: the multi-scale input is integrated into the encoder path (r) to ensure feature transfer of the original image and effectively improve the segmentation quality. In a multi-scale input layer, the segmentation method uses an average pooling layer to perform down-sampling on an image, then uses a common convolution layer to expand a channel of the down-sampled image, and then fuses the expanded down-sampled image with outputs of different layers in a path (I) of an encoder to be used as the input of a next layer;
multi-output fusion layer: in the decoder path IV, extracting output characteristics of residual modules with different depths, expanding the characteristics by using an up-sampling method to be the same as the size of an input image, then respectively inputting the characteristics into different convolutional layers for channel compression, classifying the characteristics by using a Softmax function, and fusing a plurality of obtained probability maps into a double-channel probability map, wherein the channel with the 0 th dimension is the probability of being segmented into background, and the channel with the 1 st dimension is the probability of being segmented into retinal blood vessels. And finally, performing threshold segmentation on the 0 th dimension channel feature map to obtain a final blood vessel segmentation map.
The three modules are improved to improve the characteristic identification and multiplexing capability of the network, and a structure superior to the FCN is obtained by constructing a multi-scale input layer, a multipath FCN and a multi-output fusion module. The model has a deeper network structure, the capacity of the model is increased, the low-level features and the high-level features are fused through skip connection at different distances, the difficulty of training the model is greatly reduced, parameters of a convolution kernel can be trained in a self-adaptive mode according to the shape features of blood vessels in training, the distinguishing capability of the network on the blood vessels and the background can be improved, and the segmentation accuracy of the model is improved. Finally, the present invention trains and tests the framework presented herein on three data sets, DRIVE, start and CHASE. The experimental results show that the proposed network structure has achieved competitive experimental results on three data sets, DRIVE, STARE and CHASE, and the improved model can obtain better results than FCN, wherein the results of segmentation are shown in fig. 7 by taking the DRIVE data set as an example. Fig. 7(a) is a fundus retinal image before division; FIG. 7(b) shows retinal blood vessels manually segmented by an expert; fig. 7(c) is the result of our proposed baseline full convolution neural network model segmentation. From a comparison of fig. 7(b) and 7(c), we can see that the results of model segmentation are very close to those of retinal vessels manually segmented by experts, and are very complete for the segmentation of small vessels, which indicates that the method proposed by us has very good segmentation performance.
Analysis of processing results
All experiments of the invention were run on the same hardware platform and software environment, as detailed in table 1. The experiment is based on the open source visual processing library OpenCV running code on the CPU. The experiment realizes a data preprocessing method based on a Pythrch frame, a video disc and cup segmentation network and a retina blood vessel segmentation network, and network training is carried out on a GPU. The method provided by the invention has low requirements on hardware and software of an experimental environment, and can be trained and used without or only by simply upgrading the existing equipment in the actual application process.
TABLE 1 Experimental hardware platform and software Environment
Figure BDA0002625387640000141
The setting of the hyper-parameters is important for the experimental results of the training model and the recurrent model. The experimental hyper-parameter settings of the invention are detailed in table 2. And assigning network weights and deviations by using a default initialization method of a Pythrch convolutional layer, and then performing end-to-end back propagation training on the network by using an Adam optimizer.
Table 2 experimental parameter settings
Figure BDA0002625387640000151
The present invention uses several commonly used evaluation indices to quantitatively evaluate the performance of the proposed method, including Accuracy (Accuracy), Sensitivity (Sensitivity), Specificity (Specificity) and F1 score (F1 Scores). Through comprehensive evaluation, each index of the retinal vessel segmentation method obtains very competitive experimental results on the DRIVE, STARE and CHASE data sets, wherein the segmentation accuracy rate reaches more than 97%, and the F1 score exceeds 80%. And the optic disc and optic cup segmentation method achieves good segmentation on a Drishti-GS1 data set, the F1 score reaches 97% on the optic disc, and the score reaches 92% on the optic cup.
Finally, it should be noted that: although the present invention has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that changes may be made in the embodiments and/or equivalents thereof without departing from the spirit and scope of the invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (7)

1. A method of pre-processing a subretinal image, comprising the steps of:
A. acquiring a retinal image dataset;
B. pretreatment: and sequentially performing gray processing, normalization, contrast-limited adaptive histogram equalization and gamma non-linearization processing on the retinal image to obtain a preprocessed image.
2. A method of subretinal retinal image preprocessing according to claim 1, wherein the grayscale processing comprises the steps of:
decomposing the retina image into a red R channel, a green G channel and a blue B channel by adopting three RGB components, and fusing the images of the three channels according to a proportion to convert the images into a gray image;
the conversion formula is as follows:
Igray=0.299×r+0.587×g+0.114×b
in the formula: r, G, B represent the values of the R channel, G channel and B channel, respectively;
the normalization comprises the following steps:
let X be { X ═ X1,x2,...,xnIs the image dataset; carrying out dimension normalization on the data set X by using a Z fraction normalization method; the Z-score normalization formula is as follows:
Xnorm=(X-μ)/σ
where μ is the mean of x and σ is the standard deviation of x; at the moment, positive and negative values exist in x, and the mean value of x is 0 and the standard deviation is 1;
then carrying out minimum-maximum normalization on each image data in the data set X, and remapping the value of X into the range of 0-255; the min-max normalization formula is as follows:
Figure FDA0002625387630000011
in the formula, xi∈Xnorm,i∈[1,2,...,n]
The contrast-limited adaptive histogram equalization adopts a CLAHE algorithm to enhance the contrast between blood vessels and the background of the whole data set, and then gamma nonlinearity is used for carrying out nonlinear operation on the brightness or tristimulus values in the image to obtain a preprocessed image, wherein the gamma value is 1.2.
3. The optic disk and optic cup image segmentation method based on the deep convolutional neural network is characterized by comprising the following steps of:
a1, performing image preprocessing on the optic disc and the optic cup by adopting the image preprocessing method of any one of claims 1-2;
b1, generator: carrying out cascade convolution on the preprocessed image by adopting convolution layers with different expansion rates, outputting a characteristic diagram, periodically rearranging information in the characteristic diagram through scale reconstruction of the characteristic diagram, and generating a segmented image of a video disc and a video cup with the same size as a target;
c1, discriminator: and constructing a discriminator which has the same convolution unit form as the generator, forming a generation confrontation network model by the discriminator and the generator, automatically adjusting the adjusting parameters and outputting a final image of the generator.
4. The method of claim 3, wherein the generator comprises an encoder and a decoder, and the training data samples of the generator are
Figure FDA0002625387630000021
xiAs fundus image, yiA label graph of the optic disc and the optic cup is shown;
Figure FDA0002625387630000022
Figure FDA0002625387630000023
h is the picture length, W is the picture width, C1Is xiNumber of channels of (C)2Is yiThe encoder performs downsampling by using cascade expansion convolution to obtain an output characteristic diagram
Figure FDA0002625387630000024
The decoder performs down-sampling by adopting scale reconstruction, and converts the characteristic diagram u into yiThe same size is obtained, and characteristic graphs with the same size are obtained
Figure FDA0002625387630000025
The generator outputs the segmentation result C (x)i);
The discriminator is used for training data sample image set
Figure FDA0002625387630000026
Image collection generated by sum generator
Figure FDA0002625387630000027
Performing discrimination, and if the discriminator is used for collecting the image of the training data sample
Figure FDA0002625387630000028
Image collection generated by sum generator
Figure FDA0002625387630000029
If the training data sample image set is classified into one type, adjusting the adjusting parameters of the discriminator, and if the discriminator is used for classifying the training data sample image set into one type, adjusting the adjusting parameters of the discriminator
Figure FDA00026253876300000210
Image collection generated by sum generator
Figure FDA00026253876300000211
And if the image is divided into two types, adjusting the adjusting parameters of the generator, and enabling the generator to regenerate the segmentation image, wherein the calculation formula of the final image of the generator is as follows:
Figure FDA0002625387630000031
in the formula, thetaCParameters representing an encoder; thetaDA parameter representing a decoder; ex,y[]Representing the expected value.
5. The method of claim 3, wherein the cascade convolution concatenates N extended convolution layers at a time, the convolution kernel size is k x k, and the extension rates are [ r x k ] respectively1,…,ri,…,rn]After N convolutional layers, all characteristic information in the receptive field area can be completely covered;
defining the maximum value of the expansion rate of the ith layer in the expansion convolution kernel as RiAnd extendRate ri≤Ri,RiThe calculation formula of (2) is as follows:
Figure FDA0002625387630000032
and (R)1-1)+(R2-1)<k is when Ri>0,Ri≠Ri+1(ii) a Expansion ratio r within each groupiThe relationship of a common factor cannot be formed; r isiRepresenting the expansion rate of the ith layer in the expanded convolution kernel, and k is the size of the convolution kernel.
6. The retinal vessel image segmentation method based on the deep convolutional neural network is characterized by comprising the following steps of:
1. image preprocessing is carried out on retinal blood vessels by adopting the image preprocessing method of any one of claims 1-2;
2. the preprocessed image is subjected to down-sampling by an average pooling layer to form down-sampled images of different levels, then a common convolution layer is used for expanding a channel of the down-sampled image, and the down-sampled image after the level expansion and an output image of a multi-path FCN (fuzzy C-channel) model encoder of the previous level are fused into an input image of the level and input into the multi-path FCN model encoder;
3. the multipath FCN model encoder encodes an input image, obtains a blood vessel characteristic image in a retina image, forms blood vessel characteristic images of different levels, fuses the blood vessel characteristic image of the level and an output image of a next level multipath FCN model decoder into the input image of the level, inputs the input image into the multipath FCN model decoder, and the multipath FCN model decoder is used for inverse operation of the encoder and decodes the input image to form output characteristic images of different depths;
4. and expanding the features of the output feature images with different depths to the same size as the input image by using an up-sampling method, then respectively inputting the feature images into different convolutional layers for channel compression, then classifying the feature images by using a Softmax function, fusing the obtained multiple probability maps into a double-channel probability map, and then performing threshold segmentation to obtain a final blood vessel segmentation map.
7. The retinal vessel image segmentation method based on the deep convolutional neural network of claim 6, wherein the multi-path FCN model is composed of two encoder paths and two decoder paths, the encoder path and the decoder path each include a convolutional layer, a residual module, a batch normalization layer and a ReLU activation layer, the encoder path generates a set of encoder feature maps using the residual module and the convolutional layer and normalizes each layer of feature maps using the batch normalization layer and then activates them using a ReLU activation function, and the decoder path decodes the images generated by the encoder using the deconvolution layer and the residual module and normalizes each layer of feature maps using the batch normalization layer and then activates them using the ReLU activation function.
CN202010795314.0A 2020-08-10 2020-08-10 Fundus retina image segmentation method based on deep convolutional neural network Pending CN113763292A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010795314.0A CN113763292A (en) 2020-08-10 2020-08-10 Fundus retina image segmentation method based on deep convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010795314.0A CN113763292A (en) 2020-08-10 2020-08-10 Fundus retina image segmentation method based on deep convolutional neural network

Publications (1)

Publication Number Publication Date
CN113763292A true CN113763292A (en) 2021-12-07

Family

ID=78785675

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010795314.0A Pending CN113763292A (en) 2020-08-10 2020-08-10 Fundus retina image segmentation method based on deep convolutional neural network

Country Status (1)

Country Link
CN (1) CN113763292A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115153478A (en) * 2022-08-05 2022-10-11 上海跃扬医疗科技有限公司 Heart rate monitoring method and system, storage medium and terminal
US11540798B2 (en) 2019-08-30 2023-01-03 The Research Foundation For The State University Of New York Dilated convolutional neural network system and method for positron emission tomography (PET) image denoising
CN117274278A (en) * 2023-09-28 2023-12-22 武汉大学人民医院(湖北省人民医院) Retina image focus part segmentation method and system based on simulated receptive field
CN117689669A (en) * 2023-11-17 2024-03-12 重庆邮电大学 Retina blood vessel segmentation method based on structure self-adaptive context sensitivity

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
YUN JIANG等: "Automatic Retinal Blood Vessel Segmentation Based on Fully Convolutional Neural Networks", SYMMETRY, vol. 11, pages 1 - 22 *
YUN JIANG等: "Optic Disc and Cup Segmentation Based on Deep Convolutional Generative Adversarial Networks", IEEE ACCESS, vol. 7, pages 64483 - 64493, XP011726080, DOI: 10.1109/ACCESS.2019.2917508 *
YUN JIANG等: "Retinal Vessels Segmentation Based on Dilated Multi-Scale Convolutional Neural Network", IEEE ACCESS, vol. 7, pages 76342 - 76352, XP011731842, DOI: 10.1109/ACCESS.2019.2922365 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11540798B2 (en) 2019-08-30 2023-01-03 The Research Foundation For The State University Of New York Dilated convolutional neural network system and method for positron emission tomography (PET) image denoising
CN115153478A (en) * 2022-08-05 2022-10-11 上海跃扬医疗科技有限公司 Heart rate monitoring method and system, storage medium and terminal
CN117274278A (en) * 2023-09-28 2023-12-22 武汉大学人民医院(湖北省人民医院) Retina image focus part segmentation method and system based on simulated receptive field
CN117274278B (en) * 2023-09-28 2024-04-02 武汉大学人民医院(湖北省人民医院) Retina image focus part segmentation method and system based on simulated receptive field
CN117689669A (en) * 2023-11-17 2024-03-12 重庆邮电大学 Retina blood vessel segmentation method based on structure self-adaptive context sensitivity
CN117689669B (en) * 2023-11-17 2024-08-27 重庆邮电大学 Retina blood vessel segmentation method based on structure self-adaptive context sensitivity

Similar Documents

Publication Publication Date Title
CN109635862B (en) Sorting method for retinopathy of prematurity plus lesion
CN108021916B (en) Deep learning diabetic retinopathy sorting technique based on attention mechanism
CN113763292A (en) Fundus retina image segmentation method based on deep convolutional neural network
Uysal et al. Computer-aided retinal vessel segmentation in retinal images: convolutional neural networks
CN109685813A (en) A kind of U-shaped Segmentation Method of Retinal Blood Vessels of adaptive scale information
CN109166126A (en) A method of paint crackle is divided on ICGA image based on condition production confrontation network
CN111986211A (en) Deep learning-based ophthalmic ultrasonic automatic screening method and system
CN113689954B (en) Hypertension risk prediction method, device, equipment and medium
CN106780439B (en) A method of screening eye fundus image
CN113012163A (en) Retina blood vessel segmentation method, equipment and storage medium based on multi-scale attention network
CN111461218B (en) Sample data labeling system for fundus image of diabetes mellitus
Qin et al. A review of retinal vessel segmentation for fundus image analysis
CN114881962A (en) Retina image blood vessel segmentation method based on improved U-Net network
CN115018756A (en) Method and device for classifying artery and vein of retina and storage medium
Phridviraj et al. A bi-directional Long Short-Term Memory-based Diabetic Retinopathy detection model using retinal fundus images
CN117315258A (en) Lightweight retinal vessel segmentation method based on graph convolution network and partial convolution
CN115409764A (en) Multi-mode fundus blood vessel segmentation method and device based on domain self-adaptation
CN114332910A (en) Human body part segmentation method for similar feature calculation of far infrared image
Alam et al. Benchmarking deep learning frameworks for automated diagnosis of OCULAR TOXOPLASMOSIS: A comprehensive approach to classification and segmentation
CN117593317A (en) Retina blood vessel image segmentation method based on multi-scale dilation convolution residual error network
Ali et al. AMDNet23: hybrid CNN-LSTM Deep learning approach with enhanced preprocessing for age-related macular degeneration (AMD) Detection
Saranya et al. Detection of exudates from retinal images for non-proliferative diabetic retinopathy detection using deep learning model
Zijian et al. AFFD-Net: A Dual-Decoder Network Based on Attention-Enhancing and Feature Fusion for Retinal Vessel Segmentation
CN115272283A (en) Endoscopic OCT image segmentation method, device, medium and product for colorectal tumor
Hatode et al. Evolution and Testimony of Deep Learning Algorithm for Diabetic Retinopathy Detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination