CN110807762B

CN110807762B - Intelligent retinal blood vessel image segmentation method based on GAN

Info

Publication number: CN110807762B
Application number: CN201910884346.5A
Authority: CN
Inventors: 赵汉理; 卢望龙; 邱夏青; 黄辉
Original assignee: Wenzhou University
Current assignee: Wenzhou University
Priority date: 2019-09-19
Filing date: 2019-09-19
Publication date: 2021-07-06
Anticipated expiration: 2039-09-19
Also published as: CN110807762A

Abstract

The invention discloses a GAN-based retinal vessel image intelligent segmentation method, which comprises the following steps: 1. giving a retina image set, and dividing a training set and a testing set; 2. designing a generator network G and a discriminator network D, and constructing an Adam optimizer; 3. inputting a training set to G; 4. g, generating a blood vessel segmentation image; 5. d, judging and calculating the segmented image generated by G; 6. updating the G and D parameters; 7. evaluating G and obtaining an optimal model G', and repeating the steps 3-7 until the iteration is finished; 8. the retinal image input G' generates a blood vessel segmentation image. The invention uses the large receptive field network model to intelligently segment the retinal image to obtain the final retinal vessel segmentation image. The network model of the invention has better robustness, and the obtained blood vessel segmentation image contains less noise and is generally superior to the existing retina blood vessel image segmentation method.

Description

Intelligent retinal blood vessel image segmentation method based on GAN

Technical Field

The invention belongs to the field of intelligent segmentation of retinal vessel images, and particularly relates to an intelligent segmentation method of a retinal vessel image based on GAN, which better solves the problem that the accuracy rate of the conventional retinal vessel image segmentation method is low.

Background

In clinical medicine, doctors often analyze ophthalmology and some systemic diseases, such as diabetes, glaucoma, hypertension, cardiovascular and cerebrovascular diseases, and the like by observing morphological characteristics of retinal images. The occurrence of these diseases generally affects the human retinal morphology, such as information affecting the number, branching, width, and angle of retinal blood vessels. Therefore, the realization of the segmentation of the color retinal image becomes an important condition for disease judgment of ophthalmologists, but the segmentation of the color retinal image by using the manual method is time-consuming and labor-consuming, and the segmentation result is also influenced by the experience of operators and the segmentation technology. The method has strong subjectivity and low repeatability, so that the method has important medical research significance and application value for realizing the intelligent and accurate segmentation of the retinal vessel image. With the continuous development of computer-aided diagnosis systems in medicine, intelligent segmentation of retinas also becomes a research hotspot at present.

Disclosure of Invention

The invention provides an intelligent retinal vessel image segmentation method based on GAN (generic object network), aiming at the problems of strong subjectivity and low efficiency in manual retinal vessel image segmentation and the problem of low segmentation accuracy in the existing method for performing retinal vessel segmentation by using a supervision method.

In order to solve the technical problems in the prior art, the technical scheme of the invention is as follows: a retinal vessel image intelligent segmentation method based on GAN comprises the following steps:

in step S1, a retinal image sample set is given, containing a sample pair of a retinal image and a reference blood vessel segmentation image, defined herein as (a, b); defining a retina image corpus C { (a)_i，b_i)|i∈[1，R]R denotes the total number of samples, i denotes the number of samples, a denotes the retinal image, and b denotes the reference blood vessel segmentation image. Copying and dividing a retina image sample set into a retina image training set E { (a)_i，b_i)|i∈[1，M]And a retinal image test set F { (a) }_i，b_i)|i∈[1，N]Where N + M ═ R, and M and N respectively denote the corresponding number of samples.

Step S2, constructing a retina intelligent segmentation network model based on GAN, wherein the network model comprises a generator network G and a discriminator network D, and constructing an Adam optimizer to assist the network training to quickly converge:

the overall architecture of the generator network G includes two parts, a contracting path (contracting path) and an expanding path (expanding path). In order to utilize the characteristic diagram information in the network training process to a greater extent, the characteristic diagram extracted and processed on the network contraction path is spliced with the characteristic diagram in the expansion path with the same size in the process of upsampling. In addition, a cavity convolution structure is introduced into the bottom layer of network downsampling, and the structure can increase the receptive field of the generated network, so that the network can better grasp the global characteristics of the retinal vessel image, and accurate segmentation of the retinal vessel image is realized. The generator network G performs 4 downsampling, 4 upsampling, and 3 feature concatenating operations in total, and the selected feature map is the feature map after downsampling, so that the generator network performs only 3 feature map concatenating operations although downsampling is performed 4 times. The down-sampling operation used in the generator network G is performed using a convolution operation with a convolution kernel size of 3x3 steps of 2.

The discriminator network D is a deep convolutional neural network, and its main role is to judge whether the input blood vessel segmented image is a reference blood vessel segmented image or a blood vessel segmented image generated by the generator network G. A residual block ResBlock structure is also used in the discrimination network, the structure can prevent the over-fitting of the network while increasing the number of network layers, and the problem of difficult training is solved, so that the network can better capture image characteristics, and the network can be converged more quickly. In the discriminator network D, the convolution kernel size used is 3 × 3, and then the downsampling operation is performed using the maximum pooling layer maxPooling operation with step size of 2, highlighting the main features in the feature map.

It is noted that the generator countermeasure network is composed of a generator network G and a discriminator network D. The main process is that the generator network G continuously fits the distribution of the retina training set E, and inputs the retina training sample pair (a)_i，b_i) The generator network G generates a vessel segmentation image z_iAnd obtaining a retina sample pair (a) generated by the generator_i，z_i). The discriminator network D simultaneously and respectively aligns the E sample pairs (a) of the retina training set_i，b_i) And the generated retina sample pair (a)_i，z_i) Discrimination is made wherein i is 1,2,3, …, M, each giving [0,1]The discrimination confidence q between the samples represents the summary of the sample pair as the retina training set E sample pairAnd (4) rate. Vessel segmentation sample z generated by loss function_iAnd a reference blood vessel segmentation sample b_iIn between, in preparation for further back propagation.

Constructing an Adam optimizer to assist network training, and setting an initial learning rate of 0.0002, beta₁＝0.5，β₂The learning rate can be intelligently adjusted during the training process, so that the network can be converged quickly.

Step S3, loading the retina training set E into the computer memory as input, and setting the retina training set E { (a)_i，b_i)|i∈[1，M]And (4) randomly scrambling to prepare for the next training stage.

Steps S4, S5, and S6 are the main training phase for generating the countermeasure network, and the game problem of the discriminator network D and the generator network G in the generation countermeasure network can be considered as a maximum minimization problem, and the two networks understand the relationship of image mapping therein by learning the features of the retinal image to the blood vessel segmentation image. The objective function is shown in equation 1:

at each iteration, a pair of image pairs (a) in the retina training set E is extracted_i，b_i)，a_iRepresenting a retinal image, b_iA reference blood vessel segmentation image is shown, where i is 1,2,3, …, M.

In step S4, a is input from the generator_iGenerating a corresponding blood vessel segmentation image G (a)_i) I.e. z_iThe generator network G tries to minimize the objective function L_cGAN(G, D), in order to make the output of the final objective function as small as possible, a vessel segmentation image z is generated_iThe image b is segmented as much as possible in the image style, the vascular structure and the reference retina_iAs similar as possible.

In step S5, the discriminator network D attempts to discriminate the distribution of the retina training set E and the distribution of the retina training composite set E' so as to maximizeThe discriminant network D simultaneously and respectively processes the sample pairs (a) of the retina training set E_i，b_i) And the sample pairs of the retinal training composition set E' (a)_i，z_i) Discrimination is made wherein i is 1,2,3, …, M, each giving [0,1]The discrimination confidence q between the samples q represents that the sample pair is a retina training set E sample pair (a)_i，b_i) The probability of (c).

Finally, the game equilibrium points of the discriminator network D and the generator network G are 'Nash equilibrium points', that is, the discriminator network D cannot judge that the input image sample pair is the sample pair (a) of the retina training set E_i，b_i) Or the sample pair of the retina training synthetic set E' (a)_i，z_i) For a given retina, sample pairs (a) of the synthetic set E' are trained_i，z_i) D confidence q for each output is 0.5. At this time, the distribution of the segmented image generated by the generator network G is fitted with the distribution of the reference blood vessel segmented image, so that the accurate mapping relation from the retina image to the blood vessel segmented image is learned, and the generated segmented image is the target image required by people. The process of the game can be considered as a maximum minimization process, which can be expressed as:

since the segmentation from the retina image into the blood vessel segmentation image is essentially a classification prediction of "black or white" for each pixel, which is actually a pixel-to-pixel classification task, the present invention additionally uses a class-two classification cross entropy loss in the generator network G to penalize the distance between the generated blood vessel segmentation image and the reference blood vessel segmentation image, so that the generated blood vessel segmentation image is more approximate to the reference blood vessel segmentation image. The class-two class cross entropy loss function is defined as follows:

where a is the retinal image, b is the reference vessel segmentation map, and G is the corresponding generator network G. In step S6, based on the loss functions given by equation (1) and equation (3), a total loss value at the current iteration number can be calculated. In order to minimize the loss value, a gradient value of each parameter in each step can be obtained by using a computation graph, and the whole function is close to a minimum value point by using a gradient updating method, so that the aim of fitting is fulfilled. The corresponding parameter updating formula is as follows:

wherein theta is_tParameters representing the t-th component in the generator network G and the discriminator network D, η represents the learning rate in the hyper-parameters,

representing the gradient of the corresponding parameter.

In step S7, the generator is evaluated using the retinal image test sample, and the optimal model parameters are retained as follows: at the end of the training phase, inputting the retina test set F into a generator network G, and generating a retina test synthesis set F' by using the retina test set F sample pairs (a)_i，b_i) Reference vessel segmentation map b in (1)_iAnd the generated retina sample pair (a)_i，z_i) Generating a segmentation map z of (1)_iA pixel-by-pixel alignment is performed, where i is 1,2,3, …, N, and each pixel is classified as a vascular point and a non-vascular point. In order to perform a performance test on the current generator network G, quantitative analysis needs to be objectively performed by a performance index. Indexes such as Accuracy (Accracy, Acc), Specificity (Sp), Sensitivity (Se), Dice coefficient, F-measure, Area (AUC) formed by a receiver operating characteristic curve (ROC) and a coordinate axis, and area (mAP) formed by an Accuracy-recall rate curve (PR curve) and the coordinate axis are adopted to measure the effectiveness of the model.

The AUC is more used for the performance measurement of medical image processing, and the closer the value of the AUC is to 1, the better the segmentation effect is.

Wherein, TP (true positive) is true positive and represents the number of correctly segmented blood vessels; TN (true negative) represents the number of correctly segmented non-blood vessels, namely background pixels; FP (false positive), that is, the number of pixel points of the blood vessel which is wrongly divided into non-blood vessels; FN (false negative), i.e., the number of pixels in the non-blood vessel is erroneously classified as blood vessel. TP + FN + FP + TN is the total number of pixel points in the region of interest in the image.

Since the above evaluation index depends on the threshold value of the output result, ROC data can be plotted by changing the true positive score (Sensitivity) and the false positive score (Specificity), and AUC is the area under the ROC curve. All evaluation indices are tests that are considered on all pixels within a mask, which represents the retinal optic disc region. And screening the indexes, and selecting the model with the largest indexes of Acc, Se, Sp, Precision, Recall and F-measure as the optimal model. And finally, judging at a parameter updating end stage, judging whether the training iteration number reaches the maximum iteration number, if so, ending the training stage to obtain an optimal generator network G 'and an optimal discriminator network D', and entering the next step. Otherwise, step S3 is entered for continuous loop iteration training.

The training image and the test image input in the training and testing process are the whole image. According to the invention, an Adam optimization method is adopted to carry out optimization training on the loss function, so that the final parameters for generating the confrontation network model are obtained, and the parameters can be continuously used in the subsequent retina segmentation task after being stored.

In step S8: a retina image sample set F₁＝{a_i|i∈[1，R_F]A, taking each retinal image a_iAs input in the optimal generator network G, the corresponding vessel segmentation image z is output_i，i＝1，2，3，…，R_FIn the formula, R_FThe sample number of the retina image sample set is represented, and the final segmentation image has better accuracy.

The invention provides an intelligent retinal vessel image segmentation method based on a generation countermeasure network, which has the beneficial effects that:

the network is mainly characterized by being based on a countermeasure training mechanism, and has a larger receptive field, so that the global information of the image can be captured well. Compared with other segmentation networks, the network has deeper network layers and can better capture and utilize abstract features of images. The method achieves the advanced effects in the aspects of accuracy, sensitivity and specificity. In addition, good segmentation effect can be achieved in a blood vessel region and a lesion region with low contrast, the method achieves high precision and good robustness of retinal blood vessel segmentation, and has good value and prospect in practical application.

Drawings

FIG. 1 is a general flow diagram of the present invention;

FIG. 2 is an overall framework of the invention;

FIG. 3 is a generator architecture diagram of the present invention;

FIG. 4 is a block diagram of a Residual block used in the present invention;

FIG. 5 is a block diagram of a scaled residual block of the present invention;

FIG. 6 is a discriminator architecture diagram of the present invention;

fig. 7 is a diagram of the final segmentation effect of the present invention.

Detailed Description

For completeness and clarity of description of technical solutions in the embodiments of the present invention, the following detailed description will be further developed with reference to the accompanying drawings in the embodiments of the present invention. It should be understood that the specific embodiments described herein are merely illustrative of the invention and do not limit the invention.

As shown in fig. 1, the present invention provides a technical solution: a retinal vessel image intelligent segmentation method based on GAN comprises the following steps:

step S1: given a sample set of retinal images, a sample pair comprising a retinal image and a reference vessel segmentation image, defined herein as (a, b); defining a retina image corpus C { (a)_i，b_i)|i∈[1，R]R denotes the total number of samples, i denotes a sample index, a denotes a retina image, and b denotes a reference blood vessel segmentation image. Copying and dividing a retina image sample set into a retina image training set E { (a)_i，b_i)|i∈[1，M]And a retinal image test set F { (a) }_i，b_i)|i∈[1，N]Where N + M ═ R, and M and N respectively denote the corresponding number of samples.

In the invention, the main steps comprise data division, and generation of the antagonistic loss is carried out by combining the constructed generator network G and the discriminator network D, finally the generator network G and the discriminator network D reach the optimal balance point, and the accurate segmentation of the input retina image by using the generator network G is realized, and the relationship between the overall network structure and the loss function is shown in FIG. 2.

Step S2: combining characteristic up-down sampling, a residual block structure and cavity convolution operation to obtain a generator network G; designing a discriminator network D, and adding a discriminator network D with a large receptive field and an improved residual structure into the discriminator network D; the combination of the generator network G and the arbiter network D is finally referred to as the generation countermeasure network. And initializing network parameters of the generator network G and the discriminator network D by using an Adam optimization method in a Pythrch frame to obtain initial parameters of the generator network and the discriminator network, and setting related training hyper-parameters for training and optimizing the network model.

The generator network G is specifically constructed by combining the advantages of feature up-down sampling and a ResBlock structure, so that the network combines the shallow feature of the image while up-sampling, thereby more comprehensively utilizing the feature map information and preventing the network degradation problem in a deep network. In addition, the generation network uses the hole convolution operation, and the receptive field of the generation network is increased while the number of network parameters is not increased. The sizes of convolution kernels in the generated network are all 3x3, after each next sample operation, the output feature map is input into 2 residual blocks ResBlock to carry out jump connection operation, and the feature map obtained after 4 times of downsampling is subjected to hole convolution operation with the hole rates of 5 and 3 respectively to increase the receptive field of the network. In the process of up-sampling, the features extracted by the shallow layer network are spliced for operation, the network finally uses convolution operation of 1x1 to ensure that the number of color channels output by the network is the same as the number of input color channels, the network structure is shown in fig. 3, and an upward arrow in the legend represents deconvolution operation, BatchNormal operation and Relu activation operation, wherein the convolution kernel size is 3x3 and the step size is 2; "downward arrow" indicates the convolution operation, BatchNormalization operation, Relu activation operation with a convolution kernel size of 3 × 3 step size of 2; "thin arrow to the right" indicates the convolution operation, BatchNormalization operation, Relu activation operation with a convolution kernel size of 3 × 3 step size of 1; "Thick arrows to the right" indicate convolution operation, BatchNormalization operation, Relu activation operation with convolution kernel size 1 × 1 step size 1; the "feature map" represents an output image after a corresponding convolution operation, i.e., is a feature map; "dotted arrows" indicate that the corresponding output feature maps are subjected to a splicing operation, that is, based on the dimension of the color channel, the number of the color channels is spliced, for example, the feature map with the size of (m, h, w) and the feature map with the size of (n, h, w) are subjected to a splicing operation, and the obtained feature map has the size of (m + n, h, w); the "residual block" refers to an operation based on a residual block network, as shown in fig. 4, the arrows in the diagram do not contain any operation meaning, and only represent the operation flow, and the residual block respectively contains two convolution operations with convolution kernel size of 3 × 3 and step size of 1, a batch normalization operation, and a Relu activation operation, and when the final output is performed, the input original feature map and the output feature map of the last convolution are added and summed to obtain and output the feature map with the same number, height, and width of color channels as the input feature map; "hole residual block" is based on the residual block, except that two convolution kernels are respectively set to different hole rates, other processes are all the same as the "residual block", and a schematic diagram is shown in fig. 5.

The overall architecture of the generator network G includes two parts, a contracting path (contracting path) and an expanding path (expanding path). The contraction path represents an operation path based on the U shape, namely an operation path which is input from an image, subjected to four times of down-sampling, further subjected to a hole residual block, and then subjected to 4 times of up-sampling output; the extended path indicates a path of the splicing operation other than the path of the U-shape. In order to utilize the feature information in the network to a greater extent, the feature map of the features extracted on the network contraction path after up-sampling is spliced with the feature map in the expansion path with the same size.

In the generator network G, 4 downsampling operations, 4 upsampling operations and 3 feature splicing operations are performed in total, and the selected and spliced feature map is the feature map after downsampling. And the downsampling operations used in the generation network are all performed using convolution operations with a convolution kernel size of 3x3 steps of 2.

The construction of the discriminator network D is specifically that a ResBlock structure is introduced into the discriminator network D to prevent the degradation problem of the deep network, and the used hopping connection structure still enables the network to optimize learning under the condition of increasing the network depth, and simultaneously prevents the network degradation problem. The size of each layer of convolution kernel in the network is judged to be 3x3, the network is downsampled by carrying out Max boosting operation for 4 times in total, the network finally uses a full connection layer to classify vectors, and whether the image input into the judgment network is from a real image or an image generated by a generator is judged. The structure of the discrimination network is shown in fig. 6, wherein in the legend, "Scalar" represents a Scalar, that is, the output value of the final discriminator network D, and is between [0,1], which represents the confidence of the final discriminator network in the authenticity judgment of the input image; "feature map" means the output image after the convolution operation; "thin arrow to the right" indicates the convolution operation with a convolution kernel size of 3 × 3 steps of size 1, the batch normalization operation, the Relu activation operation; "Thick arrows to the right" indicate convolution operations, BatchNormalization operations, Relu activation operations with convolution kernel sizes of 3 × 3 steps of size 2; "thin dashed arrow to the right" represents a flattening operation that stretches the feature map of the multi-color channel into a one-dimensional vector; the "residual block" here is the same as the "residual block" in the generator network G; "maximum pooling layer" means the maximum downsampling operation with an operation kernel size of 2 x 2 steps of size 2, with the output image size being 1/2 of the original image in both height and width; "global average pooling" means a downsampling operation that operates on the average of kernel size to image size, with the output result such that the image sizes are averaged over all, (c, h, w) size images are "global average pooled" to (c, 1, 1), where c is the number of color channels, h is the height of the feature map, and w is the width of the feature map; "thick dotted arrow to the right" indicates a full join operation, i.e., the target vector is transformed into a matrix to obtain a result vector.

It is noted that the numbers in the upper left corner of all feature maps in the generator network (as shown in fig. 3) and the discriminator network (as shown in fig. 6) represent the number of color channels, corresponding to the number of color channels output after each pass of the "arrow operation". And the 'K' at the lower left corner of the feature map represents the size of the original image, and the size of the feature map is changed into 'K/p' after multiple times of down-sampling, which represents that the adopted size is K divided by the corresponding multiple p.

Meanwhile, an Adam optimization method is adopted in the Pythrch framework, and the hyper-parameters in the Pythrch framework are optimized in the training process, so that the initial parameters of the generator network G and the discriminator network D are assisted.

Step S3: loading the retina training set E into a memory, randomly disorganizing, and extracting a pair of retina training samples (a) for each training_i，b_i) Including the retinal image and the reference blood vessel segmentation image, to prepare for the next training.

Step S4: extracting a pair of samples (a) from a retina training set E_i，b_i) I is 1,2,3, …, M. The retina image a_iAnd loading and inputting the image into a generator network G, wherein the corresponding image size is (3 multiplied by h multiplied by w), 3 represents the number of color channels, the corresponding color channel d belongs to { red, green and blue, h represents the height of a single picture, and w represents the width of the single picture. Through the layer-by-layer calculation of the network, the generated retina blood vessel segmentation graph z is finally obtained_iThe image size is (1 × h × w), and the corresponding gray-scale map is a single color channel, i.e., the image is represented by the gray-scale value size of the image according to the degree of the blood vessel in the image.

Step S5: the discriminant network D attempts to maximize the loss function by distinguishing the distribution of the retina training set E from the distribution of the retina training composite set E', and the discriminant network D simultaneously and separately processes the pairs of retina training set E samples (a)_i，b_i) And retina training synthetic set E' sample pairs (a)_i，z_i) Discrimination is made wherein i is 1,2,3, …, M, each giving [0,1]The discrimination confidence q between the samples q represents the probability that the sample pair is a retina training set E sample pair.

Step S6: and calculating the error between the generated retina training synthetic set E' and the retina training set E through loss function calculation to obtain a loss value. And performing backward propagation by using the obtained loss value, and respectively performing network parameter adjustment on the discriminator network D and the generator network G. And according to the given loss function, calculating the gradient of the parameters in the generator network G and the discriminator network D by using a chain type derivative method, and updating the corresponding parameters by using a random gradient descent method. The corresponding parameter updating formula is as follows:

representing the gradient of the corresponding parameter.

Step S7: the generators were evaluated using the retina test set F, with the optimal model parameters retained. And meanwhile, judging at the parameter updating end stage, judging whether the training iteration number reaches the maximum iteration number, if so, ending the training stage, and entering the next step. Otherwise, the training is continued, and the step S3 is continued to continue the loop iteration training.

At the end of the model training phase, inputting the retina test set F into a generator network G, and generating a retina test synthesis set F' by using the retina test set F sample pairs (a)_i，b_i) Reference vessel segmentation map b in (1)_iAnd the generated retina sample pair (a)_i，z_i) Generating a segmentation map z of (1)_iA pixel-by-pixel alignment is performed, where i is 1,2,3, …, N, and each pixel is classified as a vascular point and a non-vascular point. In order to perform a performance test on the current generator network G, quantitative analysis needs to be objectively performed by a performance index. Indexes such as Accuracy (Accracy, Acc), Specificity (Sp), Sensitivity (Se), Dice coefficient, F-measure, Area (AUC) formed by a receiver operating characteristic curve (ROC) and a coordinate axis, and area (mAP) formed by an Accuracy-recall rate curve (PR curve) and a coordinate axis are adopted to measure the text modelThe effectiveness of the model.

Step S8: and testing the test sample image by using the trained generator network G, inputting an original retinal image, and correspondingly outputting a retinal blood vessel segmentation image.

A retina image sample set F₁＝{a_i|i∈[1，R_F]A, taking each retinal image a_iAs input in the optimal generator network G, the corresponding vessel segmentation image z is output_i，i＝1，2，3，…，R_FWhere RF represents the number of samples of the retinal image sample set. The segmentation effect of the present invention on a sample set of retinal images is shown in fig. 7.

In summary, the invention adopts a GAN-based retinal vessel image intelligent segmentation method, and the generated network uses a residual error structure and a hole convolution operation, so that the network increases the receptive field of the network without introducing additional parameters, the network captures the characteristics of the image more comprehensively, and the retinal segmentation task can be completed better. In addition, a feature stitching operation in the expanded path of the network is generated, which enables the network to better exploit the shallow and deep features of the image for segmentation tasks. In the discrimination network, the structure of a residual error module is added, so that the problem of network degradation caused by the deepening of the network layer number is avoided, the discrimination capability of a deep network is better utilized, and the supervision capability of the network is enhanced.

It will be appreciated by persons skilled in the art that the invention is not limited to details of the foregoing embodiments, and that the invention can be embodied in other specific forms without departing from the spirit or scope of the invention. In addition, various modifications and alterations of this invention may be made by those skilled in the art without departing from the spirit and scope of this invention, and such modifications and alterations should also be viewed as being within the scope of this invention.

Claims

1. A retinal blood vessel image intelligent segmentation method based on GAN is characterized by comprising the following steps:

step S1: given a retinal image sample set, the sample set contains sample pairs of a retinal image and a reference blood vessel segmentation image, the retinal image sample set is noted as C { (a)_i,b_i)|i∈[1,R]In the formula, a denotes a retinal image, b denotes a reference blood vessel segmentation image, R denotes the number of samples, and i denotes a sample index; copying and dividing a retina image complete set C into a retina image training set E { (a)_i,b_i)|i∈[1,M]And a retinal image test set F { (a) }_i,b_i)|i∈[1,N]N + M ═ R, and M and N respectively denote the number of divided samples;

step S2: constructing a generator network G, wherein the generator network G comprises a residual block structure and a hole convolution operation; designing a discriminator network D, wherein the discriminator network D comprises a residual error structure, and obtaining a discriminator network D with a large receptive field; the combination of the generator network G and the discriminator network D is finally referred to as a generation countermeasure network; initializing network parameters of a generator network G and a discriminator network D by using an Adam optimization method in a Pythrch frame to obtain initial parameters of the generator network G and the discriminator network D, and setting related training hyper-parameters for training optimization of a network model;

step S3: loading the retina training set E into a computer memory as input, and setting the retina training set E { (a)_i,b_i)|i∈[1,M]Randomly disorganizing to prepare for the next training stage;

step S4: the generator network G takes the retina training set E as input, calculates output layer by layer through the network, and generates a corresponding retina training synthetic set E { (a)_i,z_i)|i∈[1,M]In which a in E' }_iAnd a in E_iThe same are retina images; and z in E_iRepresenting the vessel segmentation map generated by the generator network G, b in E_iA vessel segmentation map representing a reference;

step S5: the discriminator network D judges the image samples in the retina training synthesis set E' and the retina training set E generated by the generator network G one by one respectively, and each pair of generated retina sample pairs (a)_i,z_i) And retina training sample pair (a)_i,b_i) Confidence q, i given a decision is 1,2,3, M, q ranges from [0,1]]To (c) to (d);

step S6: calculating the error between the generated retina training synthetic set E' and the retina training set E through loss function calculation to obtain a loss value; performing backward propagation by using the obtained loss value, and respectively performing network parameter adjustment on the discriminator network D and the generator network G;

step S7: evaluating the generator network G by using the retina test set F, and reserving an optimal generator network G 'and a discriminator network D'; meanwhile, judging at the parameter updating end stage, judging whether the training iteration number reaches the maximum iteration number, if so, ending the training stage, and entering the next step; otherwise, the step S3 is executed to carry out the loop iteration training;

step S8: a retina image sample set F₁＝{a_i|i∈[1,R_F]Inputting the optimal generator G' to generate a blood vessel segmentation image, wherein R_FA sample number representing a sample set of the retinal image;

the construction and initialization of the generator, the discriminator and the Adam optimizer in the step S2 specifically include:

in the generator network G, the network splices the feature map obtained in the expansion path and the high-resolution feature map obtained in the contraction path, and then performs convolution operation on the obtained spliced feature map to extract features, so that the shallow features and the deep features of the image can be more fully utilized; a residual error module structure is also used in the generator network G, and the structure is used for solving the degradation problem easily occurring in a deep network and helping the network to better learn the characteristics of the image; in order to increase the receptive field of the generator network G, the invention adds the hole convolution operation in the generator network G, and increases the receptive field of the generator network G while not increasing the number of network parameters; the sizes of convolution kernels in the generator network G are all 3x3, after each downsampling operation, the output feature diagram is input into 2 residual block ResBlock to carry out jump connection operation, and the feature diagram obtained after 4 downsampling operations is subjected to hole convolution operation with hole rates of 5 and 3 respectively to increase the receptive field of the network; in the up-sampling process, the features extracted by the shallow network are spliced for operation, and the network finally uses convolution operation of 1x1 to ensure that the number of color channels output by the network is the same as the number of color channels input by the network;

the construction of the discriminator network D is specifically as follows: the discriminator network D is a deep convolutional neural network, and the main task of the discriminator network D is to judge whether an input image is a real image or an image generated by the generator network G; in order to prevent the problem of network degradation, a ResBlock structure is added into a discriminator network D, and the jump connection structure can still enable the network to be subjected to gradient propagation under the condition of increasing the network depth, so that convergence is ensured; the size of each convolution kernel in the discriminator network D is 3x3, the network is downsampled by carrying out Maxboosting operation for 4 times in total, finally, a fully connected layer is used for carrying out final dimension change, so that a confidence q of [0,1] is finally output, and whether the input image is a reference retinal image pair (a, b) or a retinal image pair (a, z) generated by the generator network G is judged;

the construction of the Adam optimizer is specifically as follows: dynamically adjusting the training hyper-parameters in a Pythrch frame by adopting an Adam optimization method, and optimizing the training; in the training process, an Adam optimizer is used for intelligently adjusting the learning rate and accelerating the convergence of the network; initial learning rate of 0.0002 and first moment coefficient beta₁Is 0.5, the second moment coefficient beta₂Is 0.999;

the step S4 of generating the segmented image by the generator network G specifically includes: extracting a pair of retina training samples (a) from the retina training set E_i,b_i) 1,2, 3.., M, the retinal image a_iInput into the generator network G, the corresponding image size is (3 × h × w) where 3 represents the number of color channelsD belongs to { red, green and blue } for the corresponding color channel, h represents the height of a single picture, and w represents the width of the single picture; the generated retina blood vessel segmentation graph z is obtained through the layer-by-layer calculation of the network_iThe image size is (1 × h × w), wherein the corresponding image is a gray scale image of a single color channel, namely, the image is represented in the form of the gray scale value size according to the degree of the blood vessel of the image;

in step S5, the generated blood vessel segmented image and the reference blood vessel segmented image are respectively determined, specifically: the discriminator network D simultaneously and respectively aligns the sample pairs (a) of the retina training set E_i,b_i) And the generated retina sample pair (a)_i,z_i) A determination is made wherein i 1,2,3, M, each giving [0,1]]The discrimination confidence q between the two samples represents the probability that the sample pair is the sample pair in the retina training set E; and calculating the generated blood vessel segmentation sample z by a loss function_iAnd a reference blood vessel segmentation sample b_iThe loss value between the two is prepared for the next backward propagation;

the step S6 of performing gradient update on the generator network G and the discriminator network D by using the countermeasures against the loss specifically includes: according to the given loss function, gradient calculation is carried out on parameters in the generator network G and the discriminator network D by utilizing a chain type derivative rule, and corresponding parameters are updated through a random gradient descent method; the corresponding parameter updating formula is as follows:

a gradient representing the corresponding parameter;

in step S7, the generator is evaluated by using the retinal image test sample, and the optimal model parameters are retained as follows: at the end of this training phase of the modelInputting the retina test set F into a generator network G, the generator network G generating a retina test synthesis set F' by combining the sample pairs (a) in the retina test set F_i,b_i) Reference vessel segmentation map b in (1)_iAnd the generated retina sample pair (a)_i,z_i) Generating a segmentation map z of (1)_iPerforming pixel-by-pixel comparison, wherein i is 1,2,3, N, and each pixel is classified into a blood vessel point and a non-blood vessel point; in order to perform performance test on the current generator network G, quantitative analysis needs to be objectively performed through performance indexes; measuring the effectiveness of the model by adopting indexes such as Accuracy (Accuracy, Acc), Specificity (Sp), Sensitivity (Se), Dice coefficient, F-measure, Area (AUC) formed by a receiver operating characteristic curve (ROC) and a coordinate axis, and area (mAP) formed by an Accuracy-recall rate curve (PR curve) and a coordinate axis;

the AUC is mainly used for performance measurement of medical image processing, and the closer the value of the AUC is to 1, the better the segmentation effect is;

wherein, TP (true positive) is true positive and represents the number of correctly segmented blood vessels; TN (true negative) represents the number of correctly segmented non-blood vessels, namely background pixels; FP (false positive), that is, the number of pixel points of the blood vessel which is wrongly divided into non-blood vessels; FN (false negative), that is, the number of pixels of the non-blood vessel is wrongly divided into blood vessels; TP + FN + FP + TN is the total number of pixel points of the interested area in the image;

since the performance index depends on the threshold of the output result, ROC data can be plotted by changing the true positive score (Sensitivity) and the false positive score (Specificity), and AUC is the area under the ROC curve; all evaluation indexes are tests on all pixels considered in a mask, which represents the retinal optic disc area;

selecting the model with the largest indexes of Acc, Se, Sp, Precision, Recall and F-measure as the optimal model after the indexes are screened; finally, judging at the parameter updating end stage, judging whether the training iteration number reaches the maximum iteration number, if so, ending the training stage to obtain an optimal generator network G 'and an optimal discriminator network D', and entering the next step; otherwise, step S3 is entered for continuous loop iteration training;

the step S8 specifically includes: a retina image sample set F₁＝{a_i|i∈[1,R_F]A, taking each retinal image a_iAs input in the optimal generator network G, the corresponding vessel segmentation image z is output_i，i＝1,2,3,...,R_FIn the formula, R_FRepresenting the number of samples of the retinal image sample set.