CN114283158A

CN114283158A - Retinal blood vessel image segmentation method and device and computer equipment

Info

Publication number: CN114283158A
Application number: CN202111490173.2A
Authority: CN
Inventors: 胡敏; 万飞龙; 黄宏程
Original assignee: Chongqing University of Post and Telecommunications
Current assignee: Chongqing University of Post and Telecommunications
Priority date: 2021-12-08
Filing date: 2021-12-08
Publication date: 2022-04-05

Abstract

The invention belongs to the field of medical image processing, and particularly relates to a retinal blood vessel image segmentation method, a retinal blood vessel image segmentation device and computer equipment; the method comprises the steps of obtaining retinal blood vessel images and preprocessing; inputting the preprocessed image into a trained U-Net network; extracting features by utilizing each residual pyramid convolution layer and the corresponding pooling layer of the encoder in the U-Net network; transmitting the convolution characteristics of each layer to the corresponding attention mechanism layer in a jumping connection mode, and selecting the attention characteristics; and splicing the sampling characteristics of the upper sampling layer and the corresponding attention characteristics by utilizing each residual pyramid convolution layer of the decoder in the U-Net network and the corresponding upper sampling layer, and transmitting the spliced characteristics to the last residual pyramid convolution layer of the decoder to obtain the segmentation result of the retinal blood vessel. The invention realizes the segmentation of the retinal blood vessel image on the basis of the U-Net network.

Description

Retinal blood vessel image segmentation method and device and computer equipment

Technical Field

The invention belongs to the field of medical image processing, and particularly relates to a retinal blood vessel image segmentation method, a retinal blood vessel image segmentation device and computer equipment.

Background

Retinal blood vessels have a variety of morphological structures that vary in length, width, and angle. Due to the fact that blood vessels are complex in distribution and uneven in size in the fundus image and the contrast between the target blood vessels and the background image is low, time and labor are consumed for manually segmenting the retinal blood vessels, and the large subjectivity exists. Therefore, the accurate and efficient retinal vessel segmentation realized by using the computer technology has very important significance for auxiliary diagnosis and treatment.

Many scholars at home and abroad have studied retinal vessel segmentation. Delibasis realizes the segmentation of retinal blood vessels by utilizing an automatic blood vessel tracking algorithm based on a model and introducing a multi-scale filter when seed pixels are initialized. The method comprises the steps that Alhussei denoises an image by adopting morphological filtering, extracts thick blood vessel and thin blood vessel enhanced images by utilizing Hessian matrixes with different scales respectively, and finally segments the thin blood vessel and a posterior blood vessel by utilizing different threshold segmentation methods to obtain a final retina blood vessel segmentation image. Li extracts a blood vessel network from the communicating pipe MPP model, and then applies a pipeline segmentation algorithm to the expanded pipeline target area to perform blood vessel segmentation.

With the development of artificial intelligence technology, convolutional neural networks have been widely used in the field of medical image processing. Francia proposes a new method for interlinking two convolutional neural networks, and the second CNN adopts a residual network block design and is added into information flow from a first module to realize accurate segmentation of blood vessels. Jin proposes a deformable vessel segmentation network that segments vessels in an end-to-end fashion using local features of retinal vessels. ZHou utilizes CNN to extract blood vessel characteristics, uses a group of filters to enhance fine blood vessels, reduces the intensity difference of the fine blood vessels and the coarse blood vessels, and finally utilizes dense CRF to segment blood vessels. The U-Net is combined with a DenseNet network, the characteristic information of an output layer is fully utilized, and a cavity convolution is integrated in the network, so that the network receptive field is improved, and more blood vessels can be segmented. Li proposes a retinal vessel segmentation method based on a U-shaped network, which utilizes the advantages of deformable convolution and a double-attention module to segment retinal vessels. Guo proposes to use a dense block to replace the jump connection in the traditional U-type network to realize feature fusion, and adopts a generative confrontation network in the training stage, and uses the dense U network based on an initial module as a generator of GAN, and establishes a multilayer neural network as a discriminator of GAN to realize the segmentation of retinal blood vessels.

Although some of the above methods generally perform well on retinal blood vessel segmentation, due to the complicated retinal blood vessel structure information, the large shape difference, and the influence of the focus, some blood vessel contour information is lost, and it is difficult to segment the complete retinal blood vessels, so an image segmentation processing suitable for a retinal blood vessel image segmentation model is urgently needed.

Disclosure of Invention

In order to solve the problems in the prior art, the invention provides a retinal blood vessel image segmentation method, a device and computer equipment, which realize complete segmentation of the retinal blood vessel image by improving a U-Net network, namely, the obtained retinal blood vessel image is preprocessed and data expansion is carried out by obtaining the retinal blood vessel image to be segmented; inputting the processed image into a trained improved U-Net network for image recognition and segmentation to obtain a segmented blood vessel image; the improved U-Net network comprises an encoder, a decoder, a residual pyramid convolution and a jump connection part.

In a first aspect thereof, the present invention provides a retinal blood vessel image segmentation method, including:

obtaining a retinal blood vessel image, and preprocessing the retinal blood vessel image;

inputting the preprocessed retinal blood vessel image into a trained U-Net network;

extracting convolution characteristics and pooling characteristics of the retinal vessel image at different levels by using each residual pyramid convolution layer and the corresponding pooling layer of an encoder in the U-Net network;

the convolution characteristics of each layer are transmitted to the corresponding attention mechanism layer in a jumping connection mode, and the attention characteristics of the target area of the retinal blood vessel image are selected from the convolution characteristics of each layer;

inputting the pooling characteristic of the last layer into a first residual pyramid convolution layer of a decoder in the U-Net network, and outputting a sampling characteristic by utilizing an upper sampling layer;

splicing the sampling characteristics of the upper sampling layer and the corresponding attention characteristics by utilizing each residual pyramid convolution layer and the corresponding upper sampling layer of a decoder in the U-Net network, transmitting the spliced sampling characteristics to the last residual pyramid convolution layer of the decoder, and outputting the obtained characteristic diagram;

and (4) performing 1 × 1 convolution on the output feature map to finally obtain a segmentation result of the retinal blood vessels.

In a second aspect of the present invention, the present invention also provides a retinal blood vessel image segmentation apparatus including:

the image acquisition module is used for acquiring a retinal blood vessel image;

the image processing module is used for preprocessing the acquired retinal blood vessel image;

the image input module is used for inputting the preprocessed retinal blood vessel images into the trained U-Net network;

the encoder module is used for extracting convolution characteristics and pooling characteristics of the retinal vessel image at different levels by utilizing each residual pyramid convolution layer and the corresponding pooling layer of the encoder in the U-Net network;

the jump connection module is used for transmitting the convolution characteristics of each layer to the corresponding attention mechanism layer in a jump connection mode;

the attention mechanism module is used for selecting attention characteristics of a target area of the retinal blood vessel image from the convolution characteristics of each layer;

the characteristic connection module is used for inputting the pooled characteristic of the last layer into a first residual pyramid convolution layer of a decoder in the U-Net network and outputting a sampling characteristic by utilizing an upper sampling layer;

the decoder module is used for splicing the sampling characteristics of the upper sampling layer and the corresponding attention characteristics by utilizing each residual pyramid convolution layer and the corresponding upper sampling layer of the decoder in the U-Net network and transmitting the spliced sampling characteristics to the last residual pyramid convolution layer of the decoder;

and the image output module is used for performing 1 x 1 convolution on the feature map of the last residual pyramid convolution layer to obtain a segmentation result of the retinal blood vessel.

In a third aspect of the invention, the invention also provides a computer device comprising a memory storing a computer program and a processor implementing the steps of the method according to the first aspect of the invention when the processor executes the computer program.

The invention has the beneficial effects that:

the invention provides a retinal vessel image segmentation method, a retinal vessel image segmentation device and computer equipment, aiming at the problems of low segmentation precision and the like caused by complexity and changeability of retinal vessel scale information and morphological structures. The method is based on a U-Net network, a residual pyramid module RPC is used in an encoding stage, convolution cores with different sizes and depths are utilized to check retinal vessel images for feature extraction, and therefore retinal vessel information with different scales is captured; an attention mechanism is introduced into the jump connection, the characteristic information of a target area is focused, and the interference is reduced; and finally, obtaining a final segmentation result through a SoftMax activation function.

Drawings

FIG. 1 is a flowchart of a retinal blood vessel image segmentation method according to an embodiment of the present invention;

FIG. 2 is a flow chart of a retinal vessel image segmentation method in accordance with a preferred embodiment of the present invention;

FIG. 3 is a schematic diagram of an improved U-Net network structure of the present invention;

FIG. 4 is a diagram of a pyramid convolution module of the present invention;

FIG. 5 is a diagram of a residual pyramid convolution layer of the present invention;

FIG. 6 is a block diagram of the residual pyramid convolution of the present invention;

FIG. 7 is a block diagram of an improved attention mechanism of the present invention;

FIG. 8 is a diagram illustrating a structure of a retinal blood vessel image segmentation apparatus according to an embodiment of the present invention;

FIG. 9 is a data set image enhancement diagram of the present invention;

FIG. 10 is a graph of the segmentation results of the blood vessel according to the present invention and the prior art;

FIG. 11 is a diagram showing the comparison result between the present invention and the prior art.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The retinal blood vessel image segmentation method can be applied to a server application environment. Specifically, a server acquires a retinal blood vessel image and preprocesses the retinal blood vessel image; the server inputs the preprocessed retinal blood vessel images into a trained U-Net network; the server extracts convolution characteristics and sampling characteristics of the retinal vessel image at different levels by utilizing each residual pyramid convolution layer of the encoder in the U-Net network and the corresponding pooling layer; the server transmits the convolution characteristics of each layer to the corresponding attention mechanism layer in a jumping connection mode, and selects the attention characteristics of the target area of the retinal blood vessel image from the convolution characteristics of each layer; the server inputs the sampling characteristics of the last layer into a first residual pyramid convolution layer of a decoder in the U-Net network, and outputs the sampling characteristics by utilizing an upper sampling layer; and the server splices the sampling characteristics of the upper sampling layer and the corresponding attention characteristics by utilizing each residual pyramid convolution layer of the decoder in the U-Net network and the corresponding upper sampling layer, and transmits the spliced sampling characteristics to the last residual pyramid convolution layer of the decoder to obtain the segmentation result of the retinal blood vessel.

As will be appreciated by those skilled in the art, a "server," as used herein, may be implemented as a stand-alone server or as a server cluster comprised of multiple servers.

Fig. 1 is a flowchart of a retinal blood vessel image segmentation method in an embodiment of the present invention, as shown in fig. 1, the method includes inputting a retinal blood vessel image to be segmented, and performing image denoising enhancement and data expansion processing on the retinal blood vessel image to be segmented; and training the U-Net network model by using the processed retinal vessel image until the training is finished and meets the qualified training standard, storing the network model, and outputting the separation result of the retinal vessel image to be segmented by using the network model.

Fig. 2 is a flowchart of a retinal blood vessel image segmentation method in a preferred embodiment of the present invention, as shown in fig. 2, the method includes:

101. obtaining a retinal blood vessel image, and preprocessing the retinal blood vessel image;

in the embodiment of the invention, in the field of actual medical images, a large number of retinal blood vessel images exist, the images are generated by medical staff (detection staff) during medical activities by using an instrument system, the image information has various morphological structures, and the invention aims to segment the retinal blood vessel images with different morphological structures, so that the instrument system can be accessed to acquire the actual retinal blood vessel images.

In the embodiment of the invention, the preprocessing of the retinal blood vessel image mainly comprises denoising and contrast enhancement processing of the obtained retinal blood vessel image, and processing of the retinal blood vessel image by adopting different color channels to obtain the retinal blood vessel image with the highest blood vessel and background contrast.

In a preferred embodiment of the present invention, the pre-processing the retinal blood vessel image further includes performing a data expansion operation on the acquired retinal blood vessel image, and performing random cropping combination of the same size on each retinal blood vessel image, thereby obtaining an expanded retinal blood vessel image.

102. Inputting the preprocessed retinal blood vessel image into a trained U-Net network;

in the embodiment of the invention, a preprocessed retinal vessel image to be segmented can be directly input into a trained U-Net network for subsequent segmentation recognition, wherein the U-Net network is a network improved by the invention, as shown in fig. 3, the improved U-Net network also comprises an encoder, a decoder and a jump connection module, wherein the encoder is used for extracting shallow layer characteristics, deep layer characteristics and fine vessel characteristics of the retinal vessel image, namely, convolution layers of each residual pyramid convolution layer of the encoder in the U-Net network and corresponding pooling layers are used for extracting convolution characteristics and sampling characteristics of the retinal vessel image at different levels; the decoder is composed of a transposed convolutional layer and is used for recovering the size of a feature map output by the encoder, namely, each residual pyramid convolutional layer of the decoder in a U-Net network and the corresponding upper sampling layer are utilized to splice the sampling features of the upper sampling layer and the corresponding attention features and are transmitted to the last residual pyramid convolutional layer of the decoder; the jump connection module is used for transmitting the convolution characteristics of each layer to the corresponding attention mechanism layer in a jump connection mode.

The improved core of the invention is that the convolution layers in the encoder and the decoder are replaced by residual pyramid convolution layers, specifically, the residual pyramid convolution layer RPC comprises two pyramid convolution modules which are connected in series, and the output of the first pyramid convolution module is connected with the output of the second pyramid convolution module through a residual connecting layer; each pyramid convolution module comprises two first units and a second unit, the two first units are connected through the second unit, and each first unit comprises a batch normalization layer BN, a convolution layer Conv and an activation function layer Relu; the second unit comprises a batch normalization layer BN, a pyramid convolution layer PyConv and an activation function layer Relu; each of the pyramid convolution layers includes convolution kernels of a plurality of different sizes.

In an embodiment of the present invention, Pyramid convolution (PyConv) is capable of processing input over multiple filter scales. PyConv contains a kernel pyramid in which each level contains a different type of filter, different in size and depth, capable of capturing different levels of detail in a scene. In addition to these improved recognition capabilities, PyConv is also very efficient and does not increase computational cost and parameters compared to standard convolution. Furthermore, it is very flexible and extensible, in thatDifferent applications provide a huge potential network architecture space. As shown in fig. 4, the kernel size is increased from the bottom of the pyramid (level 1 of PyConv) to the top (level n of PyConv). Meanwhile, as the size of the space increases, the depth of the kernel decreases from level 1 to level n. This therefore results in two interconnected pyramids, facing in opposite directions. One pyramid bottom (evolving to the top by reducing the kernel depth) and the other pyramid top bottom, where the convolution kernel is the largest in spatial size (evolving to the bottom by reducing the kernel spatial size). Mapping FM to input features_iEach layer {1,2, 3.., n } of the pyramid convolution corresponds to a different size convolution kernel { K }₁ ²,K₂ ²,K₃ ²,......,K_n ²And each layer of convolution kernel has a different depth

A different number of output profiles may be output. The number of parameters and the calculation cost of the pyramid convolution are thus formulated as:

parameter quantity formula:

calculating a cost formula:

wherein H represents the height of the feature matrix and W represents the width of the feature matrix; FM_on+...+FM_o3+FM_o2+FM_o1＝FM_oEach row in these equations represents the number of parameters and computational cost for a certain level in the pyramid convolution. If each layer in the pyramid convolution outputs an equal number of feature maps, the number of parameters and computational cost in the pyramid convolution will be evenly distributed along each pyramid layer.

Inspired by residual learning and pyramid convolution, a residual pyramid module is proposed, the structure of which is shown in fig. 5, and an input feature map X is processed by two PyConvBlock processes, then connected with an original input feature map by residual connection, and a final feature map is output after a Relu activation function. The residual connection further strengthens the characteristic propagation and improves the performance of the network. The PyConvBlock structure is shown in FIG. 6, and includes normalization, convolution, Relu operations. The second convolution in PyConvBlock uses PyConv with four different sizes of convolution kernels, 3 × 3, 5 × 5, 7 × 7, and 9 × 9, the smaller convolution kernel has a smaller receptive field, and small targets and local detail information can be obtained. The larger convolution kernel has larger receptive field, and can obtain large target and global semantic information. In the RPC module, as shown in FIG. 5, PyConv with different kernels is used for capturing tiny blood vessel branches, and features of different levels in the retinal blood vessel image are extracted for combination, so that the completeness and the precision of segmentation are improved. And finally, a residual error structure is used for outputting, so that the problem of degradation caused by excessive network cascade layers is solved, and the network convergence speed is increased.

In an embodiment of the present invention, the encoder may be composed of 4 sets of residual pyramid convolutional layer RPC modules and 4 sets of pooling layers, each RPC module being followed by one pooling layer. The retinal vessel image enters an RPC module after being processed, each RPC module comprises two pyramid convolution layers PyConv, each PyConv comprises four convolution kernels Conv with different sizes, the four convolution kernels Conv are used for extracting information of the retinal vessel in different scales for fusion, and then a convolution kernel Cat with the size of 1 x 1 is accessed to reduce the mapping number of input information. And a maximum pooling layer with the size of 2 x 2 is connected behind each RPC module to realize the function of reducing the retinal blood vessel feature map to half of the size of the feature map of the previous layer, so that the encoder consists of 4 groups of residual pyramid convolution modules RPC and 4 groups of maximum pooling layers with the size of 2 x 2. Each group of RPC modules respectively comprises a convolution layer with the size of 1 × 1 in front and back, and a pyramid convolution is arranged between the two ordinary convolutions, wherein the convolution kernel size is 3 × 3, 5 × 5, 7 × 7 and 9 × 9.

In the embodiment of the present invention, the decoder may include an Upsampling layer and an RPC module, where the Upsampling layer is a deconvolution layer with a kernel size of 2 × 2, and performs Upsampling on the output feature map to restore the original size, and then uses a SoftMax activation function to classify the blood vessel and the background image, and outputs a segmentation result. An attention mechanism is introduced in the jump connection module for fusing the ratio of the background image and the blood vessel so as to reduce the influence of the background on the blood vessel segmentation, so that the decoder consists of 4 layers of transposed convolution of 2 x 2 and 1 layer of ordinary convolution of 1 x 1.

Based on the above analysis, the training process for the U-Net network may include:

acquiring original retinal blood vessel images, and preprocessing the original retinal blood vessel images to obtain a training data set, wherein each retinal blood vessel image has a corresponding segmentation label, namely a label image; inputting the image data in the training data set into an improved U-Net network for processing; an RPC module of the encoder performs shallow feature extraction on input data to obtain shallow features of an image; residual connection is used in the RPC module, so that the network degradation phenomenon caused by excessive network layers is avoided; the jumping connection module transmits the extracted shallow layer characteristics to the attention mechanism module; selecting the characteristics of the target area by using an attention mechanism, and transmitting the selected characteristics to an output layer of an encoder; the deconvolution layer of the decoder performs characteristic graph size recovery on deep features obtained by multiple convolutions and downsampling of the encoder; performing feature splicing on the features sampled by the encoder and the features output by the attention mechanism, and transmitting the spliced feature map to the last convolution layer to obtain a final feature map; comparing the final feature map with the label image pixel by pixel to obtain an error; calculating a loss function of the model according to the error result, calculating a gradient of the target loss function by adopting back propagation, determining the minimum value of the target loss function by adopting a random descent algorithm, and finishing the training of the model when the loss function is minimum.

103. Extracting convolution characteristics and pooling characteristics of the retinal vessel image at different levels by using each residual pyramid convolution layer and the corresponding pooling layer of an encoder in the U-Net network;

in the embodiment of the invention, in a U-Net network, an encoder comprises a plurality of residual pyramid convolutional layers and a plurality of pooling layers, and the embodiment connects the residual pyramid convolutional layers and the pooling layers in a staggered manner, so that the output result of the previous layer of residual pyramid convolutional layer is input into the next layer of pooling layer, and the output result of the previous layer of pooling layer is input into the next layer of residual pyramid convolutional layer; the convolution characteristics and the sampling characteristics of different levels can reflect the superficial layer characteristics, the deep layer characteristics and the fine blood vessel characteristics of the retinal blood vessel image.

104. The convolution characteristics of each layer are transmitted to the corresponding attention mechanism layer in a jumping connection mode, and the attention characteristics of the target area of the retinal blood vessel image are selected from the convolution characteristics of each layer;

in the embodiment of the invention, in the image segmentation process, considering that noise information such as background and the like in a retina image has influence on the segmentation result and reduces the accuracy of segmentation, an attention mechanism is introduced, considering that blood vessels in the retina image are scattered, and an error may occur in the original attention mechanism in the process of distributing weights of the blood vessel region, the background and the like, so that the attention mechanism is improved, one input is added on the basis of the original input, and then the subsequent operation is performed after the addition operation is performed on the outputs of the two inputs, so that a target region can be highlighted, and a model can be more concentrated on the feature learning of the target region in the retina blood vessel image. As shown in FIG. 7, the attention coefficient of the attention mechanism model is

To identify significant retinal vascular regions and to prune the characteristic response, retaining only the activation associated with the vascular information. The output of the attention mechanism model is the element-by-element multiplication of the input feature map and the attention coefficients. The formula is as follows:

wherein the input characteristic diagram is

Output characteristics

c is the number of input network layers, which can correspond to the attention mechanism layer and the pooling layer, l is the size of the channel, and i is the size of the pixel space. In image segmentation, there are multiple semantic categories, and a multidimensional attention coefficient can be used, and each AG will concentrate on its target segmentation condition. The gating coefficients are obtained using accumulation attention compared to multiplication attention to obtain better segmentation. Additive Note the formula:

wherein

In order to take care of the mechanism of the exercise,

weight parameter vector of 1 × 1 convolution, F_intIs the length of the feature vector of the pixel,

in order to take care of the coefficients,

corresponding to Sigmoid activation function, Ω₁The function is activated for Relu. By including the linear transformation parameter Θ_attThe characteristics of AG can be obtained, and the parameters comprise linear transformation coefficient matrix

And

and

wherein the content of the first and second substances,

a first input transpose matrix representing a first matrix of linear transform coefficients,

a second input transpose matrix representing the first linear transform coefficient matrix;

a first input transpose matrix representing a second matrix of linear transform coefficients,

a second input transpose matrix representing a second matrix of linear transform coefficients;

b_grepresenting a first bias term; b_ΨRepresenting a second bias term, b_Ψ∈R，

x_iAnd g_iRespectively, an input signature and a strobe signal.

By analyzing the input features and the gate signals, the attention mechanism model can obtain gate correlation coefficients, so that when the image is segmented, the attention mechanism model can focus on the feature information of retinal blood vessels and eliminate noise information such as background. It can be seen from the improved U-Net network structure that the attention mechanism model directly cascades the pooling layer and the convolution layer to the decoding part through jump connection in the coding part, fuses complementary information and adopts the 1 × 1 convolution layer for linear transformation, which is helpful to further reduce the fracture or gap phenomenon in the segmentation process caused by insufficient recovery of the tiny blood vessels of the retina image. The accuracy and the integrity of the blood vessel segmentation are improved.

105. Inputting the pooling characteristic of the last layer into a first residual pyramid convolution layer of a decoder in the U-Net network, and outputting a sampling characteristic by utilizing an upper sampling layer;

in the embodiment of the present invention, the sampling feature of the last layer in the encoder is used as the input of the first residual pyramid convolutional layer in the decoder, so as to realize the normal connection between the encoder and the decoder, and certainly, the complete connection between the encoder and the decoder is completed by using the skip connection.

106. And splicing the sampling characteristics of the upper sampling layer and the corresponding attention characteristics by utilizing each residual pyramid convolution layer and the corresponding upper sampling layer of the decoder in the U-Net network, transmitting the spliced sampling characteristics to the last residual pyramid convolution layer of the decoder, and outputting the obtained characteristic diagram.

107. And (4) performing 1 × 1 convolution on the output feature map to finally obtain a segmentation result of the retinal blood vessels.

In the embodiment of the present invention, since an encoder and a decoder in the U-Net network are approximately symmetrical structures, in the embodiment of the present invention, the decoder includes a plurality of residual pyramid convolution layers and a plurality of upsampling layers, and in the embodiment, the residual pyramid convolution layers and the upsampling layers are connected in an interleaved manner, so that an output result of a next layer of residual pyramid convolution layers is input into an upper layer of upsampling layers, and an output result of a next layer of upsampling layers is input into an upper layer of residual pyramid convolution layers.

Fig. 8 is a structural diagram of a retinal blood vessel image segmentation apparatus according to an embodiment of the present invention, and as shown in fig. 8, the image segmentation apparatus 200 includes:

an image acquisition module 201, configured to acquire a retinal blood vessel image;

an image processing module 202, configured to perform preprocessing on the obtained retinal blood vessel image;

the image input module 203 is used for inputting the preprocessed retinal blood vessel image into the trained U-Net network;

the encoder module 204 is used for extracting convolution characteristics and pooling characteristics of the retinal blood vessel image at different levels by utilizing each residual pyramid convolution layer and the corresponding pooling layer of the encoder in the U-Net network;

a jump connection module 205, configured to transfer the convolution characteristic of each layer to the corresponding attention mechanism layer by means of jump connection;

an attention mechanism module 206, configured to select an attention feature of the target region of the retinal blood vessel image from the convolution features of each layer;

the feature connection module 207 is used for inputting the pooled features of the last layer into a first residual pyramid convolution layer of a decoder in the U-Net network and outputting sampling features by utilizing an upper sampling layer;

the decoder module 208 is configured to splice the sampling characteristics of the upsampling layer and the corresponding attention characteristics by using each residual pyramid convolutional layer and the corresponding upsampling layer of the decoder in the U-Net network, and transmit the spliced sampling characteristics to the last residual pyramid convolutional layer of the decoder;

and the image output module 209 is configured to output the convolution characteristics of the last residual pyramid convolution layer to obtain a segmentation result of the retinal blood vessel.

In one embodiment, a computer device is provided, the computer device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the following steps when executing the computer program: obtaining a retinal blood vessel image, and preprocessing the retinal blood vessel image; inputting the preprocessed retinal blood vessel image into a trained U-Net network; extracting convolution characteristics and sampling characteristics of the retinal blood vessel image at different levels by utilizing each residual pyramid convolution layer of an encoder in the U-Net network and the corresponding pooling layer; the convolution characteristics of each layer are transmitted to the corresponding attention mechanism layer in a jumping connection mode, and the attention characteristics of the target area of the retinal blood vessel image are selected from the convolution characteristics of each layer; inputting the sampling feature of the last layer into a first residual pyramid convolution layer of a decoder in the U-Net network, and outputting the sampling feature by utilizing an upper sampling layer; and splicing the sampling characteristics of the upper sampling layer and the corresponding attention characteristics by utilizing each residual pyramid convolution layer of the decoder in the U-Net network and the corresponding upper sampling layer, and transmitting the spliced characteristics to the last residual pyramid convolution layer of the decoder to obtain the segmentation result of the retinal blood vessel.

In the embodiment of the present invention, in the training process, the selected databases are RGB images, and the pixels are formed by mixing three colors of red, green, and blue. Since the contrast between retinal blood vessels and the background is low, fundus image enhancement is required to highlight the contrast between blood vessels and the background in order to obtain the characteristics of fine blood vessels. As can be seen from fig. 9, the blood vessels in the green channel have high contrast with the background and low noise interference, while the blood vessels in the red channel and the blue channel have low contrast with the background and high noise interference. Although the whole fundus image is red, the difference of pixel values between the blood vessel of the red channel and the background is found to be small after the image gray scale conversion by calculating and analyzing the RGB channel, and the segmentation of the blood vessel is not facilitated. In addition, the highlighting of the optic disc area results in a loss of part of the blood vessel information. In contrast, the retinal blood vessels in the green channel have larger difference in pixel value from the background, which is more beneficial to blood vessel segmentation. Therefore, processing is performed using the green channel of the fundus image.

As shown in fig. 10, the results of retinal blood vessel image segmentation using the network model proposed by the present invention are shown in (a) column, which is an original image, (b) column, which is a standard segmentation result graph, (c) column, which is a U-Net segmentation image, and (d) column, which is an experimental result graph obtained using the network model proposed by the present invention. As can be seen from the segmentation result chart of the U-Net algorithm, the phenomena of vessel breakage and incomplete vessel segmentation occur, and the segmentation performance of the tail end of a small vessel is poor. In addition, the segmentation result contains more noise information. Compared with the U-Net algorithm, the network model provided by the invention is greatly improved, the influence of noise information can be effectively inhibited, more detailed blood vessel information can be accurately segmented, and the segmentation performance of tiny blood vessels is improved.

As shown in FIG. 11, (a) is the original image, and (b) - (e) are the original image, the standard image, the U-Net algorithm, and the local detail map of the present invention, respectively. It can be seen intuitively that when a blood vessel region is approached, the segmentation result of the small blood vessel is fuzzy by the U-Net algorithm, the phenomenon of blood vessel loss and breakage occurs in the blood vessel crossing region, and in addition, interference information also occurs. Experimental results show that different blood vessel regions can be segmented by introducing an RPC module and an attention mechanism to improve the original U-Net segmentation result, and a better segmentation result is obtained.

The final results on the DRIVE data set are shown in table one:

according to the comparison results in the table 1, the segmentation results of the invention are greatly improved in Accuracy and Sensitivity compared with U-Net, Recurrent U-Net, Residual U-Ne, RCBAM-Net, R2U-Net and LadderNet, and the overall segmentation effect is better and the result is more accurate.

It should be understood that, although the steps in the flowcharts of the figures are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and may be performed in other orders unless explicitly stated herein. Moreover, at least a portion of the steps in the flow chart of the figure may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed alternately or alternately with other steps or at least a portion of the sub-steps or stages of other steps.

In the description of the present invention, it is to be understood that the terms "coaxial", "bottom", "one end", "top", "middle", "other end", "upper", "one side", "top", "inner", "outer", "front", "center", "both ends", and the like, indicate orientations or positional relationships based on those shown in the drawings, and are only for convenience of description and simplicity of description, and do not indicate or imply that the devices or elements referred to must have a particular orientation, be constructed and operated in a particular orientation, and thus, are not to be construed as limiting the present invention.

In the present invention, unless otherwise expressly stated or limited, the terms "mounted," "disposed," "connected," "fixed," "rotated," and the like are to be construed broadly, e.g., as meaning fixedly connected, detachably connected, or integrally formed; can be mechanically or electrically connected; the terms may be directly connected or indirectly connected through an intermediate, and may be communication between two elements or interaction relationship between two elements, unless otherwise specifically limited, and the specific meaning of the terms in the present invention will be understood by those skilled in the art according to specific situations.

Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims

1. A retinal blood vessel image segmentation method, characterized in that the method comprises:

2. The retinal blood vessel image segmentation method according to claim 1, wherein the preprocessing the retinal blood vessel image includes denoising and contrast enhancement processing the obtained retinal blood vessel image, and processing the retinal blood vessel image by using different color channels to obtain a retinal blood vessel image with the highest blood vessel and background contrast.

3. The retinal blood vessel image segmentation method according to claim 1, wherein the preprocessing the retinal blood vessel image further comprises performing a data expansion operation on the acquired retinal blood vessel image, and performing random cropping combination of the same size on each retinal blood vessel image, thereby obtaining an expanded retinal blood vessel image.

4. The retinal blood vessel image segmentation method according to claim 1, wherein the training process of the U-Net network includes comparing the segmentation result of the retinal blood vessel image with the corresponding label image pixel by pixel to obtain an error image; and calculating a target loss function of the U-Net network according to the error image, calculating a gradient value of the target loss function by adopting back propagation, determining the minimum value of the target loss function by adopting a random descent algorithm, and finishing the training of the U-Net network model when the target loss function is minimum.

5. The retinal vessel image segmentation method according to claim 1, wherein the residual pyramid convolution layer comprises two pyramid convolution modules connected in series, and the output of a first pyramid convolution module is connected with the output of a second pyramid convolution module through a residual connection layer; each pyramid convolution module comprises two first units and a second unit, the two first units are connected through the second unit, and each first unit comprises a batch normalization layer, a convolution layer and an activation function layer; the second unit comprises a batch normalization layer, a pyramid convolution layer and an activation function layer; each of the pyramid convolution layers includes convolution kernels of a plurality of different sizes.

6. The retinal vessel image segmentation method according to claim 5, wherein each convolution kernel in the pyramid convolution layer has a different depth.

7. The retinal blood vessel image segmentation method according to claim 1, wherein the modified attention mechanism formula adopted by the attention mechanism layer is expressed as:

wherein the content of the first and second substances,

the attention feature of the c-th attention mechanism layer in the i-pixel space of the l-th channel output is shown,

the pooling characteristics of the i pixel space representing the output of the c pooling layer at the l channel;

for the attentiveness coefficient of the c-th attentiveness layer,

Ω₂indicating that the Sigmoid-activated function,

the attention mechanism for the ith channel is shown,

an input feature map representing the i-pixel space output at the l-th channel; g_iRepresenting the i pixel spatial gating signal; b_gRepresenting a first bias term; b_ΨRepresenting a second bias term; Ψ^TRepresenting a weight parameter vector of the 1 × 1 convolution; omega₁Activating a function for Relu; the attention feature is composed of a set of parameters theta containing linear transformation_attIs obtained, the parameter comprises

And

wherein the content of the first and second substances,

a second input transpose matrix representing a second matrix of linear transform coefficients.

8. A retinal blood vessel image segmentation apparatus, characterized in that the apparatus comprises:

9. A computer device comprising a memory storing a computer program and a processor implementing the steps of the method according to any one of claims 1 to 7 when the computer program is executed.