CN117315258A - Lightweight retinal vessel segmentation method based on graph convolution network and partial convolution - Google Patents

Lightweight retinal vessel segmentation method based on graph convolution network and partial convolution Download PDF

Info

Publication number
CN117315258A
CN117315258A CN202311404768.0A CN202311404768A CN117315258A CN 117315258 A CN117315258 A CN 117315258A CN 202311404768 A CN202311404768 A CN 202311404768A CN 117315258 A CN117315258 A CN 117315258A
Authority
CN
China
Prior art keywords
convolution
feature
network
image
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311404768.0A
Other languages
Chinese (zh)
Inventor
崔少国
张乐迁
万皓明
王海祥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing Normal University
Original Assignee
Chongqing Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing Normal University filed Critical Chongqing Normal University
Priority to CN202311404768.0A priority Critical patent/CN117315258A/en
Publication of CN117315258A publication Critical patent/CN117315258A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Ophthalmology & Optometry (AREA)
  • Human Computer Interaction (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a lightweight retinal blood vessel segmentation method based on a graph rolling network and partial convolution, which comprises the steps of constructing a lightweight retinal blood vessel segmentation network model based on the graph rolling network and the partial convolution, training and optimizing parameters of the retinal blood vessel segmentation network model, and automatically segmenting the retinal blood vessel structure, wherein the lightweight retinal blood vessel segmentation network model comprises a feature encoder, a multi-scale feature fusion device, a feature decoder and a tag predictor. According to the method, the symmetric codec deep learning model is built, partial convolution is adopted to replace conventional convolution to reduce the computational complexity of the model, a K nearest neighbor algorithm is used for converting a feature map into a map structure, then map convolution is used for extracting global features of an image, a multi-scale feature fusion module is introduced to better fuse codec features, a novel up-sampling mode is adopted to replace bilinear interpolation algorithm by an adjacent block expansion layer, the generalization of the model is improved, and precise segmentation of retinal blood vessels is realized.

Description

Lightweight retinal vessel segmentation method based on graph convolution network and partial convolution
Technical Field
The invention relates to the technical field of medical image semantic segmentation, in particular to a lightweight retinal vessel segmentation method based on a graph convolution network and partial convolution.
Background
Fundus is the only part of human body where arteries, veins and capillaries can be directly observed. By observing the geometrical form of retinal blood vessels, such as the angle, diameter and topological structure of the blood vessels, diabetes, hypertension and various eye diseases can be effectively diagnosed. Among these diseases, diabetes can lead to retinopathy and neovascularization, and even to blindness in the patient; hypertension can cause abnormal hyperplasia of blood vessels, and symptoms such as hemangioma, cotton velvet spots and the like appear. In addition, diffuse or flaky retinal atrophy plaques may appear on the fundus of highly myopic populations. Therefore, in clinical studies, analysis of visual information of retinal blood vessels is important to help doctors diagnose diabetes, hypertension and various eye diseases, and an excellent retinal blood vessel segmentation method can improve diagnosis efficiency and accuracy of doctors.
However, the retinal blood vessel of the fundus has a complex tree-like topology, and is easily broken when the blood vessel is too fine. The brightness of the optic disc area is higher than that of other areas, the contrast is lower, and the difficulty of dividing retinal blood vessels is increased. In addition, underexposure or overexposure can also reduce image contrast, resulting in unclear retinal vessel boundaries. In summary, due to the limitation of the retinal blood vessel imaging and the interference factors such as the light source in the imaging process, the contrast of the blood vessel is reduced, and thus the blood vessel information is lost or the topological structure of the blood vessel is directly affected, which greatly increases the difficulty of retinal blood vessel segmentation.
With the rapid development of convolutional neural networks in the field of deep learning and the advent of various network optimization techniques, more and more domestic and foreign researchers focus on realizing automatic segmentation of retinal blood vessels and auxiliary diagnosis of disease conditions. In recent years, the automatic segmentation algorithm of retinal blood vessels is mostly based on an encoder-decoder structure, and a convolutional neural network is used for extracting image blood vessel characteristics, so that a good segmentation effect is finally obtained. However, the inventor of the application finds that the method is limited by the size of the convolution kernel, cannot model global information, and can cause loss of spatial information in the down sampling process, so that the phenomena of unclear blood vessel boundaries or noise and the like appear in the segmentation result.
Disclosure of Invention
Aiming at the technical problems that the existing retinal vessel segmentation method based on the encoder-decoder structure cannot solve the vessel fracture and the unclear boundary in the segmentation map, the invention provides a lightweight retinal vessel segmentation method based on a graph convolution network and partial convolution.
In order to solve the technical problems, the invention adopts the following technical scheme:
a lightweight retinal vessel segmentation method based on a graph convolution network and partial convolution comprises the following steps:
s1, constructing a lightweight retinal vascular segmentation network model based on a graph convolution network and partial convolution:
s11, a lightweight retinal vascular segmentation network based on a graph convolution network and partial convolution comprises a feature encoder, a multi-scale feature fusion device, a feature decoder and a tag predictor, wherein the feature encoder captures local features of an image through partial convolution and global features of the image through graph convolution, performs downsampling through a maximum pooling operation, and increases semantic information contained in the image while reducing the resolution of the image; the multi-scale feature fusion device performs weighted fusion on the local features and the global features, suppresses noise information in the feature layer and pays attention to the feature information of the blood vessel; the feature decoder gradually restores the resolution of the feature map to the same size as the original map through an adjacent block expansion layer and a convolution layer; the tag predictor obtains a final segmentation probability map through convolution and an activation function of the multi-channel feature map output by the feature decoder;
s12, the feature encoder comprises five feature encoding layer groups, the feature decoder comprises three feature decoding layer groups and four adjacent block expansion layers, each feature encoding layer group and each feature decoding layer group are composed of two partial convolution layers and a conventional convolution layer which are sequentially arranged, an adjacent block expansion layer is arranged between each stage of the feature decoder, the adjacent block expansion layer replaces bilinear interpolation to perform up-sampling operation on a feature map, and the feature map is gradually restored to the original map size; the multi-scale feature fusion device comprises two pooling layer groups, two convolution layers and a softmax weight conversion layer, wherein each pooling layer group consists of a maximum pooling layer and an average pooling layer, and a convolution layer is arranged behind each pooling layer group; the label predictor consists of a class prediction layer and a softmax probability conversion layer, wherein the softmax probability conversion layer converts class prediction scores into probability distribution;
s2, training a retina blood vessel segmentation network model and optimizing parameters:
s21, initializing network parameters: initializing the model parameters of the lightweight retinal vascular segmentation network constructed in the step S1 and based on the graph convolution network and partial convolution by adopting an Xavier method;
s22, data set preparation: the method comprises the steps of separating an original retinal blood vessel image which is divided into a training set and a verification set and is provided with pixel level separation labels into RGB three-channel feature images, carrying out weighting treatment, then using histogram equalization to approximately uniformly distribute the histogram of the retinal image, enhancing the contrast of the image, then carrying out Gamma transformation, more effectively retaining the brightness information of the image, and finally carrying out data enhancement on training image data samples in the training set by using an online data enhancement technology, thereby completing pretreatment on the acquired retinal blood vessel image;
s23, training a data set: training the preprocessed training set image data with the pixel level division labels by adopting a 5-fold cross validation method;
s24, inputting the color fundus image of the same retinal blood vessel section into a network, and generating a retinal blood vessel segmentation result through forward calculation of the network;
s25, adopting a combination of the classification cross entropy loss function and the set similarity loss function as a segmentation network target optimization function, and defining the following steps:
L 2 (θ')=1-Jaccard(Y′ ij -Y ij )
L(θ')=λ 1 L 1 (θ')+λ 2 L 2 (θ')
wherein θ' is a classification network parameter, L 1 (θ') is a cross entropy loss function, L 2 (θ ') is the aggregate similarity loss function, S is the number of image pixels, C is the number of pixel classes, Y' ij Is a split label, Y ij Is a predictive tag, jaccard (Y' ij -Y ij ) For the similarity of the segmentation labels and the prediction results, the similarity is the ratio of the intersection size to the union size of two sets, L (theta') is the target optimization function of the segmentation network, lambda 1 Weighting factor, lambda, for weighted summation of cross entropy loss functions 2 Weighting the summed weight factors for the aggregate similarity loss function;
s26, optimizing an objective function L (theta') by adopting a self-adaptive moment estimation gradient descent algorithm, and updating retinal vascular segmentation network model parameters by using error back propagation to obtain optimal network model parameters theta best
S3, automatic semantic segmentation of the retinal vascular structure:
s31, obtaining optimal network model parameters theta by learning best LappingEstablishing an automatic semantic segmentation network of a lightweight retinal vascular color fundus image based on a graph convolution network and partial convolution;
s32, firstly carrying out online data enhancement on the color fundus image, then carrying out weighting and merging on the RGB three-channel image into a single-channel image, dividing the single-channel image into image blocks with the size of 64 multiplied by 64, and inputting the image blocks into a feature encoder to extract feature images with different scales;
s33, obtaining a feature map with the same size as the original map through a feature decoder, then sending the feature map to a category prediction layer in a label predictor to obtain pixel category prediction scores on two categories, namely a blood vessel area and a background area, and finally converting the prediction scores into probability distribution by using a softmax probability conversion layer;
s34, taking the subscript of the component where the maximum probability of each pixel is located as a pixel class label, and obtaining a final retina blood vessel semantic segmentation binary image.
Further, in the step S12, the convolution kernel size of two partial convolution layers in each feature encoding layer group and feature decoding layer group is 3×3, the step size is 1, the convolution kernel size of one conventional convolution layer is 1×1, the step size is 1, the convolution kernel number of the previous partial convolution layer in the two partial convolution layers is only 1/4 of the number of feature channels obtained by the feature codec of the previous stage, that is, only the previous 1/4 of the feature input channel is subjected to spatial feature extraction by applying conventional convolution, and the rest channels remain unchanged, wherein the function of the latter convolution layer is to promote information exchange between the channels passing through the convolution layer and the channels not passing through the convolution layer; the number of convolution kernels of the five feature coding layer groups is 64, 16, 128, 32, 256, 64, 512, 128 and 1024 in sequence; the number of convolution kernels of the three feature decoding layer groups is 128, 512, 64, 256, 32 and 128 in sequence; the convolution kernel size of two convolution layers in the multi-scale feature fusion device is 1 multiplied by 1, and the step length is 1; the number of convolution kernels of the label predictor is 16 and 2 in sequence.
Further, in the step S22, the training image data samples in the training set are increased by 10 times as much as the initial training image data samples by using the random horizontal and vertical inversion, rotation 45 °, 90 °, 135 °, 180 °, 225 °, 270 °, 315 ° online data enhancement technique.
Further, the network forward computation in step S24 includes convolution operation, graph convolution operation, batch normalization, nonlinear excitation and probability value conversion.
Further, in the convolution operation, an output feature map Z corresponding to any one convolution kernel i The following calculations were performed:
wherein f represents a nonlinear excitation function, b i Represents the offset corresponding to the ith convolution kernel, r represents the index number of the input channel, k represents the number of the input channels, and W ir An r-th channel weight matrix representing an i-th convolution kernel,is convolution impairment, X r Representing the r-th input channel image.
Further, in the graph convolution operation, a Laplacian matrix is used to define graph convolution for an undirected graphA represents an adjacency matrix, D represents a diagonal matrix, < >>L=I-D -1/2 AD -1/2 Representation pair->The laplace matrix with normalization, L can be decomposed into l=uΛu T Where U is a eigenvector matrix, Λ=diag [ λ ] 1 ,...,λ n ]Is a characteristic value matrix; the graph rolling network introduces a first order approximation of ChebNet, iteratively aggregating information from neighbor nodes about node v i The forward propagation process of (a) is:
wherein, sigma (·) is a nonlinear activation function,representing the renormalized adjacency matrix A, W (l) A leachable transformation matrix representing layer i, v j Representing node v i Is->Representing the node feature matrix of the first layer.
Further, the nonlinear excitation adopts a rectifying linear unit ReLU as a characteristic map Z i For mapping the characteristic pattern Z i Non-linear transformation is performed on each value of (a), the rectifying linear unit ReLU is defined as follows:
f(x)=max(0,x)
where max represents the maximum value and x is an input value.
Further, the probability value transformation transforms the class prediction scores into probability distributions using a Softmax function defined as follows:
wherein Y is j Is the probability that the pixel belongs to class j, O j 、O i The prediction value of a pixel output by the segmentation network at last is the prediction value of the j-th and i-th classes, and K represents the number of the classes.
Further, the specific optimization procedure in step S26 is as follows:
m t =β 1 m t-1 +(1-β 1 )g t
where t represents the number of steps to update, θ is the network parameter to be updated, corresponds to θ' in the objective optimization function of step S25,g is a loss function with a parameter theta t For the objective loss function->Deriving the resulting gradient, beta, from theta 1 Is the first moment attenuation coefficient, beta 2 Is the second moment attenuation coefficient, m t 、v t Gradient g respectively t First moment, second moment, +.>Is m t Is corrected by bias of->V is t Is used to control stride, and the initial value is set to 5e -4 Gradual decay of learning rate to 1e using cosine annealing algorithm -5 ,/>Is a small positive number that prevents the denominator from being zero.
Compared with the prior art, the lightweight retinal vessel segmentation method based on the graph convolution network and the partial convolution has the following advantages: (1) According to the invention, a graph convolution module is introduced between the feature encoder and the feature decoder to model global context information of the image so as to capture long-distance dependency relationship among retinal vascular pixels, thereby relieving the phenomenon of vascular fracture in a segmentation result; (2) The invention uses partial convolution to replace the conventional convolution in the feature codec, the partial convolution can reduce redundant information in a feature layer, and can reduce memory access of convolution operation in a feature extraction stage, thereby improving training speed and reasoning speed of a model; (3) The multi-scale feature fusion module is introduced at the jump joint, and is mainly used for fusing the global features of the picture scroll lamination and the local features of part of the convolution layers to reduce noise information in the image features and preserve blood vessel information; (4) In the up-sampling stage of the feature decoder, the invention adopts the adjacent block expansion operation mode to enlarge the feature image and simultaneously improve the capability of filling in the missing and blank values in the process of restoring the image from small size to large size, thereby avoiding the discontinuous phenomenon in the retina blood vessel segmentation result and further improving the segmentation accuracy.
Drawings
FIG. 1 is a schematic diagram of a lightweight retinal vascular segmentation network model based on a graph convolution network and partial convolution provided by the present invention.
Fig. 2 is a schematic diagram of the partial convolution (Partial convolution, PConv) operation provided by the present invention.
Fig. 3 is a schematic diagram of a Multi-scale feature fusion apparatus (Multi-scale Feature Fusion Block, MFFB) provided by the present invention.
FIG. 4 is a schematic diagram of a neighboring block expansion layer (PatchExpandad) operation process provided by the present invention.
Fig. 5 is a schematic diagram of the final result of preprocessing image data provided by the present invention.
Detailed Description
The invention is further described with reference to the following detailed drawings in order to make the technical means, the creation characteristics, the achievement of the purpose and the effect of the implementation of the invention easy to understand.
Referring to fig. 1 to 5, the present invention provides a lightweight retinal vessel segmentation method based on a graph convolution network and partial convolution, comprising the following steps:
s1, constructing a lightweight retinal vascular segmentation network model based on a graph convolution network and partial convolution:
s11, a lightweight retinal vascular segmentation network based on a graph convolution network and partial convolution comprises a feature encoder, a multi-scale feature fusion device, a feature decoder and a tag predictor, wherein the feature encoder captures local features of an image through partial convolution and global features of the image through graph convolution, performs downsampling through a maximum pooling operation, and increases semantic information contained in the image while reducing the resolution of the image; the multi-scale feature fusion device performs weighted fusion on the local features and the global features, suppresses noise information in the feature layer and pays attention to the feature information of the blood vessel; the feature decoder gradually restores the resolution of the feature map to the same size as the original map through an adjacent block expansion layer and a convolution layer; the tag predictor obtains a final segmentation probability map through convolution and an activation function of the multi-channel feature map output by the feature decoder;
s12, the feature encoder comprises five feature encoding layer groups, the feature decoder comprises three feature decoding layer groups and four adjacent block expansion layers, each feature encoding layer group and each feature decoding layer group are composed of two partial convolution layers and a conventional convolution layer which are sequentially arranged, an adjacent block expansion layer (PatchExpandad) is arranged between each stage of the feature decoder, and the adjacent block expansion layer replaces bilinear interpolation to perform up-sampling operation on a feature map, so that the feature map is gradually restored to the original map size; the multi-scale feature fusion device (MFFB) comprises two pooling layer groups, two convolution layers and a softmax weight conversion layer, wherein each pooling layer group consists of a maximum pooling (Max pool) layer and an average pooling (Avg pool) layer, and a convolution layer is arranged behind each pooling layer group; the label predictor is composed of a class prediction layer and a softmax probability conversion layer, wherein the softmax probability conversion layer converts class prediction scores into probability distribution.
As a specific embodiment, in the step S12, the convolution kernel size of two partial convolution layers in each feature encoding layer group and feature decoding layer group is 3×3, the step size is 1, the convolution kernel size of one conventional convolution layer is 1×1, the step size is 1, the convolution kernel number of the previous partial convolution layer in the two partial convolution layers is only 1/4 of the number of feature channels obtained by the feature codec of the previous stage, that is, only the previous 1/4 of the feature input channel is subjected to spatial feature extraction by applying conventional convolution, the rest channels remain unchanged, and the latter convolution layer serves to promote information exchange between the channels passing through the convolution layer and the channels not passing through the convolution layer; the number of convolution kernels of the five feature coding layer groups is 64, 16, 128, 32, 256, 64, 512, 128 and 1024 in sequence; the number of convolution kernels of the three feature decoding layer groups is 128, 512, 64, 256, 32 and 128 in sequence; the convolution kernel size of two convolution layers in the multi-scale feature fusion device is 1 multiplied by 1, and the step length is 1; the number of convolution kernels of the label predictor is 16 and 2 in sequence. The whole segmentation network input is 1 channel, the channel size is 64 multiplied by 64, the channel size is obtained by weighted addition of red, green and blue channels of the original color image, and the weights are respectively 0.299, 0.587 and 0.144; the final output of the network is 2 channels, the size of each channel is 64 multiplied by 64, the two categories of vascular pixels and background pixels are represented respectively, and the detailed model parameters are shown in the following table 1.
Table 1 retinal vessel segmentation network model parameter table (packing=1)
In table 1 above, to ensure that the feature map size is unchanged during the convolution, padding=1 is set, i.e., the image surroundings are filled with 0 during the convolution.
S2, training a retina blood vessel segmentation network model and optimizing parameters:
s21, initializing network parameters: and initializing the model parameters of the lightweight retinal vascular segmentation network constructed in the step S1 and based on the graph convolution network and partial convolution by adopting an Xavier method.
S22, data set preparation: as a specific embodiment, the inventors of the present application obtained 88 patient data with pixel-level segmentation labels, and specifically adopted data sets include a DRIVE data set from the diabetic retinopathy screening project in the Netherlands, a STARE data set from san Diego division in California, and CHASEDB1 data sets from the left and right eyes of 14 students, wherein the DRIVE data set comprises 40 images, each image having a size of 768×584, the first 20 images being a training set, and the second 20 images being a test set; the STARE data set comprises 20 images with the size of 605 multiplied by 700, wherein the first 10 images are used as training sets, and the last 10 images are used as test sets; the ChaSEDB1 dataset contained 28 color retinal vessel images of size 999X 960, with the first 20 as the training set and the second 8 as the test set. Each image is annotated by two independent experts. In the retinal blood vessel image, the blood vessel of the green channel image has higher contrast with the background, so that the original retinal blood vessel image which is divided into a training set and a verification set and is provided with pixel level separation labels is separated into RGB three-channel feature images, the RGB three-channel feature images are weighted, then histogram equalization is used for approximately uniformly distributing the histogram of the retinal image, so that the contrast of the image is enhanced, gamma transformation is then carried out, the brightness information of the image is more effectively reserved, and finally, the training image data samples in the training set are subjected to data enhancement by using the random horizontal, vertical overturn, rotation 45 DEG, 90 DEG, 135 DEG, 180 DEG, 225 DEG and 315 DEG on-line data enhancement technology, so that the training image data samples are increased to be 10 times of the original data samples, and the preprocessing of the acquired retinal blood vessel image is completed; the invention cuts the training image and the label into pixel blocks with the size of 64 multiplied by 64, and each pixel block is separated by 16 pixel points.
S23, training a data set: the preprocessed training set image data with the pixel-level division labels is used for training the division network model by a 5-fold cross validation method.
S24, inputting the color fundus images of the same retinal blood vessel section into a network, and generating a retinal blood vessel segmentation result through forward calculation of the network. As specific embodiments, the network forward computation includes convolution operations, graph convolution operations, batch normalization, nonlinear excitation, and probability value conversion.
Convolution operation: in the convolution operation, an output characteristic diagram Z corresponding to any convolution kernel i The following calculations were performed:
wherein f represents a nonlinear excitation function, b i Represents the offset corresponding to the ith convolution kernel, r represents the index number of the input channel, k represents the number of the input channels, and W ir An r-th channel weight matrix representing an i-th convolution kernel,is convolution impairment, X r Representing the r-th input channel image.
Drawing convolution operation: the graph rolling network (Graph Convolutional Networks, GCN) is a deep learning model aiming at graph structure data, and the core idea of the GCN is to extend convolution operation to the graph structure data, and the characteristic update of the nodes is realized by aggregating the information of neighbor nodes. In the graph convolution operation of the invention, each pixel of the encoder characteristic graph is taken as a graph node, 3 nearest neighbors of each node are found by a K neighbor algorithm to be taken as graph edges, and then a Laplacian matrix is used for defining graph convolution, so as to obtain an undirected graphA represents an adjacency matrix, D represents a diagonal matrix, < >>L=I-D -1/2 AD -1/2 Representation pair->The laplace matrix with normalization, L can be decomposed into l=uΛu T Where U is a eigenvector matrix, Λ=diag [ λ ] 1 ,...,λ n ]Is a characteristic value matrix; the graph rolling network introduces a first order approximation of ChebNet (k=1), iteratively aggregating information from neighbor nodes about node v i The forward propagation process of (a) is:
wherein, sigma (·) is a nonlinear activation function,representing the renormalized adjacency matrix A, W (l) A leachable transformation matrix representing layer i, v j Representing node v i Is->Representing the node feature matrix of the first layer.
Batch normalization: batch normalization (Batch Normalization, BN) is a method for faster and more stable neural network training, which is used to calculate the mean and variance of each mini-batch and pull it back to a standard normal distribution with a mean of 0 and a variance of 1, and the specific operation calculation is known to those skilled in the art and will not be described here in detail.
Nonlinear excitation: the nonlinear excitation adopts a rectifying linear unit ReLU as a characteristic diagram Z i For mapping the characteristic pattern Z i Non-linear transformation is performed on each value of (a), the rectifying linear unit ReLU is defined as follows:
f(x)=max(0,x)
where max represents the maximum value and x is an input value.
Probability value conversion: the probability value transformation transforms the class prediction scores into probability distributions using a Softmax function defined as follows:
wherein Y is j Is the probability that the pixel belongs to class j, O j 、O i The prediction value of a pixel output by the segmentation network at last is the prediction value of the j-th and i-th classes, and K represents the number of the classes.
S25, adopting a combination of the classification cross entropy loss function and the set similarity loss function as a segmentation network target optimization function, and defining the following steps:
L 2 (θ')=1-Jaccard(Y′ ij -Y ij )
L(θ')=λ 1 L 1 (θ')+λ 2 L 2 (θ')
wherein θ' is a classification network parameter, L 1 (θ') is a cross entropy loss function, L 2 (θ ') is the aggregate similarity loss function, S is the number of image pixels, C is the number of pixel classes, Y' ij Is a split label, Y ij Is a predictive tag, jaccard (Y' ij -Y ij ) For the similarity of the segmentation labels and the prediction results, the similarity is the ratio of the intersection size to the union size of two sets, L (theta') is the target optimization function of the segmentation network, lambda 1 Weighting factor, lambda, for weighted summation of cross entropy loss functions 2 Weighting the summed weight factors for the aggregate similarity loss function; wherein lambda is 1 And lambda (lambda) 2 These two values should be not less than 0, which is usually set to 0.5 according to experience.
S26, optimizing an objective function by adopting an adaptive moment estimation gradient descent (Adam) algorithmL (theta'), updating the retinal vessel segmentation network model parameters by using error back propagation to obtain the optimal network model parameters theta best . As a specific embodiment, the specific optimization procedure is as follows:
m t =β 1 m t-1 +(1-β 1 )g t
where t represents the number of steps to update, θ is the network parameter to be updated, corresponds to θ' in the objective optimization function of step S25,g is a loss function with a parameter theta t For the objective loss function->Deriving the resulting gradient, beta, from theta 1 Is the first moment attenuation coefficient, beta 2 Is the second moment attenuation coefficient, m t 、v t Gradient g respectively t First moment, second moment, +.>Is m t Is corrected by bias of->V is t Is used to control stride, and the initial value is set to 5e -4 Gradual decay of learning rate to 1e using cosine annealing algorithm -5 ,/>Is a small positive number that prevents the denominator from being zero.
S3, automatic semantic segmentation of the retinal vascular structure:
s31, obtaining optimal network model parameters theta by learning best Constructing an automatic semantic segmentation network of the lightweight retinal vascular color fundus image based on a graph convolution network and partial convolution;
s32, firstly carrying out online data enhancement on the color fundus image, then carrying out weighting and merging on the RGB three-channel image into a single-channel image, dividing the single-channel image into image blocks with the size of 64 multiplied by 64, and inputting the image blocks into a feature encoder to extract feature images with different scales;
s33, obtaining a feature map with the same size as the original map through a feature decoder, then sending the feature map to a category prediction layer in a label predictor to obtain pixel category prediction scores on two categories, namely a blood vessel area and a background area, and finally converting the prediction scores into probability distribution by using a softmax probability conversion layer;
s34, taking the subscript of the component where the maximum probability of each pixel is located as a pixel class label, and obtaining a final retina blood vessel semantic segmentation binary image.
Compared with the prior art, the lightweight retinal vessel segmentation method based on the graph convolution network and the partial convolution has the following advantages: (1) According to the invention, a graph convolution module is introduced between the feature encoder and the feature decoder to model global context information of the image so as to capture long-distance dependency relationship among retinal vascular pixels, thereby relieving the phenomenon of vascular fracture in a segmentation result; (2) The invention uses partial convolution to replace the conventional convolution in the feature codec, the partial convolution can reduce redundant information in a feature layer, and can reduce memory access of convolution operation in a feature extraction stage, thereby improving training speed and reasoning speed of a model; (3) The multi-scale feature fusion module is introduced at the jump joint, and is mainly used for fusing the global features of the picture scroll lamination and the local features of part of the convolution layers to reduce noise information in the image features and preserve blood vessel information; (4) In the up-sampling stage of the feature decoder, the invention adopts the adjacent block expansion operation mode to enlarge the feature image and simultaneously improve the capability of filling in the missing and blank values in the process of restoring the image from small size to large size, thereby avoiding the discontinuous phenomenon in the retina blood vessel segmentation result and further improving the segmentation accuracy.
Finally, it is noted that the above embodiments are only for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made thereto without departing from the spirit and scope of the technical solution of the present invention, which is intended to be covered by the scope of the claims of the present invention.

Claims (9)

1. The lightweight retinal vessel segmentation method based on the graph convolution network and partial convolution is characterized by comprising the following steps of:
s1, constructing a lightweight retinal vascular segmentation network model based on a graph convolution network and partial convolution:
s11, a lightweight retinal vascular segmentation network based on a graph convolution network and partial convolution comprises a feature encoder, a multi-scale feature fusion device, a feature decoder and a tag predictor, wherein the feature encoder captures local features of an image through partial convolution and global features of the image through graph convolution, performs downsampling through a maximum pooling operation, and increases semantic information contained in the image while reducing the resolution of the image; the multi-scale feature fusion device performs weighted fusion on the local features and the global features, suppresses noise information in the feature layer and pays attention to the feature information of the blood vessel; the feature decoder gradually restores the resolution of the feature map to the same size as the original map through an adjacent block expansion layer and a convolution layer; the tag predictor obtains a final segmentation probability map through convolution and an activation function of the multi-channel feature map output by the feature decoder;
s12, the feature encoder comprises five feature encoding layer groups, the feature decoder comprises three feature decoding layer groups and four adjacent block expansion layers, each feature encoding layer group and each feature decoding layer group are composed of two partial convolution layers and a conventional convolution layer which are sequentially arranged, an adjacent block expansion layer is arranged between each stage of the feature decoder, the adjacent block expansion layer replaces bilinear interpolation to perform up-sampling operation on a feature map, and the feature map is gradually restored to the original map size; the multi-scale feature fusion device comprises two pooling layer groups, two convolution layers and a softmax weight conversion layer, wherein each pooling layer group consists of a maximum pooling layer and an average pooling layer, and a convolution layer is arranged behind each pooling layer group; the label predictor consists of a class prediction layer and a softmax probability conversion layer, wherein the softmax probability conversion layer converts class prediction scores into probability distribution;
s2, training a retina blood vessel segmentation network model and optimizing parameters:
s21, initializing network parameters: initializing the model parameters of the lightweight retinal vascular segmentation network constructed in the step S1 and based on the graph convolution network and partial convolution by adopting an Xavier method;
s22, data set preparation: the method comprises the steps of separating an original retinal blood vessel image which is divided into a training set and a verification set and is provided with pixel level separation labels into RGB three-channel feature images, carrying out weighting treatment, then using histogram equalization to approximately uniformly distribute the histogram of the retinal image, enhancing the contrast of the image, then carrying out Gamma transformation, more effectively retaining the brightness information of the image, and finally carrying out data enhancement on training image data samples in the training set by using an online data enhancement technology, thereby completing pretreatment on the acquired retinal blood vessel image;
s23, training a data set: training the preprocessed training set image data with the pixel level division labels by adopting a 5-fold cross validation method;
s24, inputting the color fundus image of the same retinal blood vessel section into a network, and generating a retinal blood vessel segmentation result through forward calculation of the network;
s25, adopting a combination of the classification cross entropy loss function and the set similarity loss function as a segmentation network target optimization function, and defining the following steps:
L 2 (θ')=1-Jaccard(Y′ ij -Y ij )
L(θ')=λ 1 L 1 (θ')+λ 2 L 2 (θ')
wherein θ' is a classification network parameter, L 1 (θ') is a cross entropy loss function, L 2 (θ ') is the aggregate similarity loss function, S is the number of image pixels, C is the number of pixel classes, Y' ij Is a split label, Y ij Is a predictive tag, jaccard (Y' ij -Y ij ) For the similarity of the segmentation labels and the prediction results, the similarity is the ratio of the intersection size to the union size of two sets, L (theta') is the target optimization function of the segmentation network, lambda 1 Weighting factor, lambda, for weighted summation of cross entropy loss functions 2 Weighting the summed weight factors for the aggregate similarity loss function;
s26, optimizing an objective function L (theta') by adopting a self-adaptive moment estimation gradient descent algorithm, and updating retinal vascular segmentation network model parameters by using error back propagation to obtain optimal network model parameters theta best
S3, automatic semantic segmentation of the retinal vascular structure:
s31, obtaining optimal network model parameters theta by learning best Constructing an automatic semantic segmentation network of the lightweight retinal vascular color fundus image based on a graph convolution network and partial convolution;
s32, firstly carrying out online data enhancement on the color fundus image, then carrying out weighting and merging on the RGB three-channel image into a single-channel image, dividing the single-channel image into image blocks with the size of 64 multiplied by 64, and inputting the image blocks into a feature encoder to extract feature images with different scales;
s33, obtaining a feature map with the same size as the original map through a feature decoder, then sending the feature map to a category prediction layer in a label predictor to obtain pixel category prediction scores on two categories, namely a blood vessel area and a background area, and finally converting the prediction scores into probability distribution by using a softmax probability conversion layer;
s34, taking the subscript of the component where the maximum probability of each pixel is located as a pixel class label, and obtaining a final retina blood vessel semantic segmentation binary image.
2. The method according to claim 1, wherein in the step S12, the convolution kernel size of two partial convolution layers in each feature encoding layer group and feature decoding layer group is 3×3, the step size is 1, the convolution kernel size of one regular convolution layer is 1×1, the step size is 1, the convolution kernel number of the previous one of the two partial convolution layers is only 1/4 of the number of feature channels obtained by the feature codec of the previous stage, that is, only the previous 1/4 of the feature input channels are subjected to spatial feature extraction by applying regular convolution, and the remaining channels remain unchanged, and the latter one of the convolution layers functions to promote information exchange between the channels passing through the convolution layer and the channels not passing through the convolution layer; the number of convolution kernels of the five feature coding layer groups is 64, 16, 128, 32, 256, 64, 512, 128 and 1024 in sequence; the number of convolution kernels of the three feature decoding layer groups is 128, 512, 64, 256, 32 and 128 in sequence; the convolution kernel size of two convolution layers in the multi-scale feature fusion device is 1 multiplied by 1, and the step length is 1; the number of convolution kernels of the label predictor is 16 and 2 in sequence.
3. The method according to claim 1, wherein in step S22, the training image data samples in the training set are increased by 10 times by using random horizontal, vertical flip, rotation 45 °, 90 °, 135 °, 180 °, 225 °, 270 °, 315 ° online data enhancement technique.
4. The method of claim 1, wherein the network forward computation in step S24 includes convolution operations, graph convolution operations, batch normalization, nonlinear excitation, and probability value transformation.
5. The method for segmenting a light-weight retinal vessel based on a graph convolution network and partial convolution according to claim 4, wherein in the convolution operation, an output characteristic graph Z corresponding to any one convolution kernel i The following calculations were performed:
wherein f represents a nonlinear excitation function, b i Represents the offset corresponding to the ith convolution kernel, r represents the index number of the input channel, k represents the number of the input channels, and W ir An r-th channel weight matrix representing an i-th convolution kernel,is convolution impairment, X r Representing the r-th input channel image.
6. The method of claim 4, wherein each pixel of the encoder feature map is treated as a map node in the graph convolution operation, 3 nearest neighbors of each node are found as the edges of the map by K-nearest neighbor algorithm, and then the map convolution is defined using laplace matrix for an undirected mapA represents an adjacency matrix, D represents a diagonal matrix, < >>L=I-D -1/2 AD -1/2 Representation pair->The laplace matrix with normalization, L can be decomposed into l=uΛu T Where U is a eigenvector matrix, Λ=diag [ λ ] 1 ,...,λ n ]Is a characteristic value matrix; the graph rolling network introduces a first order approximation of ChebNet, iteratively aggregating information from neighbor nodes about node v i The forward propagation process of (a) is:
wherein, sigma (·) is a nonlinear activation function,representing the renormalized adjacency matrix A, W (l) A leachable transformation matrix representing layer i, v j Representing node v i Is->Representing the node feature matrix of the first layer.
7. The method of graph-rolling network and partial convolution based light weight retinal vessel segmentation according to claim 4, wherein the nonlinear excitation employs a rectified linear unit ReLU as a signature Z i For mapping the characteristic pattern Z i Non-linear transformation is performed on each value of (a), the rectifying linear unit ReLU is defined as follows:
f(x)=max(0,x)
where max represents the maximum value and x is an input value.
8. The graph-convolution network and partial convolution based light-weight retinal vessel segmentation method according to claim 4, wherein the probability value transformation transforms the class prediction scores into probability distributions using a Softmax function defined as follows:
wherein Y is j Is the probability that the pixel belongs to class j, O j 、O i The prediction value of a pixel output by the segmentation network at last is the prediction value of the j-th and i-th classes, and K represents the number of the classes.
9. The method for segmenting light-weight retinal blood vessels based on a graph convolution network and partial convolution according to claim 1, wherein the specific optimization procedure in the step S26 is as follows:
m t =β 1 m t-1 +(1-β 1 )g t
where t represents the number of steps to update, θ is the network parameter to be updated, corresponds to θ' in the objective optimization function of step S25,g is a loss function with a parameter theta t For the objective loss function->Deriving the resulting gradient, beta, from theta 1 Is the first moment attenuation coefficient, beta 2 Is the second moment attenuation coefficient, m t 、v t Gradient g respectively t First moment, second moment, +.>Is m t Is corrected by bias of->V is t Is used to control stride, and the initial value is set to 5e -4 Gradual decay of learning rate to 1e using cosine annealing algorithm -5 ,/>Is a small positive number that prevents the denominator from being zero.
CN202311404768.0A 2023-10-27 2023-10-27 Lightweight retinal vessel segmentation method based on graph convolution network and partial convolution Pending CN117315258A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311404768.0A CN117315258A (en) 2023-10-27 2023-10-27 Lightweight retinal vessel segmentation method based on graph convolution network and partial convolution

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311404768.0A CN117315258A (en) 2023-10-27 2023-10-27 Lightweight retinal vessel segmentation method based on graph convolution network and partial convolution

Publications (1)

Publication Number Publication Date
CN117315258A true CN117315258A (en) 2023-12-29

Family

ID=89237269

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311404768.0A Pending CN117315258A (en) 2023-10-27 2023-10-27 Lightweight retinal vessel segmentation method based on graph convolution network and partial convolution

Country Status (1)

Country Link
CN (1) CN117315258A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117745595A (en) * 2024-02-18 2024-03-22 珠海金山办公软件有限公司 Image processing method, device, electronic equipment and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117745595A (en) * 2024-02-18 2024-03-22 珠海金山办公软件有限公司 Image processing method, device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN109345538B (en) Retinal vessel segmentation method based on convolutional neural network
CN108021916B (en) Deep learning diabetic retinopathy sorting technique based on attention mechanism
CN110097554B (en) Retina blood vessel segmentation method based on dense convolution and depth separable convolution
CN112132817B (en) Retina blood vessel segmentation method for fundus image based on mixed attention mechanism
CN110544274B (en) Multispectral-based fundus image registration method and system
CN108764342B (en) Semantic segmentation method for optic discs and optic cups in fundus image
CN113793348B (en) Retinal blood vessel segmentation method and device
CN113554665A (en) Blood vessel segmentation method and device
CN113724206B (en) Fundus image blood vessel segmentation method and system based on self-supervision learning
CN113205537A (en) Blood vessel image segmentation method, device, equipment and medium based on deep learning
CN110991254B (en) Ultrasonic image video classification prediction method and system
CN113012163A (en) Retina blood vessel segmentation method, equipment and storage medium based on multi-scale attention network
CN115375711A (en) Image segmentation method of global context attention network based on multi-scale fusion
CN115760586A (en) Medical image enhancement method based on multi-scale attention generation countermeasure network
CN111833334A (en) Fundus image feature processing and analyzing method based on twin network architecture
CN114511738A (en) Fundus lesion identification method and device, electronic equipment and readable storage medium
CN111242949B (en) Fundus image blood vessel segmentation method based on full convolution neural network multi-scale features
CN113870270A (en) Eyeground image cup and optic disc segmentation method under unified framework
CN113763292A (en) Fundus retina image segmentation method based on deep convolutional neural network
CN115908241A (en) Retinal vessel segmentation method based on fusion of UNet and Transformer
CN116363145A (en) Fundus OCT image retina layering method based on double edge representation
Miao et al. Classification of diabetic retinopathy based on multiscale hybrid attention mechanism and residual algorithm
CN116779091A (en) Automatic generation method of multi-mode network interconnection and fusion chest image diagnosis report
CN116664483A (en) Multi-index synergistic OCT retina image lesion evolution prediction method
CN117315258A (en) Lightweight retinal vessel segmentation method based on graph convolution network and partial convolution

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination