CN115205300A

CN115205300A - Fundus blood vessel image segmentation method and system based on cavity convolution and semantic fusion

Info

Publication number: CN115205300A
Application number: CN202211134660.XA
Authority: CN
Inventors: 张红斌; 钟翔; 李志杰; 胡朗; 袁梦; 李广丽
Original assignee: East China Jiaotong University
Current assignee: East China Jiaotong University
Priority date: 2022-09-19
Filing date: 2022-09-19
Publication date: 2022-10-18
Anticipated expiration: 2042-09-19
Also published as: CN115205300B

Abstract

The invention provides a fundus blood vessel image segmentation method and system based on cavity convolution and semantic fusion, which comprises the following steps: acquiring a fundus blood vessel image dataset, acquiring a fundus blood vessel image from the fundus blood vessel image dataset and preprocessing the fundus blood vessel image; improving based on a U-Net model to obtain an improved neural network; designing a multilayer multi-scale cavity convolution structure at a jump connection part of an improved neural network; designing a semantic fusion structure at a decoding part of the improved neural network; performing continuous three times of convolution operation to obtain an image to be segmented, and performing two-classification discrimination on pixels in the image to be segmented to segment the fundus blood vessel image and obtain an improved neural network model; and performing auxiliary training and testing on the improved neural network model according to the first loss function. The invention can accurately and effectively segment the fundus blood vessel image, and assist the clinical diagnosis work of doctors to further realize high-quality medical service.

Description

Fundus blood vessel image segmentation method and system based on cavity convolution and semantic fusion

Technical Field

The invention relates to the technical field of computer images, in particular to a fundus blood vessel image segmentation method and system based on cavity convolution and semantic fusion.

Background

For human body, the eye is the only organ in the whole body which can see blood vessels and nerves under direct vision, and the circulation of retina has the same anatomical physiological characteristics as the brain and coronary circulation. Therefore, the fundus oculi has become a very important window for observing related diseases such as cardiovascular and cerebrovascular diseases, eyeball diseases and the like. However, because manual Diagnosis is time-consuming, labor-consuming and inefficient, computer-aided Diagnosis (CAD) is an important means for improving the working efficiency and Diagnosis accuracy of doctors. The fine and accurate fundus blood vessel image segmentation can assist a doctor to better observe the diseases, and then make a correct diagnosis decision. Therefore, the fundus blood vessel image segmentation technology has a high clinical application value, can practically improve the medical service level, and promotes the medical deep fusion.

The existing fundus blood vessel image segmentation method is mostly based on a U-Net network, obtains better segmentation performance, and effectively promotes the intelligent diagnosis development based on CAD, but the existing work has the following defects: (1) The image characteristics are limited by a limited receptive field, and the extraction of local characteristics in the fundus blood vessel image is insufficient; (2) Only convolution operation is adopted, so that the context information in the fundus blood vessel image is less, and the target blood vessel cannot be accurately and completely segmented; (3) Continuous upsampling in the decoder inevitably loses some vessel detail information. To address the above issues, the details of the blood vessels in the image should be preserved as much as possible, thereby providing the physician with intuitive clinical diagnostic information.

Therefore, an advanced and efficient fundus blood vessel image segmentation method is needed to be designed, so that global context information and local features from different receptive fields are considered, loss of detail information is reduced as much as possible, fundus blood vessel segmentation precision is improved finally, and a more accurate and complete segmentation result is provided for doctors.

Disclosure of Invention

In view of the above situation, the main objective of the present invention is to provide a fundus blood vessel image segmentation method and system based on cavity convolution and semantic fusion to solve the above technical problems.

The embodiment of the invention provides an eyeground blood vessel image segmentation method based on cavity convolution and semantic fusion, wherein the method comprises the following steps:

acquiring a fundus blood vessel image dataset, acquiring a fundus blood vessel image from the fundus blood vessel image dataset, and preprocessing the fundus blood vessel image;

performing improved design based on the U-Net model to obtain an improved neural network;

step three, during the improvement design, designing a multilayer multi-scale cavity convolution structure at a jump connection part of the improved neural network, wherein the multilayer multi-scale cavity convolution structure is used for spanning an encoding part and a decoding part of the improved neural network so as to protect detail information in the fundus blood vessel;

designing a semantic fusion structure at a decoding part of the improved neural network, wherein the semantic fusion structure is used for splicing the decoded multi-scale image features, constructing a pair of extrusion excitation modules, and screening key information of the spliced multi-scale image features according to the extrusion excitation modules to finally obtain a multi-scale image feature fusion result;

performing continuous three-time convolution operation on the multi-scale image feature fusion result to obtain an image to be segmented, and performing classification judgment on pixels in the image to be segmented to segment the fundus blood vessel image to obtain an improved neural network model;

step six, performing auxiliary training on the improved neural network model according to a first loss function so as to test the trained neural network model, thereby finally completing segmentation and verification of the fundus blood vessel image;

the expression of the first loss function is:

wherein,

the first loss function is represented as a function of,

the total number of categories of the data is represented,

representing categorieskThe true positive value of (A) is,

representing categorieskThe false negative value of (a) is,

indicating that a pixel belongs to a classkThe number of the (c) component(s),

，

representing categorieskThe geometric mean confidence of (2) is calculated,

representing categorieskThe recall rate of the (c),

representing the predicted maximum distribution of the input training data over all classes,

indicating labels as categorieskThe set of samples of (a) is,

label data representing the training data is stored in the memory,

indicating the sequence number of the training data.

The fundus blood vessel image segmentation method based on cavity convolution and semantic fusion, wherein in the first step, the fundus blood vessel image data set comprises a STARE data set, a DRIVE data set and a CHASEDB1 data set;

the method for preprocessing the fundus blood vessel image comprises the following steps:

and uniformly cutting the size of the fundus blood vessel image to 512 multiplied by 512, and then performing image overturning, image rotation and image Gaussian blurring operation to finish preprocessing.

The fundus blood vessel image segmentation method based on the cavity convolution and semantic fusion is characterized in that in the third step, the multilayer multi-scale cavity convolution structure comprises an upper-layer image feature, a middle-layer image feature and a lower-layer image feature;

and performing cascade cavity convolution comprising convolution rates with different scales on the upper layer image characteristic, the middle layer image characteristic and the lower layer image characteristic.

The fundus blood vessel image segmentation method based on the hole convolution and semantic fusion comprises the following steps of:

performing cascade cavity convolution with gradually expanded receptive field on the upper image characteristics, obtaining cavity convolution characteristics corresponding to the upper image characteristics through convolution, and performing maximum pooling operation on the cavity convolution characteristics to obtain local characteristics of the large-size image;

dividing the fundus blood vessel image corresponding to the middle layer image characteristics into image blocks with fixed sizes, vectorizing the image blocks through a flattening operation, and executing linear mapping to convert the vectorized image blocks into low-dimensional linear embedded characteristics; inputting the low-dimensional linear embedded features into 12 consecutive transform layers in a transform module to continuously perform linear mapping, and performing self-attention weighting; adding position codes to the low-dimensional linear embedded features to obtain a long-distance dependency relationship in the fundus blood vessel image through modeling, and extracting to obtain image global features;

performing cascade cavity convolution with gradually expanded receptive field on the lower-layer image features, and then performing bilinear interpolation to promote the lower-layer image features to the same size as the middle-layer image features so as to obtain local features of small-size images;

and adding the local features of the large-size image, the image global features and the local features of the small-size image to complete feature fusion.

The fundus blood vessel image segmentation method based on cavity convolution and semantic fusion is characterized in that the following formula exists in the step of processing the characteristics of the middle layer image:

wherein,

a low-dimensional linear embedded feature is represented,

a matrix for performing a linear mapping is represented,

it is shown that the position code is,

is shown as

Each of the image blocks is a block of an image,

；

the transform layer includes a normalization layer, a multi-headed self-attention and multi-layered perceptron, the first

The formula corresponding to the feature transformation of each transform layer is expressed as:

wherein,

represents passing through

The image characteristics obtained after coding by the transform layer,

representing image features weighted by multi-head attention,

a linear embedded feature representing a sequence of input images,

it is meant that the multi-layer perceptron operates,

indicating self-attention of multiple headsIn the operation of the method, the operation,

indicating the normalization layer operation.

In the fourth step, the method for splicing the decoded multi-scale image features by the semantic fusion structure comprises the following steps:

the semantic fusion structure performs upsampling on the image characteristics of each layer output by the decoding part of the improved neural network and restores the image characteristics to be consistent with the original input image in size to obtain a first upsampling characteristic diagram

Second up-sampling feature map

A third upsampling profile

And a fourth upsampled feature map

；

Feature maps incorporating coding portions of the improved neural network

And the first up-sampling feature map

Second up-sampling feature map

Third upsampled feature map

And a fourth upsampled feature map

Splicing to obtain a new characteristic diagram

；

Wherein the new characteristic diagram

。

The fundus blood vessel image segmentation method based on the cavity convolution and the semantic fusion is characterized in that the new characteristic map is obtained

Thereafter, the method further comprises:

the new feature map is used

Inputting the new feature map into a squeezing excitation module, continuously executing two SE operations, and screening the new feature map in a hierarchical manner

Key information in (1);

wherein the SE operations comprise a global average pooling operation, a non-linear activation operation and a feature channel weighting operation;

the global average pooling operation comprises the steps of:

for the new feature map

Performing global average pooling operation on each channel to obtain feature vectorsmWherein the new feature map

Has the attribute of

；

Performing a global average pooling operationTo obtain a feature vectormIs expressed as:

wherein,

showing new characteristics

The height of (a) of (b),

showing new characteristics

The width of (a) is greater than (b),

diagram showing new characteristics

The number of the channels of (a) is,

diagram showing new characteristics

To middle

A compressed representation of the global information for each channel,

representing against a new feature graph

The value of the height of (a) is,

representing against a new feature graph

The value of (a) is selected from,

。

the fundus blood vessel image segmentation method based on the cavity convolution and the semantic fusion is characterized in that the nonlinear activation operation comprises the following steps:

modeling the correlation of new characteristic diagrams between different channels by utilizing two full connection layers; wherein the first fully-connected layer converts the feature vector after nonlinear activationmIs reduced to 1/r, the second fully connected layer reduces the feature vectormAdding dimensionality to original dimensionality, and using sigmoid function to make feature vectormThe feature weight of (A) is normalized to [0,1 ]]；

The formula for the nonlinear activation operation is expressed as:

wherein,

the weight vector of the feature is represented,

a sigmoid function is represented as a function,

a parameter representing the first fully connected layer,

a parameter representing the second fully-connected layer,

representing the feature vector after global pooling,

representing the ReLU nonlinear activation function.

The fundus blood vessel image segmentation method based on the cavity convolution and the semantic fusion is characterized in that the characteristic channel weighting operation comprises the following steps:

using feature weight vectors

For the new feature map

Each feature channel in (2) performs a respective multiplicative weighting, with the corresponding formula being:

wherein,

the weighted feature channels are represented and then the weighted feature channels are obtained,

diagram showing new characteristics

To middle

And the feature weight vectors corresponding to the channels.

The invention also provides a fundus blood vessel image segmentation system based on cavity convolution and semantic fusion, wherein the system comprises:

an image acquisition module to:

a model improvement module to:

performing improved design based on a U-Net model to obtain an improved neural network;

a first design module to:

in the improved design, a multilayer multi-scale cavity convolution structure is designed at a jump connection part of the improved neural network, and the multilayer multi-scale cavity convolution structure is used for spanning an encoding part and a decoding part of the improved neural network so as to protect detail information in fundus blood vessels;

a second design module to:

designing a semantic fusion structure at a decoding part of the improved neural network, wherein the semantic fusion structure is used for splicing the decoded multi-scale image features and constructing a pair of extrusion excitation modules, and performing key information screening on the spliced multi-scale image features according to the extrusion excitation modules to finally obtain a multi-scale image feature fusion result;

an image segmentation module to:

performing continuous three-time convolution operation on the multi-scale image feature fusion result to obtain an image to be segmented, and performing two-classification discrimination on pixels in the image to be segmented to segment the fundus blood vessel image and obtain an improved neural network model;

the auxiliary training module is used for carrying out auxiliary training on the improved neural network model according to a first loss function so as to test the trained neural network model, and finally completing segmentation and verification of the fundus blood vessel image;

wherein the expression of the first loss function is:

wherein,

the first loss function is represented as a function of,

the total number of categories of the data is represented,

representing categorieskThe true positive value of (A) is,

representing categorieskThe false negative value of (a) is,

indicating that a pixel belongs to a classkThe number of the (c) component(s),

，

representing categorieskThe geometric mean confidence of (2) is calculated,

representing categorieskThe rate of recall of the (c) is,

indicating labels as categorieskThe set of samples of (a) is,

label data representing the training data is stored in the memory,

indicating the sequence number of the training data.

The invention provides an eyeground blood vessel image segmentation method based on cavity convolution and semantic fusion, which has the following beneficial effects:

(1) The invention can accurately and effectively segment the fundus blood vessel image, and assist the clinical diagnosis work of doctors to further realize high-quality medical service;

(2) The multilayer multi-scale cavity convolution structure fusing the transform module has the advantages of high efficiency, portability, strong portability and the like, and can be migrated to other visual analysis tasks such as target detection, area positioning and the like which need to gradually expand image receptive fields or need to combine global features and local features so as to play a greater role;

(3) The semantic fusion structure has the advantages of high efficiency, portability, strong transportability and the like, and can be transferred to other visual analysis tasks needing multi-scale image feature fusion or feature selection, such as tumor image identification, image emotion analysis and the like, so as to play a greater role;

(4) From the perspective of patients, accurate medical diagnosis and treatment can shorten the time of patients to see a doctor, create favorable time conditions for improving the cure rate of diseases, contribute to improving the life quality of people and create good social benefits.

Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.

Drawings

FIG. 1 is a flow chart of a fundus blood vessel image segmentation method based on cavity convolution and semantic fusion according to the present invention;

FIG. 2 is a detailed structure diagram of each module in the fundus blood vessel image segmentation method based on cavity convolution and semantic fusion according to the present invention;

FIG. 3 is a model diagram of a fundus blood vessel image segmentation method based on cavity convolution and semantic fusion according to the present invention;

fig. 4 is a schematic structural diagram of a fundus blood vessel image segmentation system based on cavity convolution and semantic fusion, which is provided by the invention.

Detailed Description

Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention and are not to be construed as limiting the present invention.

These and other aspects of embodiments of the invention will be apparent with reference to the following description and attached drawings. In the description and drawings, particular embodiments of the invention have been disclosed in detail as being indicative of some of the ways in which the principles of the embodiments of the invention may be practiced, but it is understood that the scope of the embodiments of the invention is not limited correspondingly. On the contrary, the embodiments of the invention include all changes, modifications and equivalents coming within the spirit and terms of the claims appended hereto.

Referring to fig. 1 to 3, the present invention provides a fundus blood vessel image segmentation method based on void convolution and semantic fusion, the method includes the following steps:

s101, acquiring a fundus blood vessel image data set, acquiring a fundus blood vessel image from the fundus blood vessel image data set, and preprocessing the fundus blood vessel image.

In step S101, the fundus blood vessel image data set includes a STARE data set, a DRIVE data set, and a CHASEDB1 data set.

Specifically, the method for preprocessing the blood vessel image of the fundus comprises the following steps:

the size of the fundus blood vessel image is uniformly cut to 512 x 512, then image inversion, image rotation and image Gaussian blur operation are carried out to complete preprocessing, and finally the original fundus blood vessel image data set is expanded to 18 times of the original data set.

S102, performing improved design based on the U-Net model to obtain an improved neural network.

In a specific implementation, applying the U-Net model, the encoding portion retains its prototype, which includes four successive downsampling operations, each of which performs two successive 3 × 3 convolutions. The invention carries out re-improvement design on other parts of the U-Net model, and the steps are specifically described in step S103, step S104 and step S105.

S103, during the improvement design, a multilayer multi-scale cavity convolution structure is designed at the jump connection part of the improved neural network, and the multilayer multi-scale cavity convolution structure is used for spanning the coding part and the decoding part of the improved neural network so as to protect detail information in the fundus blood vessel.

In step S103, the multi-layer multi-scale hole convolution structure includes an upper layer image feature, a middle layer image feature, and a lower layer image feature. And performing cascade cavity convolution comprising convolution rates with different scales on the upper layer image characteristic, the middle layer image characteristic and the lower layer image characteristic.

Specifically, the method for performing the cascade type hole convolution containing different scale convolution rates on the upper layer image characteristic, the middle layer image characteristic and the lower layer image characteristic comprises the following steps:

and S1031, performing cascade type cavity convolution with gradually expanded receptive field on the upper-layer image characteristics, obtaining cavity convolution characteristics corresponding to the upper-layer image characteristics through convolution, and performing maximum pooling operation on the cavity convolution characteristics to obtain local characteristics of the large-size image.

As shown in FIG. 2 (a) Partially shown, the upper layer image features execute a cascading void convolution process, the upper layer image features are subjected to 3 × 3 void convolutions with 3 different scales, and the void convolution rates are 1, (1,3) and (1,3,5), respectively. And adopting the hole convolutions with the hole rates of 1,3 and 5 to cascade step by step so as to obtain the receptive fields with the hole rates of 3, 9 and 19. Therefore, the receptive field of the upper image is continuously enlarged, and an important foundation is laid for accurately and completely segmenting the target blood vessel. In addition, the cascading type cavity convolution is executed to obtain 3 image characteristics capable of displaying different receptive fields, each image characteristic is respectively convoluted with the image characteristic by 1 × 1, and the 1 × 1 convolution results are fused and added to obtain the cavity convolution characteristics corresponding to the upper-layer image characteristics. Finally, a maximum pooling operation is performed on the hollow convolution features, which are reduced to the same size as the middle layer image features, thereby obtaining local features from the large size image.

S1032, segmenting the fundus blood vessel image corresponding to the characteristics of the middle layer image into image blocks with fixed sizes, vectorizing the image blocks through a flattening operation, and executing linear mapping to convert the vectorized image blocks into low-dimensional linear embedded characteristics; inputting the low-dimensional linear embedded features into 12 consecutive transform layers in a transform module to continuously perform linear mapping, and performing self-attention weighting; and adding position codes to the low-dimensional linear embedded features to model and obtain a long-distance dependency relationship in the fundus blood vessel image, and extracting to obtain the global features of the image.

In the step of processing the features of the middle layer image, the following formula exists:

wherein,

representing a low-dimensional linear embedded feature that,

a matrix for performing a linear mapping is represented,

it is shown that the position code is,

is shown as

Each of the image blocks is a block of an image,

。

as shown in (A) of FIG. 2b) In part, the transform layer includes a normalization layer, a multi-headed self-attention and multi-layered perceptron, a

wherein,

is shown passing through

The image characteristics obtained after the transform layer coding,

representing image features weighted by multiple attention,

a linear embedded feature representing a sequence of input images,

it is meant that the multi-layer perceptron operates,

a multi-head self-attention operation is shown,

representing the normalization layer operations.

It can be understood that becauseMSAAndMLPby using the method, the Transformer module can model long-distance dependence in the fundus blood vessel image, further extract image details and capture image global features, and is better used for fundus blood vessel segmentation. Finally, the image features are reshaped to their original size by a 3 × 3 convolution operation in the image recovery layer.

S1033, performing cascade cavity convolution with gradually expanded receptive fields on the lower-layer image features, and then performing bilinear interpolation to increase the lower-layer image features to the same size as the middle-layer image features so as to obtain local features of small-size images.

As in (2) ((a) Partially shown, the lower layer image features execute a cascading hole convolution process, the lower layer image features are subjected to 3 × 3 hole convolutions with 3 different scales, and the hole convolution rates are 1, (1,3) and (1,3,5), respectively. And adopting the hole convolutions with the hole rates of 1,3 and 5 to cascade step by step so as to obtain the receptive fields with the hole rates of 3, 9 and 19. Therefore, the receptive field of the lower image is continuously enlarged, and an important foundation is laid for accurately and completely segmenting the target blood vessel.

In addition, performing cascade type hole convolution to obtain 3 image features capable of displaying different receptive fields, performing 1 × 1 convolution on each image feature and the image feature, and fusing and adding 1 × 1 convolution results to obtain hole convolution features corresponding to lower-layer image features. And finally, performing bilinear interpolation operation on the hollow convolution characteristic, and increasing the hollow convolution characteristic to the size which is the same as that of the middle-layer image characteristic, thereby obtaining the local characteristic of the small-size image.

S1034, adding the local features of the large-size image, the image global features and the local features of the small-size image to complete feature fusion.

In this step, the physical meaning of the fusion of the upper layer image feature, the middle layer image feature and the lower layer image feature is as follows: the global characteristics of the image output by the transform module and the local characteristics of the image output by the cascade type cavity convolution module are considered, the global characteristics and the local characteristics are complementary, and the details of the fundus blood vessel can be described more comprehensively. The size of the fused image features is the same as the size of the original image, i.e. the input and output sizes at the two ends of the jump connection are the same.

And S104, designing a semantic fusion structure in a decoding part of the improved neural network, wherein the semantic fusion structure is used for splicing the decoded multi-scale image features, constructing a pair of extrusion excitation modules, and screening key information of the spliced multi-scale image features according to the extrusion excitation modules to finally obtain a multi-scale image feature fusion result.

In step S104, the method for splicing the decoded multi-scale image features by the semantic fusion structure includes the following steps:

Second up-sampling feature map

A third upsampling profile

And a fourth upsampled feature map

；

Feature maps incorporating coding portions of the improved neural network

And the first up-sampling feature map

Second up-sampling feature map

Third upsampled feature map

And a fourth upsampled feature map

Splicing to obtain a new characteristic diagram

；

Wherein the new characteristic diagram

。

After obtaining a new characteristic diagram

Then, new feature map

Complementary information between features from different layers of the image is taken into account, helping to more fully characterize the vessel details in the image.

Further, the new feature map is provided

The key information in (1). Wherein the SE operations include a global average pooling operation, a non-linear activation operation, and a feature channel weighting operation.

Wherein, the new feature map comprises 320 feature channels

After the first SE operation is performed, a new feature map may be screened

The key channel information preliminarily inhibits noise in image characteristics and recovers important detail information; new characteristic diagram

After the second SE operation is carried out, further key information screening is carried out on the characteristic channel, and detail information in the fundus blood vessel image is accurately described for subsequent useAnd (5) dividing the task.

S1041, the global average pooling operation includes the following steps:

for new characteristic diagram

Has the attribute of

；

Performing a global average pooling operation to obtain feature vectorsmIs expressed as:

wherein,

diagram showing new characteristics

The height of (a) of (b),

showing new characteristics

The width of (a) is greater than the width of (b),

showing new characteristics

The number of the channels of (a) is,

showing new characteristics

To middle

A compressed representation of the global information for each channel,

representing against a new feature graph

The value of (a) is selected,

representing a graph against new features

The value of the width of (a) is,

。

s1042, the nonlinear activation operation includes the steps of:

modeling the correlation of the new characteristic diagram between different channels by utilizing two full-connection layers; wherein the first fully-connected layer converts the feature vector after nonlinear activationmIs reduced to 1/r, the second fully connected layer reduces the feature vectorm. law manAdding the dimensionality to the original dimensionality, and using a sigmoid function to carry out feature vectormThe feature weight of (A) is normalized to [0,1 ]]；

The formula for the nonlinear activation operation is expressed as:

wherein,

the weight vector of the feature is represented,

a sigmoid function is represented as a function,

the parameters representing the first fully-connected layer,

a parameter indicative of a second fully connected layer,

representing the feature vectors after global pooling,

representing the ReLU nonlinear activation function.

S1043, the feature channel weighting operation includes the following steps:

using feature weight vectors

For the new feature map

wherein,

showing new characteristics

To middle

And feature weight vectors corresponding to the channels.

Human perception of the outside world is a hierarchical structure that retains the most critical information by continuous filtering and screening, and the new feature X has a large number of channels with weights other than 0 and contains a lot of noise. Therefore, hierarchical structure pair new feature map is designed in DSE module (squeeze incentive module)

And (5) screening. As shown in part (c) DSE module in fig. 2:

novel feature map comprising 320 channels

After the first SE operation is executed, according to the information importance of each channel, the characteristic channel is endowed with a certain weight, the weight of 0 represents that the related characteristic channel does not contribute to the segmentation, and a new characteristic diagram

The noise information in the signal is suppressed to a certain extent; continuing to perform a second SE operation on the screened features, the feature channel weight with the original weight of 0 remains unchanged, and the feature channels with the original weights of not 0 are endowed with new weights, i.e. the channel weight containing important information becomes larger to highlight the importance of the feature channels, the channel weight containing secondary information becomes smaller to reduce the importance of the feature channels, the number of channels with weights of not 0 becomes smaller, and the new feature map is

The noise in the eye fundus image is further inhibited, so that the subsequent convolution operation and image pixel binary classification judgment are facilitated, and an important foundation is laid for high-quality eye fundus blood vessel image segmentation.

In conclusion, the multi-scale image features spliced by the high-speed neural network decoding part sequentially pass through the DSE module comprising three operations of global average pooling, nonlinear activation and feature channel weighting, the key information in the features is screened in a hierarchical mode, the multi-scale image feature fusion is completed, and preparation is made for fundus image segmentation.

S105, performing continuous three-time convolution operation on the multi-scale image feature fusion result to obtain an image to be segmented, and performing two-classification discrimination on pixels in the image to be segmented to segment the blood vessel image at the bottom of the eye to obtain an improved neural network model.

In specific implementation, a new feature map which is output by the semantic fusion structure and contains 320 feature channels passes through three continuous convolution layers and 64 filters, key features of blood vessels and the background are extracted, and a segmentation image is generated, wherein each pixel in the segmentation image has a probability value. A two-class discriminator is used to distinguish between vessels and background in the segmented image. Wherein the three convolution layers are 1 × 1,3 × 3, and 1 × 1 convolutions, respectively.

The binary classifier regards the pixels with the probability value larger than 0.5 in the segmentation image as fundus blood vessel pixels, marks the pixel value as 1, regards the pixels with the probability value smaller than 0.5 in the segmentation image as the background, marks the pixel value as 0, sets all the pixels with the pixel value of 1 as white, sets all the pixels with the pixel value of 0 as black, completes segmentation of the fundus blood vessel image, clearly distinguishes the blood vessel and the background, visually shows the segmentation effect to a doctor, and accurately and efficiently assists clinical diagnosis work of the doctor.

And S106, performing auxiliary training on the improved neural network model according to the first loss function so as to test the trained neural network model, thereby finally completing segmentation and verification of the fundus blood vessel image.

Wherein the expression of the first penalty function is:

wherein,

the first loss function is represented as a function of,

the total number of categories of the data is represented,

representing categorieskThe true positive value of (A) is,

representing categorieskThe false negative value of (a) is,

indicating that a pixel belongs to a classkThe number of the (c) component(s),

，

representing categorieskThe geometric mean confidence of the measured data points of the image,

representing categorieskThe rate of recall of the (c) is,

representing the predicted maximum distribution of the input training data across all classes,

indicating labels as categorieskThe set of samples of (a) is,

label data representing the training data is stored in the memory,

indicating the sequence number of the training data.

(1) The invention can accurately and effectively segment the fundus blood vessel image, and assists the clinical diagnosis work of doctors to realize high-quality medical service;

(2) The multilayer multi-scale cavity convolution structure integrated with the transform module has the advantages of high efficiency, portability, strong transportability and the like, and can be migrated to other visual analysis tasks such as target detection, area positioning and the like which need to gradually expand image receptive fields or need to combine global features and local features, so as to play a greater role;

(3) The semantic fusion structure provided by the invention has the advantages of high efficiency, portability, strong portability and the like, and can be transferred to other visual analysis tasks needing multi-scale image feature fusion or feature selection, such as tumor image identification, image emotion analysis and the like, so as to play a greater role;

Referring to fig. 4, the present invention further provides a fundus blood vessel image segmentation system based on cavity convolution and semantic fusion, wherein the system includes:

an image acquisition module to:

a model improvement module to:

a first design module to:

in the process of improving design, a multilayer multi-scale cavity convolution structure is designed at a jump connection part of the improved neural network, and the multilayer multi-scale cavity convolution structure is used for spanning an encoding part and a decoding part of the improved neural network so as to protect detail information in a fundus blood vessel;

a second design module to:

an image segmentation module to:

wherein the expression of the first penalty function is:

wherein,

the first loss function is represented as a function of,

the total number of categories of the data is represented,

representing categorieskThe true positive value of (A) is,

representing categorieskThe false negative value of (a) is,

indicating that a pixel belongs to a classkThe number of the (c) component(s),

，

representing categorieskThe rate of recall of the (c) is,

indicating labels as categorieskThe set of samples of (a) is,

label data representing the training data is stored in the memory,

indicating the sequence number of the training data.

It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.

In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present invention. It should be noted that various changes and modifications can be made by those skilled in the art without departing from the spirit of the invention, and these changes and modifications are all within the scope of the invention. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. An eyeground blood vessel image segmentation method based on cavity convolution and semantic fusion is characterized by comprising the following steps:

the expression of the first loss function is:

wherein,

a first loss function is represented as a function of,

the total number of categories of the data is represented,

representing categorieskThe true positive value of (A) is,

representing categorieskThe false negative value of (a) is,

indicating that a pixel belongs to a classkThe number of the (c) component(s),

，

representing categorieskThe geometric mean confidence of (2) is calculated,

representing categorieskThe recall rate of the (c),

indicating labels as categorieskThe set of samples of (a) is,

label data representing the training data is stored in the memory,

indicating the sequence number of the training data.

2. A fundus blood vessel image segmentation method based on cavity convolution and semantic fusion according to claim 1, characterized in that in the first step, the fundus blood vessel image data set comprises a STARE data set, a DRIVE data set and a CHASEDB1 data set;

3. A fundus blood vessel image segmentation method based on hole convolution and semantic fusion as claimed in claim 2 wherein in the third step, the multi-layer multi-scale hole convolution structure comprises an upper layer image feature, a middle layer image feature and a lower layer image feature;

4. A fundus blood vessel image segmentation method based on hole convolution and semantic fusion as claimed in claim 3 wherein the method of performing cascade hole convolution containing different scale convolution rates on the upper layer image feature, the middle layer image feature and the lower layer image feature comprises the steps of:

segmenting the fundus blood vessel image corresponding to the middle-layer image characteristic into image blocks with fixed sizes, vectorizing the image blocks through a flattening operation, and executing linear mapping to convert the vectorized image blocks into low-dimensional linear embedded characteristics; inputting the low-dimensional linear embedded features into 12 consecutive transform layers in a transform module to continuously perform linear mapping, and performing self-attention weighting; adding position codes to the low-dimensional linear embedded features to obtain a long-distance dependency relationship in the fundus blood vessel image through modeling, and extracting to obtain image global features;

5. A fundus blood vessel image segmentation method based on cavity convolution and semantic fusion according to claim 4, characterized in that the step of processing the middle layer image features has the following formula:

wherein,

a low-dimensional linear embedded feature is represented,

a matrix for performing a linear mapping is represented,

it is shown that the position code is,

is shown as

Each of the image blocks is a block of an image,

；

The formula corresponding to the feature transformation of each transform layer is expressed as follows:

wherein,

is shown passing through

The image characteristics obtained after the transform layer coding,

representing image features weighted by multi-head attention,

a linear embedded feature representing a sequence of input images,

it is meant that the multi-layer perceptron operates,

a multi-head self-attention operation is shown,

representing the normalization layer operations.

6. A fundus blood vessel image segmentation method based on hole convolution and semantic fusion according to claim 5, characterized in that in the fourth step, the method for splicing the decoded multi-scale image features by the semantic fusion structure comprises the following steps:

Second up-sampling feature map

A third upsampling profile

And a fourth upsampled feature map

；

Feature maps incorporating coding portions of the improved neural network

And the first up-sampling feature map

Second up-sampling feature map

Third upsampled feature map

And a fourth upsampled feature map

Splicing to obtain a new characteristic diagram

；

Wherein the new characteristic diagram

。

7. A fundus blood vessel image segmentation method based on hole convolution and semantic fusion according to claim 6, characterized in that said new feature map is obtained

Thereafter, the method further comprises:

the new feature map is mapped

Key information in (1);

wherein the SE operation comprises a global average pooling operation, a non-linear activation operation and a feature channel weighting operation;

the global average pooling operation comprises the steps of:

for the new feature map

Has the attribute of

；

wherein,

showing new characteristics

The height of (a) of (b),

showing new characteristics

The width of (a) is greater than the width of (b),

diagram showing new characteristics

The number of the channels of (a) is,

diagram showing new characteristics

To middle

A compressed representation of the global information for each channel,

representing against a new feature graph

The value of (a) is selected,

representing against a new feature graph

The value of (a) is selected from,

。

8. a fundus blood vessel image segmentation method based on hole convolution and semantic fusion according to claim 7, characterized in that said nonlinear activation operation comprises the following steps:

modeling the correlation of the new characteristic diagram between different channels by utilizing two full-connection layers; wherein the first fully-connected layer converts the feature vector after nonlinear activationmIs reduced to 1/r, the second fully connected layer reduces the feature vectorm method Human beingAdding dimensionality to original dimensionality, and using sigmoid function to make feature vectormThe feature weight of (A) is normalized to [0,1 ]]；

The formula for the nonlinear activation operation is expressed as:

wherein,

the weight vector of the feature is represented,

a sigmoid function is represented as a function,

the parameters representing the first fully-connected layer,

a parameter indicative of a second fully connected layer,

representing the feature vector after global pooling,

representing the ReLU nonlinear activation function.

9. A fundus blood vessel image segmentation method based on hole convolution and semantic fusion according to claim 8, wherein said characteristic channel weighting operation comprises the following steps:

using feature weight vectors

For the new feature map

wherein,

showing new characteristics

To middle

And feature weight vectors corresponding to the channels.

10. A fundus blood vessel image segmentation system based on void convolution and semantic fusion, the system comprising:

an image acquisition module to:

acquiring a fundus blood vessel image data set, acquiring a fundus blood vessel image from the fundus blood vessel image data set, and preprocessing the fundus blood vessel image;

a model improvement module to:

a first design module to:

a second design module to:

an image segmentation module to:

wherein the expression of the first penalty function is: