CN116740076A - Network model and method for pigment segmentation in retinal pigment degeneration fundus image - Google Patents

Network model and method for pigment segmentation in retinal pigment degeneration fundus image Download PDF

Info

Publication number
CN116740076A
CN116740076A CN202310544818.9A CN202310544818A CN116740076A CN 116740076 A CN116740076 A CN 116740076A CN 202310544818 A CN202310544818 A CN 202310544818A CN 116740076 A CN116740076 A CN 116740076A
Authority
CN
China
Prior art keywords
segmentation
pigment
encoder
network model
fundus image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310544818.9A
Other languages
Chinese (zh)
Inventor
陈新建
许景程
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou University
Original Assignee
Suzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou University filed Critical Suzhou University
Priority to CN202310544818.9A priority Critical patent/CN116740076A/en
Publication of CN116740076A publication Critical patent/CN116740076A/en
Pending legal-status Critical Current

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Image Analysis (AREA)

Abstract

The application provides a network model and a method for pigment segmentation in a retinal pigment degeneration fundus image, wherein the network model comprises an encoder, a processing unit and a processing unit, wherein the encoder is used for extracting a feature map from an original picture; the multi-scale global attention module is connected with the encoder and used for fusing multi-scale global context information according to the characteristics of the characteristic diagram extracted by the encoder; the channel and space joint attention module is used for extracting context semantic features according to the features of the feature map extracted by the encoder; the decoder is connected with the multi-scale global attention module and used for recovering the extracted characteristics according to the multi-scale global context information and the context semantic characteristics to obtain a prediction result; a joint attention module of channel and space is used between each layer of the encoder and decoder blocks for connection. The application realizes the automatic segmentation of the pigment deposition in the fundus image and improves the segmentation precision.

Description

Network model and method for pigment segmentation in retinal pigment degeneration fundus image
Technical Field
The application relates to the technical field of image segmentation, in particular to a network model and a method for pigment segmentation in a retinal pigment degeneration fundus image.
Background
Retinitis pigmentosa (Retinitis Pigmentosa, RP) is a hereditary retinal dystrophy caused by loss of photoreceptors and has a global prevalence of about 0.025%.
Fundus photography is an important way to check whether a patient has retinal pigment degeneration, which is a typical feature of retinal pigment degeneration, and accurate segmentation is of great importance to diagnosis and treatment of retinal pigment degeneration patients.
Currently, most methods of segmenting pigmentation are based on conventional machine learning. Brancat et al [1] propose a method of detecting pigmentation in fundus images using a random forest and an adaptively enhanced integrated classifier to divide each region into a normal region or a pigment region, and they propose another method of dividing pigmentation [2] based on a relationship between features of adjacent regions by applying local preprocessing to attenuate image distortion, followed by watershed transformation of the picture to produce homogeneous regions, followed by outlier detection of the extracted features, and finally classifying the pigment region and the normal region to divide the pigment region. With the continued development of deep learning, neural network-based approaches have also begun to emerge. Brancat et al [3] applied U-Net based networks for the first time to segment pigmentation on the retina, enabling end-to-end automatic segmentation of fundus images of RP patients. Arsalan et al [4] proposed an automatic RP segmentation network (RPS-Net) that enabled the application of feature enhancement strategies through multiple dense connections between convolutional layers, enabling the network to distinguish between normal and diseased eyes and accurately segment diseased areas from the background.
Although there have been many advances in the method of dividing pigmentation, there are still many limitations in the current methods of dividing pigments of fundus images. The machine learning-based method relies on the characteristics of manual design, and cannot realize automatic end-to-end segmentation, so that the workload of doctors can be increased. The deep learning-based method does not consider the importance of the multi-scale global context information and the attention module, focuses on the overall feature, ignores the semantic features, and thus may introduce redundant information when extracting the features and negatively affect the final segmentation result.
Therefore, there is a need for a network model and method that accurately segments pigmentation in retinal pigment-modified fundus images.
Reference is made to:
[1]Brancati,N.,Frucci,M.,Gragnaniello,D.,Riccio,D.,Di Iorio,V.,Di Perna,L.:Automatic segmentation of pigment deposits in retinal fundus images of retinitis pigmentosa.Computerized Medical Imaging and Graphics 66,73–81(2018)。
[2]Brancati,N.,Frucci,M.,Gragnaniello,D.,Riccio,D.,Di Iorio,V.,Di Perna,L.,Simonelli,F.:Learning-based approach to segment pigment signs in fundus images for retinitis pigmentosa analysis.Neurocomputing 308,159–171(2018)。
[3]Brancati,N.,Frucci,M.,Riccio,D.,Di Perna,L.,Simonelli,F.:Segmentation of pigment signs in fundus images for retinitis pigmentosa analysis by using deep learning.In:Image Analysis and Processing–ICIAP 2019:20th International Conference,Trento,Italy,September 9–13,2019,Proceedings,Part II 20.pp.437–445.Springer(2019)。
[4]Arsalan,M.,Baek,N.R.,Owais,M.,Mahmood,T.,Park,K.R.:Deep learningbased detection of pigment signs for analysis and diagnosis of retinitis pigmentosa.Sensors 20(12),3454(2020)。
disclosure of Invention
Therefore, the embodiment of the application provides a network model and a method for pigment segmentation in a retinal pigment degeneration fundus image, which are used for solving the problems that a machine learning-based method in the prior art depends on characteristics of manual design, automatic end-to-end segmentation cannot be realized, the importance of multi-scale global context information and attention modules is not considered in a deep learning-based method, the overall importance of the characteristics is emphasized, semantic characteristics are ignored, redundant information can be possibly introduced when the characteristics are extracted, and the final segmentation result is negatively influenced.
In order to solve the above-described problems, an embodiment of the present application provides a network model of pigment segmentation in a retinal pigment-denatured fundus image, the network model including:
an encoder for extracting a feature map from an original picture;
the multi-scale global attention module is connected with the encoder and used for fusing multi-scale global context information according to the characteristics of the characteristic diagram extracted by the encoder;
the channel and space joint attention module is used for extracting context semantic features according to the features of the feature map extracted by the encoder;
the decoder is connected with the multi-scale global attention module and used for recovering the extracted characteristics according to the multi-scale global context information and the context semantic characteristics to obtain a prediction result;
wherein a joint attention module of channel and space is adopted between each layer of the encoder and the decoder block for connection;
after the output of each layer of the encoder passes through the channel and space joint attention module, the obtained output is directly added with the output of the upper layer of the decoder to be used as the input of the lower layer of the decoder.
Preferably, the encoder includes five layers in total, and performs two 3x3 convolutions, batch normalization, and Relu activation processes after each downsampling; after each downsampling, the number of channels is doubled, the resolution is doubled, and the number of output channels is 32, 64, 128, 256 and 512 respectively.
Preferably, the first four layers of the encoder are activated by the Relu and then are added with an extrusion and excitation module, the extrusion and excitation module pools the input feature images through global average, and then the output vector obtained through the two full-connection layers is multiplied by the channel weight of the original image, so that the output feature images are obtained.
Preferably, the multi-scale global attention module structure is as follows:
the multi-scale global attention module comprises a K end, a Q end and a multi-scale global attention moduleV-terminal three input terminals for inputting characteristic X in The input is taken as the Q end after 1x1 convolution;
input feature X in Obtaining the characteristic X through a multi-scale information fusion module m Feature X m The obtained channel information and the input characteristic X are processed by a channel attention mechanism in The multiplication is used as an input of a K end, wherein the channel attention mechanism comprises average pooling, two 1x1 convolutions, a Relu activation process and a Sigmoid activation function;
input feature X in Obtaining the characteristic X through a multi-scale information fusion module m Feature X is subjected to average pooling and maximum pooling m Performing global feature refinement to obtain global features, then splicing the two global features to obtain a spatial information fusion feature, and convoluting the spatial information fusion feature with a feature X by 7X 7 m Multiplying the characteristic images to obtain a characteristic image which is used as the input of the V end;
the output of the Q end is multiplied by the output of the K end after transposition, a channel fusion result is obtained through a normalized exponential function, the channel fusion result is multiplied by the output of the V end after transposition, and a final output characteristic X is obtained after remolding out
Preferably, the multi-scale information fusion module is composed of parallel 1x1 convolutions, 3x3 convolutions, and 5 x 5 convolutions for extracting multi-scale features.
Preferably, the structure of the channel and space joint attention module is as follows:
preferably, the characteristic T obtained by the encoder h And decoder upsampled feature T up Combining, normalizing the obtained characteristics by a 1x1 convolution and spatial attention module to obtain weight information, and then combining the weight information with the characteristics T h Multiplying by T up And (5) information fusion is carried out.
Preferably, the network model employs a joint segmentation loss function based on the Dice loss and cross entropy loss, expressed as follows:
wherein L is Dice Representing the Dice loss, L BCE Represents cross entropy loss, L joint The joint segmentation loss of the Dice loss and the cross entropy loss is represented, g is more than or equal to 0 and less than or equal to 1, the target pixel value in the gold standard is represented by p is more than or equal to 0 and less than or equal to 1, the pixel value of an output graph of the neural network is represented by C, the number of all pixels in the image is represented by i, and the i is represented by the ith pixel.
The embodiment of the application also provides a pigment segmentation method in the retinal pigment degeneration fundus image, which comprises the following steps:
s1: collecting different retinal pigment degeneration fundus images to form a data set, and randomly dividing the data set into a training set, a verification set and a test set;
s2: constructing a network model of pigment segmentation in the retinal pigment degeneration fundus image;
s3: training the network model by using the training set and the verification set, and storing the model weight when the effect on the verification set reaches the best;
s4: and loading the saved model weight, and predicting the eye bottom image by using the trained network model so as to realize the segmentation of pigmentation.
Preferably, the segmentation results are evaluated using the Dice coefficient, cross-joint, accuracy and specificity.
The embodiment of the application also provides an electronic device, which comprises a processor, a memory and a bus system, wherein the processor and the memory are connected through the bus system, the memory is used for storing instructions, and the processor is used for executing the instructions stored by the memory so as to realize the method for dividing pigment in the retinal pigment degeneration fundus image.
From the above technical scheme, the application has the following advantages:
the embodiment of the application provides a network model and a method for pigment segmentation in a retinal pigment degeneration fundus image, wherein an MsGAM module is added between an encoder and a decoder to guide the model to learn multi-scale global context information; a CSAM module is provided for capturing the space and channel characteristics from the encoder and obtaining more critical characteristic information; the network model for pigment segmentation in the retinal pigment degeneration fundus image based on the U-shaped structure is provided by combining the MsGAM module and the CSAM module, and compared with other networks, the network provided by the application has obvious improvement on pigment segmentation performance.
Drawings
For a clearer description of embodiments of the application or of solutions in the prior art, reference will be made to the accompanying drawings, which are intended to be used in the examples, for a clearer understanding of the characteristics and advantages of the application, by way of illustration and not to be interpreted as limiting the application in any way, and from which, without any inventive effort, a person skilled in the art can obtain other figures. Wherein:
fig. 1 is a schematic diagram of a network model of pigment segmentation in a retinal pigment-modified fundus image according to an embodiment;
FIG. 2 is a schematic diagram of the structure of the extrusion and excitation module in an embodiment;
FIG. 3 is a schematic diagram of a multi-scale global attention module according to an embodiment;
FIG. 4 is a schematic diagram of a channel and spatial joint attention module in an embodiment;
fig. 5 is a flowchart of a method of pigment segmentation in a retinal pigment-modified fundus image provided in accordance with an embodiment;
fig. 6 is a schematic diagram showing the segmentation result of pigmentation in fundus images using different methods in the embodiment;
fig. 7 is a schematic diagram showing a visual detection result of a pigment region in a fundus image in the embodiment.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more clear, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
Example 1
As shown in fig. 1, an embodiment of the present application proposes a network model of pigment segmentation in a retinal pigment-denatured fundus image, the network model including:
an encoder for extracting a feature map from an original picture;
the multi-scale global attention module is connected with the encoder and used for fusing multi-scale global context information according to the characteristics of the characteristic diagram extracted by the encoder;
the channel and space joint attention module is used for extracting context semantic features according to the features of the feature map extracted by the encoder;
the decoder is connected with the multi-scale global attention module and used for recovering the extracted characteristics according to the multi-scale global context information and the context semantic characteristics to obtain a prediction result;
wherein a joint attention module of channel and space is adopted between each layer of the encoder and the decoder block for connection;
after the output of each layer of the encoder passes through the channel and space joint attention module, the obtained output is directly added with the output of the upper layer of the decoder to be used as the input of the lower layer of the decoder.
According to the network model for pigment segmentation in the retinal pigment degeneration fundus image, an MsGAM module is added between an encoder and a decoder to guide the model to learn multi-scale global context information; a CSAM module is provided for capturing the space and channel characteristics from the encoder and obtaining more critical characteristic information; the network model for pigment segmentation in the retinal pigment degeneration fundus image based on the U-shaped structure is provided by combining the MsGAM module and the CSAM module, and compared with other networks, the network provided by the application has obvious improvement on pigment segmentation performance.
Further, the network model for pigment segmentation in the retinal pigment degeneration fundus image provided by the application is characterized in that the backbone of the proposed segmentation network model is a U-shaped network based on U-Net, and the network model consists of an encoder, a multi-scale global attention module (multi-scale global attention module, msGAM), a channel and space joint attention module (channel and spatial joint attention module, CSAM) and a decoder. MsGAM is to fuse multi-scale global context information and CSAM is to extract more context semantic features. The application is designed to obtain an end-to-end pigment segmentation model by inputting fundus images into a network.
Further, the encoder includes five layers in total, and performs 3x3 convolution, batch normalization, and Relu activation processing twice after each downsampling; after each downsampling, the number of channels is doubled, the resolution is doubled, and the number of output channels is 32, 64, 128, 256 and 512 respectively. The first four layers of the encoder are activated by the Relu and then an extrusion and Excitation module (SE) is added, and the addition of the SE module can improve the feature extraction capability of the encoder compared with the U-Net encoder.
As shown in fig. 2, the SE module pools the input feature map through global average, and then multiplies the output vector obtained through two full connection layers by the channel weight of the original map to obtain the output feature map.
Further, considering complex pathological manifestations of pigment deposition with large size variation, wide distribution and blurred edges in fundus images, the segmentation network needs to enhance the capability of extracting multi-scale global features. The present application designs a novel multi-scale global attention module, which is embedded in the top layer of the encoder path, and the MsGAM structure is shown in fig. 3, and is different from the conventional self-attention module in that input information of each of three input key vectors (key, K), query vectors (Q) and value vectors (value, V) is different.
Specifically, the feature X is input in The input is taken as the Q end after 1x1 convolution; input feature X in Obtaining the characteristic X through a multi-scale information fusion module m Feature X m The obtained channel information and the input characteristic X are processed by a channel attention mechanism in The multiplication is used as an input of a K end, wherein the channel attention mechanism comprises average pooling, two 1x1 convolutions, a Relu activation process and a Sigmoid activation function; input feature X in Obtaining the characteristic X through a multi-scale information fusion module m Feature X is subjected to average pooling and maximum pooling m Performing global feature refinement to obtain global features, then splicing the two global features to obtain a spatial information fusion feature, and convoluting the spatial information fusion feature with a feature X by 7X 7 m The resulting feature map is multiplied as input to the V-terminal. The output of the Q end is multiplied by the output of the K end after transposition, a channel fusion result is obtained through a normalized exponential function, the channel fusion result is multiplied by the output of the V end after transposition, and a final output characteristic X is obtained after remolding out . The network model can acquire global long-distance information while capturing multi-scale local space and channel information, and similar features are related to each other regardless of the distance between the network model and the channel information, so that the network model has higher discrimination capability, and the network model is very suitable for focuses with wider distribution range such as pigments.
The multi-scale information fusion module consists of parallel 1x1 convolution, 3x3 convolution and 5 x 5 convolution and is used for extracting multi-scale features.
Furthermore, the simple jump connection in the U-Net combines the local information of different layers without difference, ignores the semantic information, and easily introduces the redundant information to influence the segmentation performance of the final pigment. In view of this, the present application employs a CSAM module instead of a jump connection in the U-Net, in one embodiment each layer between the encoder and decoder blocks is connected across the connection layer using a channel and spatial joint attention module.
The CSAM module structure is shown in figure 4, and the characteristic T obtained by the encoder h And decoder upsampled feature T up Combining, normalizing the obtained characteristics by a 1x1 convolution and spatial attention module to obtain weight information, and then combining the weight information with the characteristics T h Multiplying by T up And (5) information fusion is carried out. This allows the network to fully take into account global context informationMore effective information is reserved.
Further, the present application uses the difference loss (L Dice ) And cross entropy loss (L BCE ) Is used for the joint loss function of (a). The Dice loss can effectively relieve the problem of unbalanced data in image segmentation, is beneficial to pigment segmentation, and meanwhile, the problem that the model is unstable in training due to the Dice loss can be effectively relieved by adding the cross entropy loss. The joint loss can be expressed as:
wherein L is Dice Representing the Dice loss, L BCE Represents cross entropy loss, L joint The joint segmentation loss of the Dice loss and the cross entropy loss is represented, g is more than or equal to 0 and less than or equal to 1, the target pixel value in the gold standard is represented by p is more than or equal to 0 and less than or equal to 1, the pixel value of an output graph of the neural network is represented by C, the number of all pixels in the image is represented by i, and the i is represented by the ith pixel.
Example two
As shown in fig. 5, the present application provides a method of pigment segmentation in a retinal pigment-denatured fundus image, the method comprising:
s1: collecting different retinal pigment degeneration fundus images to form a data set, and randomly dividing the data set into a training set, a verification set and a test set;
s2: constructing a network model of pigment segmentation in the retinal pigment degeneration fundus image;
s3: training the network model by using the training set and the verification set, and storing the model weight when the effect on the verification set reaches the best;
s4: and loading the saved model weight, and predicting the eye bottom image by using the trained network model so as to realize the segmentation of pigmentation.
The method adopts the network model for pigment segmentation in the retinal pigment degeneration fundus image to realize the segmentation of pigment deposition in the fundus image, and is not repeated here for avoiding redundancy.
The advantages of the process according to the application will be elucidated by means of specific experiments.
1. Experimental data set
The present application uses a dataset consisting of 215 fundus images, with gold standards noted by the team of doctors. And for each original image, a method for cutting out the minimum circumscribed rectangle of the target area is adopted to remove part of noise, so that the influence of the noise on the final segmentation result is reduced. To obtain the best segmentation effect, we randomly divide the dataset into 132 training sets, 42 verification sets and 41 test sets, which basically meet 6:2:2.
2. Experimental setup
The application uses an Adam optimizer, the initial learning rate is 1.0X10-4, and the momentum is 0.9. The batch size is set to 4 and the number of iterations is set to 350. All images are first adjusted to 1024 x 1024, then divided into 4 blocks for data enhancement, i.e., 512 x 512 per block, and finally input into the model. The application is based on a Pytorch environment, and uses a NVIDIA RTX 3060 GPU with 12GB storage space for model training, verification and test.
3. Training and verification of network models
In order to verify the effectiveness of CSAM and MsGAM provided by the present application, four evaluation indexes of Dice coefficient, cross-over-Union (IoU), accuracy (Acc) and Specificity (Spec) are adopted, and corresponding ablation experiments are performed, and table 1 shows the results of the relevant ablation experiments and comparison.
TABLE 1
(1) The first four layers of encoders of U-Net add into the SE module as a base line network, namely "Baseline" in Table 1; (2) Adding CSAM to the base line network proposed by the present application, namely "Baseline+CSAM" in Table 1; (3) MsGAM was added to the proposed base network of the present application, "baseine+msgam" in table 1; as can be seen from table 1, the baseline+csam and baseline+msgam networks are improved in both the Dice coefficient and IoU, and the Dice coefficient is increased by 1.38% compared with the base network by 60.25% and IoU by 1.27% in combination with the UAU-Net of all modules.
Fig. 6 shows the segmentation results of pigmentation in fundus images using different methods, (a) as original image, (b) as gold standard, (c) - (f) as base line network, base line network+msgam, base line network+csam and UAU-Net proposed by the present application, in this order. In the figure, the white region is a correct divided region, the dark gray region is a missing divided region, and the light gray region is a multi-divided region. The result shows that the UAU-Net provided by the application has the best performance.
For objective evaluation of the performance of the method of the present application, the proposed method was compared with other CNN-based excellent segmentation networks including U-Net, context-encoding network (CE-Net), context pyramid fusion network (CPFNet), attention U-type network (Att-UNet) and curve structure segmentation network (CS) 2 Net). In these experiments, the settings of the parameters were the same as the settings of UAU-Net.
FIG. 7 shows the visual detection results of pigment regions in fundus images, (a) original image, (b) gold standard, (c) - (g) U-Net, CE-Net, CPFNet, CS in this order 2 Net and the segmentation result of UAU-Net proposed by the present application. The method proposed by the application is found to achieve the best results.
The results of the quantitative analysis are shown in Table 2 below, and the proposed method achieves good results in all of Dice, ioU, acc and Spec. The Dice was improved by 2.14% and IoU was improved by 1.81% compared to the best performing CE-Net in the comparative network. Except CS 2 Net, these networks do not pay attention to global attention, do not consider long-distance dependence of features, and for targets with wide distribution of pigments and large size variation, global attention can enable long-distance features to be correlated, so that it is easier to determine whether features with different distances are the same target. With CS 2 Compared to Net networks, the Dice is 8.17% improved, although CS 2 Net also uses self-attention related modules, but like the above networks, they are not suitable for small targets like pigments, for which they are not capturedThe most critical information. More notably, the method provided by the application obtains the minimum standard deviation on both Dice and IoU indexes, which indicates that the obtained segmentation result is the most stable.
TABLE 2
In summary, the method for pigment segmentation in the retinal pigment degeneration fundus image provided by the application has been realized and verified. The multi-scale global attention module MsGAM and the channel and space combined attention module CSAM provided by the application better overcome the defects of insufficient consideration, excessive redundant information and the like in the aspect of multi-scale global context characteristic information extraction of the traditional model. Experimental results show that the UAU-Net network designed by the application can effectively divide the pigmentation on the fundus image, and is beneficial to the diagnosis and treatment of the retinitis pigmentosa patients by ophthalmologists.
The embodiment of the application also provides an electronic device, which comprises a processor, a memory and a bus system, wherein the processor and the memory are connected through the bus system, the memory is used for storing instructions, and the processor is used for executing the instructions stored by the memory so as to realize the method for dividing pigment in the retinal pigment degeneration fundus image.
Note that the above is only a preferred embodiment of the present application and the technical principle applied. It will be understood by those skilled in the art that the present application is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the application. Therefore, while the application has been described in connection with the above embodiments, the application is not limited to the embodiments, but may be embodied in many other equivalent forms without departing from the spirit or scope of the application, which is set forth in the following claims.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks. These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It is apparent that the above examples are given by way of illustration only and are not limiting of the embodiments. Other variations and modifications of the present application will be apparent to those of ordinary skill in the art in light of the foregoing description. It is not necessary here nor is it exhaustive of all embodiments. And obvious variations or modifications thereof are contemplated as falling within the scope of the present application.

Claims (10)

1. A network model of pigment segmentation in a retinal pigment-modified fundus image, comprising:
an encoder for extracting a feature map from an original picture;
the multi-scale global attention module is connected with the encoder and used for fusing multi-scale global context information according to the characteristics of the characteristic diagram extracted by the encoder;
the channel and space joint attention module is used for extracting context semantic features according to the features of the feature map extracted by the encoder;
the decoder is connected with the multi-scale global attention module and used for recovering the extracted characteristics according to the multi-scale global context information and the context semantic characteristics to obtain a prediction result;
wherein a joint attention module of channel and space is adopted between each layer of the encoder and the decoder block for connection;
after the output of each layer of the encoder passes through the channel and space joint attention module, the obtained output is directly added with the output of the upper layer of the decoder to be used as the input of the lower layer of the decoder.
2. The network model of pigment segmentation in a retinitis pigmentosa fundus image according to claim 1, wherein the encoder comprises a total of five layers, with 3x3 convolution, batch normalization, and Relu activation processing performed twice after each downsampling; after each downsampling, the number of channels is doubled, the resolution is doubled, and the number of output channels is 32, 64, 128, 256 and 512 respectively.
3. The network model for pigment segmentation in a retinal pigment degeneration fundus image according to claim 2, wherein the first four layers of the encoder are activated by a Relu and then an extrusion and excitation module is added, the extrusion and excitation module pools the input feature map globally, and then the output vector obtained by the two full-connection layers is multiplied by the channel weight of the original map to obtain an output feature map.
4. The network model of pigment segmentation in a retinitis pigmentosa fundus image of claim 1, wherein the multi-scale global attention module structure is:
the multi-scale global attention module comprises three input ends of a K end, a Q end and a V end, and features X are input in The input is taken as the Q end after 1x1 convolution;
input feature X in Obtaining the characteristic X through a multi-scale information fusion module m Feature X m The obtained channel information and the input characteristic X are processed by a channel attention mechanism in The multiplication is used as an input of a K end, wherein the channel attention mechanism comprises average pooling, two 1x1 convolutions, a Relu activation process and a Sigmoid activation function;
input feature X in Obtaining the characteristic X through a multi-scale information fusion module m Feature X is subjected to average pooling and maximum pooling m Performing global feature refinement to obtain global features, then splicing the two global features to obtain a spatial information fusion feature, and convoluting the spatial information fusion feature with a feature X by 7X 7 m Multiplying the characteristic images to obtain a characteristic image which is used as the input of the V end;
the output of the Q end is multiplied by the output of the K end after transposition, a channel fusion result is obtained through a normalized exponential function, the channel fusion result is multiplied by the output of the V end after transposition, and a final output characteristic X is obtained after remolding out
5. The network model of pigment segmentation in a retinitis pigmentosa fundus image of claim 4, wherein the multi-scale information fusion module consists of parallel 1x1 convolutions, 3x3 convolutions, and 5 x 5 convolutions for extracting multi-scale features.
6. The network model of pigment segmentation in a retinitis pigmentosa fundus image of claim 1, wherein the channel and spatial joint attention module is structured as:
feature T obtained by the encoder h And decoder upsampled feature T up Combining, normalizing the obtained characteristics by a 1x1 convolution and spatial attention module to obtain weight information, and then combining the weight information with the characteristics T h Multiplying by T up And (5) information fusion is carried out.
7. A network model of pigment segmentation in a retinitis pigmentosa fundus image according to claim 1, characterized in that the network model employs a joint segmentation loss function based on a Dice loss and a cross entropy loss, expressed as follows:
wherein L is Dice Representing the Dice loss, L BCE Represents cross entropy loss, L joint The joint segmentation loss of the Dice loss and the cross entropy loss is represented, g is more than or equal to 0 and less than or equal to 1, the target pixel value in the gold standard is represented by p is more than or equal to 0 and less than or equal to 1, the pixel value of an output graph of the neural network is represented by C, the number of all pixels in the image is represented by i, and the i is represented by the ith pixel.
8. A method of pigment segmentation in a retinitis pigmentosa fundus image, comprising:
s1: collecting different retinal pigment degeneration fundus images to form a data set, and randomly dividing the data set into a training set, a verification set and a test set;
s2: constructing a network model of pigment segmentation in the retinal pigment-modified fundus image of any one of claims 1 to 7;
s3: training the network model by using the training set and the verification set, and storing the model weight when the effect on the verification set reaches the best;
s4: and loading the saved model weight, and predicting the eye bottom image by using the trained network model so as to realize the segmentation of pigmentation.
9. The method of pigment segmentation in a retinitis pigmentosa fundus image of claim 8, wherein the segmentation results are evaluated using a Dice coefficient, cross-joint, accuracy, and specificity.
10. An electronic device comprising a processor, a memory and a bus system, the processor and the memory being connected by the bus system, the memory being for storing instructions, the processor being for executing the instructions stored by the memory to implement a method of pigment segmentation in a retinitis pigmentosa fundus image according to any of claims 8 to 9.
CN202310544818.9A 2023-05-15 2023-05-15 Network model and method for pigment segmentation in retinal pigment degeneration fundus image Pending CN116740076A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310544818.9A CN116740076A (en) 2023-05-15 2023-05-15 Network model and method for pigment segmentation in retinal pigment degeneration fundus image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310544818.9A CN116740076A (en) 2023-05-15 2023-05-15 Network model and method for pigment segmentation in retinal pigment degeneration fundus image

Publications (1)

Publication Number Publication Date
CN116740076A true CN116740076A (en) 2023-09-12

Family

ID=87905314

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310544818.9A Pending CN116740076A (en) 2023-05-15 2023-05-15 Network model and method for pigment segmentation in retinal pigment degeneration fundus image

Country Status (1)

Country Link
CN (1) CN116740076A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117522881A (en) * 2023-11-06 2024-02-06 山东省人工智能研究院 Cardiac image segmentation method based on attention mechanism and multi-level feature fusion

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117522881A (en) * 2023-11-06 2024-02-06 山东省人工智能研究院 Cardiac image segmentation method based on attention mechanism and multi-level feature fusion

Similar Documents

Publication Publication Date Title
CN114612479B (en) Medical image segmentation method and device based on global and local feature reconstruction network
Lin et al. Automatic retinal vessel segmentation via deeply supervised and smoothly regularized network
CN111259982A (en) Premature infant retina image classification method and device based on attention mechanism
Wang et al. Frnet: an end-to-end feature refinement neural network for medical image segmentation
Balakrishna et al. Automatic detection of lumen and media in the IVUS images using U-Net with VGG16 Encoder
JP6446374B2 (en) Improvements in image processing or improvements related to image processing
CN113298718A (en) Single image super-resolution reconstruction method and system
Hu et al. Automatic artery/vein classification using a vessel-constraint network for multicenter fundus images
CN112884788B (en) Cup optic disk segmentation method and imaging method based on rich context network
CN116309648A (en) Medical image segmentation model construction method based on multi-attention fusion
CN114140651A (en) Stomach focus recognition model training method and stomach focus recognition method
Yamanakkanavar et al. MF2-Net: A multipath feature fusion network for medical image segmentation
CN116740076A (en) Network model and method for pigment segmentation in retinal pigment degeneration fundus image
Yang et al. RADCU-Net: Residual attention and dual-supervision cascaded U-Net for retinal blood vessel segmentation
Jian et al. Dual-Branch-UNet: A Dual-Branch Convolutional Neural Network for Medical Image Segmentation.
Qin et al. A review of retinal vessel segmentation for fundus image analysis
CN112869704B (en) Diabetic retinopathy area automatic segmentation method based on circulation self-adaptive multi-target weighting network
CN114418987A (en) Retinal vessel segmentation method and system based on multi-stage feature fusion
CN113610842A (en) OCT image retina detachment and splitting automatic segmentation method based on CAS-Net
CN113052849A (en) Automatic segmentation method and system for abdominal tissue image
CN116091458A (en) Pancreas image segmentation method based on complementary attention
Wang et al. Ensemble of deep learning cascades for segmentation of blood vessels in confocal microscopy images
CN113379770B (en) Construction method of nasopharyngeal carcinoma MR image segmentation network, image segmentation method and device
Go et al. Combined Deep Learning of Fundus Images and Fluorescein Angiography for Retinal Artery/Vein Classification
He et al. Ultrasonic image diagnosis of liver and spleen injury based on a double-channel convolutional neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination