CN115587979B - Three-stage attention network-based diabetic retinopathy grading method - Google Patents

Three-stage attention network-based diabetic retinopathy grading method Download PDF

Info

Publication number
CN115587979B
CN115587979B CN202211233514.2A CN202211233514A CN115587979B CN 115587979 B CN115587979 B CN 115587979B CN 202211233514 A CN202211233514 A CN 202211233514A CN 115587979 B CN115587979 B CN 115587979B
Authority
CN
China
Prior art keywords
representing
network
attention
image
classification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211233514.2A
Other languages
Chinese (zh)
Other versions
CN115587979A (en
Inventor
蹇木伟
陈鸿瑜
王芮
举雅琨
杨成东
武玉增
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Jiude Intelligent Technology Co ltd
Linyi University
Shandong University of Finance and Economics
Original Assignee
Shandong Jiude Intelligent Technology Co ltd
Linyi University
Shandong University of Finance and Economics
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Jiude Intelligent Technology Co ltd, Linyi University, Shandong University of Finance and Economics filed Critical Shandong Jiude Intelligent Technology Co ltd
Priority to CN202211233514.2A priority Critical patent/CN115587979B/en
Publication of CN115587979A publication Critical patent/CN115587979A/en
Application granted granted Critical
Publication of CN115587979B publication Critical patent/CN115587979B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30041Eye; Retina; Ophthalmic
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Abstract

The application provides a three-stage attention network-based diabetic retinopathy grading method which comprises two parts, wherein a three-stage network model aiming at a diabetic retinopathy grading task is constructed, and the model can split a complex diabetic retinopathy five-classification task into specific two/three classification tasks, so that the classification of the specific lesion type is carried out by the specific model, and the accuracy of overall grading is improved. Secondly, attention modules are respectively added into the three-stage model, wherein a non-local spatial attention module is designed, the module can enable the model to have the functions of context information and sensitivity to spatial position information of lesions, and can effectively and accurately classify confusing categories, so that the accuracy of each stage model and the classification efficiency of the whole model are improved.

Description

Three-stage attention network-based diabetic retinopathy grading method
Technical Field
The application relates to the field of computer vision and the technical field of medical treatment, in particular to a method for grading diabetic retinopathy based on a three-stage attention network.
Background
Diabetic retinopathy (Diabetic Retinopathy, DR) is a complication of diabetes mellitus, which is the most common condition leading to blindness and vision disability. According to the international clinical DR classification system in 2003, DR can be classified into five stages of non-DR, light, moderate, severe non-proliferative DR (NPDR) and Proliferative DR (PDR) according to fundus lesions, with the extent of the proliferative DR being the most severe. The main treatment in current medicine is anti-Vascular Endothelial Growth Factor (VEGF) injection or laser photocoagulation therapy. However, despite the existence of therapeutic approaches, the probability of blindness is largely dependent on early diagnosis.
Deep learning based disease classification has achieved great success in solving clinical problems. Disease classification prediction disease categories may assist doctors in completing early diagnosis of disease. At present, a model for grading the severity of diabetic retinopathy is proposed, but the grading accuracy is still to be further improved, and for the reason that the model accuracy is not high, the following points are possible: 1) The convolution kernel in convolutional neural networks is a local feature extraction operation that is not of sufficient interest for context information, which has a certain limitation in DR grading, because there may be long-distance similar lesions, such as microaneurysms, in fundus images. 2) For different lesions in DR, their shapes are also different, for example, the microaneurysms are punctiform in fundus images, and bleeding and exudates are likely to be flaked, so that the model is required to extract not only texture features but also shape features appropriately. 3) Most of the existing models perform well in DR severity grading, with/without DR, but perform poorly in multiple classification tasks of light/medium/heavy DR and proliferative DR.
In summary, for DR severity classification tasks, the impetus to improve model performance may be placed on model design, combining contextual information in fundus images, and extracting shape features of more lesions.
Disclosure of Invention
In order to make up for the defects of the prior art, the application designs a three-stage network model based on non-local attention in order to improve the grading efficiency of the diabetic retinopathy severity and better assist doctors in early diagnosis of patients. The application provides a method for grading diabetic retinopathy based on a three-stage attention network.
The application is realized by the following technical scheme: a method for grading diabetic retinopathy based on a three-stage attention network, which comprises the following steps:
s1, constructing a data set: firstly, remapping the original five-class labels of the data set into corresponding two-class labels and three-class labels; the data set is then divided into a training set and a test set, and the original training set is divided into three specific training sets according to three types of labels、/>、/>
S2, designing a non-local attention module: including non-local channel attention modulesAnd non-officePartial space attention module->The method comprises the steps of carrying out a first treatment on the surface of the The non-local attention module comprises two parts of feature extraction and non-local attention calculation, wherein the two parts are designed in a parallel mode, and the image is +.>The output after entering the module is a non-local channel attention strive +.>Non-local spatial attention patterns
S3: constructing a multi-label-based multi-classification network model and training; the model divides the network into three stages of Stage1, stage2 and Stage3 according to the redefined three types of labels in the S1, wherein Stage1 and Stage2 are classified tasks, stage3 is a three-classified task, the three stages are independently operated in a training Stage of the network model, and the testing stages are serial; i.e. the training set described in S1、/>、/>Respectively inputting the three models to train to obtain three models after training;
s4: inputting the test set into a three-stage network model in turn for prediction to obtain a final prediction result
S5: model evaluation: and comprehensively evaluating the network grading effect by using an Accuracy index (Accuracy).
Preferably, step S1 discloses a data set with APTOS 2019Blindness Detection on kagle, and the original label of the data set is thatWherein 0 represents no DR,1 represents mild DR,2 represents moderate DR,3 represents severe DR,4 represents proliferative DR, comprising the steps of:
s1-1, a data set is divided into 8: dividing the ratio of 2 into an original training set and a test set;
s1-2, reconstructing the divided original training set into multiple labels according to five classification labels of the data set, namely,/>2,3}, wherein->Wherein 0 represents no DR and 1 represents DR; />Wherein 0 represents non-proliferation, 1 represents proliferation DR; />Middle 1 represents mild DR,2 represents moderate DR, and 3 represents severe DR;
s1-3, constructing an original training set into three specific training sets according to the new label、/>、/>, wherein />All samples comprising the original training set; />Samples containing only DR; />Only non-value added DR samples are included;
preferably, the step S2 specifically includes the following steps:
s2-1, constructing a non-local channel attention module
S2-1-1, performing feature extraction by adopting three-layer convolution, adding residual blocks in ResNet network, and performing image extractionExtracting features to obtain feature map->
(I))))) (1);
wherein 、/>Representing 1x1 convolution and 3x3 convolution, respectively,/->Representing the ReLU activation function.
S2-1-2, the non-local channel attention part is divided into two sub-parts of global perception and channel attentionThe method comprises the steps of carrying out a first treatment on the surface of the First, for an imageGlobal information modeling is performed, and mathematical representation is as follows:
(2);
where i is the index of the output location for which the response is to be calculated, j is the index of all possible locations,representing a 1x1 convolution,/->Is a normalization parameter; />Represented by formula (3):
(3);
wherein and />Respectively representing two different 1x1 convolutions;
s2-1-3, to be obtained with context informationFurther enter the channel attention part to generate channel attention vector +.>The mathematical expression is:
(/>Pool(/>)))/>(/>MaxPool(/>))) (4);
wherein ,pool and MaxPool represent average pooling and maximum pooling operations, respectively, and then use full-join layers and />To learn the dependency between channels>Representing a Sigmoid activation function; />Representing an add by element operation;
s2-1-4, channel attention vectorAnd (4) feature map>Fusion to obtainThe final non-local channel attention module is obtained through the shortcut operation after reaching the characteristic diagram of the channel attention>Output of +.>The mathematical expression is:
(5);
wherein Representing multiplication by element, ∈>Representing an add by element operation;
s2-2, constructing a non-local spatial attention module
S2-2-1 is similar to the feature extraction operation of S2-1-1 in S2-1, except that the 3x3 convolution is replaced by a large-kernel convolution of 13x13, so that the effective receptive field is increased, and the feature map has more shape bias. Image processing apparatusExtracting features to obtain feature map->
(I))))) (6)
S2-2-2, dividing the non-local spatial attention part into two sub-parts of global modeling and spatial attention generation; global modeling operation with S2-1, equations (2) and (3) to obtain an image with context information
S2-2-3、Further enter the spatial attention part to generate a spatial attention vector +.>The mathematical expression is:
(7)
wherein Represents a 7x7 convolution, ">Representing Sigmoid activation functions, avgPool and MaxPool represent average pooling operations and maximum pooling operations, respectively.
S2-2-4, and operating with the formula (5) in S2-1 to obtain the final non-local spatial attention moduleOutput of (2)
As a preferable scheme, in step S3, three-Stage network models Stage1, stage2, stage3 are constructed, which specifically includes the following steps:
s3-1, constructing Stage1; stage1 toAs a basic structure by stacking->,/>,...,/>And adding two layers +.>、/>As a classification header, a Stage1 network model is obtained. Entering Stage1 for image I will yield two classification results, expressed mathematically as follows:
(.../>(8)
))) (9)
wherein Respectively representing model prediction sample category->Probability values of (2);
s3-2, constructing Stage2 and Stage 3; stage2, 3 are allAs basic structure +.>By stacking->,/>,...,And adding two layers +.>、/>As a classification header, stage2, 3 network models are obtained. Entering Stage2 for image I will yield two classification results, expressed mathematically as follows:
(.../>(10)
))) (11)
wherein Respectively representing model prediction sample category->Probability values of (a) are provided.
Entering Stage3 for image I will yield a three-classification result, mathematically expressed as follows:
(.../>(12)
))) (13)
wherein Respectively representing model prediction sample category->Probability values of (a) are provided.
S3-3, constructing model optimization targets of different stages; stage1 and Stage2 are two-class networks, and adopt cross entropy loss functions:
(14)
where N represents the number of samples,a label representing the ith sample, +.>Representing the probability that the i-th sample is positive; in Stage 1->The method comprises the steps of carrying out a first treatment on the surface of the Stage2>
The training set of Stage3 uses Focal-Loss as its Loss function:
(15)
wherein and />Is two super parameters, here set to 0.25 and 2, < >>Weight representing the i-th sample, +.>For controlling the difficulty classification sample, +.>Representing the probability value of predicting the ith sample, < +.>
S3-4, training set described in S1、/>、/>Respectively inputting the three models for training to obtain three models after training.
Preferably, the step S4 specifically includes the following steps:
s4-1, inputting the test set sample into the trained Stage1 network to obtain a classification result. If it is=0, then as final prediction result +.>=0; if->When the image is=1, the DR lesion exists in the image, and the image enters Stage2 for further prediction;
s4-2, the image enters a Stage2 network to obtain a classification resultThe method comprises the steps of carrying out a first treatment on the surface of the If->=1, then as final prediction result +.>=4; if->If the value is=0, the DR lesion in the image is not incremental, and the image enters Stage3 for further prediction;
s4-3, the image enters a Stage3 network to obtain a three-classification result. The result of Stage3 network prediction will be the final predicted result +.>=/>
Preferably, the step S5 specifically includes the following steps:
s5-1, comparing the predicted values of all samples in the test setAnd tag->Calculate the correct number of classification +.>
(16);
Where N is the total number of samples in the test set and i represents the ith sample in the test set.
S5-2, calculating accuracy according to a formula to judge model performance:
(17)。
the application adopts the technical proposal, and compared with the prior art, the application has the following beneficial effects: the application mainly comprises two parts, namely, a three-stage network model aiming at the diabetic retinopathy grading task is constructed, and the model can split the complex diabetic retinopathy five-classification task into specific two/three-classification tasks, so that the category with specific lesions is classified by the specific model, and the accuracy of overall grading is improved. Secondly, attention modules are respectively added into the three-stage model, wherein a non-local spatial attention module is designed, the module can enable the model to have the functions of context information and sensitivity to spatial position information of lesions, and can effectively and accurately classify confusing categories, so that the accuracy of each stage model and the classification efficiency of the whole model are improved.
Additional aspects and advantages of the application will be set forth in part in the description which follows, or may be learned by practice of the application.
Drawings
The foregoing and/or additional aspects and advantages of the application will become apparent and may be better understood from the following description of embodiments taken in conjunction with the accompanying drawings in which:
FIG. 1 is a general flow chart of the present application;
FIG. 2 is a schematic diagram of a non-local channel attention module;
FIG. 3 is a schematic diagram of a non-local spatial attention module;
fig. 4 is a network training-testing flow chart.
Detailed Description
In order that the above-recited objects, features and advantages of the present application will be more clearly understood, a more particular description of the application will be rendered by reference to the appended drawings and appended detailed description. It should be noted that, without conflict, the embodiments of the present application and features in the embodiments may be combined with each other.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application, however, the present application may be practiced otherwise than as described herein, and therefore the scope of the present application is not limited to the specific embodiments disclosed below.
The application designs a three-stage network model based on non-local attention in order to improve the grading efficiency of the severity of diabetic retinopathy and better assist doctors in early diagnosis of patients. When the method is specifically implemented, the technical scheme of the application can adopt the computer software technology to realize the automatic operation flow. A method of three-phase attention network based diabetic retinopathy stratification in accordance with an embodiment of the present application is described in detail below with reference to fig. 1-3.
The application provides a method for grading diabetic retinopathy based on a three-stage attention network, which specifically comprises the following steps:
s1, constructing a data set: firstly, remapping the original five-class labels of the data set into corresponding two-class labels and three-class labels; the data set is then divided into a training set and a test set, and the original training set is divided into three specific training sets according to three types of labels、/>、/>The method comprises the steps of carrying out a first treatment on the surface of the The data set is disclosed by APTOS 2019Blindness Detection on kagle, and the original label of the data set is +.>Wherein 0 represents no DR,1 represents mild DR,2 represents moderate DR,3 represents severe DR,4 represents proliferative DR, as shown in fig. 1, specifically comprising the steps of:
s1-1, a data set is divided into 8: dividing the ratio of 2 into an original training set and a test set;
s1-2, reconstructing the divided original training set into multiple labels according to five classification labels of the data set, namely,/>2,3}, wherein->Wherein 0 represents no DR and 1 represents DR; />Wherein 0 represents non-proliferation, 1 represents proliferation DR; />Middle 1 represents mild DR,2 represents moderate DR, and 3 represents severe DR;
s1-3, constructing an original training set into three specific training sets according to the new label、/>、/>, wherein />All samples comprising the original training set; />Samples containing only DR; />Containing only non-value-added DR samples
S2, designing a non-local attention module: including non-local channel attention modulesAnd a non-local spatial attention module->The method comprises the steps of carrying out a first treatment on the surface of the The non-local attention module comprises two parts of feature extraction and non-local attention calculation, wherein the two parts are designed in a parallel mode, and the image is +.>The output after entering the module is a non-local channel attention strive +.>Non-local spatial attention patternsThe method comprises the steps of carrying out a first treatment on the surface of the As shown in fig. 2 and 3, the step S2 specifically includes the following steps:
s2-1, constructing a non-local channel attention module
S2-1-1, performing feature extraction by adopting three-layer convolution, adding residual blocks in ResNet network, and performing image extractionExtracting features to obtain feature map->
(I))))) (1);
wherein 、/>Representing 1x1 convolution and 3x3 convolution, respectively,/->Representing the ReLU activation function.
S2-1-2, the non-local channel attention part is divided into two sub-parts of global perception and channel attention; first, for an imageGlobal information modeling is performed, and mathematical representation is as follows:
(2);
where i is the index of the output location for which the response is to be calculated, j is the index of all possible locations,representing a 1x1 convolution,/->Is a normalization parameter; />Expressed by formula (3), the objective is to calculate the similarity between the pixels of the image I by means of dot multiplication and activate it with the Softmax function:
(3);
wherein and />Respectively representing two different 1x1 convolutions;
s2-1-3, to be obtained with context informationFurther enter the channel attention part to generate channel attention vector +.>The mathematical expression is:
(/>Pool(/>)))/>(/>MaxPool(/>))) (4);
wherein ,pool and MaxPool represent average pooling and maximum pooling operations, respectively, and then use full-join layers and />To learn the dependency between channels>Representing a Sigmoid activation function, the purpose is to limit the weight value of each channel to (0, 1); />Representing an add by element operation;
s2-1-4, channel attention vectorAnd (4) feature map>Fusing to obtain a feature map with channel attention, and obtaining a final non-local channel attention module through shortcut operation>Output of +.>The mathematical expression is:
(5);
wherein Representing multiplication by element, ∈>Representing an add by element operation;
s2-2, constructing a non-local spatial attention module
S2-2-1 is similar to the feature extraction operation of S2-1-1 in S2-1, except that the 3x3 convolution is replaced by a large-kernel convolution of 13x13, so that the effective receptive field is increased, and the feature map has more shape bias. Image processing apparatusExtracting features to obtain feature map->
(I))))) (6)
S2-2-2, dividing the non-local spatial attention part into two sub-parts of global modeling and spatial attention generation; the global modeling operation is the same as equations (2) and (3) in S2-1 to obtain an image with context information
S2-2-3、Further enter the spatial attention part to generate a spatial attention vector +.>The mathematical expression is:
(7)
wherein Represents a 7x7 convolution, ">Representing Sigmoid activation functions, avgPool and MaxPool represent average pooling operations and maximum pooling operations, respectively.
S2-2-4, and operating with the formula (5) in S2-1 to obtain the final non-local spatial attention moduleOutput of (2)
S3: constructing a multi-label-based multi-classification network model and training; the model divides the network into three stages of Stage1, stage2 and Stage3 according to the redefined three types of labels in the S1, wherein Stage1 and Stage2 are classified tasks, stage3 is a three-classified task, the three stages are independently operated in a training Stage of the network model, and the testing stages are serial; i.e. the training set described in S1、/>、/>Respectively inputting the three models to train to obtain three models after training; constructing three-Stage network models Stage1, stage2 and Stage3, wherein Stage1 classifies whether DR exists in fundus images; stage2 classifies whether or not the fundus image is proliferative DR; stage3 is used for analyzing NPDR fundus image and furtherThe severity of the step subdivision comprises the following steps:
s3-1, constructing Stage1; stage1 toAs a basic structure by stacking->,/>,...,/>And adding two layers +.>、/>As a classification header, a Stage1 network model is obtained. Entering Stage1 for image I will yield two classification results, expressed mathematically as follows:
(.../>(8)
))) (9)
wherein Respectively representing model prediction sample category->Probability values of (2);
s3-2, constructing Stage2 and Stage 3; stage2, 3 are allAs basic structure +.>By stacking->,/>,...,And adding two layers +.>、/>As a classification header, stage2, 3 network models are obtained. Entering Stage2 for image I will yield two classification results, expressed mathematically as follows:
(.../>(10)
))) (11)
wherein Respectively representing model prediction sample category->Probability values of (a) are provided.
Entering Stage3 for image I will yield a three-classification result, mathematically expressed as follows:
(.../>(12)
))) (13)
wherein Respectively representing model prediction sample category->Probability values of (a) are provided.
S3-3, constructing model optimization targets of different stages; stage1 and Stage2 are two-class networks, and adopt cross entropy loss functions:
(14)
where N represents the number of samples,a label representing the ith sample, +.>Representing the probability that the i-th sample is positive; in Stage 1->The method comprises the steps of carrying out a first treatment on the surface of the Stage2>
The training set of Stage3 has the problem of sample imbalance, and to alleviate this problem, focal-Loss is used as its Loss function:
(15)
wherein and />Is two super parameters, here set to 0.25 and 2, < >>Weight representing the i-th sample, +.>For controlling the difficulty classification sample, +.>Representing the probability value of predicting the ith sample, < +.>
S3-4, training set described in S1、/>、/>Respectively inputting the three models for training to obtain three models after training.
S4: inputting the test set into a three-stage network model in turn for prediction to obtain the final test setPrediction resultThe method comprises the steps of carrying out a first treatment on the surface of the The method specifically comprises the following steps:
s4-1, inputting the test set sample into the trained Stage1 network to obtain a classification result. If it is=0, then as final prediction result +.>=0; if->When the image is=1, the DR lesion exists in the image, and the image enters Stage2 for further prediction;
s4-2, the image enters a Stage2 network to obtain a classification resultThe method comprises the steps of carrying out a first treatment on the surface of the If->=1, then as final prediction result +.>=4; if->If the value is=0, the DR lesion in the image is not incremental, and the image enters Stage3 for further prediction;
s4-3, the image enters a Stage3 network to obtain a three-classification result. The result of Stage3 network prediction will be the final predicted result +.>=/>
S5: model evaluation: comprehensively evaluating the network grading effect by using an Accuracy index (Accuracy); the method specifically comprises the following steps:
s5-1, comparing the predicted values of all samples in the test setAnd tag->Calculate the correct number of classification +.>
(16);
Where N is the total number of samples in the test set and i represents the ith sample in the test set.
S5-2, calculating accuracy according to a formula to judge model performance:
(17)。
in the description of the present application, the term "plurality" means two or more, unless explicitly defined otherwise, the orientation or positional relationship indicated by the terms "upper", "lower", etc. are based on the orientation or positional relationship shown in the drawings, merely for convenience of description of the present application and to simplify the description, and do not indicate or imply that the apparatus or elements referred to must have a specific orientation, be constructed and operated in a specific orientation, and therefore should not be construed as limiting the present application; the terms "coupled," "mounted," "secured," and the like are to be construed broadly, and may be fixedly coupled, detachably coupled, or integrally connected, for example; can be directly connected or indirectly connected through an intermediate medium. The specific meaning of the above terms in the present application can be understood by those of ordinary skill in the art according to the specific circumstances.
In the description of the present specification, the terms "one embodiment," "some embodiments," "particular embodiments," and the like, mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present application. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
The above is only a preferred embodiment of the present application, and is not intended to limit the present application, but various modifications and variations can be made to the present application by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims (3)

1. A method for grading diabetic retinopathy based on a three-stage attention network, comprising the steps of:
s1, constructing a data set: firstly, remapping the original five-class labels of the data set into corresponding two-class labels and three-class labels; the data set is then divided into a training set and a test set, and the original training set is divided into three specific training sets according to three types of labels、/>、/>
The data set is disclosed by APTOS 2019Blindness Detection on kagle, and the original label of the data set isWherein 0 represents no DR,1 represents mild DR,2 represents moderate DR,3 represents severe DR,4 represents proliferative DR, comprising the steps of:
s1-1, a data set is divided into 8: dividing the ratio of 2 into an original training set and a test set;
s1-2, reconstructing the divided original training set into multiple labels according to five classification labels of the data set, namely,/>2,3}, wherein->Wherein 0 represents no DR and 1 represents DR; />Wherein 0 represents non-proliferation, 1 represents proliferation DR; />Middle 1 represents mild DR,2 represents moderate DR, and 3 represents severe DR;
s1-3, constructing an original training set into three specific training sets according to the new label、/>、/>, wherein />All samples comprising the original training set; />Samples containing only DR; />Only non-value added DR samples are included;
s2, designing a non-local attention module: including non-local channel attention modulesAnd a non-local spatial attention moduleThe method comprises the steps of carrying out a first treatment on the surface of the The non-local attention module comprises two parts of feature extraction and non-local attention calculation, wherein the two parts are designed in a parallel mode, and the image is +.>The output after entering the module is a non-local channel attention strive +.>Non-local spatial attention strive->
The method specifically comprises the following steps:
s2-1, constructing a non-local channel attention module
S2-1-1, performing feature extraction by adopting three-layer convolution, adding residual blocks in ResNet network, and performing image extractionExtracting features to obtain feature map->
(I))))) (1);
wherein 、/>Representing 1x1 convolution and 3x3 convolution, respectively,/->Representing a ReLU activation function;
s2-1-2, the non-local channel attention part is divided into two sub-parts of global perception and channel attention; first, for an imageGlobal information modeling is performed, and mathematical representation is as follows:
(2);
where i is the index of the output location for which the response is to be calculated, j is the index of all possible locations,representing a 1x1 convolution,/->Is a normalization parameter; />Represented by formula (3):
(3);
wherein and />Respectively representing two different 1x1 convolutions;
s2-1-3, to be obtained with context informationFurther enter the channel attention part to generate channel attention vector +.>The mathematical expression is:
(/>Pool(/>)))/>(/>MaxPool(/>))) (4);
wherein ,pool and MaxPool represent mean pooling and maximum pooling operations, respectively, followed by the use of the full linker +.>Andto learn the dependency between channels>Representing a Sigmoid activation function; />Representing an add by element operation;
s2-1-4, channel attention vectorAnd (4) feature map>Fusing to obtain a feature map with channel attention, and obtaining a final non-local channel attention module through shortcut operation>Output of +.>The mathematical expression is:
(5);
wherein Representing multiplication by element, ∈>Representing an add by element operation;
s2-2, constructing a non-local spatial attention module
S2-2-1 is similar to the feature extraction operation of S2-1-1 in S2-1, except that the 3x3 convolution is replaced by a large-kernel convolution of 13x13, so that the effective receptive field is increased, and the feature map has more shape bias; image processing apparatusExtracting features to obtain feature map->
(I))))) (6)
S2-2-2, dividing the non-local spatial attention part into two sub-parts of global modeling and spatial attention generation; the global modeling operation is the same as equations (2) and (3) in S2-1 to obtain an image with context information
S2-2-3、Further enter the spatial attention part to generate a spatial attention vector +.>The mathematical expression is:
(7)
wherein Represents a 7x7 convolution, ">Representing Sigmoid activation functions, avgPool and MaxPool represent average pooling operations and maximum pooling operations, respectively;
s2-2-4, and operating with the formula (5) in S2-1 to obtain the final non-local spatial attention moduleOutput of +.>
S3: constructing a multi-label-based multi-classification network model and training; the model divides the network into three stages of Stage1, stage2 and Stage3 according to the redefined three types of labels in the S1, wherein Stage1 and Stage2 are classified tasks, stage3 is a three-classified task, the three stages are independently operated in a training Stage of the network model, and the testing stages are serial; i.e. the training set described in S1、/>、/>Respectively inputting the three models to train to obtain three models after training;
the method specifically comprises the following steps:
s3-1, constructing Stage1; stage1 toAs a basic structure by stacking->,/>,...,/>And adding two layers +.>、/>As a classification head, obtaining a Stage1 network model; entering Stage1 for image I will yield two classification results, expressed mathematically as follows:
(.../>(8)
))) (9)
wherein Respectively representing model prediction sample category->Probability values of (2);
s3-2, constructing Stage2 and Stage 3; stage2, 3 are allAs basic structure +.>By stacking->,/>,...,/>And adding two layers +.>、/>As a classification head, obtaining Stage2 and 3 network models; entering Stage2 for image I will yield two classification results, expressed mathematically as follows:
(.../>(10)
))) (11)
wherein Respectively representing model prediction sample category->Probability values of (2);
entering Stage3 for image I will yield a three-classification result, mathematically expressed as follows:
(.../>(12)
))) (13)
wherein Respectively representing model prediction sample category->Probability values of (2);
s3-3, constructing model optimization targets of different stages; stage1 and Stage2 are two-class networks,
the cross entropy loss function is adopted:
(14)
where N represents the number of samples,a label representing the ith sample, +.>Representing the probability that the i-th sample is positive; in Stage 1->The method comprises the steps of carrying out a first treatment on the surface of the Stage2>
The training set of Stage3 uses Focal-Loss as its Loss function:(15)
wherein and />Is two super parameters, here set to 0.25 and 2, < >>Weight representing the i-th sample, +.>For controlling the difficulty classification sample, +.>Representing the probability value of predicting the ith sample, < +.>
S3-4, training set described in S1、/>、/>Respectively inputting the three models to train to obtain three models after training;
s4: inputting the test set into a three-stage network model in turn for prediction to obtain a final prediction result
S5: model evaluation: and comprehensively evaluating the network grading effect by using the accuracy index.
2. A method of diabetic retinopathy grading based on a three-phase attention network according to claim 1 wherein step S4 comprises in particular the steps of:
s4-1, inputting the test set sample into the trained Stage1 network to obtain a classification result
If it is=0, then as final prediction result +.>=0; if->When the image is=1, the DR lesion exists in the image, and the image enters Stage2 for further prediction;
s4-2, the image enters a Stage2 network to obtain a classification resultThe method comprises the steps of carrying out a first treatment on the surface of the If->=1, then as final prediction result +.>=4; if->If the value is=0, the DR lesion in the image is not incremental, and the image enters Stage3 for further prediction;
s4-3, the image enters a Stage3 network to obtain a three-classification resultThe method comprises the steps of carrying out a first treatment on the surface of the The result of Stage3 network prediction will be the final predicted result +.>=/>
3. A method of diabetic retinopathy grading based on a three-phase attention network according to claim 1 wherein step S5 comprises in particular the steps of:
s5-1, comparing the predicted values of all samples in the test setAnd tag->Calculate the correct number of classification +.>
S5-2, calculating accuracy according to a formula to judge model performance:
(17) Where N is the total number of samples in the test set.
CN202211233514.2A 2022-10-10 2022-10-10 Three-stage attention network-based diabetic retinopathy grading method Active CN115587979B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211233514.2A CN115587979B (en) 2022-10-10 2022-10-10 Three-stage attention network-based diabetic retinopathy grading method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211233514.2A CN115587979B (en) 2022-10-10 2022-10-10 Three-stage attention network-based diabetic retinopathy grading method

Publications (2)

Publication Number Publication Date
CN115587979A CN115587979A (en) 2023-01-10
CN115587979B true CN115587979B (en) 2023-08-15

Family

ID=84780924

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211233514.2A Active CN115587979B (en) 2022-10-10 2022-10-10 Three-stage attention network-based diabetic retinopathy grading method

Country Status (1)

Country Link
CN (1) CN115587979B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116384448B (en) * 2023-04-10 2023-09-12 中国人民解放军陆军军医大学 CD severity grading system based on hybrid high-order asymmetric convolution network

Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108021916A (en) * 2017-12-31 2018-05-11 南京航空航天大学 Deep learning diabetic retinopathy sorting technique based on notice mechanism
US10430946B1 (en) * 2019-03-14 2019-10-01 Inception Institute of Artificial Intelligence, Ltd. Medical image segmentation and severity grading using neural network architectures with semi-supervised learning techniques
CN110837803A (en) * 2019-11-07 2020-02-25 复旦大学 Diabetic retinopathy grading method based on depth map network
CN111259982A (en) * 2020-02-13 2020-06-09 苏州大学 Premature infant retina image classification method and device based on attention mechanism
CN111461218A (en) * 2020-04-01 2020-07-28 复旦大学 Sample data labeling system for fundus image of diabetes mellitus
CN111639564A (en) * 2020-05-18 2020-09-08 华中科技大学 Video pedestrian re-identification method based on multi-attention heterogeneous network
AU2020103938A4 (en) * 2020-12-07 2021-02-11 Capital Medical University A classification method of diabetic retinopathy grade based on deep learning
CN112733961A (en) * 2021-01-26 2021-04-30 苏州大学 Method and system for classifying diabetic retinopathy based on attention mechanism
CN112819797A (en) * 2021-02-06 2021-05-18 国药集团基因科技有限公司 Diabetic retinopathy analysis method, device, system and storage medium
CN113537375A (en) * 2021-07-26 2021-10-22 深圳大学 Diabetic retinopathy grading method based on multi-scale cascade
CN113723451A (en) * 2021-07-20 2021-11-30 山东师范大学 Retinal image classification model training method, system, storage medium and device
CN113888412A (en) * 2021-11-23 2022-01-04 钟家兴 Image super-resolution reconstruction method for diabetic retinopathy classification
CN114219807A (en) * 2022-02-22 2022-03-22 成都爱迦飞诗特科技有限公司 Mammary gland ultrasonic examination image grading method, device, equipment and storage medium
CN114266757A (en) * 2021-12-25 2022-04-01 北京工业大学 Diabetic retinopathy classification method based on multi-scale fusion attention mechanism
CN114287878A (en) * 2021-10-18 2022-04-08 江西财经大学 Diabetic retinopathy focus image identification method based on attention model
CN115049898A (en) * 2022-07-05 2022-09-13 西安电子科技大学 Automatic grading method for lumbar intervertebral disc degeneration based on region block characteristic enhancement and inhibition

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10719936B2 (en) * 2017-04-27 2020-07-21 Retinascan Limited System and method for automated funduscopic image analysis
CN109753978B (en) * 2017-11-01 2023-02-17 腾讯科技(深圳)有限公司 Image classification method, device and computer readable storage medium
CN109686444A (en) * 2018-12-27 2019-04-26 上海联影智能医疗科技有限公司 System and method for medical image classification
EP3937753A4 (en) * 2019-03-13 2023-03-29 The Board Of Trustees Of The University Of Illinois Supervised machine learning based multi-task artificial intelligence classification of retinopathies
CN112906623A (en) * 2021-03-11 2021-06-04 同济大学 Reverse attention model based on multi-scale depth supervision

Patent Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108021916A (en) * 2017-12-31 2018-05-11 南京航空航天大学 Deep learning diabetic retinopathy sorting technique based on notice mechanism
US10430946B1 (en) * 2019-03-14 2019-10-01 Inception Institute of Artificial Intelligence, Ltd. Medical image segmentation and severity grading using neural network architectures with semi-supervised learning techniques
CN110837803A (en) * 2019-11-07 2020-02-25 复旦大学 Diabetic retinopathy grading method based on depth map network
CN111259982A (en) * 2020-02-13 2020-06-09 苏州大学 Premature infant retina image classification method and device based on attention mechanism
CN111461218A (en) * 2020-04-01 2020-07-28 复旦大学 Sample data labeling system for fundus image of diabetes mellitus
CN111639564A (en) * 2020-05-18 2020-09-08 华中科技大学 Video pedestrian re-identification method based on multi-attention heterogeneous network
AU2020103938A4 (en) * 2020-12-07 2021-02-11 Capital Medical University A classification method of diabetic retinopathy grade based on deep learning
CN112733961A (en) * 2021-01-26 2021-04-30 苏州大学 Method and system for classifying diabetic retinopathy based on attention mechanism
CN112819797A (en) * 2021-02-06 2021-05-18 国药集团基因科技有限公司 Diabetic retinopathy analysis method, device, system and storage medium
CN113723451A (en) * 2021-07-20 2021-11-30 山东师范大学 Retinal image classification model training method, system, storage medium and device
CN113537375A (en) * 2021-07-26 2021-10-22 深圳大学 Diabetic retinopathy grading method based on multi-scale cascade
CN114287878A (en) * 2021-10-18 2022-04-08 江西财经大学 Diabetic retinopathy focus image identification method based on attention model
CN113888412A (en) * 2021-11-23 2022-01-04 钟家兴 Image super-resolution reconstruction method for diabetic retinopathy classification
CN114266757A (en) * 2021-12-25 2022-04-01 北京工业大学 Diabetic retinopathy classification method based on multi-scale fusion attention mechanism
CN114219807A (en) * 2022-02-22 2022-03-22 成都爱迦飞诗特科技有限公司 Mammary gland ultrasonic examination image grading method, device, equipment and storage medium
CN115049898A (en) * 2022-07-05 2022-09-13 西安电子科技大学 Automatic grading method for lumbar intervertebral disc degeneration based on region block characteristic enhancement and inhibition

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于3D全时序卷积神经网络的视频显著性检测;王教金等;《计算机科学》;第47卷(第8期);第195-201页 *

Also Published As

Publication number Publication date
CN115587979A (en) 2023-01-10

Similar Documents

Publication Publication Date Title
Li et al. CANet: cross-disease attention network for joint diabetic retinopathy and diabetic macular edema grading
Wang et al. Automatic image-based plant disease severity estimation using deep learning
Li et al. Lesion-attention pyramid network for diabetic retinopathy grading
Jiang et al. White blood cells classification with deep convolutional neural networks
CN113421652A (en) Method for analyzing medical data, method for training model and analyzer
Lim et al. The adoption of deep learning interpretability techniques on diabetic retinopathy analysis: a review
CN115587979B (en) Three-stage attention network-based diabetic retinopathy grading method
Sikder et al. Supervised learning-based cancer detection
CN113380413A (en) Method and device for constructing invalid re-circulation (FR) prediction model
CN108765374B (en) Method for screening abnormal nuclear area in cervical smear image
Agarwal et al. Mobile application based cataract detection system
CN114822823B (en) Tumor fine classification system based on cloud computing and artificial intelligence fusion multi-dimensional medical data
Pakzad et al. CIRCLe: Color invariant representation learning for unbiased classification of skin lesions
Chhabra et al. An advanced VGG16 architecture-based deep learning model to detect pneumonia from medical images
Orchi et al. Real-time detection of crop leaf diseases using enhanced YOLOv8 algorithm
Singh et al. A novel hybridized feature selection strategy for the effective prediction of glaucoma in retinal fundus images
Anandhakrishnan et al. Identification of tomato leaf disease detection using pretrained deep convolutional neural network models
Tariq et al. Towards counterfactual and contrastive explainability and transparency of DCNN image classifiers
Ali et al. COVID-19 pneumonia level detection using deep learning algorithm and transfer learning
CN112488996A (en) Inhomogeneous three-dimensional esophageal cancer energy spectrum CT (computed tomography) weak supervision automatic labeling method and system
Mercy Bai et al. Optimized deep neuro-fuzzy network with MapPeduce architecture for acute lymphoblastic leukemia classification and severity analysis
Al-khuzaie et al. Developing an efficient VGG19-based model and transfer learning for detecting acute lymphoblastic leukemia (ALL)
Gong et al. Evolutionary neural network and visualization for CNN-based pulmonary textures classification
Wang et al. Diabetic retinopathy detection based on weakly supervised object localization and knowledge driven attribute mining
KR102636461B1 (en) Automated labeling method, device, and system for learning artificial intelligence models

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant