CN114330484A

CN114330484A - Method and system for classification and focus identification of diabetic retinopathy through weak supervision learning

Info

Publication number: CN114330484A
Application number: CN202111356999.XA
Authority: CN
Inventors: 潘俊君; 谷云超; 王心亮
Original assignee: Beihang University
Current assignee: Beihang University
Priority date: 2021-11-16
Filing date: 2021-11-16
Publication date: 2022-04-12

Abstract

The invention relates to a method and a system for classification and focus identification of diabetic retinopathy by weak supervision learning, wherein the method comprises the following steps: s1: inputting the color image of the fundus of the sugar net into a CNN network to obtain a characteristic diagram F; inputting F into a focus point capturing module based on weak supervision object positioning, selecting to enter a branch module, calculating to obtain a score map, and combining the score map and the characteristicsMultiplying the graphs F to obtain a new characteristic graph F'; s2: calculating the insertion position, the selection probability, the feature discarding threshold value and the retention threshold value of the focus point capturing module based on weak supervision in the CNN network by using an optimal network structure searching module based on reinforcement learning, and finally outputting a feature map F_NA(ii) a S3: obtaining a focus attribute prediction result and a disease grading result through an attribute mining and focus identification module; s4: based on the result of S3, a final lesion identification result is obtained through a plurality of iterative erasure calculations. The method provided by the invention can improve the focus capturing effect, and determine the focus type by using the focus mining method, thereby providing a basis for disease classification.

Description

Method and system for classification and focus identification of diabetic retinopathy through weak supervision learning

Technical Field

The invention relates to the field of image processing, in particular to a method and a system for classification and focus identification of diabetic retinopathy through weak supervision learning.

Background

In recent years, the incidence of diabetes has increased year by year, and Diabetic Retinopathy (DR), which is a major complication of diabetes, has become the leading cause of blindness in people of working age worldwide. According to statistics, about 5 hundred million people in China are in the pre-stage diabetes, about 1.1 hundred million people are diabetic patients, and about 3700 ten thousand patients with diabetic retinopathy are diabetic patients. The diagnosis of the sugar net needs to be carried out by observing focus points on fundus oculi color images by professional ophthalmologists and determining whether patients are ill or not according to the grading standard of the sugar net, the process is time-consuming and labor-consuming, the number of professional ophthalmologists in remote or undeveloped areas is small, the patients cannot be diagnosed in time, the treatment time is delayed, according to statistics of relevant departments, 87% of diabetes patients are diagnosed in county-level and below medical institutions at present, but basic diagnosis and treatment measures and suitable technologies of the sugar net disease are implemented in three-level medical institutions. Therefore, the intelligent and automatic sugar net auxiliary diagnosis technology can effectively relieve the working pressure of doctors and improve the diagnosis efficiency.

At present, the sugar net assistance completed by means of a deep learning technology is a hot problem. The computer vision technology in the deep learning can extract the characteristics of the color images of the fundus oculi of the sugar net, understand the pathological changes in an abstract way, and finish the diagnosis of the sugar net by utilizing the powerful summarizing and inducing capability of the neural network. Disease classification and lesion identification based on deep learning have had great success in solving clinical problems. In many technologies, the basic tasks of disease stratification and lesion detection provide great help for clinical diagnosis. The disease classification prediction helps doctors to complete early diagnosis of diseases, and the focus detection directly positions focus points, thereby providing interpretable diagnosis basis for the doctors. There has been a great deal of work on disease grading and lesion localization, which usually deal with both tasks separately. However, the disease grading and focus detection tasks complement each other, and focus detection comprehensively locates potential focuses, so that a reliable basis is provided for grading; the classification model used for disease classification can learn the difference information of each category, which is helpful for improving the detection performance. In addition, the data with accurate labels is a precondition for applying a deep learning algorithm in the field of medical images, the data with image-level labels are required for a disease grading task, but fine-grained labels with label boxes are required for lesion detection, which is a time-consuming and labor-consuming task for professional doctors. Therefore, how to accomplish disease classification and lesion detection with restricted labeling is a challenge in medical image processing.

Disclosure of Invention

In order to solve the technical problems, the invention provides a method and a system for classification and focus identification of diabetic retinopathy by weak supervision learning.

The technical solution of the invention is as follows: a method for classification and focus identification of diabetic retinopathy by weak supervision learning comprises the following steps:

step S1: inputting the color image of the fundus of the sugar net into a convolutional neural network to obtain a characteristic diagram F; inputting F into a focus point capturing module based on weak supervision object positioning, reducing the dimension of the focus point capturing module into a one-dimensional characteristic diagram, and selecting and entering three branch modules according to the selection probability: calculating one of the attention branching module, the salient feature erasing module and the salient feature retaining module to obtain a score map, and multiplying the score map and the feature map F to obtain a new feature map F';

step S2: inserting positions of the weakly supervised focus point based capture module in the convolutional neural network model, the selection probabilities of three branch modules, the optimal network structure search module based on reinforcement learningCalculating the feature discarding threshold and the retention threshold in the significant feature erasing module and the significant feature retention module, and finally outputting a feature map F_NA；

Step S3: according to the diabetic retinopathy classification standard, acquiring a focus category for distinguishing the sugar network grade, using the focus category as prior knowledge for labeling the weakly supervised multi-label classification, and acquiring a focus attribute prediction result and a disease classification result through an attribute mining and focus identification module;

step S4: based on the focus attribute prediction result and the disease grading result, a visual map of the focus position is obtained by using a class activation map method, and a final focus identification result is obtained through multiple iterative erasure calculations.

Compared with the prior art, the invention has the following advantages:

1. the method disclosed by the invention uses a focus point capturing module for weak supervision object positioning, can capture more potential focus information in the disease grading process, namely obtains the potential focus position by using category specific information, determines the focus type by using a focus mining method, and can provide an explanatory diagnosis basis for disease grading through a visualization result.

2. The method optimizes the configuration of the focus point capturing module for weak supervision object positioning through the optimal network structure searching module based on reinforcement learning, and improves the focus capturing effect.

3. The method disclosed by the invention takes lesion classification basis as a priori knowledge attribute label, provides an attribute mining method to enable a model to fit the label, further improves the disease classification performance, and takes the attribute category obtained in the attribute mining process as a lesion detection result.

Drawings

FIG. 1 is a flowchart of a method for weakly supervised learning diabetic retinopathy classification and lesion identification in accordance with an embodiment of the present invention;

FIG. 2 is a schematic structural diagram of a focus point capture module based on weak surveillance object localization according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of a calculation process of an optimal network structure search module based on reinforcement learning according to an embodiment of the present invention;

FIG. 4 is a schematic view of a lesion attribute mining process based on prior knowledge and iterative attribute mining according to an embodiment of the present invention;

FIG. 5 is a diagram illustrating the effect of lesion identification according to an embodiment of the present invention;

FIG. 6 is a block diagram of a system for diabetic retinopathy classification and lesion identification with low supervision learning according to an embodiment of the present invention.

Detailed Description

The invention provides a method for classifying diabetic retinopathy and identifying focuses by weak supervision learning, which can capture more potential focus information in the disease classification process, improve the focus capture effect, determine the focus types by using a focus mining method and provide explanatory diagnosis basis for disease classification through a visualization result.

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings.

Example one

As shown in fig. 1, a method for classification and lesion identification of diabetic retinopathy by weak supervised learning according to an embodiment of the present invention includes the following steps:

step S1: inputting the color image of the fundus of the sugar net into a convolutional neural network to obtain a characteristic diagram F; inputting F into a focus point capturing module based on weak supervision object positioning, reducing the dimension of the focus point capturing module into a one-dimensional characteristic diagram, and selecting and entering three branch modules according to the selection probability: calculating one of the attention branch module, the salient feature erasing module and the salient feature retaining module to obtain a score map, and multiplying the score map by the feature map F to obtain a new feature map F';

step S2: performing feature discarding threshold and retention threshold in a weak supervision focus point capturing module, a selection probability of three branch modules, a significant feature erasing module and a significant feature retention module in a convolutional neural network model by using an optimal network structure searching module based on reinforcement learningCalculating and finally outputting a characteristic diagram F_NA；

In one embodiment, the step S1: inputting the color image of the fundus of the sugar net into a convolutional neural network to obtain a characteristic diagram F; inputting F into a focus point capturing module based on weak supervision object positioning, reducing the dimension of the focus point capturing module into a one-dimensional characteristic diagram, and selecting and entering three branch modules according to the selection probability: the method includes the following steps that one of an attention branching module, a salient feature erasing module and a salient feature retaining module is calculated to obtain a score map, and the score map is multiplied by a feature map F to obtain a new feature map F', and the method specifically includes the following steps:

step S11: inputting the color image of the fundus oculi into a convolutional neural network CNN to obtain a characteristic diagram F, wherein F belongs to R^H×W×DWherein H, W and D are the height, width and depth of feature F, respectively;

convolutional neural networks are more prone to use the most discriminative features to perform image classification during the calculation process, which results in other potentially useful information being ignored. However, in medical images, disease grading is often required to be completed through comprehensive evaluation of multiple focuses, and diagnosis of the sugar network requires the comprehensive evaluation of multiple focus attributes. Firstly, weak supervision lesion location is carried out on a feature map F extracted by a convolutional neural network, so that fine lesions in medical images can be extracted more comprehensively, the accuracy of disease classification is improved, and the feature map F is used as the basis for lesion capture and visualization;

step S12: f obtaining an attention feature map Att after average pooling on a depth dimension, wherein the Att belongs to R^H×WThe Att maps the value to [0,1] after Sigmoid operation]Obtaining a one-dimensional characteristic diagram;

the magnitude of the value after Att is mapped to [0,1] reflects the discriminativity of the features. The salient feature erasure structure suppresses the most discriminative feature, and the part of Att with a value greater than the parameter discarding threshold is set to 0, which indicates that the most discriminative feature is discarded in this part of the spatial dimension. The salient feature retention structure avoids that the network pays more attention to the possibly invalid features to cause the classification performance to be reduced, and the part of the Att with the value smaller than the parameter discarding threshold value is set to be 0, which indicates that the features which are possibly useless for the classification result in the part of the spatial dimension are discarded.

Step S13: inputting the one-dimensional feature map into one of an attention branching module, a significant feature erasing module and a significant feature retaining module according to the selection probability p to obtain a score map; wherein, the selection probabilities of the attention branching module, the significant characteristic erasing module and the significant characteristic retaining module are respectively P_A，P_DAnd P_HAnd P is_A+P_D+P_H＝1；

Step S14: multiplying the score chart with the feature chart F to obtain a new feature chart F', wherein a formula (1) is shown as follows;

where p is the probability of selection, x is the point-by-point multiplication, a is the output of the attention-branching module, D is the output of the salient feature-erasing module, and H is the output of the salient feature-preserving module.

As shown in fig. 2, a schematic structural diagram of a focus point capturing module based on weak supervision object positioning is shown, and the focus point capturing module based on weak supervision object positioning is only used in a network training process, and is not used in testing and practical application.

In one embodiment, the step S2: the optimal network structure search module based on reinforcement learning is used for solving the problems of insertion position of a focus point capture module based on weak supervision in a convolutional neural network model, selection probability of three branch modules and significance characteristicsCalculating the feature discarding threshold and the retention threshold in the feature erasing module and the significant feature retention module, and finally outputting a feature map F_NAThe method specifically comprises the following steps:

using LSTM as a controller to generate a searched parameter value, using ResNet50 as a backbone network for feature extraction, obtaining each parameter of a focus point capturing module based on weak supervision object positioning by selecting probability p, and scaling the gradient of p by verifying the performance of a set on the focus point capturing module based on weak supervision object positioning so as to update the controller; therefore, the feature discarding threshold value and the feature retaining threshold value in the significant feature erasing module and the significant feature retaining module are determined based on the insertion position of the weak supervision focus point capturing module in the convolutional neural network model, the selection probability of the three branch modules.

In the focus point capturing module based on weak supervision object positioning, there are many parameters and settings directly related to disease grading and focus capturing performance, including: the insertion position of a focus point capturing module based on weak supervision object positioning in a CNN network, the selection probability of three branches, and a feature discarding threshold value and a retention threshold value. When the insertion position of a focus point capture module based on weak supervision object positioning in a CNN network is in a shallow layer, the CNN network extracts a generalization characteristic, and the characteristic discarding operation can cause the loss of the generalization characteristic and influence the characteristic abstraction effect of the later layer; when the insertion position of a focus point capture module based on weak supervision object positioning in a CNN network is in a deeper layer, the features are highly related to the categories, and improper feature discarding can cause important information to be lost or wrong information to be extracted; similarly, the selection probabilities of the three branch modules and the feature discarding threshold and the retention threshold are also important, the significant feature erasure structure selection probability and the feature discarding threshold affect the capturing capability of the network on the non-discriminative lesion, and the significant feature retention structure selection probability is related to the feature retention threshold and the generalization performance of the network. Therefore, the method for searching the neural network structure can maximize the function of capturing the structure of the weak supervision focus point.

Therefore, the embodiment of the invention constructs an optimal network structure searching module based on reinforcement learning, utilizes a recurrent neural network LSTM controller to sample the microstructure of the network in a given searching set, splices the microstructure into an end-to-end network architecture, converges the network architecture through training, and updates the controller by using a reinforcement learning method according to the expression of the model on a verification set, so that the controller can generate a better network structure. The LSTM is used as a parameter value for controller-generated search, and the ResNet50 widely used in other glycogenomics research is used as a backbone network for feature extraction to calculate feature discarding and retention thresholds based on the insertion position of the weakly supervised lesion point capture module in the CNN network, the selection probabilities of the three branch modules, the salient feature erasure module, and the salient feature retention module.

As shown in fig. 3, a schematic diagram of a calculation process of the optimal network structure search module based on reinforcement learning is shown.

After the steps S1 and S2 are completed, the CNN network trained by using the focus point capturing module based on weak supervised object localization and the optimal network structure searching module based on reinforcement learning already has a more comprehensive and accurate focus capturing capability, and the disease classification accuracy can be further improved and focus recognition can be completed on the basis through the focus attribute mining and focus recognition module.

In one embodiment, the aforementioned S3: according to the classification standard of diabetic retinopathy, acquiring a focus category for distinguishing the sugar network grade, using the focus category as priori knowledge for labeling of weak supervision multi-label classification, and acquiring a focus attribute prediction result and a disease classification result through an attribute mining and focus identification module, wherein the method specifically comprises the following steps:

step S31: taking the lesion type of each lesion grade in diabetic retinopathy as prior knowledge to obtain six prior knowledge attributes: microaneurysm, hemorrhage, venous beading, intraretinal microvascular abnormality, neovascularization and pre-retinal hemorrhage, which are used as priori knowledge for labeling of weakly supervised multi-label classification;

although a multi-classification label can be obtained by using the attributes, the lesion features represented by the same type of fundus images are not completely the same, and therefore the a priori knowledge attribute labels may not be completely correct. Therefore, different lesion attributes have different influences on the disease classification of the current image, and the method solves the problem by utilizing an attribute mining method and further improves the disease classification performance;

step S32: input-based feature graph F_NAThe attribute mining and lesion identification module carries out disease classification by using cross entropy loss, and then calculates by using a formula (2) to obtain an attribute prediction q:

q＝f(F_NA) (2)

wherein f is a function containing a full-link layer and a sigmoid layer, q is a vector with a dimension of 6 and represents six predicted focus attributes;

step S33: obtaining a thermodynamic diagram by using a characteristic class activation diagram method to obtain focus information; obtaining K activation regions on the thermodynamic diagram by using a threshold value t, thereby forming K connected components, wherein each connected component is used as a focus region; the thermodynamic diagram is subjected to K iterative calculations, each at F_NACovering a focus area, and mixing with F_NAMultiplying and sharing the weight of f to obtain a new attribute estimate q_k，q_k∈R⁶(ii) a The lesion attribute weight of the iteration can be obtained through the softmax layer, and finally, the average value is taken as the attribute weight of the lesion, and the calculation formula (3) is as follows:

wherein, softmax (·) is a softmax layer, k ═ 0 is calculated on the original feature map F, weight is the attribute weight of the lesion, and is used for solving the problem of incompletely correct attribute labeling;

step S34: multiplying the attribute weight and the attribute prediction q to obtain the attribute prediction with the weight, wherein a calculation formula (4) is as follows:

q_weight＝q×weight (4)

q_weightand finally, calculating two-class cross entropy with the attribute label after passing through a full connection layer and a sigmoid layerLoss, thereby allowing for a lesion attribute prediction and disease stratification results.

FIG. 4 is a schematic diagram illustrating a lesion attribute mining process based on prior knowledge and iterative attribute mining;

in one embodiment, the step S4: based on the focus attribute prediction result and the disease grading result, a visual map of the focus position is obtained by using a class activation map method, and a final focus identification result is obtained through multiple iterative erasure calculations, which specifically comprises the following steps:

based on the K iterative computations in step S33, the lesion property of the current occlusion region can be computed by formula (5), and the K results are combined to be the lesion recognition result of the whole image;

the final lesion recognition effect map is shown in fig. 5, where the input image is a color image of the fundus, lesion recognition is a lesion recognition effect, and the potential lesion is a visual thermodynamic map of the potential lesion, where the highlighted portion indicates the potential lesion.

The method disclosed by the invention uses a focus point capturing module for weak supervision object positioning, can capture more potential focus information in the disease grading process, namely obtains the potential focus position by using category specific information, determines the focus type by using a focus mining method, and can provide an explanatory diagnosis basis for disease grading through a visualization result. The method optimizes the configuration of the focus point capturing module for weak supervision object positioning through the optimal network structure searching module based on reinforcement learning, and improves the focus capturing effect. The method disclosed by the invention takes lesion classification basis as a priori knowledge attribute label, provides an attribute mining method to enable a model to fit the label, further improves the disease classification performance, and takes the attribute category obtained in the attribute mining process as a lesion detection result.

Example two

As shown in fig. 6, an embodiment of the present invention provides a system for classification and lesion identification of diabetic retinopathy with weak supervision, including the following modules:

the characteristic image obtaining module 51 is used for inputting the color images of the fundus oculi of the sugar net into the convolutional neural network to obtain a characteristic image F; inputting F into a focus point capturing module based on weak supervision object positioning, reducing the dimension of the focus point capturing module into a one-dimensional characteristic diagram, and selecting and entering three branch modules according to the selection probability: calculating one of the attention branching module, the salient feature erasing module and the salient feature retaining module to obtain a score map, and multiplying the score map and the feature map F to obtain a new feature map F';

an optimal network structure search module 52, configured to calculate, using an optimal network structure search module based on reinforcement learning, insertion positions of the weakly supervised focus point capture module in the convolutional neural network model, the selection probabilities of three branch modules, feature discarding thresholds and retention thresholds in the significant feature erasing module and the significant feature retention module, and finally output a feature map F_NA；

The focus attribute mining module 53 is used for obtaining focus categories for distinguishing the sugar network grades according to the diabetic retinopathy grading standard, using the focus categories as priori knowledge for labeling the weak supervision multi-label classification, and obtaining a focus attribute prediction result and a disease grading result through the attribute mining and focus identification module;

and the lesion identification module 54 is configured to obtain a visual image of a lesion position by using a class activation image method based on the lesion attribute prediction result and the disease classification result, and obtain a final lesion identification result by performing multiple iterative erasure calculations.

The above examples are provided only for the purpose of describing the present invention, and are not intended to limit the scope of the present invention. The scope of the invention is defined by the appended claims. Various equivalent substitutions and modifications can be made without departing from the spirit and principles of the invention, and are intended to be within the scope of the invention.

Claims

1. A method for classification and focus identification of diabetic retinopathy by weak supervision learning is characterized by comprising the following steps:

step S2: calculating the insertion position of the weakly supervised focus point capture module in the convolutional neural network model, the selection probabilities of three branch modules, feature discarding thresholds and retention thresholds in the significant feature erasing module and the significant feature retention module by using an optimal network structure search module based on reinforcement learning, and finally outputting a feature map F_NA；

2. The method for weakly supervised learning diabetic retinopathy classification and lesion identification according to claim 1, wherein the step S1: inputting the color image of the fundus of the sugar net into a convolutional neural network to obtain a characteristic diagram F; inputting F into a focus point capturing module based on weak supervision object positioning, reducing the dimension of the focus point capturing module into a one-dimensional characteristic diagram, and selecting and entering three branch modules according to the selection probability: the method includes the following steps that one of an attention branching module, a salient feature erasing module and a salient feature retaining module calculates to obtain a score map, and multiplies the score map by a feature map F to obtain a new feature map F', and specifically includes:

step S11: inputting the color image of the fundus oculi into a convolutional neural network CNN to obtain a characteristic diagram F, wherein F belongs to R^H×W×DWherein H, W and D are the height, width and depth of the feature map F, respectively;

step S12: f obtaining an attention feature map Att after average pooling on a depth dimension, wherein the Att belongs to R^H×WThe Att maps the value to [0,1] after Sigmoid operation]Obtaining the one-dimensional characteristic diagram;

step S13: inputting the one-dimensional feature map into one of an attention branching module, a significant feature erasing module and a significant feature retaining module according to the selection probability p to obtain a score map; wherein the selection probabilities of the attention branching module, the significant characteristic erasing module and the significant characteristic retaining module are respectively P_A，P_DAnd P_HAnd P is_A+P_D+P_H＝1；

Step S14: multiplying the score map and the feature map F to obtain a new feature map F', wherein a formula (1) is shown as follows;

wherein p is the selection probability, x is a point-by-point multiplication, a is the output of the attention branching module, D is the output of the salient feature erasing module, and H is the output of the salient feature preserving module.

3. The method for weakly supervised learning diabetic retinopathy classification and lesion identification according to claim 1, wherein the step S2: calculating the insertion position of the weakly supervised focus point capture module in the convolutional neural network model, the selection probabilities of three branch modules, feature discarding thresholds and retention thresholds in the significant feature erasing module and the significant feature retention module by using an optimal network structure search module based on reinforcement learning, and finally outputting a feature map F_NAThe method specifically comprises the following steps:

using LSTM as a controller to generate a searched parameter value, using ResNet50 as a backbone network for feature extraction, obtaining each parameter of the focus point capture module based on weak supervision object positioning through the selection probability p, and scaling the gradient of p through verifying the performance of the focus point capture module based on weak supervision object positioning so as to update the controller; thereby determining a feature discarding threshold and a feature retaining threshold based on the insertion position of a weakly supervised focal point capture module in the convolutional neural network model, the selection probabilities of three branch modules, the salient feature erasure module and the salient feature retaining module.

4. The method for weakly supervised learning diabetic retinopathy classification and lesion identification according to claim 1, wherein the step S3: according to the classification standard of diabetic retinopathy, acquiring a focus category for distinguishing the sugar network grade, using the focus category as priori knowledge for labeling of weak supervision multi-label classification, and acquiring a focus attribute prediction result and a disease classification result through an attribute mining and focus identification module, wherein the method specifically comprises the following steps:

step S32: the feature graph F based on input_NAThe attribute mining and lesion identification module performs disease classification by using cross entropy loss, and then obtains an attribute prediction q by calculating according to a formula (2):

q＝f(F_NA) (2)

step S33: obtaining a thermodynamic diagram by using a characteristic class activation diagram method to obtain focus information; by usingObtaining K activation regions on the thermodynamic diagram by a threshold t, thereby forming K connected components, each of which serves as a lesion region; performing K iterative computations on the thermodynamic diagram, each at F_NACovering a focus area, and mixing with F_NAMultiplying and sharing the weight of f to obtain a new attribute estimate q_k，q_k∈R⁶(ii) a The lesion attribute weight of the iteration can be obtained through the softmax layer, and finally, the average value is taken as the attribute weight of the lesion, and the calculation formula (3) is as follows:

step S34: multiplying the attribute weight with the attribute prediction q to obtain the attribute prediction with the weight, wherein a calculation formula (4) is as follows:

q_weight＝q×weight (4)。

5. the method for weakly supervised learning diabetic retinopathy classification and lesion identification according to claim 1, wherein the step S4: based on the focus attribute prediction result and the disease grading result, a visual map of the focus position is obtained by using a class activation map method, and a final focus identification result is obtained through multiple iterative erasure calculations, which specifically comprises the following steps:

6. a weak supervision learning diabetic retinopathy grading and focus identification system is characterized by comprising the following modules:

the characteristic image acquisition module is used for inputting the color images of the eyeground of the sugar net into the convolutional neural network to obtain a characteristic image F; inputting F into a focus point capturing module based on weak supervision object positioning, reducing the dimension of the focus point capturing module into a one-dimensional characteristic diagram, and selecting and entering three branch modules according to the selection probability: calculating one of the attention branching module, the salient feature erasing module and the salient feature retaining module to obtain a score map, and multiplying the score map and the feature map F to obtain a new feature map F';

an optimal network structure searching module for calculating the insertion position of the weakly supervised focus point capturing module in the convolutional neural network model, the selection probabilities of the three branch modules, the feature discarding threshold and the feature retaining threshold in the significant feature erasing module and the significant feature retaining module by using the optimal network structure searching module based on reinforcement learning, and finally outputting a feature map F_NA；

The focus attribute mining module is used for obtaining focus categories for distinguishing the sugar network grades according to the diabetic retinopathy grading standard, using the focus categories as priori knowledge for labeling the weakly supervised multi-label classification, and obtaining a focus attribute prediction result and a disease grading result through the attribute mining and focus identification module;

and the focus identification module is used for obtaining a visual image of the focus position by using a class activation image method based on the focus attribute prediction result and the disease grading result, and obtaining a final focus identification result through repeated iteration erasing calculation.