CN116152229B

CN116152229B - Method for constructing diabetic retinopathy diagnosis model and diagnosis model

Info

Publication number: CN116152229B
Application number: CN202310397924.9A
Authority: CN
Inventors: 欧阳继红; 刘思光; 李成溪; 孟庆奕
Original assignee: Jilin University
Current assignee: Jilin University
Priority date: 2023-04-14
Filing date: 2023-04-14
Publication date: 2023-07-11
Anticipated expiration: 2043-04-14
Also published as: CN116152229A

Abstract

The invention provides a method for constructing a diabetic retinopathy diagnosis model and the diagnosis model, and belongs to the technical field of deep learning. The construction method comprises the following steps: acquiring a fundus image and preprocessing to obtain an input image; inputting the input image into a backbone network for feature extraction to obtain a feature image containing advanced features; designing an improved classifier, inputting the feature map into the improved classifier, and obtaining a classification prediction result and a regression prediction result through calculation; designing a relative position loss function additional term, wherein the classification prediction result in the previous step is used for calculating a cross entropy loss function, and the regression prediction result is used for calculating the relative position loss function additional term; and combining the cross entropy loss function and the relative position loss function additional term to obtain a final joint loss function so as to construct a diagnosis model. The model constructed by the invention can learn the continuity information of the diabetic retinopathy process, and the model performance is obviously improved.

Description

Method for constructing diabetic retinopathy diagnosis model and diagnosis model

Technical Field

The invention relates to the technical field of deep learning, in particular to a method for constructing a diabetic retinopathy diagnosis model and the diagnosis model.

Background

In recent years, diabetic retinopathy (Diabetic Retinopathy, abbreviated as DR) is one of the most common diabetic ophthalmic disease complications, and is a preventable disease. In clinic, the eye specialist with high experience is mainly relied on to observe and evaluate the color fundus images to determine the disease stage grade, wherein the pathology stage grade adopts DR international five stage standard to make an optimized diagnosis and treatment scheme. The deep learning can effectively extract pathological features in the image according to the marked data and obtain a better diagnosis result, but the actual situation can face a plurality of challenges.

The problem of DR staging has a significant feature in that the disease progression is continuous. However, the existing DR stage method based on deep learning generally regards the problem as a classification task with discrete categories, and ignores the continuity attribute of the lesion process. This corresponds to discarding a significant portion of the prior knowledge during modeling, which undoubtedly negatively affects the performance of the model. The main reason for this is that the model end classifier cannot directly model and learn the continuity information of the labels in the training data due to the structural limitation of the model end classifier.

Disclosure of Invention

Aiming at the problems, the invention aims to provide a method for constructing a diabetic retinopathy diagnosis model and the diagnosis model, and aims to improve the performance of the model by designing a brand-new classifier structure and a loss function so that the model can learn the continuity information of a DR pathological change process.

In order to solve the technical problems, the invention provides the following scheme:

in one aspect, a method for constructing a diabetic retinopathy diagnostic model is provided, comprising the steps of:

acquiring a fundus image and preprocessing to obtain an input image;

inputting the input image into a backbone network for feature extraction to obtain a feature image containing advanced features;

designing an improved classifier, inputting the feature map into the improved classifier, and obtaining a classification prediction result and a regression prediction result through calculation;

designing a relative position loss function additional term, wherein the classification prediction result in the previous step is used for calculating a cross entropy loss function, and the regression prediction result in the previous step is used for calculating the relative position loss function additional term;

and combining the cross entropy loss function and the relative position loss function additional term to obtain a final joint loss function so as to construct a diagnosis model.

Preferably, the step of preprocessing the fundus image specifically includes:

firstly, removing black areas around fundus images;

secondly, the sizes of fundus images are regulated uniformly, and the resolution of the regulated images is 256 multiplied by 256;

finally, fundus image enhancement is performed using the following formula to improve the brightness and contrast differences of fundus images:

（1）

（2）

in the formula (1),

for the preprocessed fundus image, +.>

Representing standard deviation +.>

Is a gaussian convolution of (c) and,

representing pixel points in the fundus image; equation (2) is a weighted sum of pre-and post-enhancement fundus images, wherein the parameters used +.>

, />

, />

,/>

4, -4, 10 and 128, respectively.

Preferably, a ResNet50 network containing ImageNet pre-training parameters is selected as the backbone network for the encoder to construct the diagnostic model.

Preferably, the process of designing an improved classifier is as follows:

feature graphs obtained via backbone network are fed into full connection layer and vector representation with dimension of 6 is output

The components are

The method comprises the steps of carrying out a first treatment on the surface of the Wherein->

After Softmax the probability of i for the input sample label is obtained +.>

，/>

The method comprises the steps of carrying out a first treatment on the surface of the The Softmax calculation procedure is as follows:

(3)

classification results given to input samples

Obtained by subscript of the largest element in P, the formula is:

(4)

obtaining +.>

This process is shown in equation (5):

(5)

here, the

Consider as the regression prediction value of the input sample; />

The greater the value representing the greater the extent of pathology of the sample; considering the value range of the input sample label of 0-4, for the convenience of calculation, the predicted value +.>

Amplified to get->

：

(6)

The regression prediction value is regarded as the regression prediction value of the lesion degree of the input sample.

Preferably, the process of designing the relative position loss function additional term is as follows:

assume any two input samples of a training set

And->

Their label is->

And->

Regression prediction of lesion extentThe value is +.>

And->

Their tag distance and predicted value distance are +.>

And->

The method comprises the steps of carrying out a first treatment on the surface of the To avoid interference of the symbol to the model training process, the absolute values of the two distances are taken when the distance difference is calculated:

(7)

likewise, consider cancellation

The influence of the sign of (2) on the model is verified by experiments, and the final selection pair is +>

Taking the square as the relative position loss;

for the model training process, one batch contains

In the case of several input samples, the distance difference between any two input samples is calculated, and +.>

Secondary times; and then averaged as an additional term to the relative position loss function of the batch

：

(8)

The final joint loss function consists of the cross entropy loss function and the relative position loss function additional term together, and is specifically shown as follows:

(9)

wherein the method comprises the steps of

For a super parameter adjusting the ratio of the two loss functions, +.>

Is a cross entropy loss function.

In another aspect, a diagnostic model of diabetic retinopathy is provided, comprising:

the preprocessing unit is used for preprocessing the acquired fundus image to obtain an input image;

the backbone network unit is used for extracting the characteristics of the input image to obtain a characteristic image containing advanced characteristics;

the improved classifier unit is used for calculating the feature map by utilizing the improved classifier to respectively obtain a classification prediction result and a regression prediction result;

and the loss function unit is used for calculating a cross entropy loss function by using the classification prediction result, calculating a relative position loss function additional term by using the regression prediction result, and combining the two to obtain a final joint loss function.

Preferably, the preprocessing unit is specifically configured to:

firstly, removing black areas around fundus images;

（1）

（2）

in the formula (1),

for the preprocessed fundus image, +.>

Representing standard deviation +.>

Is a gaussian convolution of (c) and,

, />

, />

,/>

4, -4, 10 and 128, respectively.

Preferably, in the backbone network unit, a ResNet50 network containing ImageNet pre-training parameters is selected as a backbone network of the encoder forming the diagnostic model.

Preferably, in the improved classifier unit, the feature map obtained via the backbone network is fed into the fully connected layer, and a vector representation with dimension of 6 is output

The component is->

After Softmax the probability of i for the input sample label is obtained +.>

，/>

(3)

classification results given to input samples

Obtained by subscript of the largest element in P, the formula is:

(4)

after Sigmoid function, get +.>

This process is shown in equation (5):

(5)

here, the

Consider as the regression prediction value of the input sample; />

Amplified to get->

：

(6)

Preferably, in the loss function unit, the relative position loss function additional term is designed as follows:

assume any two input samples of a training set

And->

Their label is->

And->

Regression prediction value of lesion degree is +.>

And->

Their tag distance and predicted value distance are +.>

And->

(7)

likewise, consider cancellation

Taking the square as the relative position loss;

for the model training process, one batch contains

：

(8)

(9)

wherein the method comprises the steps of

For a super parameter adjusting the ratio of the two loss functions, +.>

Is a cross entropy loss function.

Compared with the prior art, the technical scheme provided by the invention has the following beneficial effects:

according to the method for constructing the diabetic retinopathy diagnosis model and the diagnosis model, provided by the invention, the model can learn the continuity information of the DR pathological change process by designing a brand-new classifier structure and a loss function, so that the performance of the model is obviously improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a general flow chart of a method of constructing a diagnostic model of diabetic retinopathy;

FIGS. 2 a-2 c are contrast plots showing fundus image preprocessing before and after preprocessing;

fig. 3 is a schematic diagram of the structure of a double-ended classifier.

While particular structures have been shown for purposes of clarity and understanding of the embodiments of the invention, it is not intended to limit the invention to the particular structures and environments disclosed, and modifications within the scope of the invention will occur to those skilled in the art.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention more clear, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings of the embodiments of the present invention. It will be apparent that the described embodiments are some, but not all, embodiments of the invention. All other embodiments, which can be made by a person skilled in the art without creative efforts, based on the described embodiments of the present invention fall within the protection scope of the present invention.

The embodiment of the invention firstly provides a method for constructing a diabetic retinopathy diagnosis model, and referring to fig. 1, the method comprises the following steps:

s1, acquiring a fundus image and preprocessing to obtain an input image;

s2, inputting the input image into a backbone network for feature extraction to obtain a feature image containing advanced features;

s3, designing an improved classifier, inputting the feature map into the improved classifier, and obtaining a classification prediction result and a regression prediction result through calculation;

s4, designing a relative position loss function additional item, wherein the classification prediction result in the previous step is used for calculating a cross entropy loss function, and the regression prediction result in the previous step is used for calculating the relative position loss function additional item;

s5, combining the cross entropy loss function and the relative position loss function additional term to obtain a final joint loss function so as to construct a diagnosis model.

In the present embodiment, this newly constructed diagnostic model is designated as RL-ResNet. The RL-res net is mainly composed of a backbone network and an improved double-end classifier, and after the preprocessed fundus image is input into the backbone network, features are extracted through the backbone network, and become a set of feature images containing advanced features, and the feature images are sent into the improved classifier. In the classifier, the feature map is calculated to obtain a classification prediction result and a regression prediction result, and the two results are used for calculating a cross entropy loss function and a relative position loss function addition term designed by the invention, wherein the cross entropy loss function and the relative position loss function addition term are used for modeling continuous information in the label. The two are combined to obtain a final joint loss function so as to realize the training of model parameters.

In particular, since fundus images have differences in illumination conditions and photographing apparatuses during sampling, there is a great difference in size and color between pictures. Meanwhile, some pictures have the problems of overexposure, more noise and the like. In order to train a network using samples of consistent color range, pre-processing and enhancement of the acquired fundus image is required.

In step S1, the step of preprocessing the fundus image specifically includes:

firstly, removing black areas around fundus images; black areas around the fundus image have a great negative influence on the classification result, and therefore need to be removed;

secondly, the sizes of fundus images are adjusted uniformly, and the resolution of the images after adjustment in the embodiment of the invention is 256×256;

（1）

（2）

in the formula (1),

for the preprocessed fundus image, +.>

Representing standard deviation +.>

Is a gaussian convolution of (c) and,

, />

, />

,/>

4, -4, 10 and 128, respectively.

Fig. 2 a-2 c show the preprocessing results of three different fundus images. As can be seen from the figure, by the above pretreatment, the black areas around the fundus image become gray, reducing the influence on the classification result. The sharpness of blood vessels and lesion areas and the contrast with surrounding areas are significantly improved.

Further, in the embodiment of the invention, a ResNet50 network containing ImageNet pre-training parameters is selected as a backbone network of the diagnostic model formed by the encoder.

The ResNet50 network has proven to be very efficient in extracting image high-level semantic information, which is critical to medical image tasks. Therefore, the invention selects the ResNet50 network containing the ImageNet pre-training parameters as an encoder to form the backbone network for providing the diagnosis model, so that the backbone network has better initialization parameters and the convergence speed of the backbone network in the training process is accelerated.

The improved classifier designed in the invention is a double-end classifier, and the structure of the classifier is shown in figure 3.

The components are

After Softmax the probability of i for the input sample label is obtained +.>

，/>

(3)

classification results given to input samples

Obtained by subscript of the largest element in P, the formula is:

(4)

existing deep learning-based DR staging methods typically use P-dependent cross entropy loss functions for network training, a process that does not require the use of

。/>

Calculation is only required when using a trained model. If one wants to model tag continuity information, one needs to use +.>

Related information. But the gradient information of the corresponding term of the loss function cannot be transferred to the shallow layer of the model through the process shown in the formula (4). That is, if use +.>

Modeling tag continuity information and constructing corresponding loss function additional items, wherein the part of information cannot influence the training process of the model, namely, the newly added additional items are invalid. The invention thus envisages +_ in figure 3>

Corresponding structure.

In FIG. 3

Obtaining +.>

This process is shown in equation (5):

(5)

here, the

I.e. can be regarded as regression predictions of the input samples; />

The greater the value representing the greater the extent of pathology of the sample; for example->

Corresponding to DR0 stage, and->

Then the corresponding DR4 stage.

Considering the value range of the input sample label of 0-4, for the convenience of calculation, the predicted value is calculated

Amplified to get->

：

(6)

The regression prediction value is regarded as the regression prediction value of the lesion degree of the input sample. For example->

Indicating that the extent of lesions in the input samples is between DR2 and DR3, closer to DR2.

In the embodiment of the invention, the paired sample regression prediction value is usedThe relative distance between the two is used for modeling the lesion continuity and counting a loss function additional term, namely a relative position loss function additional term (Relative Position Loss, short for RePLoss) used for

And (3) representing.

The process of designing the relative position loss function additional term is as follows:

assume any two input samples of a training set

And->

Their label is->

And->

Regression prediction value of lesion degree is +.>

And->

Their tag distance and predicted value distance are +.>

And->

(7)

likewise, consider cancellation

Influence of the sign of (2) on the modelExperiments prove that the final selection pair +.>

The relative position loss is obtained as a square.

For the model training process, a Batch (Batch) contains

Secondary times; then taking the average as a relative position loss function additional term for this batch +.>

：

(8)

(9)

wherein the method comprises the steps of

For a super parameter adjusting the ratio of the two loss functions, +.>

Is a cross entropy loss function.

The calculation procedure for a single sample is as follows:

(10)

onehot, which is a sample tag, indicates that when the sample belongs to category i, i.e. the sample tag is i,/>

Otherwise->

；/>

Is classified as +.>

Probability of individual categories. For a batch +.>

Sample number->

For each of them->

Average value of (2):

(11)

aiming at the defects of the prior art, the invention designs a brand-new classifier structure and a loss function, so that the model can learn the DR pathological change process continuity information, thereby obviously improving the model performance.

Accordingly, embodiments of the present invention also provide a diagnostic model of diabetic retinopathy, the diagnostic model comprising:

Further, the preprocessing unit is specifically configured to:

firstly, removing black areas around fundus images;

（1）

（2）

in the formula (1),

for the preprocessed fundus image, +.>

Representing standard deviation +.>

Is a gaussian convolution of (c) and,

, />

, />

,/>

4, -4, 10 and 128, respectively.

Further, in the backbone network unit, a ResNet50 network containing ImageNet pre-training parameters is selected as a backbone network of the diagnostic model formed by the encoder.

Further, in the improved classifier unit, the feature map obtained through the backbone network is sent to the full connection layer, and a vector representation with 6 dimensions is output

The component is->

After Softmax the probability of i for the input sample label is obtained +.>

，/>

(3)

classification results given to input samples

Obtained by subscript of the largest element in P, the formula is:

(4)

obtaining +.>

This process is shown in equation (5):

(5)

here, the

Consider as the regression prediction value of the input sample; />

Amplified to get->

：

(6)

Further, in the loss function unit, the relative position loss function additional term is designed as follows:

assume any two input samples of a training set

And->

Their labels are/>

And->

Regression prediction value of lesion degree is +.>

And->

Their tag distance and predicted value distance are +.>

And->

(7)

likewise, consider cancellation

The relative position loss is obtained as a square.

For the model training process, one batch contains

：

(8)

(9)

wherein the method comprises the steps of

For a super parameter adjusting the ratio of the two loss functions, +.>

Is a cross entropy loss function.

The calculation procedure for a single sample is as follows:

(10)

Otherwise->

；/>

Is classified as +.>

Probability of individual categories. For a batch +.>

Sample number->

For each of them

Average value of (2):

(11)

according to the diagnosis model provided by the invention, through designing a brand-new classifier structure and a loss function, the model can learn the DR pathological change process continuity information, so that the performance of the model is obviously improved.

Embodiments of the present invention also provide an electronic device that may vary greatly in configuration or performance, and may include one or more processors (central processing units, CPU) and one or more memories, where the memories store at least one instruction that is loaded and executed by the processors to implement the steps of the method for constructing a diabetic retinopathy diagnostic model described above.

In an exemplary embodiment, a computer readable storage medium, such as a memory comprising instructions executable by a processor in a terminal to perform the method of constructing a diabetic retinopathy diagnostic model described above, is also provided. For example, the computer readable storage medium may be ROM, random Access Memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or terminal device comprising the element.

References in the specification to "one embodiment," "an example embodiment," "some embodiments," etc., indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may not necessarily include the particular feature, structure, or characteristic. Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it is submitted that it is within the knowledge of one skilled in the relevant art to effect such feature, structure, or characteristic in connection with other embodiments whether or not explicitly described.

The invention is intended to cover any alternatives, modifications, equivalents, and variations that fall within the spirit and scope of the invention. In the following description of preferred embodiments of the invention, specific details are set forth in order to provide a thorough understanding of the invention, and the invention will be fully understood to those skilled in the art without such details. In other instances, well-known methods, procedures, flows, components, circuits, and the like have not been described in detail so as not to unnecessarily obscure aspects of the present invention.

Those of ordinary skill in the art will appreciate that all or a portion of the steps in implementing the methods of the embodiments described above may be implemented by a program that instructs associated hardware, and the program may be stored on a computer readable storage medium, such as: ROM/RAM, magnetic disks, optical disks, etc.

The foregoing description of the preferred embodiments of the invention is not intended to limit the invention to the precise form disclosed, and any such modifications, equivalents, and alternatives falling within the spirit and scope of the invention are intended to be included within the scope of the invention.

Claims

1. The method for constructing the diabetic retinopathy diagnosis model is characterized by comprising the following steps of:

acquiring a fundus image and preprocessing to obtain an input image;

the cross entropy loss function and the relative position loss function additional term are combined to obtain a final joint loss function so as to realize the construction of a diagnosis model;

the process of designing the improved classifier is as follows:

The component is->

Through the process ofSoftmaxObtaining the input sample label asiProbability of->

，/>

；SoftmaxThe calculation process is as follows:

（3）

classification results given to input samples

From the following componentsPThe subscript of the largest element in (2) is obtained, and the formula is as follows:

（4）

through the process ofSigmoidAfter the function get +.>

This process is shown in equation (5):

（5）

here, the

Consider as the regression prediction value of the input sample; />

Amplified to get->

：

（6）

The regression prediction value is regarded as the regression prediction value of the lesion degree of the input sample;

assume any two input samples of a training set

And->

Their label is->

And->

Regression prediction value of lesion degree is +.>

And->

Their tag distance and predicted value distance are +.>

And->

(7)

likewise, consider cancellation

Taking the square as the relative position loss;

for the model training process, one batch contains

：

(8)

(9)

wherein the method comprises the steps of

For a super parameter adjusting the ratio of the two loss functions, +.>

Is a cross entropy loss function.

2. The method for constructing a diabetic retinopathy diagnostic model according to claim 1, wherein the step of preprocessing the fundus image specifically includes:

firstly, removing black areas around fundus images;

（1）

（2）

in the formula (1),

for the preprocessed fundus image, +.>

Representing standard deviation +.>

Gaussian convolution of>

, />

, />

, />

4, -4, 10 and 128, respectively.

3. The method for constructing a diagnostic model of diabetic retinopathy according to claim 1, wherein a network of ResNet50 containing image Net pre-training parameters is selected as a backbone network of the diagnostic model.

4. A diagnostic model of diabetic retinopathy, comprising:

the loss function unit is used for calculating a cross entropy loss function by using the classification prediction result, calculating a relative position loss function additional term by using the regression prediction result, and combining the two to obtain a final joint loss function;

in the improved classifier unit, a feature map obtained through a backbone network is sent to a full-connection layer, and a vector representation with the dimension of 6 is output

The component is->

After Softmax the probability of i for the input sample label is obtained +.>

， />

(3)

classification results given to input samples

Obtained by subscript of the largest element in P, the formula is:

(4)

obtaining +.>

This process is shown in equation (5):

(5)

here, the

Consider as the regression prediction value of the input sample; />

Amplified to get->

：

(6)

in the loss function unit, the relative position loss function additional terms are designed as follows:

assume any two input samples of a training set

And->

Their label is->

And->

Regression prediction value of lesion degree is +.>

And->

Their tag distance and predicted value distance are +.>

And->

(7)

likewise, consider cancellation

The influence of the sign of (2) on the model is determined byExperiments prove that the final selection pair +.>

Taking the square as the relative position loss;

for the model training process, one batch contains

：

(8)

(9)

wherein the method comprises the steps of

For a super parameter adjusting the ratio of the two loss functions, +.>

Is a cross entropy loss function.

5. The diabetic retinopathy diagnostic model according to claim 4, wherein the preprocessing unit is specifically configured to:

firstly, removing black areas around fundus images;

（1）

（2）

in the formula (1),

for the preprocessed fundus image, +.>

A gaussian convolution with standard deviation sigma is represented,

,/>

, />

, />

4, -4, 10 and 128, respectively.

6. The diabetic retinopathy diagnostic model according to claim 4, wherein the backbone network unit selects a ResNet50 network containing ImageNet pre-training parameters as a backbone network of the diagnostic model.