CN112132782B

CN112132782B - Method and terminal for processing DME (DME) typing based on deep neural network

Info

Publication number: CN112132782B
Application number: CN202010851829.8A
Authority: CN
Inventors: 余洪华; 蔡宏民; 吴乔伟; 张滨; 刘宝怡
Original assignee: Guangdong General Hospital
Current assignee: Guangdong General Hospital
Priority date: 2020-08-21
Filing date: 2020-08-21
Publication date: 2023-09-05
Anticipated expiration: 2040-08-21
Also published as: CN112132782A

Abstract

The invention provides a method and a terminal for processing DME typing based on a deep neural network, wherein the method comprises the following steps: preprocessing an OCT image to be identified; performing image feature extraction on the OCT image subjected to pretreatment through a trained DME feature extraction model; the DME feature extraction model is obtained based on deep learning network training; based on the extracted image features, obtaining a binary classification function value corresponding to whether preset DME appears in the OCT image; the preset DME includes: DRT, CME and SRD; and obtaining a binary classification task result based on the binary classification function value and a preset threshold value. The scheme realizes the rapid and accurate identification of different types of DMEs.

Description

Method and terminal for processing DME (DME) typing based on deep neural network

Technical Field

The invention relates to the field of clinical medicine ophthalmology and computer engineering, in particular to a method and a terminal for processing DME typing based on a deep neural network.

Background

DME is the leading cause of vision loss in diabetics and can be classified into diffuse retinal thickening (DRT, diffuse retinal thickening), macular cystoid edema (CME, cystoid macular edema), serous retinal detachment (SRD, serous retinal detachment) and mixed DME (Mixed DME) of two or three types depending on its morphology in OCT examination. On OCT images, DRT appears as an increase in retinal nerve epithelial layer thickness with a decrease in inter-nerve epithelial layer reflection, and an enlargement of the low reflection region; CME is manifested as a retinal interlayer cyst-like cavity formation, while the lighter one shows several small cyst cavities in honeycomb form, which can merge into a larger cyst cavity when edema is evident, even retaining only a thin inner limiting membrane in the fovea; SRD is manifested as detachment of the nerve epithelium layer in the macular region, underlying the liquid dark space, where the highly reflective bands of the pigment epithelium layer are clearly visible.

Studies have shown that the systemic risk factors for different types of DME patients are different, indicating that the pathogenesis of different types of DME may vary. Previous studies have further demonstrated that DRT and CME are mainly associated with intracellular edema and necrosis of Muller cells caused by disruption of the blood-retinal barrier, respectively. In contrast, SRD occurs in association with blood-subretinal barrier dysfunction causing accumulation of subretinal fluid. Intravitreal injection of anti-VEGF drugs is currently the most prominent means of treating DME, but different morphological types of DME on OCT images may respond differently to anti-VEGF treatment due to different pathogenesis.

According to the above conventional predictive methods, ophthalmologists are generally presented in the field of accurately classifying DME from OCT images and formulating personalized treatment strategies for DME patients. The main reasons for this are that traditional prediction methods are subjective, have many and complex factors to consider comprehensively, and depend largely on the clinical experience and knowledge level of the ophthalmologist. And with the increasing prevalence of diabetes mellitus worldwide, it may place a great burden on the clinical diagnosis and management of DME patients. Thus, there is a great difficulty in making accurate DME classifications for young doctors lacking clinical experience and doctors in communities or primary hospitals where medical levels are relatively low. At present, no means for automatically and accurately classifying the DME is available for clinic, and the means are rapid, accurate, high in universality and wide in application range.

Disclosure of Invention

Aiming at the defects in the prior art, the invention provides a method and a terminal for processing DME typing based on a deep neural network,

specifically, the present invention proposes the following specific embodiments:

the embodiment of the invention provides a method for processing DME typing based on a deep neural network, which comprises the following steps:

preprocessing an OCT image to be identified;

performing image feature extraction on the OCT image subjected to pretreatment through a trained DME feature extraction model; the DME feature extraction model is obtained based on deep learning network training;

based on the extracted image features, obtaining a binary classification function value corresponding to whether preset DME appears in the OCT image; the preset DME includes: DRT, CME and SRD;

and obtaining a binary classification task result based on the binary classification function value and a preset threshold value.

In a specific embodiment, before "preprocessing the OCT image to be identified", it further includes:

acquiring OCT images of a DME-confirmed patient;

setting the OCT image of the patient as the OCT image to be identified.

In a specific embodiment, the "preprocessing the OCT image to be identified" includes:

removing saturated pixel values, reducing noise and cutting edges of the OCT image to be identified;

and reading the processed OCT images, and uniformly adjusting the read OCT images to a preset size.

In a specific embodiment, the DME feature extraction model is constructed on the basis of a VGG16 convolutional neural network.

In a specific embodiment, the DME feature extraction model is constructed on the basis of a VGG16 convolutional neural network, and the DME feature extraction model comprises the following steps:

constructing a deep DME model on the basis of a VGG16 convolutional neural network;

training the deep DME model based on a training set in the preprocessed preset OCT image dataset, and performing internal verification;

and carrying out external verification on the trained deep DME model based on a test set in a preset OCT image data set so as to obtain a DME feature extraction model.

In a specific embodiment, the "constructing the deep dme model on the basis of the VGG16 convolutional neural network" includes:

estimating the degree of difference between the VGG16 convolutional neural network output value and the true value by adopting a cross entropy loss function;

the ReLU activation function is adopted, so that the nonlinearity of the VGG16 convolutional neural network is improved, and the sparsity of the VGG16 convolutional neural network is increased;

the Dropout layer is adopted to act on a full connection layer of the VGG16 convolutional neural network;

after the Batchnorm layer is adopted to act on each convolution layer in the VGG16 convolution neural network, the distribution of the neuron input values of each layer of the neural network layer is pulled back to the standard normal distribution;

adjusting the VGG16 convolutional neural network which is pre-trained on the ImageNet data set, and constructing an initial model;

replacing the last full-connection layer, the SoftMax layer and the classification layer of the VGG16 convolutional neural network with a full-connection layer and a regression layer containing 3 neurons, so that the VGG16 convolutional neural network is converted into a regression network;

mapping the OCT image into OCT typing of DME based on the initial model, and respectively adjusting the output of the last fully-connected layer into binary classification function values of DRT, CME and SRD to obtain a deep DME model.

In a specific embodiment, the "training the deepdem model based on the training set in the pre-processed preset OCT image dataset" and performing internal verification "includes:

based on a training set in the preprocessed preset OCT image data set, performing internal inspection on the deep DME model by using a five-fold cross validation method; the training set is randomly divided into five parts, wherein 4 parts are used for training the deep DME model, the remaining 1 part is used for evaluating the performance of the deep DME model, and each part participates in the training of the deep DME model;

determining consistency between a predicted label and a real label obtained each time in internal verification of the deep DME model by using a confusion matrix, and calculating accuracy, sensitivity, specificity and area under a test subject working characteristic curve of each time of running of the model;

the performance of the deepdem model was verified by recording the average results of five runs of the deepdem model.

In a specific embodiment, the "externally validating the deep DME model after training based on a test set in a preset OCT image dataset to obtain a DME feature extraction model" includes:

externally verifying the trained deep DME model by using a test set in a preset OCT image data set;

determining consistency between a predicted tag and a real tag obtained by the deep DME model by using a confusion matrix;

based on the confusion matrix, calculating the accuracy, sensitivity, specificity and area under the working characteristic curve of the subject for detecting the DME typing by the deep DME model.

In a specific embodiment, the "obtaining the DME-typed result based on the binary classification function value and a preset threshold" includes:

for OCT images of the hybrid DME, binary classification function values of the DRT, the CME and the SRD and a preset threshold value are converted into the proportion of 3 types of DME in the hybrid DME by a normalization method.

The embodiment of the invention also provides a terminal which comprises a memory and a processor, wherein the processor executes the method when running the program stored in the memory.

In this way, the embodiment of the invention provides a method and a terminal for processing DME typing based on a deep neural network, wherein the method comprises the following steps: preprocessing an OCT image to be identified; performing image feature extraction on the OCT image subjected to pretreatment through a trained DME feature extraction model; the DME feature extraction model is obtained based on deep learning network training; based on the extracted image features, obtaining a binary classification function value corresponding to whether preset DME appears in the OCT image; the preset DME includes: DRT, CME and SRD; and obtaining a binary classification task result based on the binary classification function value and a preset threshold value. The scheme realizes the rapid and accurate identification of different types of DMEs.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments will be briefly described below, it being understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and other related drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic flow chart of a method for processing DME typing based on a deep neural network according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a system corresponding to a method for processing DME typing based on a deep neural network according to an embodiment of the present invention;

FIG. 3 is a schematic flow chart of a method for processing DME typing based on a deep neural network according to an embodiment of the present invention;

fig. 4 is a schematic flow chart of a specific implementation process of a method for processing DME typing based on a deep neural network according to an embodiment of the present invention.

Detailed Description

Hereinafter, various embodiments of the present disclosure will be more fully described. The present disclosure is capable of various embodiments and of modifications and variations therein. However, it should be understood that: there is no intention to limit the various embodiments of the disclosure to the specific embodiments disclosed herein, but rather the disclosure is to be interpreted to cover all modifications, equivalents, and/or alternatives falling within the spirit and scope of the various embodiments of the disclosure.

The terminology used in the various embodiments of the disclosure is for the purpose of describing particular embodiments only and is not intended to be limiting of the various embodiments of the disclosure. As used herein, the singular is intended to include the plural as well, unless the context clearly indicates otherwise. Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which various embodiments of this disclosure belong. The terms (such as those defined in commonly used dictionaries) will be interpreted as having a meaning that is the same as the context of the relevant art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein in the various embodiments of the disclosure.

Example 1

The embodiment 1 of the invention discloses a method for processing DME typing based on a deep neural network, which comprises the following steps as shown in figure 1:

step 101, preprocessing an OCT image to be identified;

specifically, as shown in fig. 2, the step 101 corresponds to a preprocessing module, where the preprocessing module performs preprocessing on the labeled OCT image, and sends the result to a feature extraction module;

specifically, the OCT image may be in TIFF format.

102, extracting image features of the OCT image subjected to pretreatment through a trained DME feature extraction model; the DME feature extraction model is obtained based on deep learning network training;

specifically, step 102 corresponds to the feature extraction module in fig. 2, and the feature extraction module performs image feature extraction on the OCT image after preprocessing by using the deep neural network.

Step 103, based on processing the extracted image features, obtaining a binary classification function value corresponding to whether preset DME appears in the OCT image; the preset DME includes: DRT, CME and SRD;

step 103 corresponds to the data processing module in fig. 2, where the data processing module processes the image features obtained by the feature extraction module, and outputs whether the respective binary classification function values of the DRT, CME or SRD appear in the image.

And 104, obtaining a DME typing result based on the binary classification function value and a preset threshold value.

Specifically, step 104 corresponds to the decision module in fig. 2, and the decision module may divide the result into the binary classification task 0 or 1 according to the binary classification function value obtained by the data processing module, for example, a threshold of 0.5 (as shown in fig. 3 and 4, specifically, if less than 0.5, it is considered to be 0, and if not less than 0.5, it is considered to be 1).

Further, before the "preprocessing the OCT image to be identified", it further includes:

acquiring OCT images of a DME-confirmed patient;

setting the OCT image of the patient as the OCT image to be identified.

Specifically, the OCT image to be identified is an image obtained from a patient diagnosed with DME.

Specifically, for example, the OCT images are all adjusted to 224×224, or other sizes, and the specific size can be set and flexibly adjusted according to the actual situation.

Further, constructing and obtaining the DME characteristic extraction model on the basis of a VGG16 convolutional neural network, wherein the DME characteristic extraction model comprises the following steps:

Specifically, the "constructing the deep dme model on the basis of the VGG16 convolutional neural network" includes:

estimating the degree of difference between the VGG16 convolutional neural network output value and the true value by adopting a cross entropy loss function; generally, all values of the loss function are non-negative, and if the robustness is better, the corresponding loss value is smaller, so that the obtained result is more convincing;

the ReLU activation function is adopted, so that the nonlinearity of the VGG16 convolutional neural network is improved, and the sparsity of the VGG16 convolutional neural network is increased; by the method, the problems of overfitting effect and gradient disappearance of the model are effectively avoided, and the calculation speed is improved;

the Dropout layer is adopted to act on the full-connection layer of the VGG16 convolutional neural network, so that interaction between hidden layer nodes can be effectively reduced, complex co-adaptation relations among neurons are reduced, and overfitting is effectively avoided;

after the Batchnorm layer is adopted to act on each convolution layer in the VGG16 convolution neural network, the distribution of the neuron input values of each layer of the neural network layer is pulled back to the standard normal distribution; therefore, the training speed is improved, gradient explosion and gradient disappearance are effectively prevented, and the overfitting effect is avoided;

replacing the last full-connection layer, the SoftMax layer and the classification layer of the VGG16 convolutional neural network with a full-connection layer and a regression layer containing 3 neurons, so that the VGG16 convolutional neural network is converted into a regression network; specifically, for example, the last full-connection layer, softMax layer, and classification layer containing 1,000 neurons may be replaced with the full-connection layer and regression layer containing 3 neurons, thereby converting the classification network model into a regression network model;

In a specific embodiment, the specific preset OCT image dataset is a sample dataset, i.e. a professional evaluator gives an annotated OCT image of 0 or 1, respectively, according to the presence or absence of 3 DME-typing in each OCT image, each OCT image having a multiple annotation consisting of 3 labels. For example, OCT images where only DRT is present will be labeled as 1/0/0, while OCT images where CME and SRD coexist will be labeled as 0/1/1; thus, the "training the deepdem model based on the training set in the pre-processed preset OCT image dataset" includes:

Specifically, the deep dme model is internally checked by using a five-fold cross-validation method, in which the OCT image of the training set is randomly divided into five independent parts; in each run of the model, four of the parts are used for training of the model, and the remaining one part is used for evaluating the performance of the model until each part participates in training of the model; the confusion matrix is used to describe the consistency between the predictive label and the real label obtained each time in the internal verification of the deep dme model, and the accuracy, sensitivity, specificity and area under the working characteristic curve of the test subject of each operation of the model are calculated.

Specifically, in a specific embodiment, by collecting 11,599 OCT images of different types of DME patients, the deep DME model automatically diagnoses the morphology typing of DME based on an image multi-labeling technology and quantitatively analyzes the respective proportion of 3 types of DME in the mixed DME, so that the defect of the conventional subjective diagnosis method depending on time consumption participation of ophthalmologists is overcome, and an ophthalmologist is helped to formulate a personalized treatment strategy according to the morphology typing of the DME patients.

The deep DME model can automatically detect the morphological typing of DME, so that the workload of an ophthalmologist is reduced, the ophthalmologist is helped to formulate a personalized treatment strategy for a DME patient, the clinical curative effect of DME is improved, the understanding of the patient on diseases is enhanced, and the psychological pressure of medical treatment is relieved. For DME patients expected to respond well to anti-VEGF treatment, the promising outcome of the patient's anti-VEGF treatment should be emphasized and patients should be encouraged to develop standard treatment regimens. For DME patients who are predicted to have poor response to VEGF treatment, additional treatment regimens should be recommended in time when they are not optimally receiving standard treatment regimens.

Compared with the traditional subjective diagnosis method requiring the participation of ophthalmologists, the deep DME model can automatically extract OCT image characteristics to objectively and accurately detect the morphological typing of DME and quantitatively analyze the proportion of each type of DME in the mixed DME, so that the deep DME model is used by clinicians in various areas and at various levels. The deep DME model of the present invention achieves superior performance in the task of detecting different types of DME. In internal verification, the accuracy of detecting DRT typing is 93.0%, the sensitivity is 93.5% and the specificity is 92.3%; the accuracy of CME typing detection is 95.1%, the sensitivity is 94.5%, and the specificity is 95.6%; the accuracy of detecting SRD typing is 98.8%, the sensitivity is 96.7%, and the specificity is 99.3%; the areas under the test operating characteristic curves of the subjects for detecting DRT, CME and SRD typing are respectively 0.971, 0.974 and 0.994. In external verification, the accuracy of detecting DRT typing is 90.2%, the sensitivity is 80.1%, and the specificity is 97.6%; the accuracy of CME typing detection is 95.4%, the sensitivity is 93.4%, and the specificity is 97.2%; the accuracy of detecting SRD typing is 95.9%, the sensitivity is 94.9%, and the specificity is 96.5%; the areas under the test subject operating characteristics curves for detecting DRT, CME and SRD typing were 0.970, 0.997 and 0.997, respectively. In the future, the method is expected to be applied to automatic diagnosis of other multi-factor eye or whole body diseases after the sample is continuously enlarged and repeated training is carried out for a plurality of times. In addition, the deep dme model of the present invention can also improve the ophthalmic diagnosis and treatment level in remote areas lacking medical resources by reducing the time and expertise required for retinal image classification.

Example 2

The embodiment 2 of the invention also discloses a terminal, which comprises a memory and a processor, wherein the processor executes the method in the embodiment 1 when running the program stored in the memory. In particular, embodiment 2 of the present invention also discloses other relevant features, and for brevity, reference is made to the description in embodiment 1.

Those skilled in the art will appreciate that the drawing is merely a schematic illustration of a preferred implementation scenario and that the modules or flows in the drawing are not necessarily required to practice the invention.

Those skilled in the art will appreciate that modules in an apparatus in an implementation scenario may be distributed in an apparatus in an implementation scenario according to an implementation scenario description, or that corresponding changes may be located in one or more apparatuses different from the implementation scenario. The modules of the implementation scenario may be combined into one module, or may be further split into a plurality of sub-modules.

The above-mentioned inventive sequence numbers are merely for description and do not represent advantages or disadvantages of the implementation scenario.

The foregoing disclosure is merely illustrative of some embodiments of the invention, and the invention is not limited thereto, as modifications may be made by those skilled in the art without departing from the scope of the invention.

Claims

1. A method for processing DME typing based on a deep neural network, comprising:

preprocessing an OCT image to be identified;

obtaining a DME typing result based on the binary classification function value and a preset threshold value;

the deep DME model is constructed on the basis of a VGG16 convolutional neural network and specifically comprises the following steps:

2. The method of claim 1, further comprising, prior to "preprocessing the OCT image to be identified:

acquiring OCT images of a DME-confirmed patient;

setting the OCT image of the patient as the OCT image to be identified.

3. The method of claim 1, wherein "preprocessing the OCT image to be identified" comprises:

4. The method of claim 1, wherein constructing the deepfdme model based on a VGG16 convolutional neural network, further comprises;

5. The method of claim 4, wherein the training the deepdem model based on the training set in the pre-processed pre-set OCT image dataset and performing internal verification comprises:

6. The method of claim 4, wherein the externally validating the deep DME model after training based on a test set in a preset OCT image dataset to obtain a DME feature extraction model comprises:

7. The method of claim 1, wherein the obtaining a DME-typed result based on the binary classification function value and a preset threshold value comprises:

8. A terminal comprising a memory and a processor, the processor executing the method of any of claims 1-7 when running a program stored in the memory.