CN116758038A - Infant retina disease information identification method and system based on training network - Google Patents
Infant retina disease information identification method and system based on training network Download PDFInfo
- Publication number
- CN116758038A CN116758038A CN202310747947.8A CN202310747947A CN116758038A CN 116758038 A CN116758038 A CN 116758038A CN 202310747947 A CN202310747947 A CN 202310747947A CN 116758038 A CN116758038 A CN 116758038A
- Authority
- CN
- China
- Prior art keywords
- features
- network
- module
- attention
- feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000012549 training Methods 0.000 title claims abstract description 56
- 238000000034 method Methods 0.000 title claims abstract description 36
- 210000001525 retina Anatomy 0.000 title claims abstract description 20
- 201000010099 disease Diseases 0.000 title claims description 25
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 title claims description 25
- 208000017442 Retinal disease Diseases 0.000 claims abstract description 35
- 230000004927 fusion Effects 0.000 claims abstract description 29
- 238000011282 treatment Methods 0.000 claims abstract description 9
- 230000004913 activation Effects 0.000 claims description 20
- 238000012545 processing Methods 0.000 claims description 15
- 238000000605 extraction Methods 0.000 claims description 12
- 230000000903 blocking effect Effects 0.000 claims description 6
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 claims description 5
- 230000003213 activating effect Effects 0.000 claims description 5
- 230000004256 retinal image Effects 0.000 claims description 3
- 239000000284 extract Substances 0.000 claims description 2
- 238000003745 diagnosis Methods 0.000 abstract description 14
- 230000003902 lesion Effects 0.000 abstract description 13
- 206010038923 Retinopathy Diseases 0.000 abstract description 5
- 238000001514 detection method Methods 0.000 abstract description 4
- 230000009977 dual effect Effects 0.000 abstract description 4
- 238000013527 convolutional neural network Methods 0.000 description 24
- 206010038933 Retinopathy of prematurity Diseases 0.000 description 21
- 230000006870 function Effects 0.000 description 16
- 238000013135 deep learning Methods 0.000 description 6
- 238000013461 design Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 5
- 238000002474 experimental method Methods 0.000 description 5
- 230000007246 mechanism Effects 0.000 description 5
- 201000000582 Retinoblastoma Diseases 0.000 description 3
- 230000002207 retinal effect Effects 0.000 description 3
- 201000004569 Blindness Diseases 0.000 description 2
- 208000007014 Retinitis pigmentosa Diseases 0.000 description 2
- 238000002679 ablation Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 230000006735 deficit Effects 0.000 description 2
- 210000001508 eye Anatomy 0.000 description 2
- 238000007689 inspection Methods 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 230000007774 longterm Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 208000021089 Coats disease Diseases 0.000 description 1
- 208000032170 Congenital Abnormalities Diseases 0.000 description 1
- 206010010356 Congenital anomaly Diseases 0.000 description 1
- 208000028506 Familial Exudative Vitreoretinopathies Diseases 0.000 description 1
- 208000010412 Glaucoma Diseases 0.000 description 1
- 208000001140 Night Blindness Diseases 0.000 description 1
- 208000022873 Ocular disease Diseases 0.000 description 1
- 208000029091 Refraction disease Diseases 0.000 description 1
- 208000008709 Retinal Telangiectasis Diseases 0.000 description 1
- 208000004350 Strabismus Diseases 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000004430 ametropia Effects 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 210000005252 bulbus oculi Anatomy 0.000 description 1
- 238000004195 computer-aided diagnosis Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 201000006902 exudative vitreoretinopathy Diseases 0.000 description 1
- 238000010191 image analysis Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 201000003142 neovascular glaucoma Diseases 0.000 description 1
- 238000011369 optimal treatment Methods 0.000 description 1
- 208000014733 refractive error Diseases 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 238000001356 surgical procedure Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0012—Biomedical image inspection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
- G06N3/0455—Auto-encoder networks; Encoder-decoder networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/806—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20021—Dividing image into blocks, subimages or windows
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30041—Eye; Retina; Ophthalmic
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/03—Recognition of patterns in medical or anatomical images
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Biomedical Technology (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Physics (AREA)
- General Engineering & Computer Science (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Medical Informatics (AREA)
- Biophysics (AREA)
- Multimedia (AREA)
- Databases & Information Systems (AREA)
- Quality & Reliability (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Radiology & Medical Imaging (AREA)
- Image Analysis (AREA)
- Eye Examination Apparatus (AREA)
Abstract
The application relates to a training network-based infant retinal disease information identification method and system, wherein the method comprises the following steps: extracting local-global features of the retina image to be detected by adopting a CNN and Transformer mixed network, fusing the local-global features by adopting a depth attention fusion module, training ROP data in the fused features by adopting a training network, extracting information with depth feature expression, and realizing ROP severity classification; combining the advantages and disadvantages of CNN and a transducer network, the dual tasks of automatically detecting various infant retinopathy and grading the severity of ROP are realized; the device can detect various common infant fundus lesions, shield the influence of other lesion characteristics, and greatly improve the accuracy of an automatic detection system for the common infant fundus lesions; the discomfort of the child patient due to the fact that repeated examination is difficult to diagnose is reduced to a certain extent, the certain misdiagnosis and missed diagnosis rate are reduced, and the diagnosis and treatment efficiency of doctors is improved.
Description
Technical Field
The application relates to the technical field of ophthalmic disease recognition and image recognition, in particular to a training network-based infant retinal disease information recognition method and system.
Background
Deep Learning (DL) is a mature but still rapidly evolving technology, especially in the context of computer-aided diagnosis of human diseases. In terms of computer major algorithms, he et al proposed a model res net with a residual structure, so that network depth was continuously increased without overfitting, and shallow-deep features were extracted to improve network recognition accuracy. Dosovitsky et al propose a different scale Transformer framework, i.e., design different scale input block sizes to train large scale data to obtain higher classification accuracy. Chen et al employ a pyramid structure and select a new region-local attention mechanism instead of a global self-attention mechanism to obtain more spatial information, thereby improving classification accuracy. Tu et al describe an efficient and extensible attention model called multiaxial attention (MaxViT) that consists of two aspects: blocking local attention and expanding global attention. These design choices allow global local spatial interactions at any input resolution with only linear complexity. Valanaasu et al propose a gated axial attention model that extends the existing architecture by introducing additional control mechanisms in the self-attention module, and we propose a local-global training strategy that further improves performance in order to train the model efficiently on medical images. Zhang et al propose a new parallel branch TransFuse network that combines the transformers and CNNs together in a parallel fashion, can effectively capture global dependencies and low-level spatial detail features in a shallower fashion, and fuses features extracted at different levels of the two branches using a bi-directional fusion module.
Deep learning techniques are widely used in the field of medical image analysis. As a representative framework of deep learning, convolutional neural networks (Convolutional Neural Network, CNN) are often used in backbone network frameworks to extract deep features in medical images by virtue of their strong feature extraction capabilities. The residual network proposed in 2015 can pay attention to the information of the shallow features while extracting the deep features due to the unique residual jump connection. Therefore, the residual network can pay attention to the characteristic information of the deep layer and the shallow layer, so that the extracted characteristics are more complete, and the network performance is better. The residual network is also selected as one of the branches for feature extraction. However, the features extracted by the pure CNN network lack the expression of global feature information, so that the network performance improvement has a certain limitation. For this reason, a Transformer network has been developed that can constantly learn global feature information having long-distance dependency relationships using a multi-head self-attention mechanism.
Among the numerous ocular diseases, congenital abnormalities and early-onset diseases are particularly important. Conventional ocular fundus diseases in infants generally include retinopathy of prematurity (ROP), coats disease, retinoblastoma (RB), retinitis Pigmentosa (RP), choroidal defects, congenital retinal folds, and familial exudative vitreoretinopathy. Most diseases have a long-term impact on the structure and function of the eye, including ametropia, night blindness, and may increase the abnormal arrangement of the eye (strabismus) and neovascular glaucoma. Among them, ROP is a major cause of vision impairment and blindness in children, and even 8000 RB newborns worldwide may need to undergo eyeball removal surgery to save lives.
From a clinical point of view, these infant conditions often lead to severe vision impairment and even blindness in the child for life, which has a long-term impact on society, especially future employment pressures. Since infant fundus diseases are not common in some general hospitals, they are often ignored, and even after they are encountered, accurate diagnosis may not be made. At the same time, specialized ophthalmologists worldwide are not available in sufficient resources.
In the field of ocular fundus diseases, several studies on automatic examination methods for retinopathy of prematurity have been successively carried out, and most of these studies are to detect a single ROP lesion. To date, we have found that little research has focused on detecting more than one type of ocular fundus disease in infants, and even a variety of ocular fundus diseases. In real life, particularly in remote areas lacking specialized ophthalmologists, it is necessary to effectively detect various types of fundus diseases.
Therefore, a more efficient auxiliary detection system is needed to solve this problem.
Disclosure of Invention
Aiming at the defects in the prior art, the application provides a training network-based infant retina disease information identification method and a training network-based infant retina disease information identification system.
The technical scheme adopted for solving the technical problems is as follows:
a training network-based infant retinal disease information identification method is constructed, which comprises the following steps:
extracting local-global characteristics of the retina image to be detected by adopting a CNN and Transformer mixed network;
the depth attention fusion module is adopted to fuse the local-global characteristics, so as to obtain fused characteristics with the local-global characteristic expression capability;
and training ROP data in the fused features by adopting a training network, extracting information with depth feature expression, and realizing ROP severity grading.
The application relates to a training network-based infant retinal disease information identification method, wherein the method for extracting local-global features of a retina image to be detected by adopting a CNN and Transformer mixed network comprises the following steps:
extracting depth semantic feature information in the retina image to be detected by using residual error network modules with different scales of the CNN network to obtain local features;
and extracting global features with long-distance dependency relations by using a 4-stage transducer network module.
The application discloses a training network-based infant retinal disease information identification method, wherein the fusion of local-global features by using a deep attention fusion module comprises the following steps:
the characteristics extracted by the residual block network and the transducer network module are used as the input of the depth attention fusion module, and element level addition processing is carried out after point convolution operation;
the processed features are input into a ReLU activation function for activation, then point convolution is utilized for feature extraction, and then the feature extraction is performed through a Sigmoid function for activation, so that attention features are obtained;
and performing element-level multiplication operation on the obtained attention characteristic and the characteristic obtained by the transducer network module to obtain a deep attention fusion characteristic.
The application discloses a training network-based infant retinal disease information identification method, wherein the algorithm flow of a transducer network module comprises the following steps:
firstly, carrying out blocking operation on an input image, then processing the block image by utilizing block embedding operation, carrying out layer regularization treatment on processed block features, inputting the processed block features into a multi-head attention module, outputting features, and adding the features and the block embedding features to obtain attention features;
and inputting the obtained attention features into a layer regularization module to obtain regularized features, inputting the regularized features into a multi-layer perceptron to obtain processed features, and adding the processed features with the previous attention features at element level to obtain module processing features of the converter network module.
The application discloses a training network-based infant retinal disease information identification method, wherein the residual network module algorithm flow comprises the following steps:
firstly, inputting an input image into a 3×3 convolution to obtain a convolution feature, regularizing and activating a function to obtain a standardized feature, inputting the feature into the 3×3 convolution to obtain the convolution feature, and inputting the convolution into the regularized function to obtain the regularized feature;
and adding the characteristics obtained by the operation with the input characteristics by element level to obtain processed characteristics, and obtaining residual module characteristics through an activation function.
An infant retinal disease information identification system based on a training network, comprising:
the hybrid network module is composed of CNN and a transducer and is used for extracting local-global characteristics of the retina image to be detected;
the depth attention fusion module is used for fusing the local-global characteristics to obtain fused characteristics with the local-global characteristic expression capability;
and the training network module is used for training ROP data in the fused features, extracting information with depth feature expression and realizing ROP severity classification.
The application relates to a training network-based infant retinal disease information identification system, wherein the local-global characteristics of a retinal image to be detected are extracted by a hybrid network module by adopting the following steps:
extracting depth semantic feature information in the retina image to be detected by using residual error network modules with different scales of the CNN network to obtain local features;
and extracting global features with long-distance dependency relations by using a 4-stage transducer network module.
The application discloses a training network-based infant retinal disease information identification system, wherein the deep attention fusion module fuses local-global characteristics by adopting the following method:
the characteristics extracted by the residual block network and the transducer network module are used as the input of the depth attention fusion module, and element level addition processing is carried out after point convolution operation;
the processed features are input into a ReLU activation function for activation, then point convolution is utilized for feature extraction, and then the feature extraction is performed through a Sigmoid function for activation, so that attention features are obtained;
and performing element-level multiplication operation on the obtained attention characteristic and the characteristic obtained by the transducer network module to obtain a deep attention fusion characteristic.
The application discloses a training network-based infant retinal disease information identification system, wherein the algorithm flow of a transducer network module comprises the following steps:
firstly, carrying out blocking operation on an input image, then processing the block image by utilizing block embedding operation, carrying out layer regularization treatment on processed block features, inputting the processed block features into a multi-head attention module, outputting features, and adding the features and the block embedding features to obtain attention features;
and inputting the obtained attention features into a layer regularization module to obtain regularized features, inputting the regularized features into a multi-layer perceptron to obtain processed features, and adding the processed features with the previous attention features at element level to obtain module processing features of the converter network module.
The application discloses a training network-based infant retinal disease information identification system, wherein the residual network module algorithm flow comprises the following steps:
firstly, inputting an input image into a 3×3 convolution to obtain a convolution feature, regularizing and activating a function to obtain a standardized feature, inputting the feature into the 3×3 convolution to obtain the convolution feature, and inputting the convolution into the regularized function to obtain the regularized feature;
and adding the characteristics obtained by the operation with the input characteristics by element level to obtain processed characteristics, and obtaining residual module characteristics through an activation function.
The application has the beneficial effects that: combining the advantages and disadvantages of the CNN and the Transformer network, providing a double-stage deep learning network combining the CNN and the Transformer network, and realizing the double tasks of automatically detecting various infant retinopathy and ROP severity classification; the device can detect various common infant fundus lesions, shield the influence of other lesion characteristics, and greatly improve the accuracy of an automatic detection system for the common infant fundus lesions; the method reduces the uncomfortable feeling of the child patient due to difficult diagnosis and repeated examination to a certain extent, reduces certain misdiagnosis and missed diagnosis rate, improves the diagnosis and treatment efficiency of doctors, and therefore reduces the complications of the child patient to a certain extent.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the present application will be further described with reference to the accompanying drawings and embodiments, in which the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained by those skilled in the art without inventive effort:
FIG. 1 is a flowchart of a training network-based infant retinal disease information identification method in accordance with a preferred embodiment of the present application;
FIG. 2 is a block diagram of a two-stage training network based method for identifying retinal diseases in infants in accordance with a preferred embodiment of the present application;
FIG. 3a is a block diagram of residual structure of a dual-stage training network based multiple infant retinal disease identification methods according to a preferred embodiment of the present application;
FIG. 3b is a schematic diagram of a two-stage training network based multiple infant retinal disease identification method transducer module according to the preferred embodiment of the present application;
FIG. 3c is a deep attention fusion module of a multiple infant retinal disease identification method based on a dual stage training network in accordance with a preferred embodiment of the present application;
FIGS. 4a-d are schematic diagrams of four confusion matrices, res-18, maxViT, res-18+MaxViT, res-18+MaxViT+DA, in accordance with a preferred embodiment of the present application;
fig. 5 is a schematic block diagram of an infant retinal disease information identification system based on a training network in accordance with a preferred embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application more apparent, the following description will be made in detail with reference to the technical solutions in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by a person skilled in the art without any inventive effort, are intended to be within the scope of the present application, based on the embodiments of the present application.
The training network-based infant retinal disease information identification method according to the preferred embodiment of the present application, as shown in fig. 1, with reference to fig. 2, 3a, 3b, 3c and 4a-d, comprises the following steps:
s01: extracting local-global characteristics of the retina image to be detected by adopting a CNN and Transformer mixed network;
s02: the depth attention fusion module is adopted to fuse the local-global characteristics, so as to obtain fused characteristics with the local-global characteristic expression capability;
s03: training ROP data in the fused features by adopting a training network, extracting information with depth feature expression, and realizing ROP severity grading;
combining the advantages and disadvantages of the CNN and the Transformer network, providing a double-stage deep learning network combining the CNN and the Transformer network, and realizing the double tasks of automatically detecting various infant retinopathy and ROP severity classification; the device can detect various common infant fundus lesions, shield the influence of other lesion characteristics, and greatly improve the accuracy of an automatic detection system for the common infant fundus lesions; the method reduces the uncomfortable feeling of the child patient due to difficult diagnosis and repeated examination to a certain extent, reduces certain misdiagnosis and missed diagnosis rate, improves the diagnosis and treatment efficiency of doctors, and therefore reduces the complications of the child patient to a certain extent.
Aiming at the dual tasks of automatically detecting various infant retinopathy and ROP severity grading, a dual-stage training network is designed, as shown in figure 2, for the training network 1, the application utilizes a mixed network of CNN and a transducer to extract local-global characteristics, utilizes a 4-stage transducer network to extract global characteristics with long-distance dependency, and utilizes residual networks with different scales to extract deep semantic characteristic information. The features extracted by CNN and the transducer network branches are fused by a deep attention fusion module, so that the fused features have the local-global feature expression capability.
The residual error network module, the transducer network module and the deep attention fusion module used in the application are shown in fig. 3a, 3b and 3 c;
preferably, the residual network module algorithm flow includes:
firstly, inputting an input image into a 3×3 convolution to obtain a convolution feature, regularizing and activating a function to obtain a standardized feature, inputting the feature into the 3×3 convolution to obtain the convolution feature, and inputting the convolution into the regularized function to obtain the regularized feature;
and adding the characteristics obtained by the operation with the input characteristics by element level to obtain processed characteristics, and obtaining residual module characteristics through an activation function.
Preferably, the method for fusing the local-global features by using the deep attention fusion module comprises the following steps:
the characteristics extracted by the residual block network and the transducer network module are used as the input of the depth attention fusion module, and element level addition processing is carried out after point convolution operation;
the processed features are input into a ReLU activation function for activation, then point convolution is utilized for feature extraction, and then the feature extraction is performed through a Sigmoid function for activation, so that attention features are obtained;
and performing element-level multiplication operation on the obtained attention characteristic and the characteristic obtained by the transducer network module to obtain a deep attention fusion characteristic.
Preferably, the algorithm flow of the transducer network module comprises:
firstly, carrying out blocking operation on an input image, then processing the block image by utilizing block embedding operation, carrying out layer regularization treatment on processed block features, inputting the processed block features into a multi-head attention module, outputting features, and adding the features and the block embedding features to obtain attention features;
and inputting the obtained attention features into a layer regularization module to obtain regularized features, inputting the regularized features into a multi-layer perceptron to obtain processed features, and adding the processed features with the previous attention features at element level to obtain module processing features of the converter network module.
By adopting the training network 1 of the application, by comparing different base line network results, a proper frame of the base line network component training network 1 is selected, and the comparison results of the different base line networks in the retinal multi-disease classification and identification are shown in the table 1:
TABLE 1 comparison of different basis-line networks for retinal multiple disease identification
As can be seen from Table 1, the Res-18 model is best classified as a CNN network and is the lowest complexity, so the present application selects the ResNet-18 model as the CNN branched backbone network. For a transducer network, the MaxViT can perform local and global spatial information interaction due to the multi-axis attention mechanism designed by the MaxViT, so that the best classification performance is obtained, and therefore, the MaxViT is selected as a backbone network of a transducer branch.
In summary, the application selects ResNet-18 and MaxViT as backbone design hybrid network to extract local-global characteristics to realize the classification and identification task of multiple retina diseases.
In order to verify the effectiveness of the design, the application also carries out a plurality of ablation experiments under the condition of ensuring that the verification settings are completely consistent, namely, firstly, a ResNet-18 network and a MaxViT network are respectively utilized to carry out complete experiments, then, the two networks are combined to carry out experiments, and finally, a deep attention fusion module is added to carry out experiments on the basis before. The results are shown in Table 2.
TABLE 2 ablation experiment results for different network modules
As can be seen from Table 2, the classification results obtained by combining ResNet-18 and MaxViT network models are the best, and the overall performance is further improved after the deep attention fusion module is added.
In addition, the application calculates confusion matrixes of different methods on different disease types as an auxiliary evidence, and proves that the application proposes that the network framework is effective in the task of classifying and identifying the multiple diseases of the retina, as shown in fig. 4a-d, and (a) - (d) in fig. 4a-d respectively represent the confusion matrixes obtained by four modes of Res-18, maxViT, res-18+MaxViT and Res-18+MaxViT+DA (the method of the application).
Aiming at the training network 2, the retina multi-disease identification result is obtained through the training network 1, and the severity classification is carried out on the obtained ROP category data by using the training network 2.
In the application, the ResNet-34 model is selected as a framework in consideration of precision and complexity, and the classification task of the ROP severity is realized, and the auxiliary evidence is shown in a table 3.
TABLE 3 comparison of results of different network models on ROP severity classification tasks
Methods | Accuracy | Precision | Recall | F1 | Kappa |
ResNet-18 | 91.82(0.49) | 93.01(0.26) | 92.83(0.98) | 92.89(0.52) | 86.64(1.41) |
ResNet-34 | 92.41(0.02) | 93.19(0.19) | 93.47(0.02) | 93.21(0.05) | 87.61(0.19) |
ResNet-50 | 93.14(0.13) | 94.23(0.14) | 94.05(0.07) | 94.13(0.09) | 88.78(0.32) |
An infant retinal disease information identification system based on a training network, as shown in fig. 5, includes:
the hybrid network module 100 is composed of CNN and a transducer and is used for extracting local-global characteristics of the retina image to be detected;
the depth attention fusion module 101 is configured to fuse the local-global features to obtain fused features with the local-global feature expression capability;
the training network module 102 is configured to train ROP data in the fused features, extract information with depth feature expression, and implement ROP severity classification.
The specific overview of the system content is referred to the above method section and will not be repeated here;
in summary, the following drawbacks in the prior art are addressed:
1. prediction of a single disease.
2. Most often using a single neural network
3. Modules are added in a transducer and a CNN, so that the complexity of a network is increased;
4. aiming at the single-task design, the design structure of the double tasks in the application cannot be satisfied yet;
the method and the system have the following beneficial effects:
1. the application combines knowledge in the business field, can detect various common infant fundus lesions, shield the influence of other lesion characteristics, and greatly improve the accuracy (including indexes such as sensitivity, specificity, F1 and the like) of the common infant fundus lesion automatic inspection system.
2. Aiming at the characteristics of complex characteristics and small difference of certain focus of the common fundus image of the infants, the prior art mainly adopts manual judgment to easily misdiagnose or difficult to give accurate disease judgment; because all pictures are analyzed in the same way, subjectivity is not involved. Although this work is the job to be completed for pediatric ophthalmologists, the misdiagnosis rate and missed diagnosis rate of common infant fundus diseases are high due to the limited professional level; the device can accurately and rapidly identify various common infant fundus diseases through rapid and efficient learning, reduces uncomfortable feeling of the infant due to difficult diagnosis and repeated inspection to a certain extent, reduces certain misdiagnosis and missed diagnosis rate, improves diagnosis and treatment efficiency of doctors, and therefore complications of the infant to a certain extent.
3. The operation is simple, and the universal adaptability is realized. Once properly trained, even a non-ophthalmologist doctor can make a preliminary diagnosis so that the child patient does not miss the optimal treatment time; .
4. The algorithm model has the advantages that: extracting a network structure with local-global characteristic information expression based on a mixed framework of CNN and a transducer; the depth attention module is used for fusing the feature information extracted from the CNN and the transducer branches, so that the extracted features have complete expressivity; aiming at the dual tasks of automatically detecting various infant retinopathy and ROP severity classification, a dual-stage data feature training network is designed, namely, the identification of multiple retinal diseases is realized by utilizing a CNN and Transformer hybrid network, and the ROP severity classification task is realized by utilizing a ResNet-34 model.
It will be understood that modifications and variations will be apparent to those skilled in the art from the foregoing description, and it is intended that all such modifications and variations be included within the scope of the following claims.
Claims (10)
1. The infant retina disease information identification method based on the training network is characterized by comprising the following steps of:
extracting local-global characteristics of the retina image to be detected by adopting a CNN and Transformer mixed network;
the depth attention fusion module is adopted to fuse the local-global characteristics, so as to obtain fused characteristics with the local-global characteristic expression capability;
and training ROP data in the fused features by adopting a training network, extracting information with depth feature expression, and realizing ROP severity grading.
2. The training network-based infant retinal disease information identification method according to claim 1, wherein the extracting of local-global features of the retinal image to be measured using the CNN and transducer mixed network comprises the steps of:
extracting depth semantic feature information in the retina image to be detected by using residual error network modules with different scales of the CNN network to obtain local features;
and extracting global features with long-distance dependency relations by using a 4-stage transducer network module.
3. The training network-based infant retinal disease information identification method according to claim 2, wherein the fusing of local-global features using the deep attention fusion module comprises the steps of:
the characteristics extracted by the residual block network and the transducer network module are used as the input of the depth attention fusion module, and element level addition processing is carried out after point convolution operation;
the processed features are input into a ReLU activation function for activation, then point convolution is utilized for feature extraction, and then the feature extraction is performed through a Sigmoid function for activation, so that attention features are obtained;
and performing element-level multiplication operation on the obtained attention characteristic and the characteristic obtained by the transducer network module to obtain a deep attention fusion characteristic.
4. The training network-based infant retinal disease information identification method according to claim 2, wherein the Transformer network module algorithm flow comprises:
firstly, carrying out blocking operation on an input image, then processing the block image by utilizing block embedding operation, carrying out layer regularization treatment on processed block features, inputting the processed block features into a multi-head attention module, outputting features, and adding the features and the block embedding features to obtain attention features;
and inputting the obtained attention features into a layer regularization module to obtain regularized features, inputting the regularized features into a multi-layer perceptron to obtain processed features, and adding the processed features with the previous attention features at element level to obtain module processing features of the converter network module.
5. The training network-based infant retinal disease information identification method according to claim 2, wherein the residual network module algorithm flow comprises:
firstly, inputting an input image into a 3×3 convolution to obtain a convolution feature, regularizing and activating a function to obtain a standardized feature, inputting the feature into the 3×3 convolution to obtain the convolution feature, and inputting the convolution into the regularized function to obtain the regularized feature;
and adding the characteristics obtained by the operation with the input characteristics by element level to obtain processed characteristics, and obtaining residual module characteristics through an activation function.
6. An infant retinal disease information identification system based on a training network, comprising:
the hybrid network module is composed of CNN and a transducer and is used for extracting local-global characteristics of the retina image to be detected;
the depth attention fusion module is used for fusing the local-global characteristics to obtain fused characteristics with the local-global characteristic expression capability;
and the training network module is used for training ROP data in the fused features, extracting information with depth feature expression and realizing ROP severity classification.
7. The training network-based infant retinal disease information identification system of claim 6, wherein the hybrid network module extracts local-global features of the retinal image to be tested using the method of:
extracting depth semantic feature information in the retina image to be detected by using residual error network modules with different scales of the CNN network to obtain local features;
and extracting global features with long-distance dependency relations by using a 4-stage transducer network module.
8. The training network-based infant retinal disease information identification system of claim 7, wherein the deep attention fusion module fuses local-global features using the method of:
the characteristics extracted by the residual block network and the transducer network module are used as the input of the depth attention fusion module, and element level addition processing is carried out after point convolution operation;
the processed features are input into a ReLU activation function for activation, then point convolution is utilized for feature extraction, and then the feature extraction is performed through a Sigmoid function for activation, so that attention features are obtained;
and performing element-level multiplication operation on the obtained attention characteristic and the characteristic obtained by the transducer network module to obtain a deep attention fusion characteristic.
9. The training network-based infant retinal disease information identification system of claim 7, wherein the Transformer network module algorithm flow comprises:
firstly, carrying out blocking operation on an input image, then processing the block image by utilizing block embedding operation, carrying out layer regularization treatment on processed block features, inputting the processed block features into a multi-head attention module, outputting features, and adding the features and the block embedding features to obtain attention features;
and inputting the obtained attention features into a layer regularization module to obtain regularized features, inputting the regularized features into a multi-layer perceptron to obtain processed features, and adding the processed features with the previous attention features at element level to obtain module processing features of the converter network module.
10. The training network-based infant retinal disease information identification system of claim 7, wherein the residual network module algorithm flow comprises:
firstly, inputting an input image into a 3×3 convolution to obtain a convolution feature, regularizing and activating a function to obtain a standardized feature, inputting the feature into the 3×3 convolution to obtain the convolution feature, and inputting the convolution into the regularized function to obtain the regularized feature;
and adding the characteristics obtained by the operation with the input characteristics by element level to obtain processed characteristics, and obtaining residual module characteristics through an activation function.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310747947.8A CN116758038A (en) | 2023-06-25 | 2023-06-25 | Infant retina disease information identification method and system based on training network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310747947.8A CN116758038A (en) | 2023-06-25 | 2023-06-25 | Infant retina disease information identification method and system based on training network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116758038A true CN116758038A (en) | 2023-09-15 |
Family
ID=87951148
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310747947.8A Pending CN116758038A (en) | 2023-06-25 | 2023-06-25 | Infant retina disease information identification method and system based on training network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116758038A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117274228A (en) * | 2023-10-24 | 2023-12-22 | 脉得智能科技(无锡)有限公司 | Ultrasonic image risk classification system based on deep learning of schistosome liver diseases |
CN117789284A (en) * | 2024-02-28 | 2024-03-29 | 中日友好医院(中日友好临床医学研究所) | Identification method and device for ischemic retinal vein occlusion |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180214087A1 (en) * | 2017-01-30 | 2018-08-02 | Cognizant Technology Solutions India Pvt. Ltd. | System and method for detecting retinopathy |
CN111259982A (en) * | 2020-02-13 | 2020-06-09 | 苏州大学 | Premature infant retina image classification method and device based on attention mechanism |
CN114998210A (en) * | 2022-04-29 | 2022-09-02 | 华南理工大学 | Premature infant retinopathy detection system based on deep learning target detection |
CN115690479A (en) * | 2022-05-23 | 2023-02-03 | 安徽理工大学 | Remote sensing image classification method and system based on convolution Transformer |
-
2023
- 2023-06-25 CN CN202310747947.8A patent/CN116758038A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180214087A1 (en) * | 2017-01-30 | 2018-08-02 | Cognizant Technology Solutions India Pvt. Ltd. | System and method for detecting retinopathy |
CN111259982A (en) * | 2020-02-13 | 2020-06-09 | 苏州大学 | Premature infant retina image classification method and device based on attention mechanism |
CN114998210A (en) * | 2022-04-29 | 2022-09-02 | 华南理工大学 | Premature infant retinopathy detection system based on deep learning target detection |
CN115690479A (en) * | 2022-05-23 | 2023-02-03 | 安徽理工大学 | Remote sensing image classification method and system based on convolution Transformer |
Non-Patent Citations (2)
Title |
---|
WEIMING LI等: "ConvTransNet: A CNN–Transformer Network for Change Detection With Multiscale Global–Local Representations", IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, vol. 61, 3 March 2023 (2023-03-03), pages 1 - 15 * |
郝文强等: "基于Transformer和CNN的低剂量CT图像去噪网络", 海南师范大学学报(自然科学版), vol. 36, no. 02, 15 June 2023 (2023-06-15), pages 176 - 182 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117274228A (en) * | 2023-10-24 | 2023-12-22 | 脉得智能科技(无锡)有限公司 | Ultrasonic image risk classification system based on deep learning of schistosome liver diseases |
CN117789284A (en) * | 2024-02-28 | 2024-03-29 | 中日友好医院(中日友好临床医学研究所) | Identification method and device for ischemic retinal vein occlusion |
CN117789284B (en) * | 2024-02-28 | 2024-05-14 | 中日友好医院(中日友好临床医学研究所) | Identification method and device for ischemic retinal vein occlusion |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108771530B (en) | Fundus lesion screening system based on deep neural network | |
CN110570421B (en) | Multitask fundus image classification method and apparatus | |
CN109635862B (en) | Sorting method for retinopathy of prematurity plus lesion | |
CN116758038A (en) | Infant retina disease information identification method and system based on training network | |
CN109948719B (en) | Automatic fundus image quality classification method based on residual dense module network structure | |
CN108553079A (en) | Lesion identifying system based on eye fundus image | |
CN112101424B (en) | Method, device and equipment for generating retinopathy identification model | |
KR102313143B1 (en) | Diabetic retinopathy detection and severity classification apparatus Based on Deep Learning and method thereof | |
Chen et al. | Detection of diabetic retinopathy using deep neural network | |
CN110599480A (en) | Multi-source input fundus image classification method and device | |
CN112957005A (en) | Automatic identification and laser photocoagulation region recommendation algorithm for fundus contrast image non-perfusion region | |
Kajan et al. | Detection of diabetic retinopathy using pretrained deep neural networks | |
Sharma et al. | Harnessing the Strength of ResNet50 to Improve the Ocular Disease Recognition | |
Mohamed et al. | Improved automatic grading of diabetic retinopathy using deep learning and principal component analysis | |
AU2021100684A4 (en) | DEPCADDX - A MATLAB App for Caries Detection and Diagnosis from Dental X-rays | |
CN112741651B (en) | Method and system for processing ultrasonic image of endoscope | |
Tian et al. | Learning discriminative representations for fine-grained diabetic retinopathy grading | |
Venkatalakshmi et al. | Graphical user interface for enhanced retinal image analysis for diagnosing diabetic retinopathy | |
Himami et al. | Deep learning in image classification using dense networks and residual networks for pathologic myopia detection | |
Ou et al. | M 2 LC-Net: A multi-modal multi-disease long-tailed classification network for real clinical scenes | |
Sadhukhan et al. | Optic disc localization in retinal fundus images using faster R-CNN | |
Sengar et al. | An efficient artificial intelligence-based approach for diagnosis of media haze disease | |
Latha et al. | Automated macular disease detection using retinal optical coherence tomography images by fusion of deep learning networks | |
Zhou et al. | Computer aided diagnosis for diabetic retinopathy based on fundus image | |
CN113273959B (en) | Portable diabetic retinopathy diagnosis and treatment instrument |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |