CN116758038A - Infant retina disease information identification method and system based on training network - Google Patents

Infant retina disease information identification method and system based on training network Download PDF

Info

Publication number
CN116758038A
CN116758038A CN202310747947.8A CN202310747947A CN116758038A CN 116758038 A CN116758038 A CN 116758038A CN 202310747947 A CN202310747947 A CN 202310747947A CN 116758038 A CN116758038 A CN 116758038A
Authority
CN
China
Prior art keywords
features
network
module
attention
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310747947.8A
Other languages
Chinese (zh)
Inventor
张国明
刘亚玲
谢海
赵欣予
吴祯泉
唐建楠
郑棉瑩
陈妙虹
雷柏英
汪天富
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Eye Hospital (shenzhen Institute Of Eye Disease Prevention And Control)
Original Assignee
Shenzhen Eye Hospital (shenzhen Institute Of Eye Disease Prevention And Control)
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Eye Hospital (shenzhen Institute Of Eye Disease Prevention And Control) filed Critical Shenzhen Eye Hospital (shenzhen Institute Of Eye Disease Prevention And Control)
Priority to CN202310747947.8A priority Critical patent/CN116758038A/en
Publication of CN116758038A publication Critical patent/CN116758038A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20021Dividing image into blocks, subimages or windows
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30041Eye; Retina; Ophthalmic
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/03Recognition of patterns in medical or anatomical images

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biomedical Technology (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Medical Informatics (AREA)
  • Biophysics (AREA)
  • Multimedia (AREA)
  • Databases & Information Systems (AREA)
  • Quality & Reliability (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Image Analysis (AREA)
  • Eye Examination Apparatus (AREA)

Abstract

The application relates to a training network-based infant retinal disease information identification method and system, wherein the method comprises the following steps: extracting local-global features of the retina image to be detected by adopting a CNN and Transformer mixed network, fusing the local-global features by adopting a depth attention fusion module, training ROP data in the fused features by adopting a training network, extracting information with depth feature expression, and realizing ROP severity classification; combining the advantages and disadvantages of CNN and a transducer network, the dual tasks of automatically detecting various infant retinopathy and grading the severity of ROP are realized; the device can detect various common infant fundus lesions, shield the influence of other lesion characteristics, and greatly improve the accuracy of an automatic detection system for the common infant fundus lesions; the discomfort of the child patient due to the fact that repeated examination is difficult to diagnose is reduced to a certain extent, the certain misdiagnosis and missed diagnosis rate are reduced, and the diagnosis and treatment efficiency of doctors is improved.

Description

Infant retina disease information identification method and system based on training network
Technical Field
The application relates to the technical field of ophthalmic disease recognition and image recognition, in particular to a training network-based infant retinal disease information recognition method and system.
Background
Deep Learning (DL) is a mature but still rapidly evolving technology, especially in the context of computer-aided diagnosis of human diseases. In terms of computer major algorithms, he et al proposed a model res net with a residual structure, so that network depth was continuously increased without overfitting, and shallow-deep features were extracted to improve network recognition accuracy. Dosovitsky et al propose a different scale Transformer framework, i.e., design different scale input block sizes to train large scale data to obtain higher classification accuracy. Chen et al employ a pyramid structure and select a new region-local attention mechanism instead of a global self-attention mechanism to obtain more spatial information, thereby improving classification accuracy. Tu et al describe an efficient and extensible attention model called multiaxial attention (MaxViT) that consists of two aspects: blocking local attention and expanding global attention. These design choices allow global local spatial interactions at any input resolution with only linear complexity. Valanaasu et al propose a gated axial attention model that extends the existing architecture by introducing additional control mechanisms in the self-attention module, and we propose a local-global training strategy that further improves performance in order to train the model efficiently on medical images. Zhang et al propose a new parallel branch TransFuse network that combines the transformers and CNNs together in a parallel fashion, can effectively capture global dependencies and low-level spatial detail features in a shallower fashion, and fuses features extracted at different levels of the two branches using a bi-directional fusion module.
Deep learning techniques are widely used in the field of medical image analysis. As a representative framework of deep learning, convolutional neural networks (Convolutional Neural Network, CNN) are often used in backbone network frameworks to extract deep features in medical images by virtue of their strong feature extraction capabilities. The residual network proposed in 2015 can pay attention to the information of the shallow features while extracting the deep features due to the unique residual jump connection. Therefore, the residual network can pay attention to the characteristic information of the deep layer and the shallow layer, so that the extracted characteristics are more complete, and the network performance is better. The residual network is also selected as one of the branches for feature extraction. However, the features extracted by the pure CNN network lack the expression of global feature information, so that the network performance improvement has a certain limitation. For this reason, a Transformer network has been developed that can constantly learn global feature information having long-distance dependency relationships using a multi-head self-attention mechanism.
Among the numerous ocular diseases, congenital abnormalities and early-onset diseases are particularly important. Conventional ocular fundus diseases in infants generally include retinopathy of prematurity (ROP), coats disease, retinoblastoma (RB), retinitis Pigmentosa (RP), choroidal defects, congenital retinal folds, and familial exudative vitreoretinopathy. Most diseases have a long-term impact on the structure and function of the eye, including ametropia, night blindness, and may increase the abnormal arrangement of the eye (strabismus) and neovascular glaucoma. Among them, ROP is a major cause of vision impairment and blindness in children, and even 8000 RB newborns worldwide may need to undergo eyeball removal surgery to save lives.
From a clinical point of view, these infant conditions often lead to severe vision impairment and even blindness in the child for life, which has a long-term impact on society, especially future employment pressures. Since infant fundus diseases are not common in some general hospitals, they are often ignored, and even after they are encountered, accurate diagnosis may not be made. At the same time, specialized ophthalmologists worldwide are not available in sufficient resources.
In the field of ocular fundus diseases, several studies on automatic examination methods for retinopathy of prematurity have been successively carried out, and most of these studies are to detect a single ROP lesion. To date, we have found that little research has focused on detecting more than one type of ocular fundus disease in infants, and even a variety of ocular fundus diseases. In real life, particularly in remote areas lacking specialized ophthalmologists, it is necessary to effectively detect various types of fundus diseases.
Therefore, a more efficient auxiliary detection system is needed to solve this problem.
Disclosure of Invention
Aiming at the defects in the prior art, the application provides a training network-based infant retina disease information identification method and a training network-based infant retina disease information identification system.
The technical scheme adopted for solving the technical problems is as follows:
a training network-based infant retinal disease information identification method is constructed, which comprises the following steps:
extracting local-global characteristics of the retina image to be detected by adopting a CNN and Transformer mixed network;
the depth attention fusion module is adopted to fuse the local-global characteristics, so as to obtain fused characteristics with the local-global characteristic expression capability;
and training ROP data in the fused features by adopting a training network, extracting information with depth feature expression, and realizing ROP severity grading.
The application relates to a training network-based infant retinal disease information identification method, wherein the method for extracting local-global features of a retina image to be detected by adopting a CNN and Transformer mixed network comprises the following steps:
extracting depth semantic feature information in the retina image to be detected by using residual error network modules with different scales of the CNN network to obtain local features;
and extracting global features with long-distance dependency relations by using a 4-stage transducer network module.
The application discloses a training network-based infant retinal disease information identification method, wherein the fusion of local-global features by using a deep attention fusion module comprises the following steps:
the characteristics extracted by the residual block network and the transducer network module are used as the input of the depth attention fusion module, and element level addition processing is carried out after point convolution operation;
the processed features are input into a ReLU activation function for activation, then point convolution is utilized for feature extraction, and then the feature extraction is performed through a Sigmoid function for activation, so that attention features are obtained;
and performing element-level multiplication operation on the obtained attention characteristic and the characteristic obtained by the transducer network module to obtain a deep attention fusion characteristic.
The application discloses a training network-based infant retinal disease information identification method, wherein the algorithm flow of a transducer network module comprises the following steps:
firstly, carrying out blocking operation on an input image, then processing the block image by utilizing block embedding operation, carrying out layer regularization treatment on processed block features, inputting the processed block features into a multi-head attention module, outputting features, and adding the features and the block embedding features to obtain attention features;
and inputting the obtained attention features into a layer regularization module to obtain regularized features, inputting the regularized features into a multi-layer perceptron to obtain processed features, and adding the processed features with the previous attention features at element level to obtain module processing features of the converter network module.
The application discloses a training network-based infant retinal disease information identification method, wherein the residual network module algorithm flow comprises the following steps:
firstly, inputting an input image into a 3×3 convolution to obtain a convolution feature, regularizing and activating a function to obtain a standardized feature, inputting the feature into the 3×3 convolution to obtain the convolution feature, and inputting the convolution into the regularized function to obtain the regularized feature;
and adding the characteristics obtained by the operation with the input characteristics by element level to obtain processed characteristics, and obtaining residual module characteristics through an activation function.
An infant retinal disease information identification system based on a training network, comprising:
the hybrid network module is composed of CNN and a transducer and is used for extracting local-global characteristics of the retina image to be detected;
the depth attention fusion module is used for fusing the local-global characteristics to obtain fused characteristics with the local-global characteristic expression capability;
and the training network module is used for training ROP data in the fused features, extracting information with depth feature expression and realizing ROP severity classification.
The application relates to a training network-based infant retinal disease information identification system, wherein the local-global characteristics of a retinal image to be detected are extracted by a hybrid network module by adopting the following steps:
extracting depth semantic feature information in the retina image to be detected by using residual error network modules with different scales of the CNN network to obtain local features;
and extracting global features with long-distance dependency relations by using a 4-stage transducer network module.
The application discloses a training network-based infant retinal disease information identification system, wherein the deep attention fusion module fuses local-global characteristics by adopting the following method:
the characteristics extracted by the residual block network and the transducer network module are used as the input of the depth attention fusion module, and element level addition processing is carried out after point convolution operation;
the processed features are input into a ReLU activation function for activation, then point convolution is utilized for feature extraction, and then the feature extraction is performed through a Sigmoid function for activation, so that attention features are obtained;
and performing element-level multiplication operation on the obtained attention characteristic and the characteristic obtained by the transducer network module to obtain a deep attention fusion characteristic.
The application discloses a training network-based infant retinal disease information identification system, wherein the algorithm flow of a transducer network module comprises the following steps:
firstly, carrying out blocking operation on an input image, then processing the block image by utilizing block embedding operation, carrying out layer regularization treatment on processed block features, inputting the processed block features into a multi-head attention module, outputting features, and adding the features and the block embedding features to obtain attention features;
and inputting the obtained attention features into a layer regularization module to obtain regularized features, inputting the regularized features into a multi-layer perceptron to obtain processed features, and adding the processed features with the previous attention features at element level to obtain module processing features of the converter network module.
The application discloses a training network-based infant retinal disease information identification system, wherein the residual network module algorithm flow comprises the following steps:
firstly, inputting an input image into a 3×3 convolution to obtain a convolution feature, regularizing and activating a function to obtain a standardized feature, inputting the feature into the 3×3 convolution to obtain the convolution feature, and inputting the convolution into the regularized function to obtain the regularized feature;
and adding the characteristics obtained by the operation with the input characteristics by element level to obtain processed characteristics, and obtaining residual module characteristics through an activation function.
The application has the beneficial effects that: combining the advantages and disadvantages of the CNN and the Transformer network, providing a double-stage deep learning network combining the CNN and the Transformer network, and realizing the double tasks of automatically detecting various infant retinopathy and ROP severity classification; the device can detect various common infant fundus lesions, shield the influence of other lesion characteristics, and greatly improve the accuracy of an automatic detection system for the common infant fundus lesions; the method reduces the uncomfortable feeling of the child patient due to difficult diagnosis and repeated examination to a certain extent, reduces certain misdiagnosis and missed diagnosis rate, improves the diagnosis and treatment efficiency of doctors, and therefore reduces the complications of the child patient to a certain extent.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the present application will be further described with reference to the accompanying drawings and embodiments, in which the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained by those skilled in the art without inventive effort:
FIG. 1 is a flowchart of a training network-based infant retinal disease information identification method in accordance with a preferred embodiment of the present application;
FIG. 2 is a block diagram of a two-stage training network based method for identifying retinal diseases in infants in accordance with a preferred embodiment of the present application;
FIG. 3a is a block diagram of residual structure of a dual-stage training network based multiple infant retinal disease identification methods according to a preferred embodiment of the present application;
FIG. 3b is a schematic diagram of a two-stage training network based multiple infant retinal disease identification method transducer module according to the preferred embodiment of the present application;
FIG. 3c is a deep attention fusion module of a multiple infant retinal disease identification method based on a dual stage training network in accordance with a preferred embodiment of the present application;
FIGS. 4a-d are schematic diagrams of four confusion matrices, res-18, maxViT, res-18+MaxViT, res-18+MaxViT+DA, in accordance with a preferred embodiment of the present application;
fig. 5 is a schematic block diagram of an infant retinal disease information identification system based on a training network in accordance with a preferred embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application more apparent, the following description will be made in detail with reference to the technical solutions in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by a person skilled in the art without any inventive effort, are intended to be within the scope of the present application, based on the embodiments of the present application.
The training network-based infant retinal disease information identification method according to the preferred embodiment of the present application, as shown in fig. 1, with reference to fig. 2, 3a, 3b, 3c and 4a-d, comprises the following steps:
s01: extracting local-global characteristics of the retina image to be detected by adopting a CNN and Transformer mixed network;
s02: the depth attention fusion module is adopted to fuse the local-global characteristics, so as to obtain fused characteristics with the local-global characteristic expression capability;
s03: training ROP data in the fused features by adopting a training network, extracting information with depth feature expression, and realizing ROP severity grading;
combining the advantages and disadvantages of the CNN and the Transformer network, providing a double-stage deep learning network combining the CNN and the Transformer network, and realizing the double tasks of automatically detecting various infant retinopathy and ROP severity classification; the device can detect various common infant fundus lesions, shield the influence of other lesion characteristics, and greatly improve the accuracy of an automatic detection system for the common infant fundus lesions; the method reduces the uncomfortable feeling of the child patient due to difficult diagnosis and repeated examination to a certain extent, reduces certain misdiagnosis and missed diagnosis rate, improves the diagnosis and treatment efficiency of doctors, and therefore reduces the complications of the child patient to a certain extent.
Aiming at the dual tasks of automatically detecting various infant retinopathy and ROP severity grading, a dual-stage training network is designed, as shown in figure 2, for the training network 1, the application utilizes a mixed network of CNN and a transducer to extract local-global characteristics, utilizes a 4-stage transducer network to extract global characteristics with long-distance dependency, and utilizes residual networks with different scales to extract deep semantic characteristic information. The features extracted by CNN and the transducer network branches are fused by a deep attention fusion module, so that the fused features have the local-global feature expression capability.
The residual error network module, the transducer network module and the deep attention fusion module used in the application are shown in fig. 3a, 3b and 3 c;
preferably, the residual network module algorithm flow includes:
firstly, inputting an input image into a 3×3 convolution to obtain a convolution feature, regularizing and activating a function to obtain a standardized feature, inputting the feature into the 3×3 convolution to obtain the convolution feature, and inputting the convolution into the regularized function to obtain the regularized feature;
and adding the characteristics obtained by the operation with the input characteristics by element level to obtain processed characteristics, and obtaining residual module characteristics through an activation function.
Preferably, the method for fusing the local-global features by using the deep attention fusion module comprises the following steps:
the characteristics extracted by the residual block network and the transducer network module are used as the input of the depth attention fusion module, and element level addition processing is carried out after point convolution operation;
the processed features are input into a ReLU activation function for activation, then point convolution is utilized for feature extraction, and then the feature extraction is performed through a Sigmoid function for activation, so that attention features are obtained;
and performing element-level multiplication operation on the obtained attention characteristic and the characteristic obtained by the transducer network module to obtain a deep attention fusion characteristic.
Preferably, the algorithm flow of the transducer network module comprises:
firstly, carrying out blocking operation on an input image, then processing the block image by utilizing block embedding operation, carrying out layer regularization treatment on processed block features, inputting the processed block features into a multi-head attention module, outputting features, and adding the features and the block embedding features to obtain attention features;
and inputting the obtained attention features into a layer regularization module to obtain regularized features, inputting the regularized features into a multi-layer perceptron to obtain processed features, and adding the processed features with the previous attention features at element level to obtain module processing features of the converter network module.
By adopting the training network 1 of the application, by comparing different base line network results, a proper frame of the base line network component training network 1 is selected, and the comparison results of the different base line networks in the retinal multi-disease classification and identification are shown in the table 1:
TABLE 1 comparison of different basis-line networks for retinal multiple disease identification
As can be seen from Table 1, the Res-18 model is best classified as a CNN network and is the lowest complexity, so the present application selects the ResNet-18 model as the CNN branched backbone network. For a transducer network, the MaxViT can perform local and global spatial information interaction due to the multi-axis attention mechanism designed by the MaxViT, so that the best classification performance is obtained, and therefore, the MaxViT is selected as a backbone network of a transducer branch.
In summary, the application selects ResNet-18 and MaxViT as backbone design hybrid network to extract local-global characteristics to realize the classification and identification task of multiple retina diseases.
In order to verify the effectiveness of the design, the application also carries out a plurality of ablation experiments under the condition of ensuring that the verification settings are completely consistent, namely, firstly, a ResNet-18 network and a MaxViT network are respectively utilized to carry out complete experiments, then, the two networks are combined to carry out experiments, and finally, a deep attention fusion module is added to carry out experiments on the basis before. The results are shown in Table 2.
TABLE 2 ablation experiment results for different network modules
As can be seen from Table 2, the classification results obtained by combining ResNet-18 and MaxViT network models are the best, and the overall performance is further improved after the deep attention fusion module is added.
In addition, the application calculates confusion matrixes of different methods on different disease types as an auxiliary evidence, and proves that the application proposes that the network framework is effective in the task of classifying and identifying the multiple diseases of the retina, as shown in fig. 4a-d, and (a) - (d) in fig. 4a-d respectively represent the confusion matrixes obtained by four modes of Res-18, maxViT, res-18+MaxViT and Res-18+MaxViT+DA (the method of the application).
Aiming at the training network 2, the retina multi-disease identification result is obtained through the training network 1, and the severity classification is carried out on the obtained ROP category data by using the training network 2.
In the application, the ResNet-34 model is selected as a framework in consideration of precision and complexity, and the classification task of the ROP severity is realized, and the auxiliary evidence is shown in a table 3.
TABLE 3 comparison of results of different network models on ROP severity classification tasks
Methods Accuracy Precision Recall F1 Kappa
ResNet-18 91.82(0.49) 93.01(0.26) 92.83(0.98) 92.89(0.52) 86.64(1.41)
ResNet-34 92.41(0.02) 93.19(0.19) 93.47(0.02) 93.21(0.05) 87.61(0.19)
ResNet-50 93.14(0.13) 94.23(0.14) 94.05(0.07) 94.13(0.09) 88.78(0.32)
An infant retinal disease information identification system based on a training network, as shown in fig. 5, includes:
the hybrid network module 100 is composed of CNN and a transducer and is used for extracting local-global characteristics of the retina image to be detected;
the depth attention fusion module 101 is configured to fuse the local-global features to obtain fused features with the local-global feature expression capability;
the training network module 102 is configured to train ROP data in the fused features, extract information with depth feature expression, and implement ROP severity classification.
The specific overview of the system content is referred to the above method section and will not be repeated here;
in summary, the following drawbacks in the prior art are addressed:
1. prediction of a single disease.
2. Most often using a single neural network
3. Modules are added in a transducer and a CNN, so that the complexity of a network is increased;
4. aiming at the single-task design, the design structure of the double tasks in the application cannot be satisfied yet;
the method and the system have the following beneficial effects:
1. the application combines knowledge in the business field, can detect various common infant fundus lesions, shield the influence of other lesion characteristics, and greatly improve the accuracy (including indexes such as sensitivity, specificity, F1 and the like) of the common infant fundus lesion automatic inspection system.
2. Aiming at the characteristics of complex characteristics and small difference of certain focus of the common fundus image of the infants, the prior art mainly adopts manual judgment to easily misdiagnose or difficult to give accurate disease judgment; because all pictures are analyzed in the same way, subjectivity is not involved. Although this work is the job to be completed for pediatric ophthalmologists, the misdiagnosis rate and missed diagnosis rate of common infant fundus diseases are high due to the limited professional level; the device can accurately and rapidly identify various common infant fundus diseases through rapid and efficient learning, reduces uncomfortable feeling of the infant due to difficult diagnosis and repeated inspection to a certain extent, reduces certain misdiagnosis and missed diagnosis rate, improves diagnosis and treatment efficiency of doctors, and therefore complications of the infant to a certain extent.
3. The operation is simple, and the universal adaptability is realized. Once properly trained, even a non-ophthalmologist doctor can make a preliminary diagnosis so that the child patient does not miss the optimal treatment time; .
4. The algorithm model has the advantages that: extracting a network structure with local-global characteristic information expression based on a mixed framework of CNN and a transducer; the depth attention module is used for fusing the feature information extracted from the CNN and the transducer branches, so that the extracted features have complete expressivity; aiming at the dual tasks of automatically detecting various infant retinopathy and ROP severity classification, a dual-stage data feature training network is designed, namely, the identification of multiple retinal diseases is realized by utilizing a CNN and Transformer hybrid network, and the ROP severity classification task is realized by utilizing a ResNet-34 model.
It will be understood that modifications and variations will be apparent to those skilled in the art from the foregoing description, and it is intended that all such modifications and variations be included within the scope of the following claims.

Claims (10)

1. The infant retina disease information identification method based on the training network is characterized by comprising the following steps of:
extracting local-global characteristics of the retina image to be detected by adopting a CNN and Transformer mixed network;
the depth attention fusion module is adopted to fuse the local-global characteristics, so as to obtain fused characteristics with the local-global characteristic expression capability;
and training ROP data in the fused features by adopting a training network, extracting information with depth feature expression, and realizing ROP severity grading.
2. The training network-based infant retinal disease information identification method according to claim 1, wherein the extracting of local-global features of the retinal image to be measured using the CNN and transducer mixed network comprises the steps of:
extracting depth semantic feature information in the retina image to be detected by using residual error network modules with different scales of the CNN network to obtain local features;
and extracting global features with long-distance dependency relations by using a 4-stage transducer network module.
3. The training network-based infant retinal disease information identification method according to claim 2, wherein the fusing of local-global features using the deep attention fusion module comprises the steps of:
the characteristics extracted by the residual block network and the transducer network module are used as the input of the depth attention fusion module, and element level addition processing is carried out after point convolution operation;
the processed features are input into a ReLU activation function for activation, then point convolution is utilized for feature extraction, and then the feature extraction is performed through a Sigmoid function for activation, so that attention features are obtained;
and performing element-level multiplication operation on the obtained attention characteristic and the characteristic obtained by the transducer network module to obtain a deep attention fusion characteristic.
4. The training network-based infant retinal disease information identification method according to claim 2, wherein the Transformer network module algorithm flow comprises:
firstly, carrying out blocking operation on an input image, then processing the block image by utilizing block embedding operation, carrying out layer regularization treatment on processed block features, inputting the processed block features into a multi-head attention module, outputting features, and adding the features and the block embedding features to obtain attention features;
and inputting the obtained attention features into a layer regularization module to obtain regularized features, inputting the regularized features into a multi-layer perceptron to obtain processed features, and adding the processed features with the previous attention features at element level to obtain module processing features of the converter network module.
5. The training network-based infant retinal disease information identification method according to claim 2, wherein the residual network module algorithm flow comprises:
firstly, inputting an input image into a 3×3 convolution to obtain a convolution feature, regularizing and activating a function to obtain a standardized feature, inputting the feature into the 3×3 convolution to obtain the convolution feature, and inputting the convolution into the regularized function to obtain the regularized feature;
and adding the characteristics obtained by the operation with the input characteristics by element level to obtain processed characteristics, and obtaining residual module characteristics through an activation function.
6. An infant retinal disease information identification system based on a training network, comprising:
the hybrid network module is composed of CNN and a transducer and is used for extracting local-global characteristics of the retina image to be detected;
the depth attention fusion module is used for fusing the local-global characteristics to obtain fused characteristics with the local-global characteristic expression capability;
and the training network module is used for training ROP data in the fused features, extracting information with depth feature expression and realizing ROP severity classification.
7. The training network-based infant retinal disease information identification system of claim 6, wherein the hybrid network module extracts local-global features of the retinal image to be tested using the method of:
extracting depth semantic feature information in the retina image to be detected by using residual error network modules with different scales of the CNN network to obtain local features;
and extracting global features with long-distance dependency relations by using a 4-stage transducer network module.
8. The training network-based infant retinal disease information identification system of claim 7, wherein the deep attention fusion module fuses local-global features using the method of:
the characteristics extracted by the residual block network and the transducer network module are used as the input of the depth attention fusion module, and element level addition processing is carried out after point convolution operation;
the processed features are input into a ReLU activation function for activation, then point convolution is utilized for feature extraction, and then the feature extraction is performed through a Sigmoid function for activation, so that attention features are obtained;
and performing element-level multiplication operation on the obtained attention characteristic and the characteristic obtained by the transducer network module to obtain a deep attention fusion characteristic.
9. The training network-based infant retinal disease information identification system of claim 7, wherein the Transformer network module algorithm flow comprises:
firstly, carrying out blocking operation on an input image, then processing the block image by utilizing block embedding operation, carrying out layer regularization treatment on processed block features, inputting the processed block features into a multi-head attention module, outputting features, and adding the features and the block embedding features to obtain attention features;
and inputting the obtained attention features into a layer regularization module to obtain regularized features, inputting the regularized features into a multi-layer perceptron to obtain processed features, and adding the processed features with the previous attention features at element level to obtain module processing features of the converter network module.
10. The training network-based infant retinal disease information identification system of claim 7, wherein the residual network module algorithm flow comprises:
firstly, inputting an input image into a 3×3 convolution to obtain a convolution feature, regularizing and activating a function to obtain a standardized feature, inputting the feature into the 3×3 convolution to obtain the convolution feature, and inputting the convolution into the regularized function to obtain the regularized feature;
and adding the characteristics obtained by the operation with the input characteristics by element level to obtain processed characteristics, and obtaining residual module characteristics through an activation function.
CN202310747947.8A 2023-06-25 2023-06-25 Infant retina disease information identification method and system based on training network Pending CN116758038A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310747947.8A CN116758038A (en) 2023-06-25 2023-06-25 Infant retina disease information identification method and system based on training network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310747947.8A CN116758038A (en) 2023-06-25 2023-06-25 Infant retina disease information identification method and system based on training network

Publications (1)

Publication Number Publication Date
CN116758038A true CN116758038A (en) 2023-09-15

Family

ID=87951148

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310747947.8A Pending CN116758038A (en) 2023-06-25 2023-06-25 Infant retina disease information identification method and system based on training network

Country Status (1)

Country Link
CN (1) CN116758038A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117274228A (en) * 2023-10-24 2023-12-22 脉得智能科技(无锡)有限公司 Ultrasonic image risk classification system based on deep learning of schistosome liver diseases
CN117789284A (en) * 2024-02-28 2024-03-29 中日友好医院(中日友好临床医学研究所) Identification method and device for ischemic retinal vein occlusion

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180214087A1 (en) * 2017-01-30 2018-08-02 Cognizant Technology Solutions India Pvt. Ltd. System and method for detecting retinopathy
CN111259982A (en) * 2020-02-13 2020-06-09 苏州大学 Premature infant retina image classification method and device based on attention mechanism
CN114998210A (en) * 2022-04-29 2022-09-02 华南理工大学 Premature infant retinopathy detection system based on deep learning target detection
CN115690479A (en) * 2022-05-23 2023-02-03 安徽理工大学 Remote sensing image classification method and system based on convolution Transformer

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180214087A1 (en) * 2017-01-30 2018-08-02 Cognizant Technology Solutions India Pvt. Ltd. System and method for detecting retinopathy
CN111259982A (en) * 2020-02-13 2020-06-09 苏州大学 Premature infant retina image classification method and device based on attention mechanism
CN114998210A (en) * 2022-04-29 2022-09-02 华南理工大学 Premature infant retinopathy detection system based on deep learning target detection
CN115690479A (en) * 2022-05-23 2023-02-03 安徽理工大学 Remote sensing image classification method and system based on convolution Transformer

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
WEIMING LI等: "ConvTransNet: A CNN–Transformer Network for Change Detection With Multiscale Global–Local Representations", IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, vol. 61, 3 March 2023 (2023-03-03), pages 1 - 15 *
郝文强等: "基于Transformer和CNN的低剂量CT图像去噪网络", 海南师范大学学报(自然科学版), vol. 36, no. 02, 15 June 2023 (2023-06-15), pages 176 - 182 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117274228A (en) * 2023-10-24 2023-12-22 脉得智能科技(无锡)有限公司 Ultrasonic image risk classification system based on deep learning of schistosome liver diseases
CN117789284A (en) * 2024-02-28 2024-03-29 中日友好医院(中日友好临床医学研究所) Identification method and device for ischemic retinal vein occlusion
CN117789284B (en) * 2024-02-28 2024-05-14 中日友好医院(中日友好临床医学研究所) Identification method and device for ischemic retinal vein occlusion

Similar Documents

Publication Publication Date Title
CN108771530B (en) Fundus lesion screening system based on deep neural network
CN110570421B (en) Multitask fundus image classification method and apparatus
CN109635862B (en) Sorting method for retinopathy of prematurity plus lesion
CN116758038A (en) Infant retina disease information identification method and system based on training network
CN109948719B (en) Automatic fundus image quality classification method based on residual dense module network structure
CN108553079A (en) Lesion identifying system based on eye fundus image
CN112101424B (en) Method, device and equipment for generating retinopathy identification model
KR102313143B1 (en) Diabetic retinopathy detection and severity classification apparatus Based on Deep Learning and method thereof
Chen et al. Detection of diabetic retinopathy using deep neural network
CN110599480A (en) Multi-source input fundus image classification method and device
CN112957005A (en) Automatic identification and laser photocoagulation region recommendation algorithm for fundus contrast image non-perfusion region
Kajan et al. Detection of diabetic retinopathy using pretrained deep neural networks
Sharma et al. Harnessing the Strength of ResNet50 to Improve the Ocular Disease Recognition
Mohamed et al. Improved automatic grading of diabetic retinopathy using deep learning and principal component analysis
AU2021100684A4 (en) DEPCADDX - A MATLAB App for Caries Detection and Diagnosis from Dental X-rays
CN112741651B (en) Method and system for processing ultrasonic image of endoscope
Tian et al. Learning discriminative representations for fine-grained diabetic retinopathy grading
Venkatalakshmi et al. Graphical user interface for enhanced retinal image analysis for diagnosing diabetic retinopathy
Himami et al. Deep learning in image classification using dense networks and residual networks for pathologic myopia detection
Ou et al. M 2 LC-Net: A multi-modal multi-disease long-tailed classification network for real clinical scenes
Sadhukhan et al. Optic disc localization in retinal fundus images using faster R-CNN
Sengar et al. An efficient artificial intelligence-based approach for diagnosis of media haze disease
Latha et al. Automated macular disease detection using retinal optical coherence tomography images by fusion of deep learning networks
Zhou et al. Computer aided diagnosis for diabetic retinopathy based on fundus image
CN113273959B (en) Portable diabetic retinopathy diagnosis and treatment instrument

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination