WO2022188489A1 - Training method and apparatus for multi-mode multi-disease long-tail distribution ophthalmic disease classification model - Google Patents

Training method and apparatus for multi-mode multi-disease long-tail distribution ophthalmic disease classification model Download PDF

Info

Publication number
WO2022188489A1
WO2022188489A1 PCT/CN2021/137142 CN2021137142W WO2022188489A1 WO 2022188489 A1 WO2022188489 A1 WO 2022188489A1 CN 2021137142 W CN2021137142 W CN 2021137142W WO 2022188489 A1 WO2022188489 A1 WO 2022188489A1
Authority
WO
WIPO (PCT)
Prior art keywords
feature information
image
image sample
image feature
dual
Prior art date
Application number
PCT/CN2021/137142
Other languages
French (fr)
Chinese (zh)
Inventor
欧中洪
王莉菲
柴文俊
宋美娜
鄂海红
何佳雯
张如如
李峻迪
袁立飞
贾鑫
黄儒剑
Original Assignee
北京邮电大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京邮电大学 filed Critical 北京邮电大学
Publication of WO2022188489A1 publication Critical patent/WO2022188489A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10101Optical tomography; Optical coherence tomography [OCT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20112Image segmentation details
    • G06T2207/20132Image cropping
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30041Eye; Retina; Ophthalmic
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30096Tumor; Lesion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/03Recognition of patterns in medical or anatomical images

Definitions

  • the present application relates to the technical field of deep learning, and in particular, to an ophthalmic disease classification model training and its identification method and device under the multimodal and multi-disease long-tail distribution.
  • OCT optical coherence tomography
  • the color fundus photos and their corresponding disease labels are input into the neural network for training, and the fundus image features are extracted to finally give the disease classification results;
  • the OCT images and their corresponding disease labels are input into the neural network for training, The OCT image features are extracted to finally give the disease classification results;
  • Fundus images and OCT images and their corresponding disease labels are simultaneously input into the neural network for training, and the feature combination of the two modal images is extracted to finally give the disease classification results.
  • Scheme 1 and Scheme 2 can easily collect a large number of images, but only using a single image for auxiliary diagnosis does not conform to the actual clinical process of most eye disease diagnosis.
  • doctors usually combine multiple modal information to make comprehensive judgments; and Only a single image is used for the deep learning model eye disease classification decision, the number of features is limited, and the recognition accuracy is not enough.
  • Scheme 3 combines the characteristics of fundus images and OCT, which is in line with the actual clinical situation, but because it is difficult to collect a large number of images corresponding to fundus and OCT at the same time, there is less available data, and the existing research diseases are limited to AMD diseases.
  • the present application aims to solve one of the technical problems in the related art at least to a certain extent.
  • the first purpose of this application is to propose a method for classifying ophthalmic diseases under the long-tailed distribution of multi-modality and multi-disease.
  • the second objective of the present application is to propose an ophthalmic disease classification device under the long-tailed distribution of multi-modality and multi-disease.
  • the third object of the present application is to propose an electronic device.
  • a fourth object of the present application is to propose a computer-readable storage medium.
  • a fifth object of the present application is a computer program product.
  • the embodiment of the first aspect of the present application proposes a method for classifying ophthalmic diseases under the long-tailed distribution of multi-modality and multi-disease, including:
  • the dual-modality image sample includes an infrared macular region fundus image sample and an optical coherence tomography OCT image sample, and the dual-modality image sample is marked diagnostic labels;
  • the error value of the prediction result and the diagnostic label is calculated by a loss function, and the parameters of the neural network are continuously adjusted by back-propagation technology until the error value is maintained at a preset threshold, and an ophthalmic disease classification model is generated.
  • the method for classifying ophthalmic diseases under the long-tailed distribution of multi-modality and multi-disease by acquiring bimodal image samples, marking the bimodal image samples with diagnostic labels; Respectively input the first neural network for training to obtain the first image feature information and the second image feature information; calculate the total image feature information according to the first image feature information and the first weight, the second image feature information and the second weight, and input the full connection
  • the network obtains the prediction results; the parameters of the neural network are continuously adjusted by back-propagation technology until the error value is maintained at a preset threshold, and an ophthalmic disease classification model is generated.
  • the two-way convolutional neural network model is used to learn two modal image features to obtain a deep learning model similar to the clinical diagnosis process, which solves the problem of ophthalmic images that rely on multiple modal features when only a single modality is used for classification.
  • the accuracy is not enough, the paired color fundus and OCT images are difficult to collect, the disease categories are covered with few diseases and the real scenes show long-tailed data distribution, the categories are unbalanced, and the classification effect of the diseases with fewer samples is poor.
  • the performing data collection on an electronic medical record, acquiring a dual-modality image sample, and labeling the dual-modality image sample with a diagnostic label includes:
  • the bimodal image of the electronic medical record and the current diagnosis information are parsed, and a diagnostic label is marked on the bimodal image sample according to the diagnosis information.
  • the method further includes:
  • the loss function is shown in formula (1):
  • the method for identifying an ophthalmic disease classification model under a multimodal, multi-disease long-tailed distribution includes:
  • a second aspect embodiment of the present application proposes a multi-modal multi-disease long-tailed distribution device for classifying ophthalmic diseases, including:
  • the acquisition and annotation module is used to collect data from electronic medical records and acquire dual-modality image samples; wherein, the dual-modality image samples include infrared macular region fundus image samples and optical coherence tomography OCT image samples, and the dual-modality image samples are analyzed. Modal image samples are labeled with diagnostic labels;
  • an extraction module configured to input the infrared macular fundus image sample and the OCT image sample into the first neural network for training at the same time, and obtain the first image feature information and the second image feature information;
  • a prediction module configured to calculate the total image feature information according to the first image feature information and the first weight, the second image feature information and the second weight, and input it into a fully connected network to obtain a prediction result
  • the generating module is used to calculate the error value of the prediction result and the diagnostic label through a loss function, and continuously adjust the parameters of the neural network through the back-propagation technology until the error value is maintained at the preset threshold, and generate an ophthalmic disease classification Model.
  • the device for classifying ophthalmic diseases under the long-tailed distribution of multi-modality and multi-disease by acquiring dual-modality image samples, the dual-modality image samples are marked with diagnostic labels; the infrared macular region fundus image samples and the OCT image samples are Respectively input the first neural network for training to obtain the first image feature information and the second image feature information; calculate the total image feature information according to the first image feature information and the first weight, the second image feature information and the second weight, and input the full connection
  • the network obtains the prediction results; the parameters of the neural network are continuously adjusted by back-propagation technology until the error value is maintained at a preset threshold, and an ophthalmic disease classification model is generated.
  • the two-way convolutional neural network model is used to learn two modal image features to obtain a deep learning model similar to the clinical diagnosis process, which solves the problem of ophthalmic images that rely on multiple modal features when only a single modality is used for classification.
  • the accuracy is not enough, the paired color fundus and OCT images are difficult to collect, the disease categories are covered with few diseases and the real scenes show long-tailed data distribution, the categories are unbalanced, and the classification effect of the diseases with fewer samples is poor.
  • the obtaining and labeling module is specifically used for:
  • the bimodal image of the electronic medical record and the diagnosis information at that time are parsed, and the bimodal image sample is marked with a diagnostic label according to the diagnosis information.
  • the device further includes:
  • the preprocessing module is used to adjust the size of the infrared macular fundus image sample and the optical coherence tomography OCT image sample, and perform random preset angle rotation, random sharpness enhancement, random brightness enhancement, random chromaticity enhancement, and random contrast enhancement. and one or more of random horizontal flip operations.
  • the loss function is shown in formula (1):
  • the device for identifying an ophthalmic disease classification model under a multimodal and multi-disease long-tailed distribution includes:
  • the acquisition module is used to acquire the fundus image samples and OCT images in the infrared macular region to be identified;
  • a diagnosis module is used for inputting the infrared macular region fundus image sample and the OCT image into the ophthalmic disease classification model for processing to obtain a diagnosis result.
  • an embodiment of a third aspect of the present application provides an electronic device, comprising: a processor; a memory for storing instructions executable by the processor; wherein the processor is configured to execute the instructions , in order to realize a multimodal multi-disease long-tailed distribution ophthalmic disease classification method proposed in the embodiment of the first aspect of the present application.
  • a fourth aspect of the present application provides a computer-readable storage medium, when the instructions in the computer-readable storage medium are executed by a processor of an electronic device, the electronic device can execute the present invention.
  • a method for classifying ophthalmic diseases under the long-tailed distribution of multimodality and multidiseases proposed by the embodiment of the first aspect of the application.
  • the fifth aspect of the present application provides a computer program product, including a computer program that, when executed by a processor, implements the multi-modality and multi-illness proposed by the first aspect of the present application.
  • FIG. 1 is a schematic flowchart of a method for classifying ophthalmic diseases under the long-tailed distribution of multi-modality and multi-diseases provided in Embodiment 1 of the present application;
  • FIG. 2 is an example diagram of a two-way model provided by Embodiment 1 of the present application.
  • FIG. 3 is a schematic flowchart of a method for classifying ophthalmic diseases under a multimodal multi-disease long-tail distribution provided by the second embodiment of the application;
  • FIG. 4 is a schematic structural diagram of a device for classifying ophthalmic diseases under the long-tailed distribution of multi-modality and multi-disease according to an embodiment of the present application.
  • FIG. 1 is a schematic flowchart of a method for classifying ophthalmic diseases under the long-tailed distribution of multi-modality and multi-disease according to the first embodiment of the present application.
  • the existing technology adopts the loosepair training method, that is, by combining the images of the same disease Instead of combining the multimodal images of the same eye to complete the training, although the number of samples is effectively expanded, this scheme reduces the correlation between the two images of the input model and reduces the interpretability of the model.
  • This application uses the infrared macular fundus image and the OCT image of the same eye used by doctors in the diagnosis of OCT equipment as dual-modal data.
  • the infrared macular fundus image and the OCT image exist in large numbers in the electronic diagnosis report at the same time. Therefore, a large amount of effective multimodal data can be obtained, which is more in line with the actual clinical diagnosis process and can improve the classification effect.
  • the electronic case data collection module and data labeling module designed in this application can effectively use this data.
  • the existing technology has fewer classification labels, and only performs three-disease internal classification for one disease of AMD, which cannot effectively deal with the long-tailed distribution of multi-disease data in real scenarios.
  • This proposal uses a two-stage training model and designs a training scheme combined with class-balanced-loss to effectively classify more than ten diseases, which can effectively improve the overall classification effect and the classification effect of diseases with a small number of samples.
  • the current mainstream ophthalmic disease image classification research mainly includes lesion recognition based on fundus images, and lesion recognition based on OCT images, and the classification features are extracted by the convolutional neural network model to give prediction results.
  • most of the existing schemes use a single modality image.
  • the recognition accuracy is not enough; the existing methods mostly assume that the distribution of disease categories is uniform, which does not conform to the actual clinical data distribution. , it is difficult to deal with the problem of long-tailed distribution of data in real scenarios.
  • the present application collects a large number of pairs of dual-modal images by a convenient method by collecting infrared macular fundus images and OCT images on OCT equipment, and learns two-modal images through a dual-channel convolutional neural network model.
  • the features are derived from deep learning models similar to the clinical diagnosis process.
  • the method for classifying ophthalmic diseases under the multimodal and multi-disease long-tail distribution includes the following steps 101 to 104 .
  • Step 101 Collect data from electronic medical records to obtain dual-modality image samples; wherein, the dual-modality image samples include infrared macular fundus image samples and optical coherence tomography OCT image samples, and perform labeling and diagnosis on the dual-modality image samples Label.
  • the dual-modality image samples include infrared macular fundus image samples and optical coherence tomography OCT image samples, and perform labeling and diagnosis on the dual-modality image samples Label.
  • an electronic case parsing algorithm that parses the document format is designed to parse the bimodal image of the electronic medical record and the current diagnosis information, and the bimodal image sample is marked with a diagnostic label according to the diagnosis information.
  • the size of the infrared macular fundus image sample and the optical coherence tomography OCT image sample is adjusted, and random preset angle rotation, random sharpness enhancement, random brightness enhancement, random chromaticity enhancement, random contrast are performed.
  • random preset angle rotation, random sharpness enhancement, random brightness enhancement, random chromaticity enhancement, random contrast are performed.
  • One or more of augmentation and random horizontal flip operations are performed.
  • the generated electronic medical record contains the infrared macular fundus image, and also includes the corresponding OCT image slice.
  • the disease labels to be labeled are established according to the actual clinical situation, the parsed bimodal images and case diagnosis information are selected and uploaded to the image labeling platform, and professional labelers (chief doctors, etc.) Images are annotated.
  • data enhancement is performed on the data.
  • the data is cropped into fundus images and OCT images before being input into the model.
  • the size of each image is modified to 224 ⁇ 224 ⁇ 3, and random 30° rotation and random sharpening are performed on the training data.
  • Step 102 input the infrared macular region fundus image sample and the OCT image sample into the first neural network respectively for training, and obtain the first image feature information and the second image feature information.
  • Step 103 Calculate the total image feature information according to the first image feature information and the first weight, the second image feature information and the second weight, and input the total image feature information into a fully connected network to obtain a prediction result.
  • the network model consists of two symmetrical branches, one for processing fundus images and the other for processing OCT images, and the weights of the two branches are not shared.
  • Each branch uses ResNet18 to delete all fully connected layers as the backbone network as shown in ResNet18-backbone in Figure 2, splicing CBAM (Convolutional Block Attention Module, the attention mechanism module of the convolution module) attention mechanism module, extracting image features information, finally merge the two branch weights, and concatenate the fully connected layer to give prediction results, such as no obvious lesions, epiretinal membrane, central serous chorioretinopathy, macular hole, macular schisis, choroidal neovascularization, age-related Macular degeneration, retinal detachment, branch vein occlusion, arterial occlusion, central vein occlusion, one of Harada disease.
  • CBAM Convolutional Block Attention Module
  • Step 104 Calculate the error value of the prediction result and the diagnostic label through the loss function, and continuously adjust the neural network parameters through the back-propagation technology until the error value is maintained at a preset threshold, and an ophthalmic disease classification model is generated.
  • the loss function is shown in formula (1):
  • the loss function is reweighted using the inverse of the number of valid samples of each class to balance the loss, thereby effectively improving the performance of small sample data in classification.
  • Focal loss is a loss function proposed to solve the serious imbalance of the proportion of positive and negative samples in one-stage target detection. Therefore, Focal Loss is selected as the loss function in this scheme. Focal loss is defined as follows:
  • E [E 1 , E 2 , . . . , E N ], E ⁇ R 12 .
  • the infrared macular fundus image samples and OCT images to be identified are obtained; the infrared macular fundus image samples and OCT images are input into the ophthalmic disease classification model for processing to obtain diagnostic results.
  • the mid-infrared macular fundus image of the OCT device as an auxiliary image, combined with the OCT image to construct a dual-modal image input, an efficient acquisition algorithm was designed to obtain dual-modal data, and a two-stage model training method was used. Data distribution characteristics, freezing the convolution layer in two stages, and weighting the class-balanced-loss retraining through the statistical information of each disease category.
  • the model training scheme designed in this application can significantly improve the overall classification effect, especially for diseases with a small number of samples classification effect.
  • the method for classifying ophthalmic diseases under the long-tailed distribution of multi-modality and multi-disease by acquiring bimodal image samples, marking the bimodal image samples with diagnostic labels; Respectively input the first neural network for training to obtain the first image feature information and the second image feature information; calculate the total image feature information according to the first image feature information and the first weight, the second image feature information and the second weight, and input the full connection
  • the network obtains the prediction results; the parameters of the neural network are continuously adjusted by back-propagation technology until the error value is maintained at a preset threshold, and an ophthalmic disease classification model is generated.
  • the two-way convolutional neural network model is used to learn two modal image features to obtain a deep learning model similar to the clinical diagnosis process, which solves the problem of ophthalmic images that rely on multiple modal features when only a single modality is used for classification.
  • the accuracy is not enough, the paired color fundus and OCT images are difficult to collect, the disease categories are covered with few diseases and the real scenes show long-tailed data distribution, the categories are unbalanced, and the classification effect of the diseases with fewer samples is poor.
  • the present application also proposes a device for classifying ophthalmic diseases under the long-tailed distribution of multi-modality and multi-disease.
  • FIG. 4 is a schematic structural diagram of a device for classifying ophthalmic diseases under the long-tailed distribution of multi-modality and multi-disease according to an embodiment of the present application.
  • the apparatus for classifying ophthalmic diseases under the long-tailed distribution of multi-modality and multi-disease includes: an acquisition and annotation module 410 , an extraction module 420 , a prediction module 430 and a generation module 440 .
  • the acquisition and labeling module 410 is used to collect data from electronic medical records and acquire dual-modality image samples; wherein, the dual-modality image samples include infrared macular region fundus image samples and optical coherence tomography OCT image samples. Bimodal image samples were labeled with diagnostic labels.
  • the extraction module 420 is configured to input the infrared macular fundus image samples and the OCT image samples into the first neural network for training at the same time, and obtain the first image feature information and the second image feature information.
  • the prediction module 430 is configured to calculate the total image feature information according to the first image feature information and the first weight, the second image feature information and the second weight, and input the total image feature information to a fully connected network to obtain a prediction result.
  • the generating module 440 is configured to calculate the error value of the prediction result and the diagnostic label through a loss function, and continuously adjust the parameters of the neural network through the back-propagation technology until the error value is maintained at the preset threshold, and generate an ophthalmology Disease classification models.
  • the acquiring and labeling module is specifically used for:
  • the bimodal image of the electronic medical record and the current diagnosis information are parsed, and a diagnostic label is marked on the bimodal image sample according to the diagnosis information.
  • the device further includes:
  • the preprocessing module is used to adjust the size of the infrared macular fundus image sample and the optical coherence tomography OCT image sample, and perform random preset angle rotation, random sharpness enhancement, random brightness enhancement, random chromaticity enhancement, and random contrast enhancement. and one or more of random horizontal flip operations.
  • the loss function is shown in formula (1):
  • the device for identifying an ophthalmic disease classification model under a multimodal and multi-disease long-tailed distribution includes:
  • the acquisition module is used to acquire the fundus image samples and OCT images in the infrared macular region to be identified;
  • a diagnosis module is used for inputting the infrared macular region fundus image sample and the OCT image into the ophthalmic disease classification model for processing to obtain a diagnosis result.
  • the device for classifying ophthalmic diseases under the long-tailed distribution of multi-modality and multi-disease by acquiring dual-modality image samples, the dual-modality image samples are marked with diagnostic labels; the infrared macular region fundus image samples and the OCT image samples are Input the first neural network for training to obtain the first image feature information and the second image feature information; calculate the total image feature information according to the first image feature information and the first weight, the second image feature information and the second weight, and input the fully connected network to obtain Predict the results; continuously adjust the neural network parameters through back-propagation technology until the error value is maintained at a preset threshold, and generate an ophthalmic disease classification model.
  • the two-way convolutional neural network model is used to learn two modal image features to obtain a deep learning model similar to the clinical diagnosis process, which solves the problem of ophthalmic images that rely on multiple modal features when only a single modality is used for classification.
  • the accuracy is not enough, the paired color fundus and OCT images are difficult to collect, the disease categories are covered with few diseases and the real scenes show long-tailed data distribution, the categories are unbalanced, and the classification effect of diseases with few samples is poor.
  • first and second are only used for descriptive purposes, and should not be construed as indicating or implying relative importance or implying the number of indicated technical features. Thus, a feature delimited with “first”, “second” may expressly or implicitly include at least one of that feature.
  • plurality means at least two, such as two, three, etc., unless expressly and specifically defined otherwise.
  • a "computer-readable medium” can be any device that can contain, store, communicate, propagate, or transport the program for use by or in connection with an instruction execution system, apparatus, or apparatus.
  • computer readable media include the following: electrical connections with one or more wiring (electronic devices), portable computer disk cartridges (magnetic devices), random access memory (RAM), Read Only Memory (ROM), Erasable Editable Read Only Memory (EPROM or Flash Memory), Fiber Optic Devices, and Portable Compact Disc Read Only Memory (CDROM).
  • the computer readable medium may even be paper or other suitable medium on which the program may be printed, as the paper or other medium may be optically scanned, for example, followed by editing, interpretation, or other suitable medium as necessary process to obtain the program electronically and then store it in computer memory.
  • each functional unit in each embodiment of the present application may be integrated into one processing module, or each unit may exist physically alone, or two or more units may be integrated into one module.
  • the above-mentioned integrated modules can be implemented in the form of hardware, and can also be implemented in the form of software function modules. If the integrated modules are implemented in the form of software functional modules and sold or used as independent products, they may also be stored in a computer-readable storage medium.
  • the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, and the like.

Abstract

The present application relates to the technical field of deep learning. Provided are a training method and apparatus for a multi-mode multi-disease long-tail distribution ophthalmic disease classification model, and an identification method and apparatus for the model. A classification method comprises: acquiring a dual-mode image sample, and performing diagnosis label annotation on the dual-mode image sample; inputting an infrared macular region ophthalmoscopic image sample and an OCT image sample into a first neural network at the same time for training, so as to acquire first image feature information and second image feature information; calculating overall image feature information according to the first image feature information, a first weight, the second image feature information and a second weight, and inputting same into a fully-connected network to acquire a prediction result; and continuously adjusting parameters of the neural network by means of a backpropagation technique until an error value is kept at a preset threshold value, and generating an ophthalmic disease classification model.

Description

多模态多病种长尾分布眼科疾病分类模型训练方法和装置Multimodal and multi-disease long-tail distribution ophthalmic disease classification model training method and device
相关申请的交叉引用CROSS-REFERENCE TO RELATED APPLICATIONS
本申请基于申请号为202110270878.7、申请日为2021年03月12日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。This application is based on the Chinese patent application with the application number of 202110270878.7 and the filing date of March 12, 2021, and claims the priority of the Chinese patent application. The entire content of the Chinese patent application is incorporated herein by reference.
技术领域technical field
本申请涉及深度学习技术领域,尤其涉及一种多模态多病种长尾分布下眼科疾病分类模型训练及其识别方法和装置。The present application relates to the technical field of deep learning, and in particular, to an ophthalmic disease classification model training and its identification method and device under the multimodal and multi-disease long-tail distribution.
背景技术Background technique
近年来,深度学习借助其高效、准确的特点,在医疗领域发展迅速。深度学习技术可对医学影像中的病理特征进行逐像素分析量化,并从一定程度上减弱医生判断的主观性,使疾病诊断更加客观、稳定。光学相干断层扫描(Optical Coherence Tomography,OCT)是一种无接触、无损伤的成像技术,可提供清晰的黄斑区病理横断面成像;眼底影像可提供清晰的平面眼底成像。基于OCT或眼底影像单一模态数据,利用深度学习技术进行眼科疾病智能辅助诊断已引起较广泛研究,但如何在临床环境下对眼科影像进行有效辅助诊断依然面临很大挑战。In recent years, deep learning has developed rapidly in the medical field due to its high efficiency and accuracy. Deep learning technology can perform pixel-by-pixel analysis and quantification of pathological features in medical images, and reduce the subjectivity of doctors' judgment to a certain extent, making disease diagnosis more objective and stable. Optical coherence tomography (OCT) is a non-contact, non-invasive imaging technology that can provide clear cross-sectional imaging of pathological macular region; fundus imaging can provide clear planar fundus imaging. Based on single-modality data of OCT or fundus images, the use of deep learning technology for intelligent auxiliary diagnosis of ophthalmic diseases has caused extensive research, but how to effectively assist in the diagnosis of ophthalmic images in the clinical environment still faces great challenges.
相关技术中,(1)把彩色眼底照片与其对应的疾病标签输入神经网络进行训练,提取眼底影像特征最终给出疾病分类结果;(2)把OCT影像与其对应的疾病标签输入神经网络进行训练,提取OCT影像特征最终给出疾病分类结果;(3)把眼底影像与OCT影像及其对应的疾病标签同时输入神经网络进行训练,提取两种模态影像的特征组合最终给出疾病分类结果。In the related art, (1) the color fundus photos and their corresponding disease labels are input into the neural network for training, and the fundus image features are extracted to finally give the disease classification results; (2) the OCT images and their corresponding disease labels are input into the neural network for training, The OCT image features are extracted to finally give the disease classification results; (3) Fundus images and OCT images and their corresponding disease labels are simultaneously input into the neural network for training, and the feature combination of the two modal images is extracted to finally give the disease classification results.
然而,方案1和方案2可方便地收集大量影像,但只使用单一影像进行辅助诊断不符合大多数眼病诊断的临床实际流程,临床情况下医生通常结合多种模态信息做出综合判断;且仅利用单一影像用于深度学习模型眼病分类决策,特征数量受限,识别准确度不够。方案3结合了眼底影像与OCT特征,符合临床实际情况,但由于难以同时收集大量眼底与OCT对应的图像,因而可用数据较少,现有研究疾病种类仅局限于AMD疾病。However, Scheme 1 and Scheme 2 can easily collect a large number of images, but only using a single image for auxiliary diagnosis does not conform to the actual clinical process of most eye disease diagnosis. In clinical situations, doctors usually combine multiple modal information to make comprehensive judgments; and Only a single image is used for the deep learning model eye disease classification decision, the number of features is limited, and the recognition accuracy is not enough. Scheme 3 combines the characteristics of fundus images and OCT, which is in line with the actual clinical situation, but because it is difficult to collect a large number of images corresponding to fundus and OCT at the same time, there is less available data, and the existing research diseases are limited to AMD diseases.
此外,眼科疾病种类繁多且发生率严重不平衡,存在众多罕见眼科疾病,而现有研究的影像数据大多疾病种类分布均衡且疾病种类数量较少,无法有效应对真实场景中可能出现的数据分布长尾现象。In addition, there are many types of ophthalmic diseases and the incidence is seriously unbalanced, and there are many rare ophthalmic diseases. However, most of the image data in the existing research has a balanced distribution of disease types and a small number of disease types, which cannot effectively deal with the long data distribution that may occur in real scenarios. tail phenomenon.
发明内容SUMMARY OF THE INVENTION
本申请旨在至少在一定程度上解决相关技术中的技术问题之一。The present application aims to solve one of the technical problems in the related art at least to a certain extent.
为此,本申请的第一个目的在于提出一种多模态多病种长尾分布下眼科疾病分类方法, 通过采集OCT设备上的红外黄斑区眼底图与OCT影像,通过便捷的方法收集大量成对的双模态影像,通过双路卷积神经网络模型学习两种模态影像特征得到与临床诊断流程相似的深度学习模型,解决了依赖于多个模态特征的眼科影像在仅用单一模态做分类时,准确度不够、成对的彩色眼底与OCT影像难以收集,覆盖病种较少和真实场景下疾病类别呈现长尾数据分布,类别不平衡,样本较少疾病的分类效果差的技术问题。Therefore, the first purpose of this application is to propose a method for classifying ophthalmic diseases under the long-tailed distribution of multi-modality and multi-disease. By collecting infrared macular fundus images and OCT images on OCT equipment, a large number of Paired dual-modality images, through the dual-channel convolutional neural network model to learn the image features of the two modalities, obtain a deep learning model similar to the clinical diagnosis process, which solves the problem of ophthalmic images that rely on multiple modalities. When the modalities are used for classification, the accuracy is not enough, the paired color fundus and OCT images are difficult to collect, the coverage of diseases is small and the disease categories in the real scene show long-tailed data distribution, the categories are unbalanced, and the classification effect of diseases with fewer samples is poor. technical issues.
本申请的第二个目的在于提出一种多模态多病种长尾分布下眼科疾病分类装置。The second objective of the present application is to propose an ophthalmic disease classification device under the long-tailed distribution of multi-modality and multi-disease.
本申请的第三个目的在于提出一种电子设备。The third object of the present application is to propose an electronic device.
本申请的第四个目的在于提出一种计算机可读存储介质。A fourth object of the present application is to propose a computer-readable storage medium.
本申请的第五个目的在于一种计算机程序产品。A fifth object of the present application is a computer program product.
为达上述目的,本申请第一方面实施例提出了一种多模态多病种长尾分布下眼科疾病分类方法,包括:In order to achieve the above purpose, the embodiment of the first aspect of the present application proposes a method for classifying ophthalmic diseases under the long-tailed distribution of multi-modality and multi-disease, including:
对电子病历进行数据采集,获取双模态影像样本;其中,所述双模态影像样本包括红外黄斑区眼底影像样本和光学相干断层扫描OCT影像样本,并对所述双模态影像样本进行标注诊断标签;Data collection is performed on the electronic medical record to obtain a dual-modality image sample; wherein, the dual-modality image sample includes an infrared macular region fundus image sample and an optical coherence tomography OCT image sample, and the dual-modality image sample is marked diagnostic labels;
将所述红外黄斑区眼底影像样本和所述OCT影像样本分别同时输入第一神经网络进行训练,获取第一图像特征信息和第二图像特征信息;inputting the infrared macular region fundus image sample and the OCT image sample into a first neural network for training at the same time, to obtain first image feature information and second image feature information;
根据所述第一图像特征信息和第一权重、所述第二图像特征信息和第二权重计算总图像特征信息输入全连接网络,获取预测结果;Calculate the total image feature information according to the first image feature information and the first weight, the second image feature information and the second weight, and input the total image feature information into a fully connected network to obtain a prediction result;
通过损失函数计算所述预测结果和所述诊断标签的误差值,通过反向传播技术不断调整神经网络参数,直到所述误差值维持在预设阈值,生成眼科疾病分类模型。The error value of the prediction result and the diagnostic label is calculated by a loss function, and the parameters of the neural network are continuously adjusted by back-propagation technology until the error value is maintained at a preset threshold, and an ophthalmic disease classification model is generated.
本申请实施例的多模态多病种长尾分布下眼科疾病分类方法,通过获取双模态影像样本,对双模态影像样本进行标注诊断标签;将红外黄斑区眼底影像样本和OCT影像样本分别同时输入第一神经网络进行训练获取第一图像特征信息和第二图像特征信息;根据第一图像特征信息和第一权重、第二图像特征信息和第二权重计算总图像特征信息输入全连接网络获取预测结果;通过反向传播技术不断调整神经网络参数,直到误差值维持在预设阈值,生成眼科疾病分类模型。由此,通过双路卷积神经网络模型学习两种模态影像特征得到与临床诊断流程相似的深度学习模型,解决了依赖于多个模态特征的眼科影像在仅用单一模态做分类时,准确度不够、成对的彩色眼底与OCT影像难以收集,覆盖病种较少和真实场景下疾病类别呈现长尾数据分布,类别不平衡,样本较少疾病的分类效果差的技术问题。The method for classifying ophthalmic diseases under the long-tailed distribution of multi-modality and multi-disease according to the embodiment of the present application, by acquiring bimodal image samples, marking the bimodal image samples with diagnostic labels; Respectively input the first neural network for training to obtain the first image feature information and the second image feature information; calculate the total image feature information according to the first image feature information and the first weight, the second image feature information and the second weight, and input the full connection The network obtains the prediction results; the parameters of the neural network are continuously adjusted by back-propagation technology until the error value is maintained at a preset threshold, and an ophthalmic disease classification model is generated. Therefore, the two-way convolutional neural network model is used to learn two modal image features to obtain a deep learning model similar to the clinical diagnosis process, which solves the problem of ophthalmic images that rely on multiple modal features when only a single modality is used for classification. , The accuracy is not enough, the paired color fundus and OCT images are difficult to collect, the disease categories are covered with few diseases and the real scenes show long-tailed data distribution, the categories are unbalanced, and the classification effect of the diseases with fewer samples is poor.
可选地,在本申请的一个实施例中,所述对电子病历进行数据采集,获取双模态影像样本,并对所述双模态影像样本进行标注诊断标签,包括:Optionally, in an embodiment of the present application, the performing data collection on an electronic medical record, acquiring a dual-modality image sample, and labeling the dual-modality image sample with a diagnostic label, includes:
通过设计解析文档格式的电子病例解析算法,解析所述电子病历的双模态影像和当时的诊断信息,根据所述诊断信息对所述双模态影像样本进行标注诊断标签。By designing an electronic case parsing algorithm that parses the document format, the bimodal image of the electronic medical record and the current diagnosis information are parsed, and a diagnostic label is marked on the bimodal image sample according to the diagnosis information.
可选地,在本申请的一个实施例中,所述的方法,还包括:Optionally, in an embodiment of the present application, the method further includes:
对红外黄斑区眼底影像样本和光学相干断层扫描OCT影像样本的大小进行调整,进行随机预设角度旋转、随机锐度增强、随机亮度增强、随机色度增强、随机对比度增强和随机 水平翻转操作中的一种或者多种。Adjust the size of the infrared macular fundus image sample and the optical coherence tomography OCT image sample, and perform random preset angle rotation, random sharpness enhancement, random brightness enhancement, random chromaticity enhancement, random contrast enhancement and random horizontal flip operations. one or more of.
可选地,在本申请的一个实施例中,所述损失函数如公式(1)所示:Optionally, in an embodiment of the present application, the loss function is shown in formula (1):
Figure PCTCN2021137142-appb-000001
Figure PCTCN2021137142-appb-000001
其中,
Figure PCTCN2021137142-appb-000002
其中,
Figure PCTCN2021137142-appb-000003
分别为所述诊断标签y和所述预测结果
Figure PCTCN2021137142-appb-000004
的独热编码形式,γ≥0,γ是超参数,E=[E 1,E 2,…,E N],
Figure PCTCN2021137142-appb-000005
N=12为总标签数,i∈{1,2,…,N},n i为第i个标签的样本数。
in,
Figure PCTCN2021137142-appb-000002
in,
Figure PCTCN2021137142-appb-000003
are the diagnostic label y and the prediction result, respectively
Figure PCTCN2021137142-appb-000004
The one-hot encoding form of , γ≥0, γ is a hyperparameter, E=[E 1 ,E 2 ,...,E N ],
Figure PCTCN2021137142-appb-000005
N=12 is the total number of labels, i∈{1,2,...,N}, n i is the number of samples of the ith label.
可选地,在本申请的一个实施例中,所述的多模态多病种长尾分布下眼科疾病分类模型的识别方法,包括:Optionally, in an embodiment of the present application, the method for identifying an ophthalmic disease classification model under a multimodal, multi-disease long-tailed distribution includes:
获取待识别的红外黄斑区眼底影像样本和OCT影像;Obtain the fundus image samples and OCT images in the infrared macular region to be identified;
将所述红外黄斑区眼底影像样本和所述OCT影像输入所述眼科疾病分类模型进行处理,获取诊断结果。Inputting the infrared macular region fundus image sample and the OCT image into the ophthalmic disease classification model for processing to obtain a diagnosis result.
为达上述目的,本申请第二方面实施例提出了一种多模态多病种长尾分布下眼科疾病分类装置,包括:In order to achieve the above purpose, a second aspect embodiment of the present application proposes a multi-modal multi-disease long-tailed distribution device for classifying ophthalmic diseases, including:
获取标注模块,用于对电子病历进行数据采集,获取双模态影像样本;其中,所述双模态影像样本包括红外黄斑区眼底影像样本和光学相干断层扫描OCT影像样本,并对所述双模态影像样本进行标注诊断标签;The acquisition and annotation module is used to collect data from electronic medical records and acquire dual-modality image samples; wherein, the dual-modality image samples include infrared macular region fundus image samples and optical coherence tomography OCT image samples, and the dual-modality image samples are analyzed. Modal image samples are labeled with diagnostic labels;
提取模块,用于将所述红外黄斑区眼底影像样本和OCT影像样本分别同时输入第一神经网络进行训练,获取第一图像特征信息和第二图像特征信息;an extraction module, configured to input the infrared macular fundus image sample and the OCT image sample into the first neural network for training at the same time, and obtain the first image feature information and the second image feature information;
预测模块,用于根据所述第一图像特征信息和第一权重、所述第二图像特征信息和第二权重计算总图像特征信息输入全连接网络,获取预测结果;a prediction module, configured to calculate the total image feature information according to the first image feature information and the first weight, the second image feature information and the second weight, and input it into a fully connected network to obtain a prediction result;
生成模块,用于通过损失函数计算所述预测结果和所述诊断标签的误差值,通过反向传播技术不断调整神经网络参数,直到所述误差值维持在所述预设阈值,生成眼科疾病分类模型。The generating module is used to calculate the error value of the prediction result and the diagnostic label through a loss function, and continuously adjust the parameters of the neural network through the back-propagation technology until the error value is maintained at the preset threshold, and generate an ophthalmic disease classification Model.
本申请实施例的多模态多病种长尾分布下眼科疾病分类装置,通过获取双模态影像样本,对双模态影像样本进行标注诊断标签;将红外黄斑区眼底影像样本和OCT影像样本分别同时输入第一神经网络进行训练获取第一图像特征信息和第二图像特征信息;根据第一图像特征信息和第一权重、第二图像特征信息和第二权重计算总图像特征信息输入全连接网络获取预测结果;通过反向传播技术不断调整神经网络参数,直到误差值维持在预设阈值,生成眼科疾病分类模型。由此,通过双路卷积神经网络模型学习两种模态影像特征得到与临床诊断流程相似的深度学习模型,解决了依赖于多个模态特征的眼科影像在仅用单一模态做分类时,准确度不够、成对的彩色眼底与OCT影像难以收集,覆盖病种较少和真实场景下疾病类别呈现长尾数据分布,类别不平衡,样本较少疾病的分类效果差的技术问题。The device for classifying ophthalmic diseases under the long-tailed distribution of multi-modality and multi-disease according to the embodiment of the present application, by acquiring dual-modality image samples, the dual-modality image samples are marked with diagnostic labels; the infrared macular region fundus image samples and the OCT image samples are Respectively input the first neural network for training to obtain the first image feature information and the second image feature information; calculate the total image feature information according to the first image feature information and the first weight, the second image feature information and the second weight, and input the full connection The network obtains the prediction results; the parameters of the neural network are continuously adjusted by back-propagation technology until the error value is maintained at a preset threshold, and an ophthalmic disease classification model is generated. Therefore, the two-way convolutional neural network model is used to learn two modal image features to obtain a deep learning model similar to the clinical diagnosis process, which solves the problem of ophthalmic images that rely on multiple modal features when only a single modality is used for classification. , The accuracy is not enough, the paired color fundus and OCT images are difficult to collect, the disease categories are covered with few diseases and the real scenes show long-tailed data distribution, the categories are unbalanced, and the classification effect of the diseases with fewer samples is poor.
可选地,在本申请的一个实施例中,所述获取标注模块,具体用于:Optionally, in an embodiment of the present application, the obtaining and labeling module is specifically used for:
通过设计解析文档格式的电子病例解析算法,解析所述电子病历的双模态影像和当时的 诊断信息,根据所述诊断信息对所述双模态影像样本进行标注诊断标签。By designing an electronic case analysis algorithm that parses the document format, the bimodal image of the electronic medical record and the diagnosis information at that time are parsed, and the bimodal image sample is marked with a diagnostic label according to the diagnosis information.
可选地,在本申请的一个实施例中,所述的装置,还包括:Optionally, in an embodiment of the present application, the device further includes:
预处理模块,用于对红外黄斑区眼底影像样本和光学相干断层扫描OCT影像样本的大小进行调整,进行随机预设角度旋转、随机锐度增强、随机亮度增强、随机色度增强、随机对比度增强和随机水平翻转操作中的一种或者多种。The preprocessing module is used to adjust the size of the infrared macular fundus image sample and the optical coherence tomography OCT image sample, and perform random preset angle rotation, random sharpness enhancement, random brightness enhancement, random chromaticity enhancement, and random contrast enhancement. and one or more of random horizontal flip operations.
可选地,在本申请的一个实施例中,所述损失函数如公式(1)所示:Optionally, in an embodiment of the present application, the loss function is shown in formula (1):
Figure PCTCN2021137142-appb-000006
Figure PCTCN2021137142-appb-000006
其中,
Figure PCTCN2021137142-appb-000007
其中,
Figure PCTCN2021137142-appb-000008
分别为所述诊断标签y和所述预测结果
Figure PCTCN2021137142-appb-000009
的独热编码形式,γ≥0,γ是超参数,E=[E 1,E 2,…,E N],
Figure PCTCN2021137142-appb-000010
N=12为总标签数,i∈{1,2,…,N},n i为第i个标签的样本数。
in,
Figure PCTCN2021137142-appb-000007
in,
Figure PCTCN2021137142-appb-000008
are the diagnostic label y and the prediction result, respectively
Figure PCTCN2021137142-appb-000009
The one-hot encoding form of , γ≥0, γ is a hyperparameter, E=[E 1 ,E 2 ,...,E N ],
Figure PCTCN2021137142-appb-000010
N=12 is the total number of labels, i∈{1,2,...,N}, n i is the number of samples of the ith label.
可选地,在本申请的一个实施例中,所述的多模态多病种长尾分布下眼科疾病分类模型的识别装置,包括:Optionally, in an embodiment of the present application, the device for identifying an ophthalmic disease classification model under a multimodal and multi-disease long-tailed distribution includes:
获取模块,用于获取待识别的红外黄斑区眼底影像样本和OCT影像;The acquisition module is used to acquire the fundus image samples and OCT images in the infrared macular region to be identified;
诊断模块,用于将所述红外黄斑区眼底影像样本和所述OCT影像输入所述眼科疾病分类模型进行处理,获取诊断结果。A diagnosis module is used for inputting the infrared macular region fundus image sample and the OCT image into the ophthalmic disease classification model for processing to obtain a diagnosis result.
为达上述目的,本申请第三方面实施例提出了一种电子设备,包括:处理器;用于存储所述处理器可执行指令的存储器;其中,所述处理器被配置为执行所述指令,以实现本申请第一方面实施例提出的一种多模态多病种长尾分布下眼科疾病分类方法。To achieve the above purpose, an embodiment of a third aspect of the present application provides an electronic device, comprising: a processor; a memory for storing instructions executable by the processor; wherein the processor is configured to execute the instructions , in order to realize a multimodal multi-disease long-tailed distribution ophthalmic disease classification method proposed in the embodiment of the first aspect of the present application.
为达上述目的,本申请第四方面实施例提出了一种计算机可读存储介质,当所述计算机可读存储介质中的指令由电子设备的处理器执行时,使得所述电子设备能够执行本申请第一方面实施例提出的一种多模态多病种长尾分布下眼科疾病分类方法。To achieve the above purpose, a fourth aspect of the present application provides a computer-readable storage medium, when the instructions in the computer-readable storage medium are executed by a processor of an electronic device, the electronic device can execute the present invention. A method for classifying ophthalmic diseases under the long-tailed distribution of multimodality and multidiseases proposed by the embodiment of the first aspect of the application.
为达上述目的,本申请第五方面实施例提出了一种计算机程序产品,包括计算机程序,所述计算机程序被处理器执行时实现本申请第一方面实施例提出的一种多模态多病种长尾分布下眼科疾病分类方法。In order to achieve the above purpose, the fifth aspect of the present application provides a computer program product, including a computer program that, when executed by a processor, implements the multi-modality and multi-illness proposed by the first aspect of the present application. Classification of ophthalmic diseases under a long-tailed distribution.
本申请附加的方面和优点将在下面的描述中部分给出,部分将从下面的描述中变得明显,或通过本申请的实践了解到。Additional aspects and advantages of the present application will be set forth, in part, in the following description, and in part will be apparent from the following description, or learned by practice of the present application.
附图说明Description of drawings
本申请上述的和/或附加的方面和优点从下面结合附图对实施例的描述中将变得明显和容易理解,其中:The above and/or additional aspects and advantages of the present application will become apparent and readily understood from the following description of embodiments taken in conjunction with the accompanying drawings, wherein:
图1为本申请实施例一所提供的一种多模态多病种长尾分布下眼科疾病分类方法的流程示意图;1 is a schematic flowchart of a method for classifying ophthalmic diseases under the long-tailed distribution of multi-modality and multi-diseases provided in Embodiment 1 of the present application;
图2为本申请实施例一所提供的双路模型的示例图;FIG. 2 is an example diagram of a two-way model provided by Embodiment 1 of the present application;
图3为本申请实施例二所提供的一种多模态多病种长尾分布下眼科疾病分类方法的 流程示意图;3 is a schematic flowchart of a method for classifying ophthalmic diseases under a multimodal multi-disease long-tail distribution provided by the second embodiment of the application;
图4为本申请实施例所提供的一种多模态多病种长尾分布下眼科疾病分类装置的结构示意图。FIG. 4 is a schematic structural diagram of a device for classifying ophthalmic diseases under the long-tailed distribution of multi-modality and multi-disease according to an embodiment of the present application.
具体实施方式Detailed ways
下面详细描述本申请的实施例,所述实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的,旨在用于解释本申请,而不能理解为对本申请的限制。The following describes in detail the embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein the same or similar reference numerals refer to the same or similar elements or elements having the same or similar functions throughout. The embodiments described below with reference to the accompanying drawings are exemplary, and are intended to be used to explain the present application, but should not be construed as a limitation to the present application.
下面参考附图描述本申请实施例的多模态多病种长尾分布下眼科疾病分类方法和装置。The following describes the method and device for classifying ophthalmic diseases under the long-tailed distribution of multimodality and multidisease according to the embodiments of the present application with reference to the accompanying drawings.
图1为本申请实施例一所提供的一种多模态多病种长尾分布下眼科疾病分类方法的流程示意图。FIG. 1 is a schematic flowchart of a method for classifying ophthalmic diseases under the long-tailed distribution of multi-modality and multi-disease according to the first embodiment of the present application.
基于现有技术的数据采集非常困难,同一眼的彩色眼底影像与OCT影像在大多数医院中常分属不同科室,难以实现数据流通,现有技术采用loosepair训练方法,即通过将同一病种的影像而非同一眼的多模态影像组合来完成训练,虽然有效扩充了样本数,但该方案使得输入模型的两个影像之间的相关性降低,降低了模型的可解释性。It is very difficult to collect data based on the existing technology. The color fundus image and OCT image of the same eye are often divided into different departments in most hospitals, and it is difficult to realize data circulation. The existing technology adopts the loosepair training method, that is, by combining the images of the same disease Instead of combining the multimodal images of the same eye to complete the training, although the number of samples is effectively expanded, this scheme reduces the correlation between the two images of the input model and reduces the interpretability of the model.
本申请采用OCT设备诊断时医生所用的红外黄斑区眼底影像与其同一眼的OCT影像作为双模态数据,红外黄斑区眼底影像与OCT影像同时成对大量存在于电子诊断报告中,又存储了一定病变信息,因而可获取大量有效多模态数据,更符合临床实际诊断流程且能提高分类效果,本申请设计的电子病例数据采集模块与数据标注模块可以有效利用这一数据。This application uses the infrared macular fundus image and the OCT image of the same eye used by doctors in the diagnosis of OCT equipment as dual-modal data. The infrared macular fundus image and the OCT image exist in large numbers in the electronic diagnosis report at the same time. Therefore, a large amount of effective multimodal data can be obtained, which is more in line with the actual clinical diagnosis process and can improve the classification effect. The electronic case data collection module and data labeling module designed in this application can effectively use this data.
另外,现有技术的分类标签较少,仅针对AMD一种疾病进行疾病内部三分类,无法有效应对真实场景中呈长尾分布的多病种数据。本提案利用两阶段训练模型,通过结合class-balanced-loss设计训练方案,有效分类超过十种疾病,可有效提高整体分类效果与样本数量较少疾病的分类效果。In addition, the existing technology has fewer classification labels, and only performs three-disease internal classification for one disease of AMD, which cannot effectively deal with the long-tailed distribution of multi-disease data in real scenarios. This proposal uses a two-stage training model and designs a training scheme combined with class-balanced-loss to effectively classify more than ten diseases, which can effectively improve the overall classification effect and the classification effect of diseases with a small number of samples.
也就是说,当前主流的眼科疾病图像分类研究主要包括基于眼底影像的病变识别,以及基于OCT影像的病变识别,通过卷积神经网络模型提取分类特征给出预测结果。但现有方案多采用单一模态影像,面对需要结合多种模态特征信息的眼病,特征数量受限,识别准确度不够;现有方法多假设疾病类别分布均匀,不符合实际临床数据分布,难以应对真实场景中数据长尾分布的问题。为解决上述问题,本申请通过采集OCT设备上的红外黄斑区眼底图与OCT影像,通过便捷的方法收集大量成对的双模态影像,通过双路卷积神经网络模型学习两种模态影像特征得到与临床诊断流程相似的深度学习模型。That is to say, the current mainstream ophthalmic disease image classification research mainly includes lesion recognition based on fundus images, and lesion recognition based on OCT images, and the classification features are extracted by the convolutional neural network model to give prediction results. However, most of the existing schemes use a single modality image. In the face of eye diseases that need to combine multiple modal feature information, the number of features is limited and the recognition accuracy is not enough; the existing methods mostly assume that the distribution of disease categories is uniform, which does not conform to the actual clinical data distribution. , it is difficult to deal with the problem of long-tailed distribution of data in real scenarios. In order to solve the above problems, the present application collects a large number of pairs of dual-modal images by a convenient method by collecting infrared macular fundus images and OCT images on OCT equipment, and learns two-modal images through a dual-channel convolutional neural network model. The features are derived from deep learning models similar to the clinical diagnosis process.
如图1所示,该多模态多病种长尾分布下眼科疾病分类方法包括以下步骤101至步骤104。As shown in FIG. 1 , the method for classifying ophthalmic diseases under the multimodal and multi-disease long-tail distribution includes the following steps 101 to 104 .
步骤101,对电子病历进行数据采集,获取双模态影像样本;其中,双模态影像样 本包括红外黄斑区眼底影像样本和光学相干断层扫描OCT影像样本,并对双模态影像样本进行标注诊断标签。Step 101: Collect data from electronic medical records to obtain dual-modality image samples; wherein, the dual-modality image samples include infrared macular fundus image samples and optical coherence tomography OCT image samples, and perform labeling and diagnosis on the dual-modality image samples Label.
在本申请实施例中,通过设计解析文档格式的电子病例解析算法,解析电子病历的双模态影像和当时的诊断信息,根据诊断信息对双模态影像样本进行标注诊断标签。In the embodiment of the present application, an electronic case parsing algorithm that parses the document format is designed to parse the bimodal image of the electronic medical record and the current diagnosis information, and the bimodal image sample is marked with a diagnostic label according to the diagnosis information.
在本申请实施例中,对红外黄斑区眼底影像样本和光学相干断层扫描OCT影像样本的大小进行调整,进行随机预设角度旋转、随机锐度增强、随机亮度增强、随机色度增强、随机对比度增强和随机水平翻转操作中的一种或者多种。In the embodiment of the present application, the size of the infrared macular fundus image sample and the optical coherence tomography OCT image sample is adjusted, and random preset angle rotation, random sharpness enhancement, random brightness enhancement, random chromaticity enhancement, random contrast are performed. One or more of augmentation and random horizontal flip operations.
具体地,由于OCT设备在使用时会通过红外黄斑区眼底影像找到对应OCT影像切片位置,因而产出的电子病历中带有红外黄斑区眼底影像,同时还包括与之对应的OCT影像切片。通过设计解析PDF格式的电子病例解析算法,解析出电子病历的双模态影像以及当时的诊断信息,并对图像进行初步预处理。Specifically, since the OCT device will find the position of the corresponding OCT image slice through the infrared macular fundus image during use, the generated electronic medical record contains the infrared macular fundus image, and also includes the corresponding OCT image slice. By designing an electronic case analysis algorithm that parses the PDF format, the bimodal images of the electronic medical record and the diagnosis information at that time are parsed, and the images are preliminarily preprocessed.
具体地,根据临床实际情况确立待标注疾病标签,选取解析出的双模态图像与病例诊断信息上传至图像标注平台,专业标注人员(主任医生等)根据临床经验结合历史病例信息对多模态影像进行标注。Specifically, the disease labels to be labeled are established according to the actual clinical situation, the parsed bimodal images and case diagnosis information are selected and uploaded to the image labeling platform, and professional labelers (chief doctors, etc.) Images are annotated.
进一步地,对数据进行数据增强,数据在输入模型前,被裁剪为眼底影像和OCT影像,每张影像大小被修改为224×224×3,且在训练数据上进行随机30°旋转、随机锐度增强、随机亮度增强、随机色度增强、随机对比度增强和随机水平翻转操作。Further, data enhancement is performed on the data. The data is cropped into fundus images and OCT images before being input into the model. The size of each image is modified to 224×224×3, and random 30° rotation and random sharpening are performed on the training data. Brightness Boost, Random Luminance Boost, Random Chroma Boost, Random Contrast Boost, and Random Horizontal Flip operations.
步骤102,将红外黄斑区眼底影像样本和OCT影像样本分别同时输入第一神经网络进行训练,获取第一图像特征信息和第二图像特征信息。 Step 102 , input the infrared macular region fundus image sample and the OCT image sample into the first neural network respectively for training, and obtain the first image feature information and the second image feature information.
步骤103,根据第一图像特征信息和第一权重、第二图像特征信息和第二权重计算总图像特征信息输入全连接网络,获取预测结果。Step 103: Calculate the total image feature information according to the first image feature information and the first weight, the second image feature information and the second weight, and input the total image feature information into a fully connected network to obtain a prediction result.
具体地,定义数据集D={x f,x o|y},其中x f和x o分别为从同一只眼睛获得的眼底影像和OCT影像,y为该组影像的诊断标签,包含11种眼科疾病以及无明显病变。模型记作“OurModel”,OurModel接收成对的输入{x f,x o},并输出对眼睛的诊断结果
Figure PCTCN2021137142-appb-000011
如下面公式:所示
Figure PCTCN2021137142-appb-000012
Specifically, define a dataset D={x f , x o |y}, where x f and x o are the fundus images and OCT images obtained from the same eye, respectively, and y is the diagnostic label of the group of images, including 11 types Ophthalmic diseases and no obvious lesions. The model is denoted as "OurModel", OurModel receives paired inputs {x f ,x o } and outputs the diagnosis result of the eye
Figure PCTCN2021137142-appb-000011
as the following formula:
Figure PCTCN2021137142-appb-000012
具体地,网络模型如图2所示,由两个对称的分支构成,一个用于处理眼底影像,另一个用于处理OCT影像,两个分支的权重不共享。每个分支均以ResNet18删除所有全连接层的结构作为骨干网络如图2中的ResNet18-backbone,拼接CBAM(Convolutional Block Attention Module,卷积模块的注意力机制模块)注意力机制模块,提取图像特征信息,最终合并两个分支权重,和全连接层拼接给出预测结果,比如无明显病变,视网膜前膜,中央性浆液性脉络膜视网膜病变,黄斑裂孔,黄斑劈裂,脉络膜新生血管,年龄相关性黄斑变性,视网膜脱离,分支静脉阻塞,动脉闭塞,中央静脉阻塞,原田病中的一种。Specifically, as shown in Figure 2, the network model consists of two symmetrical branches, one for processing fundus images and the other for processing OCT images, and the weights of the two branches are not shared. Each branch uses ResNet18 to delete all fully connected layers as the backbone network as shown in ResNet18-backbone in Figure 2, splicing CBAM (Convolutional Block Attention Module, the attention mechanism module of the convolution module) attention mechanism module, extracting image features information, finally merge the two branch weights, and concatenate the fully connected layer to give prediction results, such as no obvious lesions, epiretinal membrane, central serous chorioretinopathy, macular hole, macular schisis, choroidal neovascularization, age-related Macular degeneration, retinal detachment, branch vein occlusion, arterial occlusion, central vein occlusion, one of Harada disease.
步骤104,通过损失函数计算预测结果和诊断标签的误差值,通过反向传播技术不 断调整神经网络参数,直到误差值维持在预设阈值,生成眼科疾病分类模型。Step 104: Calculate the error value of the prediction result and the diagnostic label through the loss function, and continuously adjust the neural network parameters through the back-propagation technology until the error value is maintained at a preset threshold, and an ophthalmic disease classification model is generated.
在本申请实施例中,损失函数如公式(1)所示:In this embodiment of the present application, the loss function is shown in formula (1):
Figure PCTCN2021137142-appb-000013
Figure PCTCN2021137142-appb-000013
其中,
Figure PCTCN2021137142-appb-000014
其中,
Figure PCTCN2021137142-appb-000015
分别为诊断标签y和预测结果
Figure PCTCN2021137142-appb-000016
的独热编码形式,γ≥0,γ是超参数,E=[E 1,E 2,…,E N],
Figure PCTCN2021137142-appb-000017
Figure PCTCN2021137142-appb-000018
N=12为总标签数,i∈{1,2,…,N},n i为第i个标签的样本数。
in,
Figure PCTCN2021137142-appb-000014
in,
Figure PCTCN2021137142-appb-000015
are the diagnostic label y and the prediction result, respectively
Figure PCTCN2021137142-appb-000016
The one-hot encoding form of , γ≥0, γ is a hyperparameter, E=[E 1 ,E 2 ,...,E N ],
Figure PCTCN2021137142-appb-000017
Figure PCTCN2021137142-appb-000018
N=12 is the total number of labels, i∈{1,2,...,N}, n i is the number of samples of the ith label.
具体地,先使用交叉熵损失函数对整个模型进行训练,待验证集损失收敛后将除全连接层以外的权重冻结,并使用class-balanced-loss重新训练全连接层权重,待验证集损失再次收敛后,得到最终图3中的产出模型。Specifically, first use the cross-entropy loss function to train the entire model, freeze the weights other than the fully connected layer after the validation set loss converges, and use the class-balanced-loss to retrain the weights of the fully connected layer, and wait for the validation set loss again After convergence, the final output model in Figure 3 is obtained.
首先定义每种类别的有效样本数:
Figure PCTCN2021137142-appb-000019
其中,N=12为总标签数,i∈{1,2,…,N},n i为第i个标签的样本数,β∈[0,1)为一个超参数。使用每个类的有效样本数的倒数对损失函数重新加权,平衡损失,从而有效提升小样本数据在分类时的性能。
First define the number of valid samples for each category:
Figure PCTCN2021137142-appb-000019
Among them, N=12 is the total number of labels, i∈{1,2,...,N}, n i is the number of samples of the i-th label, and β∈[0,1) is a hyperparameter. The loss function is reweighted using the inverse of the number of valid samples of each class to balance the loss, thereby effectively improving the performance of small sample data in classification.
Focal loss是为解决一阶段目标检测中正负样本比例严重失衡问题而提出的损失函数,因此本方案选用Focal Loss作为损失函数。Focal loss的定义如下:
Figure PCTCN2021137142-appb-000020
Figure PCTCN2021137142-appb-000021
Focal loss is a loss function proposed to solve the serious imbalance of the proportion of positive and negative samples in one-stage target detection. Therefore, Focal Loss is selected as the loss function in this scheme. Focal loss is defined as follows:
Figure PCTCN2021137142-appb-000020
Figure PCTCN2021137142-appb-000021
其中,
Figure PCTCN2021137142-appb-000022
分别为标签y和模型预测结果
Figure PCTCN2021137142-appb-000023
的独热编码形式,
Figure PCTCN2021137142-appb-000024
γ≥0,γ是超参数。因此本申请的class-balanced-loss的定义如下:
in,
Figure PCTCN2021137142-appb-000022
are the label y and the model prediction result, respectively
Figure PCTCN2021137142-appb-000023
The one-hot encoded form of ,
Figure PCTCN2021137142-appb-000024
γ≥0, where γ is a hyperparameter. Therefore the definition of class-balanced-loss of this application is as follows:
Figure PCTCN2021137142-appb-000025
Figure PCTCN2021137142-appb-000025
其中,E=[E 1,E 2,…,E N],E∈R 12Among them, E=[E 1 , E 2 , . . . , E N ], E∈R 12 .
进一步地,在本申请实施例中,获取待识别的红外黄斑区眼底影像样本和OCT影像;将红外黄斑区眼底影像样本和OCT影像输入眼科疾病分类模型进行处理,获取诊断结果。Further, in the embodiments of the present application, the infrared macular fundus image samples and OCT images to be identified are obtained; the infrared macular fundus image samples and OCT images are input into the ophthalmic disease classification model for processing to obtain diagnostic results.
具体地,如图3所示,通过TensorFlowServing载入模型,用Docker作为服务容器,完成模型部署,对外以HTTP接口形式提供模型,通过Django框架开发系统基本的后端功能,接收多模态影像请求,将请求转发至Docker中请求TensorFlowServing,获得模型识别结果,最终Django根据这一结果将信息传递至前端展示。Specifically, as shown in Figure 3, load the model through TensorFlowServing, use Docker as the service container to complete the model deployment, provide the model in the form of an HTTP interface, develop the basic back-end functions of the system through the Django framework, and receive multimodal image requests , forward the request to Docker to request TensorFlowServing, get the model recognition result, and finally Django passes the information to the front-end display based on this result.
由此,通过利用OCT设备中红外黄斑区眼底图像作为辅助图像,结合OCT影像构造双模态图像输入,设计高效的采集算法获取双模态数据,利用两阶段模型训练方式,一阶段训练提取原始数据分布特征,二阶段冻结卷积层,通过各个病种类别的统计信息加权class-balanced-loss再训练,本申请设计的模型训练方案能显著提高整体分类效果, 尤其是样本数较少病种的分类效果。Therefore, by using the mid-infrared macular fundus image of the OCT device as an auxiliary image, combined with the OCT image to construct a dual-modal image input, an efficient acquisition algorithm was designed to obtain dual-modal data, and a two-stage model training method was used. Data distribution characteristics, freezing the convolution layer in two stages, and weighting the class-balanced-loss retraining through the statistical information of each disease category. The model training scheme designed in this application can significantly improve the overall classification effect, especially for diseases with a small number of samples classification effect.
本申请实施例的多模态多病种长尾分布下眼科疾病分类方法,通过获取双模态影像样本,对双模态影像样本进行标注诊断标签;将红外黄斑区眼底影像样本和OCT影像样本分别同时输入第一神经网络进行训练获取第一图像特征信息和第二图像特征信息;根据第一图像特征信息和第一权重、第二图像特征信息和第二权重计算总图像特征信息输入全连接网络获取预测结果;通过反向传播技术不断调整神经网络参数,直到误差值维持在预设阈值,生成眼科疾病分类模型。由此,通过双路卷积神经网络模型学习两种模态影像特征得到与临床诊断流程相似的深度学习模型,解决了依赖于多个模态特征的眼科影像在仅用单一模态做分类时,准确度不够、成对的彩色眼底与OCT影像难以收集,覆盖病种较少和真实场景下疾病类别呈现长尾数据分布,类别不平衡,样本较少疾病的分类效果差的技术问题。The method for classifying ophthalmic diseases under the long-tailed distribution of multi-modality and multi-disease according to the embodiment of the present application, by acquiring bimodal image samples, marking the bimodal image samples with diagnostic labels; Respectively input the first neural network for training to obtain the first image feature information and the second image feature information; calculate the total image feature information according to the first image feature information and the first weight, the second image feature information and the second weight, and input the full connection The network obtains the prediction results; the parameters of the neural network are continuously adjusted by back-propagation technology until the error value is maintained at a preset threshold, and an ophthalmic disease classification model is generated. Therefore, the two-way convolutional neural network model is used to learn two modal image features to obtain a deep learning model similar to the clinical diagnosis process, which solves the problem of ophthalmic images that rely on multiple modal features when only a single modality is used for classification. , The accuracy is not enough, the paired color fundus and OCT images are difficult to collect, the disease categories are covered with few diseases and the real scenes show long-tailed data distribution, the categories are unbalanced, and the classification effect of the diseases with fewer samples is poor.
为了实现本申请实施例,本申请还提出一种多模态多病种长尾分布下眼科疾病分类装置。In order to realize the embodiments of the present application, the present application also proposes a device for classifying ophthalmic diseases under the long-tailed distribution of multi-modality and multi-disease.
图4为本申请实施例提供的一种多模态多病种长尾分布下眼科疾病分类装置的结构示意图。FIG. 4 is a schematic structural diagram of a device for classifying ophthalmic diseases under the long-tailed distribution of multi-modality and multi-disease according to an embodiment of the present application.
如图4所示,该多模态多病种长尾分布下眼科疾病分类装置包括:获取标注模块410、提取模块420、预测模块430和生成模块440。As shown in FIG. 4 , the apparatus for classifying ophthalmic diseases under the long-tailed distribution of multi-modality and multi-disease includes: an acquisition and annotation module 410 , an extraction module 420 , a prediction module 430 and a generation module 440 .
获取标注模块410,用于对电子病历进行数据采集,获取双模态影像样本;其中,所述双模态影像样本包括红外黄斑区眼底影像样本和光学相干断层扫描OCT影像样本,并对所述双模态影像样本进行标注诊断标签。The acquisition and labeling module 410 is used to collect data from electronic medical records and acquire dual-modality image samples; wherein, the dual-modality image samples include infrared macular region fundus image samples and optical coherence tomography OCT image samples. Bimodal image samples were labeled with diagnostic labels.
提取模块420,用于将所述红外黄斑区眼底影像样本和OCT影像样本分别同时输入第一神经网络进行训练,获取第一图像特征信息和第二图像特征信息。The extraction module 420 is configured to input the infrared macular fundus image samples and the OCT image samples into the first neural network for training at the same time, and obtain the first image feature information and the second image feature information.
预测模块430,用于根据所述第一图像特征信息和第一权重、所述第二图像特征信息和第二权重计算总图像特征信息输入全连接网络,获取预测结果。The prediction module 430 is configured to calculate the total image feature information according to the first image feature information and the first weight, the second image feature information and the second weight, and input the total image feature information to a fully connected network to obtain a prediction result.
生成模块440,用于通过损失函数计算所述预测结果和所述诊断标签的误差值,通过反向传播技术不断不断调整神经网络参数,直到所述误差值维持在所述预设阈值,生成眼科疾病分类模型。The generating module 440 is configured to calculate the error value of the prediction result and the diagnostic label through a loss function, and continuously adjust the parameters of the neural network through the back-propagation technology until the error value is maintained at the preset threshold, and generate an ophthalmology Disease classification models.
在本申请的一个实施例中,所述获取标注模块,具体用于:In an embodiment of the present application, the acquiring and labeling module is specifically used for:
通过设计解析文档格式的电子病例解析算法,解析所述电子病历的双模态影像和当时的诊断信息,根据所述诊断信息对所述双模态影像样本进行标注诊断标签。By designing an electronic case parsing algorithm that parses the document format, the bimodal image of the electronic medical record and the current diagnosis information are parsed, and a diagnostic label is marked on the bimodal image sample according to the diagnosis information.
在本申请的一个实施例中,所述的装置,还包括:In an embodiment of the present application, the device further includes:
预处理模块,用于对红外黄斑区眼底影像样本和光学相干断层扫描OCT影像样本的大小进行调整,进行随机预设角度旋转、随机锐度增强、随机亮度增强、随机色度增强、随机对比度增强和随机水平翻转操作中的一种或者多种。The preprocessing module is used to adjust the size of the infrared macular fundus image sample and the optical coherence tomography OCT image sample, and perform random preset angle rotation, random sharpness enhancement, random brightness enhancement, random chromaticity enhancement, and random contrast enhancement. and one or more of random horizontal flip operations.
在本申请的一个实施例中,所述损失函数如公式(1)所示:In an embodiment of the present application, the loss function is shown in formula (1):
Figure PCTCN2021137142-appb-000026
Figure PCTCN2021137142-appb-000026
其中,
Figure PCTCN2021137142-appb-000027
其中,
Figure PCTCN2021137142-appb-000028
分别为所述诊断标签y和所述预测结果
Figure PCTCN2021137142-appb-000029
的独热编码形式,γ≥0,γ是超参数,E=[E 1,E 2,…,E N],
Figure PCTCN2021137142-appb-000030
N=12为总标签数,i∈{1,2,…,N},n i为第i个标签的样本数。
in,
Figure PCTCN2021137142-appb-000027
in,
Figure PCTCN2021137142-appb-000028
are the diagnostic label y and the prediction result, respectively
Figure PCTCN2021137142-appb-000029
The one-hot encoding form of , γ≥0, γ is a hyperparameter, E=[E 1 ,E 2 ,...,E N ],
Figure PCTCN2021137142-appb-000030
N=12 is the total number of labels, i∈{1,2,...,N}, n i is the number of samples of the ith label.
在本申请的一个实施例中,所述的多模态多病种长尾分布下眼科疾病分类模型的识别装置,包括:In an embodiment of the present application, the device for identifying an ophthalmic disease classification model under a multimodal and multi-disease long-tailed distribution includes:
获取模块,用于获取待识别的红外黄斑区眼底影像样本和OCT影像;The acquisition module is used to acquire the fundus image samples and OCT images in the infrared macular region to be identified;
诊断模块,用于将所述红外黄斑区眼底影像样本和所述OCT影像输入所述眼科疾病分类模型进行处理,获取诊断结果。A diagnosis module is used for inputting the infrared macular region fundus image sample and the OCT image into the ophthalmic disease classification model for processing to obtain a diagnosis result.
本申请实施例的多模态多病种长尾分布下眼科疾病分类装置,通过获取双模态影像样本,对双模态影像样本进行标注诊断标签;将红外黄斑区眼底影像样本和OCT影像样本输入第一神经网络进行训练获取第一图像特征信息和第二图像特征信息;根据第一图像特征信息和第一权重、第二图像特征信息和第二权重计算总图像特征信息输入全连接网络获取预测结果;通过反向传播技术不断调整神经网络参数,直到误差值维持在预设阈值,生成眼科疾病分类模型。由此,通过双路卷积神经网络模型学习两种模态影像特征得到与临床诊断流程相似的深度学习模型,解决了依赖于多个模态特征的眼科影像在仅用单一模态做分类时,准确度不够、成对的彩色眼底与OCT影像难以收集,覆盖病种较少和真实场景下疾病类别呈现长尾数据分布,类别不平衡,样本较少疾病的分类效果差的技术问题。The device for classifying ophthalmic diseases under the long-tailed distribution of multi-modality and multi-disease according to the embodiment of the present application, by acquiring dual-modality image samples, the dual-modality image samples are marked with diagnostic labels; the infrared macular region fundus image samples and the OCT image samples are Input the first neural network for training to obtain the first image feature information and the second image feature information; calculate the total image feature information according to the first image feature information and the first weight, the second image feature information and the second weight, and input the fully connected network to obtain Predict the results; continuously adjust the neural network parameters through back-propagation technology until the error value is maintained at a preset threshold, and generate an ophthalmic disease classification model. Therefore, the two-way convolutional neural network model is used to learn two modal image features to obtain a deep learning model similar to the clinical diagnosis process, which solves the problem of ophthalmic images that rely on multiple modal features when only a single modality is used for classification. , The accuracy is not enough, the paired color fundus and OCT images are difficult to collect, the disease categories are covered with few diseases and the real scenes show long-tailed data distribution, the categories are unbalanced, and the classification effect of diseases with few samples is poor.
需要说明的是,前述对多模态多病种长尾分布下眼科疾病分类方法实施例的解释说明也适用于该实施例的多模态多病种长尾分布下眼科疾病分类装置,此处不再赘述。It should be noted that the foregoing explanations of the embodiment of the method for classifying ophthalmic diseases under the long-tailed distribution of multimodality and multidiseases are also applicable to the device for classifying ophthalmic diseases under the long-tailed distribution of multimodality and multidiseases in this embodiment. No longer.
在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本申请的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不必须针对的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任一个或多个实施例或示例中以合适的方式结合。此外,在不相互矛盾的情况下,本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。In the description of this specification, description with reference to the terms "one embodiment," "some embodiments," "example," "specific example," or "some examples", etc., mean specific features described in connection with the embodiment or example , structure, material or feature is included in at least one embodiment or example of the present application. In this specification, schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, those skilled in the art may combine and combine the different embodiments or examples described in this specification, as well as the features of the different embodiments or examples, without conflicting each other.
此外,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括至少一个该特征。在本申请的描述中,“多个”的含义是至少两个,例如两个,三个等,除非另有明确具体的限定。In addition, the terms "first" and "second" are only used for descriptive purposes, and should not be construed as indicating or implying relative importance or implying the number of indicated technical features. Thus, a feature delimited with "first", "second" may expressly or implicitly include at least one of that feature. In the description of the present application, "plurality" means at least two, such as two, three, etc., unless expressly and specifically defined otherwise.
流程图中或在此以其他方式描述的任何过程或方法描述可以被理解为,表示包括一个或更多个用于实现定制逻辑功能或过程的步骤的可执行指令的代码的模块、片段或部 分,并且本申请的优选实施方式的范围包括另外的实现,其中可以不按所示出或讨论的顺序,包括根据所涉及的功能按基本同时的方式或按相反的顺序,来执行功能,这应被本申请的实施例所属技术领域的技术人员所理解。Any process or method description in the flowcharts or otherwise described herein may be understood to represent a module, segment or portion of code comprising one or more executable instructions for implementing custom logical functions or steps of the process , and the scope of the preferred embodiments of the present application includes alternative implementations in which the functions may be performed out of the order shown or discussed, including performing the functions substantially concurrently or in the reverse order depending upon the functions involved, which should It is understood by those skilled in the art to which the embodiments of the present application belong.
在流程图中表示或在此以其他方式描述的逻辑和/或步骤,例如,可以被认为是用于实现逻辑功能的可执行指令的定序列表,可以具体实现在任何计算机可读介质中,以供指令执行系统、装置或设备(如基于计算机的系统、包括处理器的系统或其他可以从指令执行系统、装置或设备取指令并执行指令的系统)使用,或结合这些指令执行系统、装置或设备而使用。就本说明书而言,"计算机可读介质"可以是任何可以包含、存储、通信、传播或传输程序以供指令执行系统、装置或设备或结合这些指令执行系统、装置或设备而使用的装置。计算机可读介质的更具体的示例(非穷尽性列表)包括以下:具有一个或多个布线的电连接部(电子装置),便携式计算机盘盒(磁装置),随机存取存储器(RAM),只读存储器(ROM),可擦除可编辑只读存储器(EPROM或闪速存储器),光纤装置,以及便携式光盘只读存储器(CDROM)。另外,计算机可读介质甚至可以是可在其上打印所述程序的纸或其他合适的介质,因为可以例如通过对纸或其他介质进行光学扫描,接着进行编辑、解译或必要时以其他合适方式进行处理来以电子方式获得所述程序,然后将其存储在计算机存储器中。The logic and/or steps represented in flowcharts or otherwise described herein, for example, may be considered an ordered listing of executable instructions for implementing the logical functions, may be embodied in any computer-readable medium, For use with, or in conjunction with, an instruction execution system, apparatus, or device (such as a computer-based system, a system including a processor, or other system that can fetch instructions from and execute instructions from an instruction execution system, apparatus, or apparatus) or equipment. For the purposes of this specification, a "computer-readable medium" can be any device that can contain, store, communicate, propagate, or transport the program for use by or in connection with an instruction execution system, apparatus, or apparatus. More specific examples (non-exhaustive list) of computer readable media include the following: electrical connections with one or more wiring (electronic devices), portable computer disk cartridges (magnetic devices), random access memory (RAM), Read Only Memory (ROM), Erasable Editable Read Only Memory (EPROM or Flash Memory), Fiber Optic Devices, and Portable Compact Disc Read Only Memory (CDROM). In addition, the computer readable medium may even be paper or other suitable medium on which the program may be printed, as the paper or other medium may be optically scanned, for example, followed by editing, interpretation, or other suitable medium as necessary process to obtain the program electronically and then store it in computer memory.
应当理解,本申请的各部分可以用硬件、软件、固件或它们的组合来实现。在上述实施方式中,多个步骤或方法可以用存储在存储器中且由合适的指令执行系统执行的软件或固件来实现。如,如果用硬件来实现和在另一实施方式中一样,可用本领域公知的下列技术中的任一项或他们的组合来实现:具有用于对数据信号实现逻辑功能的逻辑门电路的离散逻辑电路,具有合适的组合逻辑门电路的专用集成电路,可编程门阵列(PGA),现场可编程门阵列(FPGA)等。It should be understood that various parts of this application may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware as in another embodiment, it can be implemented by any one of the following techniques known in the art, or a combination thereof: discrete with logic gates for implementing logic functions on data signals Logic circuits, application specific integrated circuits with suitable combinational logic gates, Programmable Gate Arrays (PGA), Field Programmable Gate Arrays (FPGA), etc.
本技术领域的普通技术人员可以理解实现本申请实施例方法携带的全部或部分步骤是可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可读存储介质中,该程序在执行时,包括方法实施例的步骤之一或其组合。Those of ordinary skill in the art can understand that all or part of the steps carried by the methods of the embodiments of the present application can be completed by instructing relevant hardware through a program, and the program can be stored in a computer-readable storage medium, and the program When executed, one or a combination of the steps of a method embodiment is included.
此外,在本申请各个实施例中的各功能单元可以集成在一个处理模块中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。所述集成的模块如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。In addition, each functional unit in each embodiment of the present application may be integrated into one processing module, or each unit may exist physically alone, or two or more units may be integrated into one module. The above-mentioned integrated modules can be implemented in the form of hardware, and can also be implemented in the form of software function modules. If the integrated modules are implemented in the form of software functional modules and sold or used as independent products, they may also be stored in a computer-readable storage medium.
上述提到的存储介质可以是只读存储器,磁盘或光盘等。尽管上面已经示出和描述了本申请的实施例,可以理解的是,本申请实施例是示例性的,不能理解为对本申请的限制,本领域的普通技术人员在本申请的范围内可以对本申请实施例进行变化、修改、替换和变型。The above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, and the like. Although the embodiments of the present application have been shown and described above, it should be understood that the embodiments of the present application are exemplary and should not be construed as limitations on the present application. Variations, modifications, substitutions and alterations are made to the application examples.

Claims (13)

  1. 一种多模态多病种长尾分布下眼科疾病分类模型训练方法,其特征在于,包括以下步骤:A method for training an ophthalmic disease classification model under multimodal and multi-disease long-tail distribution, characterized in that it includes the following steps:
    对电子病历进行数据采集,获取双模态影像样本;其中,所述双模态影像样本包括红外黄斑区眼底影像样本和光学相干断层扫描OCT影像样本,并对所述双模态影像样本进行标注诊断标签;Data collection is performed on the electronic medical record to obtain a dual-modality image sample; wherein, the dual-modality image sample includes an infrared macular region fundus image sample and an optical coherence tomography OCT image sample, and the dual-modality image sample is marked diagnostic labels;
    将所述红外黄斑区眼底影像样本和所述OCT影像样本分别同时输入第一神经网络进行训练,获取第一图像特征信息和第二图像特征信息;inputting the infrared macular region fundus image sample and the OCT image sample into a first neural network for training at the same time, to obtain first image feature information and second image feature information;
    根据所述第一图像特征信息和第一权重、所述第二图像特征信息和第二权重计算总图像特征信息输入全连接网络,获取预测结果;Calculate the total image feature information according to the first image feature information and the first weight, the second image feature information and the second weight, and input the total image feature information into a fully connected network to obtain a prediction result;
    通过损失函数计算所述预测结果和所述诊断标签的误差值,通过反向传播技术不断调整神经网络参数,直到所述误差值维持在预设阈值,生成眼科疾病分类模型。The error value of the prediction result and the diagnostic label is calculated by a loss function, and the parameters of the neural network are continuously adjusted by back-propagation technology until the error value is maintained at a preset threshold, and an ophthalmic disease classification model is generated.
  2. 如权利要求1所述的方法,其特征在于,所述对电子病历进行数据采集,获取双模态影像样本,并对所述双模态影像样本进行标注诊断标签,包括:The method according to claim 1, wherein the collecting data from the electronic medical record, acquiring the dual-modality image samples, and labeling the dual-modality image samples with diagnostic labels, comprises:
    通过设计解析文档格式的电子病例解析算法,解析所述电子病历的双模态影像和当时的诊断信息,根据所述诊断信息对所述双模态影像样本进行标注诊断标签。By designing an electronic case parsing algorithm that parses the document format, the bimodal image of the electronic medical record and the current diagnosis information are parsed, and a diagnostic label is marked on the bimodal image sample according to the diagnosis information.
  3. 如权利要求1所述的方法,其特征在于,还包括:The method of claim 1, further comprising:
    对红外黄斑区眼底影像样本和光学相干断层扫描OCT影像样本的大小进行调整,进行随机预设角度旋转、随机锐度增强、随机亮度增强、随机色度增强、随机对比度增强和随机水平翻转操作中的一种或者多种。Adjust the size of the infrared macular fundus image sample and the optical coherence tomography OCT image sample, and perform random preset angle rotation, random sharpness enhancement, random brightness enhancement, random chromaticity enhancement, random contrast enhancement and random horizontal flip operations. one or more of.
  4. 如权利要求1所述的方法,其特征在于,所述损失函数如公式(1)所示:The method of claim 1, wherein the loss function is shown in formula (1):
    Figure PCTCN2021137142-appb-100001
    Figure PCTCN2021137142-appb-100001
    其中,
    Figure PCTCN2021137142-appb-100002
    其中,s y,
    Figure PCTCN2021137142-appb-100003
    分别为所述诊断标签y和所述预测结果
    Figure PCTCN2021137142-appb-100004
    的独热编码形式,γ≥0,γ是超参数,E=[E 1,E 2,…,E N],
    Figure PCTCN2021137142-appb-100005
    N=12为总标签数,i∈{1,2,…,N},n i为第i个标签的样本数。
    in,
    Figure PCTCN2021137142-appb-100002
    Among them, s y ,
    Figure PCTCN2021137142-appb-100003
    are the diagnostic label y and the prediction result, respectively
    Figure PCTCN2021137142-appb-100004
    The one-hot encoding form of , γ≥0, γ is a hyperparameter, E=[E 1 ,E 2 ,...,E N ],
    Figure PCTCN2021137142-appb-100005
    N=12 is the total number of labels, i∈{1,2,...,N}, n i is the number of samples of the ith label.
  5. 如权利要求1至4任一项所述的多模态多病种长尾分布下眼科疾病分类模型的识别方法,其特征在于,包括:The method for identifying an ophthalmic disease classification model under a multimodal, multi-disease long-tailed distribution according to any one of claims 1 to 4, characterized in that, comprising:
    获取待识别的红外黄斑区眼底影像样本和OCT影像;Obtain the fundus image samples and OCT images in the infrared macular region to be identified;
    将所述红外黄斑区眼底影像样本和所述OCT影像输入所述眼科疾病分类模型进行处理,获取诊断结果。Inputting the infrared macular region fundus image sample and the OCT image into the ophthalmic disease classification model for processing to obtain a diagnosis result.
  6. 一种多模态多病种长尾分布下眼科疾病分类模型训练装置,其特征在于,包括:An ophthalmic disease classification model training device under multimodal and multi-disease long-tail distribution, characterized in that it includes:
    获取标注模块,用于对电子病历进行数据采集,获取双模态影像样本;其中,所述双模态影像样本包括红外黄斑区眼底影像样本和光学相干断层扫描OCT影像样本,并对所述双模态影像样本进行标注诊断标签;The acquisition and annotation module is used to collect data from electronic medical records and acquire dual-modality image samples; wherein, the dual-modality image samples include infrared macular region fundus image samples and optical coherence tomography OCT image samples, and the dual-modality image samples are analyzed. Modal image samples are labeled with diagnostic labels;
    提取模块,用于将所述红外黄斑区眼底影像样本和所述OCT影像样本分别同时输入第一神经网络进行训练,获取第一图像特征信息和第二图像特征信息;an extraction module, configured to input the infrared macular fundus image sample and the OCT image sample into the first neural network for training at the same time, and obtain the first image feature information and the second image feature information;
    预测模块,用于根据所述第一图像特征信息和第一权重、所述第二图像特征信息和第二权重计算总图像特征信息输入全连接网络,获取预测结果;a prediction module, configured to calculate the total image feature information according to the first image feature information and the first weight, the second image feature information and the second weight, and input it into a fully connected network to obtain a prediction result;
    生成模块,用于通过损失函数计算所述预测结果和所述诊断标签的误差值,通过反向传播技术不断调整神经网络参数,直到所述误差值维持在预设阈值,生成眼科疾病分类模型。The generating module is configured to calculate the error value of the prediction result and the diagnostic label through a loss function, and continuously adjust the parameters of the neural network through the back-propagation technology until the error value is maintained at a preset threshold, thereby generating an ophthalmic disease classification model.
  7. 如权利要求6所述的装置,其特征在于,所述获取标注模块,具体用于:The device according to claim 6, wherein the acquiring and labeling module is specifically used for:
    通过设计解析文档格式的电子病例解析算法,解析所述电子病历的双模态影像和当时的诊断信息,根据所述诊断信息对所述双模态影像样本进行标注诊断标签。By designing an electronic case parsing algorithm that parses the document format, the bimodal image of the electronic medical record and the current diagnosis information are parsed, and the bimodal image sample is marked with a diagnostic label according to the diagnosis information.
  8. 如权利要求6所述的装置,其特征在于,还包括:The apparatus of claim 6, further comprising:
    预处理模块,用于对红外黄斑区眼底影像样本和光学相干断层扫描OCT影像样本的大小进行调整,进行随机预设角度旋转、随机锐度增强、随机亮度增强、随机色度增强、随机对比度增强和随机水平翻转操作中的一种或者多种。The preprocessing module is used to adjust the size of the infrared macular fundus image sample and the optical coherence tomography OCT image sample, and perform random preset angle rotation, random sharpness enhancement, random brightness enhancement, random chromaticity enhancement, and random contrast enhancement. and one or more of random horizontal flip operations.
  9. 如权利要求6所述的装置,其特征在于,所述损失函数如公式(1)所示:The device of claim 6, wherein the loss function is shown in formula (1):
    Figure PCTCN2021137142-appb-100006
    Figure PCTCN2021137142-appb-100006
    其中,
    Figure PCTCN2021137142-appb-100007
    其中,s y,
    Figure PCTCN2021137142-appb-100008
    分别为所述诊断标签y和所述预测结果
    Figure PCTCN2021137142-appb-100009
    的独热编码形式,γ≥0,γ是超参数,E=[E 1,E 2,…,E N],
    Figure PCTCN2021137142-appb-100010
    N=12为总标签数,i∈{1,2,…,N},n i为第i个标签的样本数。
    in,
    Figure PCTCN2021137142-appb-100007
    Among them, s y ,
    Figure PCTCN2021137142-appb-100008
    are the diagnostic label y and the prediction result, respectively
    Figure PCTCN2021137142-appb-100009
    The one-hot encoding form of , γ≥0, γ is a hyperparameter, E=[E 1 ,E 2 ,...,E N ],
    Figure PCTCN2021137142-appb-100010
    N=12 is the total number of labels, i∈{1,2,...,N}, n i is the number of samples of the ith label.
  10. 如权利要求6至9任一项所述的多模态多病种长尾分布下眼科疾病分类模型的识别装置,其特征在,包括:The device for identifying an ophthalmic disease classification model under a multimodal and multi-disease long-tailed distribution according to any one of claims 6 to 9, characterized in that it includes:
    获取模块,用于获取待识别的红外黄斑区眼底影像样本和OCT影像;The acquisition module is used to acquire the fundus image samples and OCT images in the infrared macular region to be identified;
    诊断模块,用于将所述红外黄斑区眼底影像样本和所述OCT影像输入所述眼科疾病分类模型进行处理,获取诊断结果。A diagnosis module is used for inputting the infrared macular region fundus image sample and the OCT image into the ophthalmic disease classification model for processing to obtain a diagnosis result.
  11. 一种电子设备,其特征在于,包括:An electronic device, comprising:
    处理器;processor;
    用于存储所述处理器可执行指令的存储器;a memory for storing the processor-executable instructions;
    其中,所述处理器被配置为执行所述指令,以实现以下步骤:wherein the processor is configured to execute the instructions to implement the following steps:
    对电子病历进行数据采集,获取双模态影像样本;其中,所述双模态影像样本包括红外黄斑区眼底影像样本和光学相干断层扫描OCT影像样本,并对所述双模态影像样本进行标注诊断标签;Data collection is performed on the electronic medical record to obtain a dual-modality image sample; wherein, the dual-modality image sample includes an infrared macular region fundus image sample and an optical coherence tomography OCT image sample, and the dual-modality image sample is marked diagnostic labels;
    将所述红外黄斑区眼底影像样本和所述OCT影像样本分别同时输入第一神经网络进行训练,获取第一图像特征信息和第二图像特征信息;inputting the infrared macular region fundus image sample and the OCT image sample into a first neural network for training at the same time, to obtain first image feature information and second image feature information;
    根据所述第一图像特征信息和第一权重、所述第二图像特征信息和第二权重计算总图像特征信息输入全连接网络,获取预测结果;Calculate the total image feature information according to the first image feature information and the first weight, the second image feature information and the second weight, and input the total image feature information into a fully connected network to obtain a prediction result;
    通过损失函数计算所述预测结果和所述诊断标签的误差值,通过反向传播技术不断调整神经网络参数,直到所述误差值维持在预设阈值,生成眼科疾病分类模型。The error value of the prediction result and the diagnostic label is calculated by a loss function, and the parameters of the neural network are continuously adjusted by back-propagation technology until the error value is maintained at a preset threshold, and an ophthalmic disease classification model is generated.
  12. 一种计算机可读存储介质,其特征在于,当所述计算机可读存储介质中的指令由电子设备的处理器执行时,使得所述电子设备能够执行以下步骤:A computer-readable storage medium, characterized in that, when the instructions in the computer-readable storage medium are executed by a processor of an electronic device, the electronic device can perform the following steps:
    对电子病历进行数据采集,获取双模态影像样本;其中,所述双模态影像样本包括红外黄斑区眼底影像样本和光学相干断层扫描OCT影像样本,并对所述双模态影像样本进行标注诊断标签;Data collection is performed on the electronic medical record to obtain a dual-modality image sample; wherein, the dual-modality image sample includes an infrared macular region fundus image sample and an optical coherence tomography OCT image sample, and the dual-modality image sample is marked diagnostic labels;
    将所述红外黄斑区眼底影像样本和所述OCT影像样本分别同时输入第一神经网络进行训练,获取第一图像特征信息和第二图像特征信息;inputting the infrared macular region fundus image sample and the OCT image sample into a first neural network for training at the same time, to obtain first image feature information and second image feature information;
    根据所述第一图像特征信息和第一权重、所述第二图像特征信息和第二权重计算总图像特征信息输入全连接网络,获取预测结果;Calculate the total image feature information according to the first image feature information and the first weight, the second image feature information and the second weight, and input the total image feature information into a fully connected network to obtain a prediction result;
    通过损失函数计算所述预测结果和所述诊断标签的误差值,通过反向传播技术不断调整神经网络参数,直到所述误差值维持在预设阈值,生成眼科疾病分类模型。The error value of the prediction result and the diagnostic label is calculated by a loss function, and the parameters of the neural network are continuously adjusted by back-propagation technology until the error value is maintained at a preset threshold, and an ophthalmic disease classification model is generated.
  13. 一种计算机程序产品,包括计算机程序,其特征在于,所述计算机程序被处理器执行时实现以下步骤:A computer program product, comprising a computer program, characterized in that, when the computer program is executed by a processor, the following steps are implemented:
    对电子病历进行数据采集,获取双模态影像样本;其中,所述双模态影像样本包括红外黄斑区眼底影像样本和光学相干断层扫描OCT影像样本,并对所述双模态影像样本进行标注诊断标签;Data collection is performed on the electronic medical record to obtain a dual-modality image sample; wherein, the dual-modality image sample includes an infrared macular region fundus image sample and an optical coherence tomography OCT image sample, and the dual-modality image sample is marked diagnostic labels;
    将所述红外黄斑区眼底影像样本和所述OCT影像样本分别同时输入第一神经网络进行训练,获取第一图像特征信息和第二图像特征信息;inputting the infrared macular region fundus image sample and the OCT image sample into a first neural network for training at the same time, to obtain first image feature information and second image feature information;
    根据所述第一图像特征信息和第一权重、所述第二图像特征信息和第二权重计算总图像特征信息输入全连接网络,获取预测结果;Calculate the total image feature information according to the first image feature information and the first weight, the second image feature information and the second weight, and input the total image feature information into a fully connected network to obtain a prediction result;
    通过损失函数计算所述预测结果和所述诊断标签的误差值,通过反向传播技术不断调整神经网络参数,直到所述误差值维持在预设阈值,生成眼科疾病分类模型。The error value of the prediction result and the diagnostic label is calculated by a loss function, and the parameters of the neural network are continuously adjusted by back-propagation technology until the error value is maintained at a preset threshold, and an ophthalmic disease classification model is generated.
PCT/CN2021/137142 2021-03-12 2021-12-10 Training method and apparatus for multi-mode multi-disease long-tail distribution ophthalmic disease classification model WO2022188489A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110270878.7 2021-03-12
CN202110270878.7A CN113011485B (en) 2021-03-12 2021-03-12 Multi-mode multi-disease long-tail distribution ophthalmic disease classification model training method and device

Publications (1)

Publication Number Publication Date
WO2022188489A1 true WO2022188489A1 (en) 2022-09-15

Family

ID=76406248

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/137142 WO2022188489A1 (en) 2021-03-12 2021-12-10 Training method and apparatus for multi-mode multi-disease long-tail distribution ophthalmic disease classification model

Country Status (2)

Country Link
CN (1) CN113011485B (en)
WO (1) WO2022188489A1 (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115631367A (en) * 2022-09-30 2023-01-20 中国医学科学院生物医学工程研究所 Neural network model-based ophthalmic ultrasonic image classification method and device
CN116416235A (en) * 2023-04-12 2023-07-11 北京建筑大学 Feature region prediction method and device based on multi-mode ultrasonic data
CN116681958A (en) * 2023-08-04 2023-09-01 首都医科大学附属北京妇产医院 Fetal lung ultrasonic image maturity prediction method based on machine learning
CN116977810A (en) * 2023-09-25 2023-10-31 之江实验室 Multi-mode post-fusion long tail category detection method and system
CN116203929B (en) * 2023-03-01 2024-01-05 中国矿业大学 Industrial process fault diagnosis method for long tail distribution data
CN117372416A (en) * 2023-11-13 2024-01-09 北京透彻未来科技有限公司 High-robustness digital pathological section diagnosis system and method for countermeasure training
CN117789284A (en) * 2024-02-28 2024-03-29 中日友好医院(中日友好临床医学研究所) Identification method and device for ischemic retinal vein occlusion
CN117789284B (en) * 2024-02-28 2024-05-14 中日友好医院(中日友好临床医学研究所) Identification method and device for ischemic retinal vein occlusion

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113011485B (en) * 2021-03-12 2023-04-07 北京邮电大学 Multi-mode multi-disease long-tail distribution ophthalmic disease classification model training method and device
CN113256636B (en) * 2021-07-15 2021-11-05 北京小蝇科技有限责任公司 Bottom-up parasite species development stage and image pixel classification method
CN113496489B (en) * 2021-09-06 2021-12-24 北京字节跳动网络技术有限公司 Training method of endoscope image classification model, image classification method and device
CN113989519B (en) * 2021-12-28 2022-03-22 中科视语(北京)科技有限公司 Long-tail target detection method and system
CN114494734A (en) * 2022-01-21 2022-05-13 平安科技(深圳)有限公司 Method, device and equipment for detecting pathological changes based on fundus image and storage medium
CN115019891B (en) * 2022-06-08 2023-07-07 郑州大学 Individual driving gene prediction method based on semi-supervised graph neural network

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110242306A1 (en) * 2008-12-19 2011-10-06 The Johns Hopkins University System and method for automated detection of age related macular degeneration and other retinal abnormalities
CN109583569A (en) * 2018-11-30 2019-04-05 中控智慧科技股份有限公司 A kind of multi-modal Feature fusion and device based on convolutional neural networks
CN111428072A (en) * 2020-03-31 2020-07-17 南方科技大学 Ophthalmologic multimodal image retrieval method, apparatus, server and storage medium
CN113011485A (en) * 2021-03-12 2021-06-22 北京邮电大学 Multi-mode multi-disease long-tail distribution ophthalmic disease classification model training method and device

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108172291B (en) * 2017-05-04 2020-01-07 深圳硅基智能科技有限公司 Diabetic retinopathy recognition system based on fundus images
CN111784665B (en) * 2020-06-30 2024-05-07 平安科技(深圳)有限公司 OCT image quality evaluation method, system and device based on Fourier transform
CN111938569A (en) * 2020-09-17 2020-11-17 南京航空航天大学 Eye ground multi-disease classification detection method based on deep learning

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110242306A1 (en) * 2008-12-19 2011-10-06 The Johns Hopkins University System and method for automated detection of age related macular degeneration and other retinal abnormalities
CN109583569A (en) * 2018-11-30 2019-04-05 中控智慧科技股份有限公司 A kind of multi-modal Feature fusion and device based on convolutional neural networks
CN111428072A (en) * 2020-03-31 2020-07-17 南方科技大学 Ophthalmologic multimodal image retrieval method, apparatus, server and storage medium
CN113011485A (en) * 2021-03-12 2021-06-22 北京邮电大学 Multi-mode multi-disease long-tail distribution ophthalmic disease classification model training method and device

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
LIN TSUNG-YI; GOYAL PRIYA; GIRSHICK ROSS; HE KAIMING; DOLLAR PIOTR: "Focal Loss for Dense Object Detection", 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), IEEE, 22 October 2017 (2017-10-22), pages 2999 - 3007, XP033283167, DOI: 10.1109/ICCV.2017.324 *
OU ZHONGHONG; CHAI WENJUN; WANG LIFEI; ZHANG RURU; HE JIAWEN; SONG MEINA; YUAN LIFEI; ZHANG SHENGJUAN; WANG YANHUI; LI HUAN; JIA X: "M2LC-Net: A multi-modal multi-disease long-tailed classification network for real clinical scenes", CHINA COMMUNICATIONS, CHINA INSTITUTE OF COMMUNICATIONS, PISCATAWAY, NJ, USA, vol. 18, no. 9, 4 October 2021 (2021-10-04), Piscataway, NJ, USA , pages 210 - 220, XP011881784, ISSN: 1673-5447, DOI: 10.23919/JCC.2021.09.016 *
XIAO PENG, DUAN ZHENGYU, WANG GENGYUAN, DENG YUQING, WANG QIAN, ZHANG JUN, LIANG SHANSHAN, YUAN JIN: "Multi-modal Anterior Eye Imager Combining Ultra-High Resolution OCT and Microvascular Imaging for Structural and Functional Evaluation of the Human Eye", APPLIED SCIENCES, MDPI SWITZERLAND, vol. 10, no. 7, 1 April 2020 (2020-04-01), pages 2545 - 12, XP055966284, ISSN: 2076-3417, DOI: 10.3390/app10072545 *

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115631367A (en) * 2022-09-30 2023-01-20 中国医学科学院生物医学工程研究所 Neural network model-based ophthalmic ultrasonic image classification method and device
CN115631367B (en) * 2022-09-30 2024-03-05 中国医学科学院生物医学工程研究所 Ophthalmic ultrasonic image classification method and device based on neural network model
CN116203929B (en) * 2023-03-01 2024-01-05 中国矿业大学 Industrial process fault diagnosis method for long tail distribution data
CN116416235A (en) * 2023-04-12 2023-07-11 北京建筑大学 Feature region prediction method and device based on multi-mode ultrasonic data
CN116416235B (en) * 2023-04-12 2023-12-05 北京建筑大学 Feature region prediction method and device based on multi-mode ultrasonic data
CN116681958A (en) * 2023-08-04 2023-09-01 首都医科大学附属北京妇产医院 Fetal lung ultrasonic image maturity prediction method based on machine learning
CN116681958B (en) * 2023-08-04 2023-10-20 首都医科大学附属北京妇产医院 Fetal lung ultrasonic image maturity prediction method based on machine learning
CN116977810A (en) * 2023-09-25 2023-10-31 之江实验室 Multi-mode post-fusion long tail category detection method and system
CN116977810B (en) * 2023-09-25 2024-01-09 之江实验室 Multi-mode post-fusion long tail category detection method and system
CN117372416A (en) * 2023-11-13 2024-01-09 北京透彻未来科技有限公司 High-robustness digital pathological section diagnosis system and method for countermeasure training
CN117789284A (en) * 2024-02-28 2024-03-29 中日友好医院(中日友好临床医学研究所) Identification method and device for ischemic retinal vein occlusion
CN117789284B (en) * 2024-02-28 2024-05-14 中日友好医院(中日友好临床医学研究所) Identification method and device for ischemic retinal vein occlusion

Also Published As

Publication number Publication date
CN113011485B (en) 2023-04-07
CN113011485A (en) 2021-06-22

Similar Documents

Publication Publication Date Title
WO2022188489A1 (en) Training method and apparatus for multi-mode multi-disease long-tail distribution ophthalmic disease classification model
CN110197493B (en) Fundus image blood vessel segmentation method
Bernard et al. Deep learning techniques for automatic MRI cardiac multi-structures segmentation and diagnosis: is the problem solved?
CN109325942B (en) Fundus image structure segmentation method based on full convolution neural network
Kou et al. Microaneurysms segmentation with a U-Net based on recurrent residual convolutional neural network
CN111223085A (en) Head medical image auxiliary interpretation report generation method based on neural network
CN111681219A (en) New coronary pneumonia CT image classification method, system and equipment based on deep learning
CN111080643A (en) Method and device for classifying diabetes and related diseases based on fundus images
EP3964136A1 (en) System and method for guiding a user in ultrasound assessment of a fetal organ
WO2022166399A1 (en) Fundus oculi disease auxiliary diagnosis method and apparatus based on bimodal deep learning
CN111275706A (en) Shear wave elastic imaging-based ultrasound omics depth analysis method and system
Gamage et al. Instance-based segmentation for boundary detection of neuropathic ulcers through Mask-RCNN
Yao et al. Holistic segmentation of intermuscular adipose tissues on thigh MRI
CN111028232A (en) Diabetes classification method and equipment based on fundus images
Sofian et al. Calcification detection using convolutional neural network architectures in intravascular ultrasound images
CN111047590A (en) Hypertension classification method and device based on fundus images
Moradi et al. Feasibility of the soft attention-based models for automatic segmentation of OCT kidney images
AU2021425940A1 (en) System and method of using right and left eardrum otoscopy images for automated otoscopy image analysis to diagnose ear pathology
CN117352164A (en) Multi-mode tumor detection and diagnosis platform based on artificial intelligence and processing method thereof
CN116703837B (en) MRI image-based rotator cuff injury intelligent identification method and device
Wen et al. A-PSPNet: A novel segmentation method of renal ultrasound image
Mu et al. Improved model of eye disease recognition based on VGG model
CN112967246A (en) X-ray image auxiliary device and method for clinical decision support system
Oliveira et al. Automatic segmentation of posterior fossa structures in pediatric brain mris
Das et al. Attention-UNet architectures with pretrained backbones for multi-class cardiac MR image segmentation

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21929951

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21929951

Country of ref document: EP

Kind code of ref document: A1