WO2022166399A1 - 基于双模态深度学习的眼底疾病辅助诊断方法和装置 - Google Patents

基于双模态深度学习的眼底疾病辅助诊断方法和装置 Download PDF

Info

Publication number
WO2022166399A1
WO2022166399A1 PCT/CN2021/137145 CN2021137145W WO2022166399A1 WO 2022166399 A1 WO2022166399 A1 WO 2022166399A1 CN 2021137145 W CN2021137145 W CN 2021137145W WO 2022166399 A1 WO2022166399 A1 WO 2022166399A1
Authority
WO
WIPO (PCT)
Prior art keywords
feature vector
feature
neural network
target
color fundus
Prior art date
Application number
PCT/CN2021/137145
Other languages
English (en)
French (fr)
Inventor
宋美娜
鄂海红
何佳雯
张胜娟
王艳辉
李欢
张如如
Original Assignee
北京邮电大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京邮电大学 filed Critical 北京邮电大学
Publication of WO2022166399A1 publication Critical patent/WO2022166399A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10101Optical tomography; Optical coherence tomography [OCT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30041Eye; Retina; Ophthalmic
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Definitions

  • the present application relates to the technical field of data processing, and in particular, to a method and device for auxiliary diagnosis of fundus diseases based on dual-modal deep learning.
  • fundus diseases include vitreous, optic nerve, choroid and retinal inflammation, tumor and various vascular lesions, or ocular tissue lesions caused by various multi-system diseases and degenerative diseases.
  • my country is one of the countries with the largest number of blind and visually impaired patients in the world. At present, there are about 27 million diabetic retinopathy patients, 16 million glaucoma patients and 30 million macular disease patients in China. Visual impairment will seriously affect people's quality of life. However, the per capita share of medical resources in my country is low, and the ratio between patients and professional doctors is seriously unbalanced.
  • many researchers have applied deep learning technology to the field of intelligent auxiliary diagnosis of fundus diseases.
  • the computer-aided diagnosis and treatment system based on deep learning can work uninterruptedly, and to a certain extent, it reduces the subjectivity of human doctors and makes the diagnosis of diseases more objective and stable.
  • deep learning technology can perform pixel-by-pixel analysis and quantification of pathological features in medical images, providing doctors with a reference for disease diagnosis.
  • color fundus images and OCT images can display different sign information from planar and cross-sectional views, respectively, for many fundus diseases such as central serous chorioretinopathy, age-related macular degeneration, retinal vein occlusion, idiopathic polypoid choroidal vessels Lesions, etc. require at least the sign information provided by color fundus images and OCT images to make a diagnosis.
  • the present application aims to solve one of the technical problems in the related art at least to a certain extent.
  • the first purpose of this application is to propose an auxiliary diagnosis method for fundus diseases based on dual-modal deep learning, obtain eye sign information from different perspectives, and select appropriate strategies for different data sets and different task scenarios. Feature fusion to improve the accuracy of auxiliary diagnosis of fundus diseases.
  • the second objective of the present application is to propose an auxiliary diagnosis device for fundus diseases based on dual-modal deep learning.
  • the third object of the present application is to propose an electronic device.
  • a fourth object of the present application is to propose a computer-readable storage medium.
  • a fifth object of the present application is a computer program product.
  • an embodiment of the first aspect of the present application proposes a method for auxiliary diagnosis of fundus diseases based on dual-modal deep learning, including:
  • Fusion processing is performed on the first feature vector and the second feature vector according to a preset feature fusion strategy to obtain a target feature vector, and the target feature vector is input into a trained neural network diagnosis model to obtain a diagnosis result.
  • the method for auxiliary diagnosis of fundus diseases based on dual-modal deep learning by acquiring a color fundus image and an optical coherence tomography OCT image of the same eye;
  • the first feature vector and the second feature vector; the first feature vector and the second feature vector are fused according to the preset feature fusion strategy to obtain the target feature vector, and the target feature vector is input into the trained neural network diagnosis model to obtain diagnostic result.
  • eye sign information from different perspectives is obtained, and appropriate strategies are selected for feature fusion for different data sets and different task scenarios, so as to improve the accuracy of auxiliary diagnosis of fundus diseases.
  • the performing feature extraction on the color fundus image and the OCT image, respectively, to obtain a first feature vector and a second feature vector including:
  • the first feature extraction module is used to perform feature extraction on the color fundus image to obtain the first feature vector
  • the second feature extraction module is used to perform feature extraction on the OCT image to obtain the second feature vector.
  • the first feature vector and the second feature vector are fused according to a preset feature fusion strategy to obtain a target feature vector, and the target feature Vector input to the trained neural network diagnostic model to obtain diagnostic results, including:
  • the first feature vector and the second feature vector are spliced to obtain the target feature vector, and the target feature vector is input into the trained neural network diagnosis model to obtain a diagnosis result.
  • the first feature vector and the second feature vector are fused according to a preset feature fusion strategy to obtain a target feature vector, and the target feature Vector input to the trained neural network diagnostic model to obtain diagnostic results, including:
  • the first feature vector and the second feature vector are fused according to a preset feature fusion strategy to obtain a target feature vector, and the target feature Vector input to the trained neural network diagnostic model to obtain diagnostic results, including:
  • the sum of the first product of the first classification result and the first weight and the second product of the second classification result and the second weight is obtained as the diagnosis result.
  • the method before the inputting the target feature vector into the trained neural network diagnosis model and obtaining the diagnosis result, the method further includes:
  • the first feature vector sample and the second feature vector sample are fused to obtain the target feature vector sample, and the target feature vector sample is input into the neural network diagnosis model for training, and the training is obtained.
  • the error between the labeling result and the training result is calculated through a loss function, and the parameters of the neural network diagnostic model are adjusted until the error is less than a preset threshold, and the trained neural network diagnostic model is generated.
  • a second aspect of the present application provides an auxiliary diagnosis device for fundus diseases based on dual-modal deep learning, including:
  • an acquisition module for acquiring color fundus images and optical coherence tomography OCT images of the same eye
  • an extraction module configured to perform feature extraction on the color fundus image and the OCT image, respectively, to obtain a first feature vector and a second feature vector;
  • a processing module configured to perform fusion processing on the first feature vector and the second feature vector according to a preset feature fusion strategy, obtain a target feature vector, and input the target feature vector into a trained neural network diagnostic model, Get diagnostic results.
  • the device for auxiliary diagnosis of fundus diseases based on dual-modal deep learning obtains a color fundus image and an optical coherence tomography OCT image of the same eye;
  • the first feature vector and the second feature vector; the first feature vector and the second feature vector are fused according to the preset feature fusion strategy to obtain the target feature vector, and the target feature vector is input into the trained neural network diagnosis model to obtain diagnostic result.
  • eye sign information from different perspectives is obtained, and appropriate strategies are selected for feature fusion for different data sets and different task scenarios, so as to improve the accuracy of auxiliary diagnosis of fundus diseases.
  • the extraction module is specifically used for:
  • the first feature extraction module is used to perform feature extraction on the color fundus image to obtain the first feature vector
  • the second feature extraction module is used to perform feature extraction on the OCT image to obtain the second feature vector.
  • the processing module is specifically used for:
  • the first feature vector and the second feature vector are spliced to obtain the target feature vector, and the target feature vector is input into the trained neural network diagnosis model to obtain a diagnosis result.
  • the processing module is specifically used for:
  • an embodiment of a third aspect of the present application provides an electronic device, comprising: a processor; a memory for storing instructions executable by the processor; wherein the processor is configured to execute the instructions , so as to realize a method for auxiliary diagnosis of fundus diseases based on dual-modal deep learning proposed by the embodiment of the first aspect of the present application.
  • a fourth aspect of the present application provides a computer-readable storage medium, when the instructions in the computer-readable storage medium are executed by a processor of an electronic device, the electronic device can execute the present invention.
  • a method for auxiliary diagnosis of fundus diseases based on dual-modal deep learning proposed in the embodiment of the first aspect of the application.
  • the fifth aspect of the present application provides a computer program product, including a computer program that, when executed by a processor, implements the dual-modal depth-based dual-modality depth proposed by the first aspect of the present application. Learn the methods of auxiliary diagnosis of fundus diseases.
  • FIG. 1 is an example diagram of an auxiliary diagnosis method for fundus diseases based on dual-modal deep learning according to an embodiment of the present application
  • FIG. 2 is a schematic flowchart of a method for auxiliary diagnosis of fundus diseases based on dual-modal deep learning provided by an embodiment of the present application;
  • FIG. 3 is a training example diagram of a method for auxiliary diagnosis of fundus diseases based on dual-modal deep learning according to an embodiment of the present application
  • FIG. 4 is a processing example diagram of the method for auxiliary diagnosis of fundus diseases based on dual-modal deep learning according to an embodiment of the present application
  • FIG. 5 is a schematic structural diagram of a device for auxiliary diagnosis of fundus diseases based on dual-modal deep learning provided by an embodiment of the present application.
  • FIG. 2 is a schematic flowchart of a method for auxiliary diagnosis of fundus diseases based on dual-modal deep learning provided by an embodiment of the present application.
  • This application designs a dual-modality fundus disease auxiliary diagnosis system, and proposes three feature fusion strategies suitable for the system.
  • the system uses color fundus images and OCT images to assist in the diagnosis of common fundus diseases.
  • a dual-modal fundus disease auxiliary diagnosis system which includes two parts: (1) Different The feature extraction module is applied to different modal data to extract features; (2) fuse feature representations.
  • the complete system structure is shown in Figure 2, where Model-F is a feature extraction module applied to color fundus images, and Model-O is a feature extraction module applied to OCT images; (a) is a feature-based connection strategy, (b) ) is the feature-based weight assignment strategy, and (c) is the weight assignment strategy based on the classification results.
  • a dataset D ⁇ x f , x O
  • y ⁇ is defined, where x f and x O are the color fundus images and OCT images obtained from the same eye, respectively, and y is the diagnostic label of the set of images.
  • the dual-modal fundus disease auxiliary diagnosis system receives the paired input ⁇ x f ,x O ⁇ and outputs the diagnosis result of the eye
  • the dual-modal fundus disease auxiliary diagnosis system is represented by "Our_Model":
  • the method for auxiliary diagnosis of fundus diseases based on dual-modal deep learning includes the following steps 101 to 103 .
  • Step 101 acquiring a color fundus image and an optical coherence tomography OCT image of the same eye.
  • Step 102 Perform feature extraction on the color fundus image and the OCT image, respectively, to obtain a first feature vector and a second feature vector.
  • the color fundus image and the OCT image are respectively subjected to feature extraction by the first feature extraction module to obtain the first feature vector and the second feature vector; or, the color fundus image is extracted by the first feature extraction module.
  • the bimodal fundus disease auxiliary diagnosis system consists of two symmetrical branches, wherein the feature extraction module for processing color fundus images is denoted as Model-F, and the feature extraction module for processing OCT images is denoted as Model-F O.
  • the mainstream feature extraction modules in computer vision can be used in bimodal fundus disease auxiliary diagnosis systems, such as VGGNet, GoogleNet, ResNet, etc.
  • the framework can choose different feature extraction modules.
  • Model-F and Model-O can be the same feature extraction module (homogeneous) or different feature extraction modules (heterogeneous).
  • the heterogeneous feature extraction module may obtain better results than the homogeneous feature extraction module.
  • F f be the feature vector extracted from the color fundus image by Model-F (the upper rectangular block in Figure 2)
  • F O be the feature vector generated by Model-O acting on the OCT image (the lower rectangular block in Figure 2).
  • the sizes of the feature vectors F f and F O need to be unified.
  • Step 103 Perform fusion processing on the first feature vector and the second feature vector according to a preset feature fusion strategy to obtain a target feature vector, and input the target feature vector into the trained neural network diagnosis model to obtain a diagnosis result.
  • the dual-modal fundus disease auxiliary diagnosis system can choose different feature fusion strategies, and can also modify the hyperparameters defined in the strategies to achieve The best classification effect.
  • the first feature vector and the second feature vector are spliced to obtain the target feature vector, and the target feature vector is input into the trained neural network diagnosis model to obtain the diagnosis result.
  • the feature-based connection strategy connects F f and F O to obtain the vector F con , and then the final output is obtained through the fully connected layer
  • the first weight corresponding to the first feature vector and the second weight corresponding to the second feature vector are obtained; the sum of the first weight and the second weight is 1; the first feature vector and the first weight are obtained.
  • the sum of the first product of , the second product of the second feature vector and the second weight is used as the target feature vector, and the target feature vector is input into the trained neural network diagnosis model to obtain the diagnosis result.
  • the two feature vectors involved in the weight assignment respectively carry feature information about different images. Different importance is given to certain modal features through the weight distribution principle.
  • the weight size can be regarded as how important this information is to the image classification task.
  • the third example is to input the first feature vector and the second feature vector into the trained neural network diagnosis model respectively, obtain the first classification result and the second classification result, obtain the first weight corresponding to the first classification result, and obtain the first classification result.
  • the second weight corresponding to the second classification result; the sum of the first product of the first classification result and the first weight and the second product of the second classification result and the second weight is obtained as the diagnosis result.
  • the weight assignment strategy based on the classification results first inputs F f and F O into the fully connected layer to obtain and Among them, W f , WO are the parameters of the fully connected layer applied to the color fundus image and the OCT image, respectively. will later and Sum the weights to get the final output The score: where a is a hyperparameter and 0 ⁇ a ⁇ 1. by choosing to achieve the classification expressed in Equation 1 with the highest scoring class in .
  • the weight assignment strategy based on the classification result is a weighted voting method.
  • the module applied to the color fundus image and the module applied to the OCT image are actually independent of each other. They respectively give the prediction of the classification result and obtain the final result through weighted voting. .
  • a color fundus image sample and an optical coherence tomography OCT image sample of each eye of the plurality of eyes are obtained; wherein, the color fundus image sample and the OCT image sample have a labeling result; the color fundus images are respectively extracted
  • the first feature vector sample and the second feature vector sample of the sample and the OCT image sample according to the preset feature fusion strategy, the first feature vector sample and the second feature vector sample are fused to obtain the target feature vector sample, and the target feature
  • the vector samples are input into the neural network diagnosis model for training, the training results are obtained, the error between the labeling results and the training results is calculated through the loss function, the parameters of the neural network diagnosis model are adjusted until the error is less than the preset threshold, and the trained neural network diagnosis is generated.
  • the training flow chart of the dual-modal fundus disease auxiliary diagnosis system is shown in Figure 3, and the specific steps are as follows: 1) According to the data set and task characteristics, select the feature extraction module Model applied to color fundus images and OCT images -F and Model-O; 2) Select a feature fusion strategy according to the data set and task characteristics, that is, a feature-based connection strategy, a feature-based weight assignment strategy or a classification result-based weight assignment strategy; 3) According to the selected feature extraction module and feature fusion strategy, modify the corresponding functional modules of the dual-modal fundus disease auxiliary diagnosis system; 4) use the existing labeled data set for training; 5) if the system can achieve the expected accuracy, the dual-modal fundus disease auxiliary diagnosis system The system training is over; if the expected accuracy is not achieved, modify the feature extraction module or modify the feature fusion strategy according to the data set and task characteristics, and go back to step 3).
  • the flow chart of the use of the system is shown in Figure 4.
  • the specific steps are as follows: 1) Obtain the color fundus image and OCT image of the patient's eye and upload it to the dual-modal fundus disease auxiliary diagnosis system. The two images must belong to the same patient 2) The system performs image preprocessing such as size adjustment and image enhancement on the two images; 3) The system inputs the color fundus image into the Model-F feature extraction module to obtain the color fundus image feature vector, and inputs the OCT image into the Model-F feature vector. O feature extraction module to obtain OCT image feature vector; 4) The system fuses the two feature vectors according to the selected feature fusion strategy, and gives the final screening result, which can be used for auxiliary diagnosis.
  • this application takes into account the dual-modal data of color fundus images and OCT images that are widely used in ophthalmology clinical applications, so that the system can obtain eye sign information from different perspectives, which is more in line with the clinical diagnosis process of many fundus diseases, and can also achieve better results.
  • This application proposes a dual-modal auxiliary diagnosis system for fundus diseases.
  • the three feature fusion strategies enable the feature vector extracted by the feature extraction module from the color fundus image and OCT image to be better applied to the judgment of fundus disease screening results. For different datasets and different task scenarios, the selection of the optimal solution of the feature fusion strategy will be different.
  • the method for auxiliary diagnosis of fundus diseases based on dual-modal deep learning by acquiring a color fundus image and an optical coherence tomography OCT image of the same eye;
  • the first feature vector and the second feature vector; the first feature vector and the second feature vector are fused according to the preset feature fusion strategy to obtain the target feature vector, and the target feature vector is input into the trained neural network diagnosis model to obtain diagnostic result.
  • eye sign information from different perspectives is obtained, and appropriate strategies are selected for feature fusion for different data sets and different task scenarios, so as to improve the accuracy of auxiliary diagnosis of fundus diseases.
  • the present application also proposes an auxiliary diagnosis device for fundus diseases based on dual-modal deep learning.
  • FIG. 5 is a schematic structural diagram of a device for auxiliary diagnosis of fundus diseases based on dual-modal deep learning according to an embodiment of the present application.
  • the device for auxiliary diagnosis of fundus diseases based on dual-modal deep learning includes: an acquisition module 510 , an extraction module 520 and a processing module 530 .
  • the acquiring module 510 is configured to acquire the color fundus image and the optical coherence tomography OCT image of the same eye.
  • the extraction module 520 is configured to perform feature extraction on the color fundus image and the OCT image, respectively, to obtain a first feature vector and a second feature vector.
  • the processing module 530 is configured to perform fusion processing on the first feature vector and the second feature vector according to a preset feature fusion strategy, obtain a target feature vector, and input the target feature vector into the trained neural network diagnostic model , to obtain diagnostic results.
  • the extraction module 520 is specifically configured to: perform feature extraction on the color fundus image and the OCT image respectively through the first feature extraction module, and obtain the first feature vector and the second feature Or, perform feature extraction on the color fundus image by a first feature extraction module to obtain the first feature vector, and perform feature extraction on the OCT image by a second feature extraction module to obtain the second feature vector.
  • the processing module 530 is specifically configured to: splicing the first feature vector and the second feature vector to obtain the target feature vector, and input the target feature vector into the trained Neural network diagnostic model to obtain diagnostic results.
  • the processing module 530 is specifically configured to: obtain a first weight corresponding to the first feature vector and a second weight corresponding to the second feature vector; wherein the first weight and the all The sum of the second weight is 1; the sum of the first product of the first feature vector and the first weight and the second product of the second feature vector and the second weight is obtained as the target feature vector, and input the target feature vector into the trained neural network diagnosis model to obtain the diagnosis result.
  • the device for auxiliary diagnosis of fundus diseases based on dual-modal deep learning obtains a color fundus image and an optical coherence tomography OCT image of the same eye;
  • the first feature vector and the second feature vector; the first feature vector and the second feature vector are fused according to the preset feature fusion strategy to obtain the target feature vector, and the target feature vector is input into the trained neural network diagnosis model to obtain diagnostic result.
  • eye sign information from different perspectives is obtained, and appropriate strategies are selected for feature fusion for different data sets and different task scenarios, so as to improve the accuracy of auxiliary diagnosis of fundus diseases.
  • first and second are only used for descriptive purposes, and should not be construed as indicating or implying relative importance or implying the number of indicated technical features. Thus, a feature delimited with “first”, “second” may expressly or implicitly include at least one of that feature.
  • plurality means at least two, such as two, three, etc., unless expressly and specifically defined otherwise.
  • a "computer-readable medium” can be any device that can contain, store, communicate, propagate, or transport the program for use by or in connection with an instruction execution system, apparatus, or apparatus.
  • computer readable media include the following: electrical connections with one or more wiring (electronic devices), portable computer disk cartridges (magnetic devices), random access memory (RAM), Read Only Memory (ROM), Erasable Editable Read Only Memory (EPROM or Flash Memory), Fiber Optic Devices, and Portable Compact Disc Read Only Memory (CDROM).
  • the computer readable medium may even be paper or other suitable medium on which the program may be printed, as the paper or other medium may be optically scanned, for example, followed by editing, interpretation, or other suitable medium as necessary process to obtain the program electronically and then store it in computer memory.
  • each functional unit in each embodiment of the present application may be integrated into one processing module, or each unit may exist physically alone, or two or more units may be integrated into one module.
  • the above-mentioned integrated modules can be implemented in the form of hardware, and can also be implemented in the form of software function modules. If the integrated modules are implemented in the form of software functional modules and sold or used as independent products, they may also be stored in a computer-readable storage medium.
  • the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, and the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Mathematical Physics (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Quality & Reliability (AREA)
  • Radiology & Medical Imaging (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Medical Informatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Multimedia (AREA)
  • Eye Examination Apparatus (AREA)

Abstract

本申请提出一种基于双模态深度学习的眼底疾病辅助诊断方法和装置,涉及数据处理技术领域,其中,方法包括:获取同一只眼睛的彩色眼底图像和光学相干断层扫描OCT图像;分别对彩色眼底图像和所述OCT图像进行特征提取,获取第一特征向量和第二特征向量;根据预设特征融合策略对第一特征向量和第二特征向量进行融合处理,获取目标特征向量,并将目标特征向量输入已训练的神经网络诊断模型,获取诊断结果。

Description

基于双模态深度学习的眼底疾病辅助诊断方法和装置
相关申请的交叉引用
本申请基于申请号为202110156174.7、申请日为2021年02月04日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。
技术领域
本申请涉及数据处理技术领域,尤其涉及一种基于双模态深度学习的眼底疾病辅助诊断方法和装置。
背景技术
通常,眼底疾病包括了玻璃体、视神经、脉络膜以及视网膜的炎症、肿瘤和各类血管的病变,或各种多系统疾病及变性疾病引起的眼部组织病变。我国是世界上盲人和视觉损伤患者数量最多的国家之一。目前我国有约2700万糖尿病视网膜病变患者,1600万青光眼疾病的患者,3000万黄斑病的患者。视力障碍将会严重影响人们的生活质量,然而,我国医疗资源人均占有率低,患者与专业医师间的比例严重失衡。近年来,为了有效缓解医生的工作量与患者需求量之间的矛盾,许多研究者将深度学习技术应用于眼底疾病智能辅助诊断领域。基于深度学习的计算机辅助诊疗系统可不间断工作,并一定程度上减弱了人类医生的主观性,使疾病的诊断更加客观、稳定。与此同时,深度学习技术可对医学图像中的病理特征进行逐像素的分析量化,为医生提供疾病诊断的参考。
相关技术中,1)只使用彩色眼底图像实现相关眼底疾病的智能筛查;2)只使用光学相干断层扫描(OCT,Optical Coherence Tomography)图像实现相关眼底疾病的智能筛查;方案1)和方案2)只使用单一模态的医学图像数据,数据收集较方便,但只使用单一图像进行辅助诊断是不符合医疗临床实际情况的。彩色眼底图像和OCT图像可分别从平面和横断面的视角展示不同的体征信息,许多眼底疾病如中心性浆液性脉络膜视网膜病变、年龄相关性黄斑变性、视网膜静脉阻塞、特发性息肉样脉络膜血管病变等都至少需要彩色眼底图像和OCT图像提供的体征信息才能进行确诊。
发明内容
本申请旨在至少在一定程度上解决相关技术中的技术问题之一。
为此,本申请的第一个目的在于提出一种基于双模态深度学习的眼底疾病辅助诊断方法,获得不同视角的眼部体征信息,针对不同数据集、不同任务场景,选择合适的策略进行特征融合,提高眼底疾病辅助诊断的准确性。
本申请的第二个目的在于提出一种基于双模态深度学习的眼底疾病辅助诊断装置。
本申请的第三个目的在于提出一种电子设备。
本申请的第四个目的在于提出一种计算机可读存储介质。
本申请的第五个目的在于一种计算机程序产品。
为达上述目的,本申请第一方面实施例提出了一种基于双模态深度学习的眼底疾病辅助诊断方法,包括:
获取同一只眼睛的彩色眼底图像和光学相干断层扫描OCT图像;
分别对所述彩色眼底图像和所述OCT图像进行特征提取,获取第一特征向量和第二特征向量;
根据预设特征融合策略对所述第一特征向量和所述第二特征向量进行融合处理,获取目标特征向量,并将所述目标特征向量输入已训练的神经网络诊断模型,获取诊断结果。
本申请实施例的基于双模态深度学习的眼底疾病辅助诊断方法,通过获取同一只眼睛的彩色眼底图像和光学相干断层扫描OCT图像;分别对彩色眼底图像和所述OCT图像进行特征提取,获取第一特征向量和第二特征向量;根据预设特征融合策略对第一特征向量和第二特征向量进行融合处理,获取目标特征向量,并将目标特征向量输入已训练的神经网络诊断模型,获取诊断结果。由此,获得不同视角的眼部体征信息,针对不同数据集、不同任务场景,选择合适的策略进行特征融合,提高眼底疾病辅助诊断的准确性。
可选地,在本申请的一个实施例中,所述分别对所述彩色眼底图像和所述OCT图像进行特征提取,获取第一特征向量和第二特征向量,包括:
通过第一特征提取模块分别对所述彩色眼底图像和所述OCT图像进行特征提取,获取所述第一特征向量和所述第二特征向量;或,
通过第一特征提取模块对所述彩色眼底图像进行特征提取,获取所述第一特征向量,以及通过第二特征提取模块对所述OCT图像进行特征提取,获取所述第二特征向量。
可选地,在本申请的一个实施例中,所述根据预设特征融合策略对所述第一特征向量和所述第二特征向量进行融合处理,获取目标特征向量,并将所述目标特征向量输入已训练的神经网络诊断模型,获取诊断结果,包括:
将所述第一特征向量和所述第二特征向量进行拼接,获取所述目标特征向量,并将所述目标特征向量输入已训练的神经网络诊断模型,获取诊断结果。
可选地,在本申请的一个实施例中,所述根据预设特征融合策略对所述第一特征向量和所述第二特征向量进行融合处理,获取目标特征向量,并将所述目标特征向量输入已训练的神经网络诊断模型,获取诊断结果,包括:
获取所述第一特征向量对应的第一权重,以及所述第二特征向量对应的第二权重;其中,所述第一权重和所述第二权重的和为1;
获取所述第一特征向量和所述第一权重的第一乘积与所述第二特征向量和所述第二权重的第二乘积的和作为所述目标特征向量,并将所述目标特征向量输入已训练的神经网络诊断模型,获取所述诊断结果。
可选地,在本申请的一个实施例中,所述根据预设特征融合策略对所述第一特征向 量和所述第二特征向量进行融合处理,获取目标特征向量,并将所述目标特征向量输入已训练的神经网络诊断模型,获取诊断结果,包括:
将所述第一特征向量和所述第二特征向量分别输入已训练的神经网络诊断模型,获取第一分类结果和第二分类结果,
获取所述第一分类结果对应的第一权重,以及获取所述第二分类结果对应的第二权重;
获取所述第一分类结果和所述第一权重的第一乘积与所述第二分类结果和所述第二权重的第二乘积的和作为所述诊断结果。
可选地,在本申请的一个实施例中,在所述将所述目标特征向量输入已训练的神经网络诊断模型,获取诊断结果之前,还包括:
获取多只眼睛中的每一只眼睛的彩色眼底图像样本和光学相干断层扫描OCT图像样本;其中,所述彩色眼底图像样本和所述OCT图像样本具有标注结果;
分别提取所述彩色眼底图像样本和所述OCT图像样本的第一特征向量样本和第二特征向量样本;
根据预设特征融合策略对所述第一特征向量样本和所述第二特征向量样本进行融合处理,获取目标特征向量样本,并将所述目标特征向量样本输入神经网络诊断模型进行训练,获取训练结果,通过损失函数计算所述标注结果和所述训练结果的误差,调整所述神经网络诊断模型的参数,直到所述误差小于预设阈值,生成所述已训练的神经网络诊断模型。
为达上述目的,本申请第二方面实施例提出了一种基于双模态深度学习的眼底疾病辅助诊断装置,包括:
获取模块,用于获取同一只眼睛的彩色眼底图像和光学相干断层扫描OCT图像;
提取模块,用于分别对所述彩色眼底图像和所述OCT图像进行特征提取,获取第一特征向量和第二特征向量;
处理模块,用于根据预设特征融合策略对所述第一特征向量和所述第二特征向量进行融合处理,获取目标特征向量,并将所述目标特征向量输入已训练的神经网络诊断模型,获取诊断结果。
本申请实施例的基于双模态深度学习的眼底疾病辅助诊断装置,通过获取同一只眼睛的彩色眼底图像和光学相干断层扫描OCT图像;分别对彩色眼底图像和所述OCT图像进行特征提取,获取第一特征向量和第二特征向量;根据预设特征融合策略对第一特征向量和第二特征向量进行融合处理,获取目标特征向量,并将目标特征向量输入已训练的神经网络诊断模型,获取诊断结果。由此,获得不同视角的眼部体征信息,针对不同数据集、不同任务场景,选择合适的策略进行特征融合,提高眼底疾病辅助诊断的准确性。
在本申请的一个实施例中,所述提取模块,具体用于:
通过第一特征提取模块分别对所述彩色眼底图像和所述OCT图像进行特征提取, 获取所述第一特征向量和所述第二特征向量;或,
通过第一特征提取模块对所述彩色眼底图像进行特征提取,获取所述第一特征向量,以及通过第二特征提取模块对所述OCT图像进行特征提取,获取所述第二特征向量。
在本申请的一个实施例中,所述处理模块,具体用于:
将所述第一特征向量和所述第二特征向量进行拼接,获取所述目标特征向量,并将所述目标特征向量输入已训练的神经网络诊断模型,获取诊断结果。
在本申请的一个实施例中,所述处理模块,具体用于:
获取所述第一特征向量对应的第一权重,以及所述第二特征向量对应的第二权重;其中,所述第一权重和所述第二权重的和为1;
获取所述第一特征向量和所述第一权重的第一乘积与所述第二特征向量和所述第二权重的第二乘积的和作为所述目标特征向量,并将所述目标特征向量输入已训练的神经网络诊断模型,获取所述诊断结果。
为达上述目的,本申请第三方面实施例提出了一种电子设备,包括:处理器;用于存储所述处理器可执行指令的存储器;其中,所述处理器被配置为执行所述指令,以实现本申请第一方面实施例提出的一种基于双模态深度学习的眼底疾病辅助诊断方法。
为达上述目的,本申请第四方面实施例提出了一种计算机可读存储介质,当所述计算机可读存储介质中的指令由电子设备的处理器执行时,使得所述电子设备能够执行本申请第一方面实施例提出的一种基于双模态深度学习的眼底疾病辅助诊断方法。
为达上述目的,本申请第五方面实施例提出了一种计算机程序产品,包括计算机程序,所述计算机程序被处理器执行时实现本申请第一方面实施例提出的一种基于双模态深度学习的眼底疾病辅助诊断方法。
本申请附加的方面和优点将在下面的描述中部分给出,部分将从下面的描述中变得明显,或通过本申请的实践了解到。
附图说明
本申请上述的和/或附加的方面和优点从下面结合附图对实施例的描述中将变得明显和容易理解,其中:
图1为本申请实施例的基于双模态深度学习的眼底疾病辅助诊断方法的示例图;
图2为本申请实施例所提供的一种基于双模态深度学习的眼底疾病辅助诊断方法的流程示意图;
图3为本申请实施例的基于双模态深度学习的眼底疾病辅助诊断方法的训练示例图;
图4为本申请实施例的基于双模态深度学习的眼底疾病辅助诊断方法的处理示例图;
图5为本申请实施例所提供的一种基于双模态深度学习的眼底疾病辅助诊断装置的结构示意图。
具体实施方式
下面详细描述本申请的实施例,所述实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的,旨在用于解释本申请,而不能理解为对本申请的限制。
下面参考附图描述本申请实施例的基于双模态深度学习的眼底疾病辅助诊断方法和装置。
图2为本申请实施例所提供的一种基于双模态深度学习的眼底疾病辅助诊断方法的流程示意图。
具体地,在现阶段的眼底疾病辅助诊断研究中,大部分的研究工作只使用了一种医学图像,这并不符合真实的医学临床场景。在使用多种医学图像的研究中,大部分工作并未探究特征提取模块的同构和异构以及不同特征融合策略是否会达到更好的分类效果。
本申请设计了一种双模态眼底疾病辅助诊断系统,并提出了适用于该系统的3种特征融合策略。系统使用彩色眼底图像和OCT图像,可实现对常见眼底疾病的辅助诊断。
具体地,如图2所示,针对眼底疾病辅助诊断领域中普遍存在的只使用单一医学图像等问题,我们提出了双模态眼底疾病辅助诊断系统,该系统包括两个部分:(1)不同特征提取模块应用于不同模态数据进行特征的提取;(2)对特征表示进行融合。完整的系统结构如图2所示,其中Model-F为应用于彩色眼底图像的特征提取模块,Model-O为应用于OCT图像的特征提取模块;(a)为基于特征的连接策略,(b)为基于特征的权重分配策略,(c)为基于分类结果的权重分配策略。
定义数据集D={x f,x O|y},其中x f和x O分别为从同一只眼睛获得的彩色眼底图像和OCT图像,y为该组图像的诊断标签。双模态眼底疾病辅助诊断系统接收成对的输入{x f,x O},并输出对眼睛的诊断结果
Figure PCTCN2021137145-appb-000001
其中双模态眼底疾病辅助诊断系统以“Our_Model”表示:
Figure PCTCN2021137145-appb-000002
如图1所示,该基于双模态深度学习的眼底疾病辅助诊断方法包括以下步骤101至步骤103。
步骤101,获取同一只眼睛的彩色眼底图像和光学相干断层扫描OCT图像。
步骤102,分别对彩色眼底图像和OCT图像进行特征提取,获取第一特征向量和第二特征向量。
在本申请实施例中,通过第一特征提取模块分别对彩色眼底图像和所述OCT图像进行特征提取,获取第一特征向量和第二特征向量;或,通过第一特征提取模块对彩色眼底图像进行特征提取,获取第一特征向量,以及通过第二特征提取模块对OCT图像进行特征提取,获取第二特征向量。
具体地,双模态眼底疾病辅助诊断系统由两个对称的分支组成,其中,用于处理彩色眼 底图像的特征提取模块记作Model-F,用于处理OCT图像的特征提取模块记作Model-O。
具体地,计算机视觉中主流的特征提取模块皆可用于双模态眼底疾病辅助诊断系统,如VGGNet、GoogleNet、ResNet等。针对不同的辅助诊断任务,该框架可选择不同的特征提取模块,Model-F和Model-O可以是相同的特征提取模块(同构)也可以是不同的特征提取模块(异构)。考虑到医学临床中,同一位病人的OCT图像往往比彩色眼底图像数量多,且两种图像分辨率不同,因此异构的特征提取模块可能会比同构特征提取模块获得更好的效果。
令F f为彩色眼底图像由Model-F提取得到的特征向量(图2中上面矩形块),同样,F O为Model-O作用于OCT图像生成的特征向量(图2中下面矩形块)。为方便后续的特征融合操作,特征向量F f和F O的大小需要进行统一。
步骤103,根据预设特征融合策略对第一特征向量和第二特征向量进行融合处理,获取目标特征向量,并将目标特征向量输入已训练的神经网络诊断模型,获取诊断结果。
具体地,在当前的结合彩色眼底图像和OCT图像的双模态研究中,大部分都只使用了特征连接策略进行特征融合。考虑到彩色眼底图像和OCT图像的数量差异、分辨率差异,两种图像对最终分类结果的贡献并不一定是均等的,因此该策略并不一定适用于所有的眼底疾病辅助诊断任务。
在本申请实施例中,提出了3种特征融合策略,针对不同的辅助诊断任务,双模态眼底疾病辅助诊断系统可选用不同的特征融合策略,也可修改策略中定义的超参数,以达到最佳分类效果。
第一种示例,将第一特征向量和第二特征向量进行拼接,获取目标特征向量,并将目标特征向量输入已训练的神经网络诊断模型,获取诊断结果。
具体地,如图2(a)所示,基于特征的连接策略将F f、F O连接得到向量F con,之后通过全连接层得到最终输出
Figure PCTCN2021137145-appb-000003
的得分:
Figure PCTCN2021137145-appb-000004
其中,W con为全连接层参数。通过选择
Figure PCTCN2021137145-appb-000005
中得分最高的类别来实现等式1中表示的分类。
其中,连接两个特征向量,可以直观地提高语义空间维数,将k维提升到2k维。对图像分类任务来说,更高维的语义空间意味着更多的语义信息。
第二种示例,获取第一特征向量对应的第一权重,以及第二特征向量对应的第二权重;其中,第一权重和第二权重的和为1;获取第一特征向量和第一权重的第一乘积与第二特征向量和第二权重的第二乘积的和作为目标特征向量,并将目标特征向量输入已训练的神经网络诊断模型,获取诊断结果。
具体地,如图2(b)所示,基于特征的权重分配策略通过将F f和F O按权重加和得到特征向量F add:F add=aF f+(1-a)F O,其中a为超参数且0<a<1。之后通过全连接层得到最终输出
Figure PCTCN2021137145-appb-000006
的得分:
Figure PCTCN2021137145-appb-000007
其中W add为全连接层参数。通过选择
Figure PCTCN2021137145-appb-000008
中得分最高的类别来实现等式1中表示的分类。
其中,参与权重分配的两个特征向量分别带有关于不同图像的特征信息。通过权重分配 原则,给予某模态特征不同的重要性。权重大小可以视为该信息对图像分类任务的重要程度。
第三种示例,将第一特征向量和第二特征向量分别输入已训练的神经网络诊断模型,获取第一分类结果和第二分类结果,获取第一分类结果对应的第一权重,以及获取第二分类结果对应的第二权重;获取第一分类结果和第一权重的第一乘积与第二分类结果和第二权重的第二乘积的和作为诊断结果。
具体地,如图2(c)所示,基于分类结果的权重分配策略先分别将F f和F O输入全连接层得到
Figure PCTCN2021137145-appb-000009
Figure PCTCN2021137145-appb-000010
其中,W f,W O分别为应用于彩色眼底图像和OCT图像上的全连接层参数。之后将
Figure PCTCN2021137145-appb-000011
Figure PCTCN2021137145-appb-000012
按权重加和得到最终输出
Figure PCTCN2021137145-appb-000013
的得分:
Figure PCTCN2021137145-appb-000014
Figure PCTCN2021137145-appb-000015
其中a为超参数且0<a<1。通过选择
Figure PCTCN2021137145-appb-000016
中得分最高的类别来实现等式1中表示的分类。
基于分类结果的权重分配策略是一种加权投票法,应用于彩色眼底图像的模块和应用于OCT图像的模块实际上相互独立,它们分别给出分类结果的预测并通过加权投票的方式得到最终结果。
在本申请实施例中,获取多只眼睛中的每一只眼睛的彩色眼底图像样本和光学相干断层扫描OCT图像样本;其中,彩色眼底图像样本和OCT图像样本具有标注结果;分别提取彩色眼底图像样本和OCT图像样本的第一特征向量样本和第二特征向量样本;根据预设特征融合策略对第一特征向量样本和第二特征向量样本进行融合处理,获取目标特征向量样本,并将目标特征向量样本输入神经网络诊断模型进行训练,获取训练结果,通过损失函数计算所述标注结果和训练结果的误差,调整神经网络诊断模型的参数,直到误差小于预设阈值,生成已训练的神经网络诊断模型。
具体地,双模态眼底疾病辅助诊断系统的训练流程图如图3所示,具体的步骤如下所述:1)根据数据集及任务特点选择应用于彩色眼底图像和OCT图像的特征提取模块Model-F和Model-O;2)根据数据集及任务特点选择特征融合策略,即基于特征的连接策略、基于特征的权重分配策略或基于分类结果的权重分配策略;3)根据选择的特征提取模块和特征融合策略,修改双模态眼底疾病辅助诊断系统对应的功能模块;4)使用已有的带标签数据集进行训练;5)若系统可达到预期的精度,则双模态眼底疾病辅助诊断系统训练结束;若未达到预期精度,则根据数据集及任务特点修改特征提取模块或修改特征融合策略,回到步骤3)。
具体地,双模态眼底疾病辅助诊断系统训练完成后,即可用于对眼底疾病的辅助诊断。系统的使用流程图如图4所示,具体的步骤如下所述:1)获取患者眼部的彩色眼底图像和OCT图像并上传至双模态眼底疾病辅助诊断系统,两张图像必须属于同一患者的同一只眼睛;2)系统对两张图像进行大小调整、图像增强等图像预处理;3)系统将彩色眼底图像输入Model-F特征提取模块得到彩色眼底图像特征向量,将OCT图像输入Model-O特征提取模块得到OCT图像特征向量;4)系统按照选定的特征融合策略将两个特征向量融合,并给 出最终的筛查结果,该结果可用于辅助诊断。
由此,本申请考虑了眼科临床应用广泛的彩色眼底图像和OCT图像双模态数据,使系统可获得不同视角的眼部体征信息,更符合较多眼底疾病的临床诊断流程,也可实现更好的分类性能;本申请可选用不同的特征提取模块,应用于彩色眼底图像和OCT图像的特征提取模块可以是同构模块也可以是异构模块。针对不同数据集、不同任务场景,特征提取模块的选择可不相同,通过这种方式可以更好的探究适用于不同眼底疾病辅助诊断任务的系统结构;本申请针对双模态眼底疾病辅助诊断系统提出3种特征融合策略,使特征提取模块从彩色眼底图像和OCT图像中提取得到的特征向量可以更好的应用于眼底疾病筛查结果的判断。针对不同数据集、不同任务场景,特征融合策略最优方案的选择会有所不同。
本申请实施例的基于双模态深度学习的眼底疾病辅助诊断方法,通过获取同一只眼睛的彩色眼底图像和光学相干断层扫描OCT图像;分别对彩色眼底图像和所述OCT图像进行特征提取,获取第一特征向量和第二特征向量;根据预设特征融合策略对第一特征向量和第二特征向量进行融合处理,获取目标特征向量,并将目标特征向量输入已训练的神经网络诊断模型,获取诊断结果。由此,获得不同视角的眼部体征信息,针对不同数据集、不同任务场景,选择合适的策略进行特征融合,提高眼底疾病辅助诊断的准确性。
为了实现上述实施例,本申请还提出一种基于双模态深度学习的眼底疾病辅助诊断装置。
图5为本申请实施例提供的一种基于双模态深度学习的眼底疾病辅助诊断装置的结构示意图。
如图5所示,该基于双模态深度学习的眼底疾病辅助诊断装置包括:获取模块510、提取模块520和处理模块530。
获取模块510,用于获取同一只眼睛的彩色眼底图像和光学相干断层扫描OCT图像。
提取模块520,用于分别对所述彩色眼底图像和所述OCT图像进行特征提取,获取第一特征向量和第二特征向量。
处理模块530,用于根据预设特征融合策略对所述第一特征向量和所述第二特征向量进行融合处理,获取目标特征向量,并将所述目标特征向量输入已训练的神经网络诊断模型,获取诊断结果。
在本申请实施例中,提取模块520,具体用于:通过第一特征提取模块分别对所述彩色眼底图像和所述OCT图像进行特征提取,获取所述第一特征向量和所述第二特征向量;或,通过第一特征提取模块对所述彩色眼底图像进行特征提取,获取所述第一特征向量,以及通过第二特征提取模块对所述OCT图像进行特征提取,获取所述第二特征向量。
在本申请实施例中,处理模块530,具体用于:将所述第一特征向量和所述第二特征向量进行拼接,获取所述目标特征向量,并将所述目标特征向量输入已训练的神经网络诊断模型,获取诊断结果。
在本申请实施例中,处理模块530,具体用于:获取所述第一特征向量对应的第一权重, 以及所述第二特征向量对应的第二权重;其中,所述第一权重和所述第二权重的和为1;获取所述第一特征向量和所述第一权重的第一乘积与所述第二特征向量和所述第二权重的第二乘积的和作为所述目标特征向量,并将所述目标特征向量输入已训练的神经网络诊断模型,获取所述诊断结果。
本申请实施例的基于双模态深度学习的眼底疾病辅助诊断装置,通过获取同一只眼睛的彩色眼底图像和光学相干断层扫描OCT图像;分别对彩色眼底图像和所述OCT图像进行特征提取,获取第一特征向量和第二特征向量;根据预设特征融合策略对第一特征向量和第二特征向量进行融合处理,获取目标特征向量,并将目标特征向量输入已训练的神经网络诊断模型,获取诊断结果。由此,获得不同视角的眼部体征信息,针对不同数据集、不同任务场景,选择合适的策略进行特征融合,提高眼底疾病辅助诊断的准确性。
需要说明的是,前述对基于双模态深度学习的眼底疾病辅助诊断方法实施例的解释说明也适用于该实施例的基于双模态深度学习的眼底疾病辅助诊断装置,此处不再赘述。
在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本申请的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不必须针对的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任一个或多个实施例或示例中以合适的方式结合。此外,在不相互矛盾的情况下,本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。
此外,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括至少一个该特征。在本申请的描述中,“多个”的含义是至少两个,例如两个,三个等,除非另有明确具体的限定。
流程图中或在此以其他方式描述的任何过程或方法描述可以被理解为,表示包括一个或更多个用于实现定制逻辑功能或过程的步骤的可执行指令的代码的模块、片段或部分,并且本申请的优选实施方式的范围包括另外的实现,其中可以不按所示出或讨论的顺序,包括根据所涉及的功能按基本同时的方式或按相反的顺序,来执行功能,这应被本申请的实施例所属技术领域的技术人员所理解。
在流程图中表示或在此以其他方式描述的逻辑和/或步骤,例如,可以被认为是用于实现逻辑功能的可执行指令的定序列表,可以具体实现在任何计算机可读介质中,以供指令执行系统、装置或设备(如基于计算机的系统、包括处理器的系统或其他可以从指令执行系统、装置或设备取指令并执行指令的系统)使用,或结合这些指令执行系统、装置或设备而使用。就本说明书而言,"计算机可读介质"可以是任何可以包含、存储、通信、传播或传输程序以供指令执行系统、装置或设备或结合这些指令执行系统、装置或设备而使用的装置。计算机可读介质的更具体的示例(非穷尽性列表)包括以下:具有一个或多个布线的电连接部(电子装置),便携式计算机盘盒(磁装置),随机存取存储器(RAM),只读存储器(ROM),可擦除可编辑只读存储器(EPROM或闪速存储器),光纤装置,以及便携式光盘只读存储 器(CDROM)。另外,计算机可读介质甚至可以是可在其上打印所述程序的纸或其他合适的介质,因为可以例如通过对纸或其他介质进行光学扫描,接着进行编辑、解译或必要时以其他合适方式进行处理来以电子方式获得所述程序,然后将其存储在计算机存储器中。
应当理解,本申请的各部分可以用硬件、软件、固件或它们的组合来实现。在上述实施方式中,多个步骤或方法可以用存储在存储器中且由合适的指令执行系统执行的软件或固件来实现。如,如果用硬件来实现和在另一实施方式中一样,可用本领域公知的下列技术中的任一项或他们的组合来实现:具有用于对数据信号实现逻辑功能的逻辑门电路的离散逻辑电路,具有合适的组合逻辑门电路的专用集成电路,可编程门阵列(PGA),现场可编程门阵列(FPGA)等。
本技术领域的普通技术人员可以理解实现上述实施例方法携带的全部或部分步骤是可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可读存储介质中,该程序在执行时,包括方法实施例的步骤之一或其组合。
此外,在本申请各个实施例中的各功能单元可以集成在一个处理模块中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。所述集成的模块如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。
上述提到的存储介质可以是只读存储器,磁盘或光盘等。尽管上面已经示出和描述了本申请的实施例,可以理解的是,上述实施例是示例性的,不能理解为对本申请的限制,本领域的普通技术人员在本申请的范围内可以对上述实施例进行变化、修改、替换和变型。

Claims (13)

  1. 一种基于双模态深度学习的眼底疾病辅助诊断方法,其特征在于,包括以下步骤:
    获取同一只眼睛的彩色眼底图像和光学相干断层扫描OCT图像;
    分别对所述彩色眼底图像和所述OCT图像进行特征提取,获取第一特征向量和第二特征向量;
    根据预设特征融合策略对所述第一特征向量和所述第二特征向量进行融合处理,获取目标特征向量,并将所述目标特征向量输入已训练的神经网络诊断模型,获取诊断结果。
  2. 如权利要求1所述的方法,其特征在于,所述分别对所述彩色眼底图像和所述OCT图像进行特征提取,获取第一特征向量和第二特征向量,包括:
    通过第一特征提取模块分别对所述彩色眼底图像和所述OCT图像进行特征提取,获取所述第一特征向量和所述第二特征向量;或,
    通过第一特征提取模块对所述彩色眼底图像进行特征提取,获取所述第一特征向量,以及通过第二特征提取模块对所述OCT图像进行特征提取,获取所述第二特征向量。
  3. 如权利要求1所述的方法,其特征在于,所述根据预设特征融合策略对所述第一特征向量和所述第二特征向量进行融合处理,获取目标特征向量,并将所述目标特征向量输入已训练的神经网络诊断模型,获取诊断结果,包括:
    将所述第一特征向量和所述第二特征向量进行拼接,获取所述目标特征向量,并将所述目标特征向量输入已训练的神经网络诊断模型,获取诊断结果。
  4. 如权利要求1所述的方法,其特征在于,所述根据预设特征融合策略对所述第一特征向量和所述第二特征向量进行融合处理,获取目标特征向量,并将所述目标特征向量输入已训练的神经网络诊断模型,获取诊断结果,包括:
    获取所述第一特征向量对应的第一权重,以及所述第二特征向量对应的第二权重;其中,所述第一权重和所述第二权重的和为1;
    获取所述第一特征向量和所述第一权重的第一乘积与所述第二特征向量和所述第二权重的第二乘积的和作为所述目标特征向量,并将所述目标特征向量输入已训练的神经网络诊断模型,获取所述诊断结果。
  5. 如权利要求1所述的方法,其特征在于,所述根据预设特征融合策略对所述第一特征向量和所述第二特征向量进行融合处理,获取目标特征向量,并将所述目标特征向量输入已训练的神经网络诊断模型,获取诊断结果,包括:
    将所述第一特征向量和所述第二特征向量分别输入已训练的神经网络诊断模型,获 取第一分类结果和第二分类结果,
    获取所述第一分类结果对应的第一权重,以及获取所述第二分类结果对应的第二权重;
    获取所述第一分类结果和所述第一权重的第一乘积与所述第二分类结果和所述第二权重的第二乘积的和作为所述诊断结果。
  6. 如权利要求1至5任一项所述的方法,其特征在于,在所述将所述目标特征向量输入已训练的神经网络诊断模型,获取诊断结果之前,还包括:
    获取多只眼睛中的每一只眼睛的彩色眼底图像样本和光学相干断层扫描OCT图像样本;其中,所述彩色眼底图像样本和所述OCT图像样本具有标注结果;
    分别提取所述彩色眼底图像样本和所述OCT图像样本的第一特征向量样本和第二特征向量样本;
    根据预设特征融合策略对所述第一特征向量样本和所述第二特征向量样本进行融合处理,获取目标特征向量样本,并将所述目标特征向量样本输入神经网络诊断模型进行训练,获取训练结果,通过损失函数计算所述标注结果和所述训练结果的误差,调整所述神经网络诊断模型的参数,直到所述误差小于预设阈值,生成所述已训练的神经网络诊断模型。
  7. 一种基于双模态深度学习的眼底疾病辅助诊断装置,其特征在于,包括:
    获取模块,用于获取同一只眼睛的彩色眼底图像和光学相干断层扫描OCT图像;
    提取模块,用于分别对所述彩色眼底图像和所述OCT图像进行特征提取,获取第一特征向量和第二特征向量;
    处理模块,用于根据预设特征融合策略对所述第一特征向量和所述第二特征向量进行融合处理,获取目标特征向量,并将所述目标特征向量输入已训练的神经网络诊断模型,获取诊断结果。
  8. 如权利要求7所述的装置,其特征在于,所述提取模块,具体用于:
    通过第一特征提取模块分别对所述彩色眼底图像和所述OCT图像进行特征提取,获取所述第一特征向量和所述第二特征向量;或,
    通过第一特征提取模块对所述彩色眼底图像进行特征提取,获取所述第一特征向量,以及通过第二特征提取模块对所述OCT图像进行特征提取,获取所述第二特征向量。
  9. 如权利要求7所述的装置,其特征在于,所述处理模块,具体用于:
    将所述第一特征向量和所述第二特征向量进行拼接,获取所述目标特征向量,并将所述目标特征向量输入已训练的神经网络诊断模型,获取诊断结果。
  10. 如权利要求7所述的装置,其特征在于,所述处理模块,具体用于:
    获取所述第一特征向量对应的第一权重,以及所述第二特征向量对应的第二权重;其中,所述第一权重和所述第二权重的和为1;
    获取所述第一特征向量和所述第一权重的第一乘积与所述第二特征向量和所述第二权重的第二乘积的和作为所述目标特征向量,并将所述目标特征向量输入已训练的神经网络诊断模型,获取所述诊断结果。
  11. 一种电子设备,其特征在于,包括:
    处理器;
    用于存储所述处理器可执行指令的存储器;
    其中,所述处理器被配置为执行所述指令,以实现以下步骤:
    获取同一只眼睛的彩色眼底图像和光学相干断层扫描OCT图像;
    分别对所述彩色眼底图像和所述OCT图像进行特征提取,获取第一特征向量和第二特征向量;
    根据预设特征融合策略对所述第一特征向量和所述第二特征向量进行融合处理,获取目标特征向量,并将所述目标特征向量输入已训练的神经网络诊断模型,获取诊断结果。
  12. 一种计算机可读存储介质,其特征在于,当所述计算机可读存储介质中的指令由电子设备的处理器执行时,使得所述电子设备能够执行以下步骤:
    获取同一只眼睛的彩色眼底图像和光学相干断层扫描OCT图像;
    分别对所述彩色眼底图像和所述OCT图像进行特征提取,获取第一特征向量和第二特征向量;
    根据预设特征融合策略对所述第一特征向量和所述第二特征向量进行融合处理,获取目标特征向量,并将所述目标特征向量输入已训练的神经网络诊断模型,获取诊断结果。
  13. 一种计算机程序产品,包括计算机程序,其特征在于,所述计算机程序被处理器执行时实现以下步骤:
    获取同一只眼睛的彩色眼底图像和光学相干断层扫描OCT图像;
    分别对所述彩色眼底图像和所述OCT图像进行特征提取,获取第一特征向量和第二特征向量;
    根据预设特征融合策略对所述第一特征向量和所述第二特征向量进行融合处理,获取目标特征向量,并将所述目标特征向量输入已训练的神经网络诊断模型,获取诊断结果。
PCT/CN2021/137145 2021-02-04 2021-12-10 基于双模态深度学习的眼底疾病辅助诊断方法和装置 WO2022166399A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110156174.7A CN112884729B (zh) 2021-02-04 2021-02-04 基于双模态深度学习的眼底疾病辅助诊断方法和装置
CN202110156174.7 2021-02-04

Publications (1)

Publication Number Publication Date
WO2022166399A1 true WO2022166399A1 (zh) 2022-08-11

Family

ID=76057186

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/137145 WO2022166399A1 (zh) 2021-02-04 2021-12-10 基于双模态深度学习的眼底疾病辅助诊断方法和装置

Country Status (2)

Country Link
CN (1) CN112884729B (zh)
WO (1) WO2022166399A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116721760A (zh) * 2023-06-12 2023-09-08 东北林业大学 融合生物标志物的多任务糖尿病性视网膜病变检测算法

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112884729B (zh) * 2021-02-04 2023-08-01 北京邮电大学 基于双模态深度学习的眼底疾病辅助诊断方法和装置
CN114494734A (zh) * 2022-01-21 2022-05-13 平安科技(深圳)有限公司 基于眼底图像的病变检测方法、装置、设备及存储介质
CN116433644B (zh) * 2023-04-22 2024-03-08 深圳市江机实业有限公司 一种基于识别模型的眼部图像动态诊断方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110103658A1 (en) * 2009-10-29 2011-05-05 John Davis Enhanced imaging for optical coherence tomography
CN111667468A (zh) * 2020-05-28 2020-09-15 平安科技(深圳)有限公司 基于神经网络的oct图像病灶检测方法、装置及介质
CN111696100A (zh) * 2020-06-17 2020-09-22 上海鹰瞳医疗科技有限公司 基于眼底影像确定吸烟程度的方法及设备
CN112884729A (zh) * 2021-02-04 2021-06-01 北京邮电大学 基于双模态深度学习的眼底疾病辅助诊断方法和装置

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107045720B (zh) * 2017-05-04 2018-11-30 深圳硅基仿生科技有限公司 基于人工神经网络的识别眼底图像病变的处理系统
EP3783533A1 (en) * 2018-04-17 2021-02-24 BGI Shenzhen Artificial intelligence-based ophthalmic disease diagnostic modeling method, apparatus, and system
CN109998599A (zh) * 2019-03-07 2019-07-12 华中科技大学 一种基于ai技术的光/声双模成像眼底疾病诊断系统
CN111428072A (zh) * 2020-03-31 2020-07-17 南方科技大学 眼科多模态影像的检索方法、装置、服务器及存储介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110103658A1 (en) * 2009-10-29 2011-05-05 John Davis Enhanced imaging for optical coherence tomography
CN111667468A (zh) * 2020-05-28 2020-09-15 平安科技(深圳)有限公司 基于神经网络的oct图像病灶检测方法、装置及介质
CN111696100A (zh) * 2020-06-17 2020-09-22 上海鹰瞳医疗科技有限公司 基于眼底影像确定吸烟程度的方法及设备
CN112884729A (zh) * 2021-02-04 2021-06-01 北京邮电大学 基于双模态深度学习的眼底疾病辅助诊断方法和装置

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"arXiv.org", vol. 18, 10 October 2019, CORNELL UNIVERSITY LIBRARY,, 201 Olin Library Cornell University Ithaca, NY 14853, article WANG WEISEN; XU ZHIYAN; YU WEIHONG; ZHAO JIANCHUN; YANG JINGYUAN; HE FENG; YANG ZHIKUN; CHEN DI; DING DAYONG; CHEN YOUXIN; LI XIRO: "Two-Stream CNN with Loose Pair Training for Multi-modal AMD Categorization", pages: 156 - 164, XP047522509, DOI: 10.1007/978-3-030-32239-7_18 *
XU, ZHIYAN: "Artificial Intelligence Diagnosis System for Age-related Macular Degeneration and Polypoidal Choroidal Vasculopathy", CHINESE DOCTORAL DISSERTATIONS FULL-TEXT DATABASE, PUBLIC MEDICINE & SCIENCE, 1 June 2019 (2019-06-01), pages 1 - 58, XP055956666, [retrieved on 20220831] *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116721760A (zh) * 2023-06-12 2023-09-08 东北林业大学 融合生物标志物的多任务糖尿病性视网膜病变检测算法
CN116721760B (zh) * 2023-06-12 2024-04-26 东北林业大学 融合生物标志物的多任务糖尿病性视网膜病变检测算法

Also Published As

Publication number Publication date
CN112884729A (zh) 2021-06-01
CN112884729B (zh) 2023-08-01

Similar Documents

Publication Publication Date Title
WO2022166399A1 (zh) 基于双模态深度学习的眼底疾病辅助诊断方法和装置
CN107423571B (zh) 基于眼底图像的糖尿病视网膜病变识别系统
CN108771530B (zh) 基于深度神经网络的眼底病变筛查系统
WO2022188489A1 (zh) 多模态多病种长尾分布眼科疾病分类模型训练方法和装置
WO2019200535A1 (zh) 基于人工智能的眼科疾病诊断建模方法、装置及系统
Borkovkina et al. Real-time retinal layer segmentation of OCT volumes with GPU accelerated inferencing using a compressed, low-latency neural network
KR20200005411A (ko) 심혈관 질병 진단 보조 방법 및 장치
CN111428072A (zh) 眼科多模态影像的检索方法、装置、服务器及存储介质
Zhu et al. Digital image processing for ophthalmology: Detection of the optic nerve head
CN109464120A (zh) 一种糖尿病视网膜病变筛查方法、装置及存储介质
CN112233087A (zh) 一种基于人工智能的眼科超声疾病诊断方法和系统
CN112869697A (zh) 同时识别糖尿病视网膜病变的分期和病变特征的判断方法
CN113887662A (zh) 一种基于残差网络的图像分类方法、装置、设备及介质
Hwang et al. Smartphone-based diabetic macula edema screening with an offline artificial intelligence
Phridviraj et al. A bi-directional Long Short-Term Memory-based Diabetic Retinopathy detection model using retinal fundus images
Giancardo Automated fundus images analysis techniques to screen retinal diseases in diabetic patients
Zhang et al. Artificial intelligence technology for myopia challenges: a review
CN116092667A (zh) 基于多模态影像的疾病检测方法、系统、装置及存储介质
Datta et al. An Integrated Fundus Image Segmentation Algorithm for Multiple Eye Ailments
Li et al. Class-aware attention network for infectious keratitis diagnosis using corneal photographs
Mu et al. Improved model of eye disease recognition based on VGG model
Patil et al. Screening and detection of diabetic retinopathy by using engineering concepts
Sheikh Diabetic Reinopathy Classification Using Deep Learning
Ali et al. Classifying Three Stages of Cataract Disease using CNN
Chang et al. Selective Plane Illumination Microscopy and Computing Reveal Differential Obliteration of Retinal Vascular Plexuses

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21924392

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21924392

Country of ref document: EP

Kind code of ref document: A1