WO2021114818A1 - Method, system, and device for oct image quality evaluation based on fourier transform - Google Patents
Method, system, and device for oct image quality evaluation based on fourier transform Download PDFInfo
- Publication number
- WO2021114818A1 WO2021114818A1 PCT/CN2020/117943 CN2020117943W WO2021114818A1 WO 2021114818 A1 WO2021114818 A1 WO 2021114818A1 CN 2020117943 W CN2020117943 W CN 2020117943W WO 2021114818 A1 WO2021114818 A1 WO 2021114818A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- oct image
- network model
- sample set
- modal
- fundus oct
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0012—Biomedical image inspection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10072—Tomographic images
- G06T2207/10101—Optical tomography; Optical coherence tomography [OCT]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20048—Transform domain processing
- G06T2207/20056—Discrete and fast Fourier transform, [DFT, FFT]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30041—Eye; Retina; Ophthalmic
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30168—Image quality inspection
Definitions
- the present application also provides a computer-readable storage medium in which is stored an OCT image quality evaluation program based on Fourier transform, and the OCT based on Fourier transform
- the image quality evaluation program is executed by the processor, the following steps are implemented:
- At least three category labels can be set up for fundus OCT image data, such as good, poor, and usable.
- good corresponds to good image quality
- the retina and choroid are clear, and does not affect the graphics of the doctor's disease diagnosis
- usable corresponds to the retina, the choroid is blurred or missing, but does not affect the doctor's diagnosis, the image quality is barely usable image
- poor corresponds to the image quality Poor, blurry or missing of the retina and choroid, the image diagnosed by the imaging doctor.
- the process of performing Fourier transform on the fundus OCT image sample set with known image tags includes the following steps:
- the fundus OCT image to be classified is an OCT image that has no label and needs to be automatically classified
- the spectrum image to be classified is a frequency domain image obtained by Fourier transform of the fundus OCT image to be classified.
- the Fourier transform process of the OCT image of the fundus to be classified is the same as the Fourier transform process in step S110, so it will not be repeated here.
- the electronic device 70 includes a processor 71 and a memory 72.
- FIG. 3 is a schematic diagram of the internal logic of the Fourier transform-based OCT image quality evaluation program according to an embodiment of the present application.
- the Fourier transform-based OCT image quality evaluation program 73 also It can be divided into one or more modules, and one or more modules are stored in the memory 72 and executed by the processor 71 to complete the application.
- the module referred to in this application refers to a series of computer program instruction segments that can complete specific functions.
- FIG. 3 it is a program module diagram of a preferred embodiment of the OCT image quality evaluation program 73 based on Fourier transform in FIG.
- the model training module 75 is configured to create a multi-modal classification network model, and train the multi-modal classification network model through the fundus OCT image sample set and the spectrum image sample set;
Landscapes
- Engineering & Computer Science (AREA)
- Quality & Reliability (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Radiology & Medical Imaging (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Eye Examination Apparatus (AREA)
- Image Analysis (AREA)
Abstract
A method, system, and device for OCT image quality assessment based on Fourier transform. The method comprises: Fourier transforming fundus OCT image samples in a fundus OCT image sample set having a known image tag to create a corresponding spectral image sample set (S110); creating a multimodal classification network model and training the multimodal classification network model via the fundus OCT image sample set and of the spectral image sample set (S120); when the training of the multimodal classification network model is completed, inputting a fundus OCT image to be classified and a spectral image to be classified corresponding to said fundus OCT image into the multimodal classification network model, and assessing the quality of said fundus OCT image via the multimodal classification network model (S130). The solution also relates to the blockchain technology; the fundus OCT image samples are stored in a blockchain. The technical solution not only implements the automation of the quality assessment of a fundus OCT image, but also significantly increases assessment precision.
Description
本申请要求于2020年6月30日提交中国专利局、申请号为202010618087.4,发明名称为“基于傅里叶变换的OCT图像质量评估方法、系统及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application filed with the Chinese Patent Office on June 30, 2020, the application number is 202010618087.4, and the invention title is "Fourier Transform-based OCT Image Quality Evaluation Method, System and Device", all of which The content is incorporated in this application by reference.
本申请涉及图像识别技术领域,尤其涉及一种基于傅里叶变换的OCT图像质量评估方法、系统、装置及存储介质。This application relates to the field of image recognition technology, and in particular to an OCT image quality assessment method, system, device and storage medium based on Fourier transform.
光学相干断层扫描(Optical Coherence Tomography,OCT)是一种可用于诊断眼底疾病的成像技术。由于其可以精准反应病人眼底的疾病,且成像方便快捷,因此广泛应用于人工智能(Artificial Intelligence,AI)筛查和辅助诊断领域。发明人意识到,现阶段眼底OCT质量评估方法,主要基于质量指数(Quality Index,QI)和信号强度指数(Signal Strength Indicator,SSI)去判断眼底OCT图像质量是否合格。但是这种方法只能反应一个OCT图像序列的整体质量,无法判断单张眼底OCT图像质量是否可用,且此种方法很难应用于人工智能领域。Optical coherence tomography (Optical Coherence Tomography, OCT) is an imaging technique that can be used to diagnose fundus diseases. Because it can accurately reflect the disease of the patient's fundus, and the imaging is convenient and fast, it is widely used in the field of artificial intelligence (AI) screening and auxiliary diagnosis. The inventor realized that the current OCT quality evaluation method of the fundus is mainly based on the Quality Index (Quality Index, QI) and the Signal Strength Index (Signal Strength Indicator, SSI) to determine whether the quality of the fundus OCT image is qualified. However, this method can only reflect the overall quality of an OCT image sequence, and cannot determine whether the quality of a single fundus OCT image is available, and it is difficult to apply this method to the field of artificial intelligence.
然而,发明人研究过程中发现,传统AI图像质量评估方法通常是将图像输入神经网络中进行图像分类,这种方式只考虑图像空间域信息而不会考虑图像的频域信息,但OCT图像为较为简单,图像域信息较为单一,因此传统AI图像质量评估方法很难得到一个较好的结果。对于眼底OCT质量的评估,使用这种传统的AI图像质量评估方法显然也得不到较好的质量评估结果。However, during the inventor’s research, it was discovered that traditional AI image quality evaluation methods usually input images into a neural network for image classification. This method only considers the image spatial domain information and does not consider the image frequency domain information, but OCT images are It is relatively simple and the image domain information is relatively single, so it is difficult for traditional AI image quality evaluation methods to obtain a better result. For the evaluation of the OCT quality of the fundus, using this traditional AI image quality evaluation method obviously cannot get a good quality evaluation result.
基于以上问题,亟需一种高效且高质量的眼底OCT图像的质量评估方法。Based on the above problems, an efficient and high-quality method for evaluating the quality of fundus OCT images is urgently needed.
发明内容Summary of the invention
本申请提供一种基于傅里叶变换的OCT图像质量评估方法、系统、电子装置以及计算机存储介质,其主要目的在于解决现有的眼底OCT图像方法效率低质量差的问题。The present application provides a Fourier transform-based OCT image quality evaluation method, system, electronic device, and computer storage medium, the main purpose of which is to solve the problem of low efficiency and poor quality of the existing fundus OCT image method.
为实现上述目的,本申请提供一种基于傅里叶变换的OCT图像质量评估方法,该方法包括如下步骤:To achieve the above objective, this application provides a Fourier transform-based OCT image quality evaluation method, which includes the following steps:
对具有已知图像标签的眼底OCT图像样本集中的各眼底OCT图像样本进行傅里叶变换,以建立相应的频谱图像样本集;Perform Fourier transform on each fundus OCT image sample set with a known image label in the fundus OCT image sample set to establish a corresponding spectrum image sample set;
创建多模态分类网络模型,并通过所述眼底OCT图像样本集和所述频谱图像样本集对所述多模态分类网络模型进行训练;Creating a multi-modal classification network model, and training the multi-modal classification network model through the fundus OCT image sample set and the spectrum image sample set;
所述多模态分类网络模型训练完毕后,将待分类眼底OCT图像以及与所述待分类眼底OCT图像对应的待分类频谱图像输入至所述多模态分类网络模型,通过所述多模态分类网络模型对所述待分类眼底OCT图像进行质量评估。After the multi-modal classification network model is trained, the fundus OCT image to be classified and the spectrum image to be classified corresponding to the fundus OCT image to be classified are input to the multi-modal classification network model, and the multi-modal The classification network model evaluates the quality of the fundus OCT image to be classified.
另外,本申请还提供一种基于傅里叶变换的OCT图像质量评估系统,所述系统包括:In addition, the present application also provides an OCT image quality evaluation system based on Fourier transform, the system includes:
样本集建立单元,用于对具有已知图像标签的眼底OCT图像样本集中的各眼底OCT图像样本进行傅里叶变换,以建立相应的频谱图像样本集;A sample set establishment unit, configured to perform Fourier transform on each fundus OCT image sample set with a known image label in the fundus OCT image sample set to establish a corresponding spectrum image sample set;
模型训练单元,用于创建多模态分类网络模型,并通过所述眼底OCT图像样本集和所述频谱图像样本集对所述多模态分类网络模型进行训练;A model training unit, configured to create a multi-modal classification network model, and train the multi-modal classification network model through the fundus OCT image sample set and the spectrum image sample set;
模型应用单元,用于在所述多模态分类网络模型训练完毕后,将待分类眼底OCT图像以及与所述待分类眼底OCT图像对应的待分类频谱图像输入至所述多模态分类网络模型,通过所述多模态分类网络模型对所述待分类眼底OCT图像进行质量评估。The model application unit is configured to input the fundus OCT image to be classified and the spectrum image to be classified corresponding to the fundus OCT image to be classified into the multimodal classification network model after the training of the multimodal classification network model is completed , Performing quality evaluation on the OCT image of the fundus to be classified through the multi-modal classification network model.
另外,为实现上述目的,本申请还提供一种电子装置,该电子装置包括:存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的基于傅里叶变换的OCT图像质量评估程序,所述基于傅里叶变换的OCT图像质量评估程序被所述处理器执行时实现如下 步骤:In addition, in order to achieve the above object, the present application also provides an electronic device, which includes a memory, a processor, and a Fourier transform-based OCT image stored in the memory and running on the processor. A quality evaluation program, when the Fourier transform-based OCT image quality evaluation program is executed by the processor, the following steps are implemented:
对具有已知图像标签的眼底OCT图像样本集中的各眼底OCT图像样本进行傅里叶变换,以建立相应的频谱图像样本集;Perform Fourier transform on each fundus OCT image sample set with known image tags in the fundus OCT image sample set to establish a corresponding spectrum image sample set;
创建多模态分类网络模型,并通过所述眼底OCT图像样本集和所述频谱图像样本集对所述多模态分类网络模型进行训练;Creating a multi-modal classification network model, and training the multi-modal classification network model through the fundus OCT image sample set and the spectrum image sample set;
所述多模态分类网络模型训练完毕后,将待分类眼底OCT图像以及与所述待分类眼底OCT图像对应的待分类频谱图像输入至所述多模态分类网络模型,通过所述多模态分类网络模型对所述待分类眼底OCT图像进行质量评估。After the multi-modal classification network model is trained, the fundus OCT image to be classified and the spectrum image to be classified corresponding to the fundus OCT image to be classified are input to the multi-modal classification network model, and the multi-modal The classification network model evaluates the quality of the fundus OCT image to be classified.
另外,为实现上述目的,本申请还提供一种计算机可读存储介质,所述计算机可读存储介质中存储有基于傅里叶变换的OCT图像质量评估程序,所述基于傅里叶变换的OCT图像质量评估程序被处理器执行时,实现如下步骤:In addition, in order to achieve the above object, the present application also provides a computer-readable storage medium in which is stored an OCT image quality evaluation program based on Fourier transform, and the OCT based on Fourier transform When the image quality evaluation program is executed by the processor, the following steps are implemented:
对具有已知图像标签的眼底OCT图像样本集中的各眼底OCT图像样本进行傅里叶变换,以建立相应的频谱图像样本集;Perform Fourier transform on each fundus OCT image sample set with a known image label in the fundus OCT image sample set to establish a corresponding spectrum image sample set;
创建多模态分类网络模型,并通过所述眼底OCT图像样本集和所述频谱图像样本集对所述多模态分类网络模型进行训练;Creating a multi-modal classification network model, and training the multi-modal classification network model through the fundus OCT image sample set and the spectrum image sample set;
所述多模态分类网络模型训练完毕后,将待分类眼底OCT图像以及与所述待分类眼底OCT图像对应的待分类频谱图像输入至所述多模态分类网络模型,通过所述多模态分类网络模型对所述待分类眼底OCT图像进行质量评估。After the multi-modal classification network model is trained, the fundus OCT image to be classified and the spectrum image to be classified corresponding to the fundus OCT image to be classified are input to the multi-modal classification network model, and the multi-modal The classification network model evaluates the quality of the fundus OCT image to be classified.
本申请使用傅里叶变换的方式获取眼底OCT图像样本集的频谱图像样本集,根据眼底OCT图像样本集和频谱图像样本集对多模态分类网络模型进行训练,通过利用人工智能中的图像识别技术进行自动地OCT图像质量评估,既能够提高OCT图像质量评估的效率,又能够显著提高模型对图像的分类效果,进而提高OCT图像质量评估。This application uses the Fourier transform method to obtain the spectrum image sample set of the fundus OCT image sample set, and trains the multimodal classification network model according to the fundus OCT image sample set and the spectrum image sample set, and uses image recognition in artificial intelligence The technology for automatic OCT image quality evaluation can not only improve the efficiency of OCT image quality evaluation, but also significantly improve the classification effect of the model on the image, thereby improving OCT image quality evaluation.
图1为根据本申请实施例的基于傅里叶变换的OCT图像质量评估方法的较佳实施例流程图;FIG. 1 is a flowchart of a preferred embodiment of an OCT image quality evaluation method based on Fourier transform according to an embodiment of the present application;
图2为根据本申请实施例的电子装置的较佳实施例结构示意图;2 is a schematic structural diagram of a preferred embodiment of an electronic device according to an embodiment of the present application;
图3为根据本申请实施例的基于傅里叶变换的OCT图像质量评估程序的内部逻辑示意图。Fig. 3 is a schematic diagram of internal logic of an OCT image quality assessment program based on Fourier transform according to an embodiment of the present application.
本申请目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。The realization, functional characteristics, and advantages of the purpose of this application will be further described in conjunction with the embodiments and with reference to the accompanying drawings.
在下面的描述中,出于说明的目的,为了提供对一个或多个实施例的全面理解,阐述了许多具体细节。然而,很明显,也可以在没有这些具体细节的情况下实现这些实施例。In the following description, for illustrative purposes, in order to provide a comprehensive understanding of one or more embodiments, many specific details are set forth. However, it is obvious that these embodiments can also be implemented without these specific details.
本申请的技术方案可应用于人工智能、区块链和/或大数据技术领域。可选的,本申请涉及的数据如图像样本等可存储于数据库中,或者可以存储于区块链中。The technical solution of this application can be applied to the fields of artificial intelligence, blockchain and/or big data technology. Optionally, the data involved in this application, such as image samples, can be stored in a database, or can be stored in a blockchain.
以下将结合附图对本申请的具体实施例进行详细描述。The specific embodiments of the present application will be described in detail below with reference to the accompanying drawings.
实施例1Example 1
为了说明本申请提供的基于傅里叶变换的OCT图像质量评估方法,图1示出了根据本申请提供的基于傅里叶变换的OCT图像质量评估方法的流程。To illustrate the Fourier transform-based OCT image quality evaluation method provided by this application, FIG. 1 shows the flow of the Fourier transform-based OCT image quality evaluation method provided according to this application.
如图1所示,本申请提供的基于傅里叶变换的OCT图像质量评估方法,包括:As shown in Fig. 1, the OCT image quality evaluation method based on Fourier transform provided by this application includes:
S110:对具有已知图像标签的眼底OCT图像样本集中的各眼底OCT图像样本进行傅里叶变换,以建立相应的频谱图像样本集。S110: Perform Fourier transform on each fundus OCT image sample set with a known image label in the fundus OCT image sample set to establish a corresponding spectrum image sample set.
需要说明的是,为了更好的实现眼底OCT图像的质量评估,可以为眼底OCT图像类数据设立至少三种类别标签,例如good,poor和usable。其中,good对应图像质量好,视网膜、脉络膜清晰,不影响医生进行疾病诊断的图形;usable对应视网膜,脉络膜有一定 模糊或缺失,但不影响医生诊断,图像质量勉强可用的图像;poor对应图像质量较差,视网膜、脉络膜模糊或大部分缺失,影像医生诊断的图像。It should be noted that, in order to better implement the quality assessment of fundus OCT images, at least three category labels can be set up for fundus OCT image data, such as good, poor, and usable. Among them, good corresponds to good image quality, the retina and choroid are clear, and does not affect the graphics of the doctor's disease diagnosis; usable corresponds to the retina, the choroid is blurred or missing, but does not affect the doctor's diagnosis, the image quality is barely usable image; poor corresponds to the image quality Poor, blurry or missing of the retina and choroid, the image diagnosed by the imaging doctor.
需要说明的是,一个已知图像标签的眼底OCT图像样本的标签一般是由医学专家评估后作出的,需要评估出的三个标签各占的概率值,一般取标签概率值最大的标签作为该已知图像标签的眼底OCT图像样本的类别。具体地,例如,可以预先选择多个(一般不少于二十个)医学专家对该眼底OCT图像进行打分,0至50分为图像质量较差,视网膜、脉络膜模糊或大部分缺失,影像医生诊断的图像,其中分数越低图像质量越差,50至80分为视网膜、脉络膜有一定模糊或缺失,但不影响医生诊断的图片,80至100分图像质量好,视网膜、脉络膜清晰的图片,最终取平均分所在的分段对应的标签作为该样本的已知标签,各分分段医生所占的比例为对应分段(各标签)的概率值。需要说明的是,本申请的创新点在于后期模型的构建,从而实现使用人工智能来模拟医生对图片进行辨认的过程,进而实现眼底OCT图像的质量评估的自动化,省去医生的辨认过程。It should be noted that the label of a fundus OCT image sample with a known image label is generally made after evaluation by a medical expert. The probability value of each of the three labels to be evaluated is generally taken as the label with the largest label probability value. The type of fundus OCT image samples with known image tags. Specifically, for example, multiple (generally no less than twenty) medical experts can be selected in advance to score the fundus OCT image, and 0-50 is divided into poor image quality, blurred or missing retina and choroid, imaging doctors Diagnosed images, the lower the score, the worse the image quality, 50 to 80 points, the retina and choroid are blurred or missing, but it does not affect the doctor’s diagnosis of the picture, the 80 to 100 points have good image quality, and the retina and choroid are clear pictures. Finally, the label corresponding to the segment where the average score is located is taken as the known label of the sample, and the proportion of doctors in each segment is the probability value of the corresponding segment (each label). It should be noted that the innovation of this application lies in the construction of the later model, so as to realize the use of artificial intelligence to simulate the process of doctors recognizing pictures, thereby realizing the automation of the quality assessment of fundus OCT images, and eliminating the doctor’s recognition process.
需要进一步说明的是,由于不同的医生的辨别能力各有不同,为提高最终模型的精度,可以尽量选用职称较高的医生进行前期的样本标签确定过程。此外,标签的种类、评判标准可以根据实际情况进行调整,本申请主要是通过设定的新的模型使用样本去模拟医生的OCT图像的质量评估能力,主要在于后期模型的学习训练过程,关于前期的标签设定过程,并不是本申请的重点,在此不再赘述。It needs to be further explained that since different doctors have different discriminating abilities, in order to improve the accuracy of the final model, doctors with higher professional titles can be selected as much as possible for the preliminary sample label determination process. In addition, the types and evaluation criteria of tags can be adjusted according to the actual situation. This application mainly uses samples to simulate the quality assessment ability of doctors’ OCT images through the new model set, mainly in the learning and training process of the later model. The label setting process of is not the focus of this application, so I won’t repeat it here.
对于步骤S110中的涉及的具有已知图像标签的眼底OCT图像样本集,即为根据预设评估规则评估后确定标签的眼底OCT图像样本组成的样本集,为更好的模拟现实生活中眼底OCT图像存在的标签比例,在该样本集中,总样本数一般不少于10000件,且good、poor和usable三类OCT图像样本所占的比例为3:4:3,该比例根据实际数据经统计后确定。For the fundus OCT image sample set with known image labels involved in step S110, it is a sample set composed of fundus OCT image samples whose labels are determined after evaluation according to a preset evaluation rule, which is a better simulation of real life fundus OCT image samples. The label ratio of the image. In this sample set, the total number of samples is generally no less than 10,000, and the ratio of good, poor and usable OCT image samples is 3:4:3. This ratio is calculated based on actual data. After confirming.
具体的,对具有已知图像标签的眼底OCT图像样本集进行傅里叶变换的过程包括如下步骤:Specifically, the process of performing Fourier transform on the fundus OCT image sample set with known image tags includes the following steps:
步骤1:依次对每一个眼底OCT图像分别进行灰度处理,以提高后期傅里叶变化时的数据抓取精度和转换效率。Step 1: Perform grayscale processing on each fundus OCT image in turn to improve the data capture accuracy and conversion efficiency during the later Fourier change.
具体地,本方案采用的灰度处理方法为分量法,在图像处理中,一般用RGB三个分量(R:Red,G:Green,B:Blue),即红、绿、蓝三原色来表示真彩色,R分量,G分量,B分量的取值范围均为0~255,比如电脑屏幕上的一个红色的像素点的三个分量的值分别为:255,0,0。像素点是最小的图像单元,一张图片由好多的像素点构成,因为一个像素点的颜色是由RGB三个值来表现的,所以一个像素点矩阵对应三个颜色向量矩阵,分别是R矩阵,G矩阵,B矩阵,以尺寸是800*800的图像为例,对应的三个矩阵也都是800*800大小的矩阵。每个矩阵的第一行第一列的值对应分量值,比如每个矩阵的第一行第一列的值分别为:R:240,G:223,B:204,所以这个像素点的颜色就是(240,223,204)。Specifically, the gray-scale processing method adopted in this solution is the component method. In image processing, three components of RGB (R: Red, G: Green, B: Blue) are generally used to represent the true colors. Color, R component, G component, and B component range from 0 to 255. For example, the values of the three components of a red pixel on a computer screen are: 255, 0, 0. Pixel is the smallest image unit. A picture is composed of many pixels. Because the color of a pixel is represented by the three values of RGB, a pixel matrix corresponds to three color vector matrices, which are R matrix. , G matrix, B matrix, taking an image with a size of 800*800 as an example, the corresponding three matrices are also 800*800 matrices. The value of the first row and first column of each matrix corresponds to the component value. For example, the values of the first row and first column of each matrix are: R: 240, G: 223, B: 204, so the color of this pixel That is (240, 223, 204).
具体的灰度处理过程就是让像素点矩阵中的每一个像素点都满足下面的关系:R=G=B(就是红色变量的值,绿色变量的值,和蓝色变量的值,这三个值相等,,此时的这个值即为灰度值,具体可以做如下赋值:灰度化后的R=处理前的R*0.3+处理前的G*0.59+处理前的B*0.11,灰度化后的G=处理前的R*0.3+处理前的G*0.59+处理前的B*0.11,灰度化后的B=处理前的R*0.3+处理前的G*0.59+处理前的B*0.11。The specific gray-scale processing process is to make each pixel in the pixel matrix satisfy the following relationship: R=G=B (that is, the value of the red variable, the value of the green variable, and the value of the blue variable, these three If the values are equal, the value at this time is the gray value, which can be assigned as follows: R after grayscale=R*0.3 before processing+G*0.59 before processing+B*0.11 before processing, gray G after scaling=R before treatment*0.3+G before treatment*0.59+B*0.11 before treatment, B after grayscale=R*0.3 before treatment+G*0.59 before treatment+before treatment的B*0.11.
此外,为提高图像的灰度处理效果,还可以对灰度处理后的图像进行二值化处理,具体过程为:设定一个阈值,比如127,计算像素点矩阵中的所有像素点的灰度值的平均值avg,然后将平均值与该阈值进行比较,若平均值大于阈值,这像素点最终设定为白色,若平均值小于该阈值,则该像素点最终设定为黑色。In addition, in order to improve the grayscale processing effect of the image, the grayscale processed image can also be binarized. The specific process is: set a threshold, such as 127, and calculate the grayscale of all pixels in the pixel matrix. The average value avg of the value, and then the average value is compared with the threshold value, if the average value is greater than the threshold value, the pixel is finally set to white, if the average value is less than the threshold, the pixel is finally set to black.
步骤2:对灰度处理后的各眼底OCT图像进行快速傅里叶变换生成相应的频域样本。Step 2: Perform fast Fourier transform on each fundus OCT image after gray-scale processing to generate corresponding frequency domain samples.
步骤3:根据所述频域样本建立所述频谱图像样本集。Step 3: Establish the spectrum image sample set according to the frequency domain samples.
此处需要说明的是,关于眼底OCT图像样本集的傅里叶变换(为提高效率,一般选用快速傅里叶变化),为图像处理领域的常用技术手段,本申请的创新点在于对OCT图像的频域信息的使用,因此,对傅里叶变换的具体过程在此不再赘述。What needs to be explained here is that the Fourier transform of the fundus OCT image sample set (to improve efficiency, the fast Fourier transform is generally used) is a common technical means in the field of image processing. The innovation of this application lies in the comparison of OCT images The use of the frequency domain information, therefore, the specific process of the Fourier transform will not be repeated here.
另外,需要强调的是,为进一步保证上述待稽核数据的私密和安全性,所述眼底OCT图像样本可以存储于区块链的节点中。In addition, it should be emphasized that, in order to further ensure the privacy and security of the above-mentioned data to be audited, the fundus OCT image samples can be stored in the nodes of the blockchain.
S120:创建多模态分类网络模型,并通过所述眼底OCT图像样本集和所述频谱图像样本集对所述多模态分类网络模型进行训练。S120: Create a multi-modal classification network model, and train the multi-modal classification network model through the fundus OCT image sample set and the spectrum image sample set.
具体地,本申请设计的多模态分类网络模型至少包括三条支路,Deep stream分支、shallow stream分支以及Simple Modal Image stream分支,(前两个为称为主干路)这三条分支平行设置,均为常用的分类模型。Specifically, the multi-modal classification network model designed in this application includes at least three branches, the Deep stream branch, the shallow stream branch, and the Simple Modal Image stream branch. (The first two are called trunk roads). These three branches are arranged in parallel. It is a commonly used classification model.
其中,Deep stream支路的卷积层较多,主要用于通过所述眼底OCT图像样本集进行训练,并提取医学图像(比如眼底OCT图像)的深层特征,并根据所述深层特征对所述眼底OCT图像样本进行一次分类,以获取相应的第一分类结果;shallow stream分支的卷积层较少,主要用于通过所述眼底OCT图像样本集进行训练,并提取医学图像的浅层特征,然后根据所述浅层特征对所述眼底OCT图像样本进行二次分类,以获取相应的第二分类结果。Among them, the Deep stream branch has many convolutional layers, which are mainly used to train through the fundus OCT image sample set, and extract the deep features of medical images (such as fundus OCT images), and compare the The fundus OCT image samples are classified once to obtain the corresponding first classification result; the shallow stream branch has fewer convolutional layers, which is mainly used for training through the fundus OCT image sample set and extracting the shallow features of the medical image. Then, perform secondary classification on the fundus OCT image samples according to the shallow features to obtain a corresponding second classification result.
此外为提高模型的识别精度,可以为shallow stream分支的卷积层加入注意力模块,专注于提取图像浅层特征。通过Deep stream分路与shallow stream分支的配合,可以达到丰富图像特征维度的效果,帮助提高特征获取的精度。In addition, in order to improve the recognition accuracy of the model, an attention module can be added to the convolutional layer of the shallow stream branch to focus on extracting the shallow features of the image. Through the cooperation of the deep stream branch and the shallow stream branch, the effect of enriching the dimension of image features can be achieved, and the accuracy of feature acquisition can be improved.
Simple Modal Image stream分支主要用于通过所述频谱图像样本集进行训练,以对各频谱图像样本的频域浅层特征进行提取,然后根据所述频域浅层特征对待频谱图像样本进行三次分类,以获取相应的第三分类结果。此处需要说明的是,Simple Modal Image stream只需要提取图像的频域浅层特征,便可用分类。The Simple Modal Image stream branch is mainly used for training through the spectrum image sample set to extract the frequency domain shallow features of each spectrum image sample, and then classify the spectrum image samples three times according to the frequency domain shallow features, To obtain the corresponding third classification result. What needs to be explained here is that Simple Modal Image stream only needs to extract the shallow features of the image in the frequency domain, and then it can be classified.
需要说明的是,Deep stream分支是一个深层特征提取网络,由经典卷积神经网络作为backbone组成,与现有的ResNet,DenseNet等网络相类似,通过将图像输入,最终输出多维图像特征图,并在此得到一个深层分支预测概率。Shallow stream是一个浅层特征提取网络,主要由两个模块组成,下采样模块和注意力集中模块;其中,下采样模块由一个卷积层,一个激活层和一个归一化层组成,可以在提取图像特征的同时,完成下采样操作;注意力集中模块由spatial attention模块和channel attention模块组成,分别关注图像的空间特征和通道特征,数据样本通过该分支后可以输出图像的浅层特征图,并在此得到一个浅层分支预测概率。而Simple Modal Image stream分支结构与shallow stream分支相同,只是输入的图像不同,Simple Modal Image stream分支需要输入傅里叶变换后的频域样本数据。It should be noted that the Deepstream branch is a deep feature extraction network consisting of a classic convolutional neural network as the backbone. It is similar to the existing ResNet, DenseNet and other networks. Through the input of the image, the final output of the multi-dimensional image feature map, and Here we get a deep branch prediction probability. Shallow stream is a shallow feature extraction network, which is mainly composed of two modules, a down-sampling module and an attention module; among them, the down-sampling module is composed of a convolutional layer, an activation layer, and a normalization layer. While extracting the image features, complete the down-sampling operation; the attention concentration module is composed of the spatial attention module and the channel attention module, which respectively focus on the spatial and channel features of the image. After the data samples pass through this branch, the shallow feature map of the image can be output. And get a shallow branch prediction probability here. The Simple Modal Image stream branch structure is the same as the shallow stream branch, except that the input images are different. The Simple Modal Image stream branch needs to input the frequency domain sample data after Fourier transform.
具体地,以Deep Stream分支为例(其他分支相同),眼底OCT图像在通过Deep Stream分支后,网络会输出n个概率值,n是所需分类的类别个数,需要预先设定。举例来说,比如我要分三类,(对应上述good、poor和usable),则通过Deep stream分支后,先提取出眼底OCT图像的深层特征,然后根据这些深层特征对该眼底OCT图像进行一次分类,便会输出三个概率值(对应上述good、poor和usable),这三个概率值相加为1,我们通常认为概率最大的那个值所对应的类别为该分支输出的第一分类结果。Specifically, taking the Deep Stream branch as an example (the other branches are the same), after the fundus OCT image passes the Deep Stream branch, the network will output n probability values, where n is the number of categories to be classified and needs to be preset. For example, if I want to divide into three categories (corresponding to the above good, poor, and usable), after the Deep stream branch, the deep features of the fundus OCT image are extracted first, and then the fundus OCT image is performed once according to these deep features Classification, it will output three probability values (corresponding to the above good, poor, and usable). The three probability values add up to 1. We usually think that the category corresponding to the value with the highest probability is the first classification result output by the branch. .
此外,为进一步提高多模态分类网络模型的识别精度,还对三个分支进行了级联融合;具体地,首先自所述深层特征、所述浅层特征以及所述频域浅层特征进行级联以获得级联特征;然后所述多模态分类网络模型根据所述级联特征对所述眼底OCT图像样本进行级联分类,以获取相应的级联分类结果。In addition, in order to further improve the recognition accuracy of the multi-modal classification network model, the three branches are also cascaded and fused; specifically, firstly, perform cascade fusion from the deep features, the shallow features, and the frequency domain shallow features. Cascading to obtain cascaded features; then the multi-modal classification network model performs cascaded classification on the fundus OCT image samples according to the cascaded features to obtain corresponding cascaded classification results.
需要说明的是,级联融合就是进行Concatanate操作,是将图像的不同维度进行级联, 即在通道维度,将不同支路得到的feature map特征图(对应各支路提取的特征)进行级联,级联后形成该眼底OCT图像一组新的级联特征,根据该级联特征输出相应的级联分类结果。It should be noted that the cascade fusion is to perform the Concatanate operation, which is to cascade the different dimensions of the image, that is, in the channel dimension, the feature map (corresponding to the extracted features of each branch) is cascaded. After cascading, a new set of cascaded features of the fundus OCT image is formed, and the corresponding cascaded classification results are output according to the cascaded features.
在多模态分类网络模型训练完毕,并通过该多模态分类网络模型对所述待分类眼底OCT图像进行分类识别时,以上述联级分类结果为多模态分类网络模型输出的最终分类结果。When the multi-modal classification network model is trained, and the fundus OCT image to be classified is classified and recognized through the multi-modal classification network model, the above-mentioned cascade classification result is used as the final classification result output by the multi-modal classification network model .
多模态分类网络模型训练的最后,根据Deep stream分支、Simple Modal Image stream分支以及Simple Modal Image stream分支(甚至联级分支,可以根据实际需求确定是否使用)的分类结果,计算相应的损失函数Loss。At the end of the training of the multi-modal classification network model, calculate the corresponding loss function Loss according to the classification results of the Deep stream branch, Simple Modal Image stream branch, and Simple Modal Image stream branch (or even the cascade branch, whether to use it according to actual needs) .
具体地,根据第一分类结果、第二分类结果以及第三分类结果计算相应的损失函数;然后根据所计算出的损失函数确定所述多模态分类网络模型的总损失函数,当总损失函数收敛到最小时,认定多模态分类网络模型训练完毕。Specifically, the corresponding loss function is calculated according to the first classification result, the second classification result, and the third classification result; and then the total loss function of the multimodal classification network model is determined according to the calculated loss function, when the total loss function When the convergence reaches the minimum, it is determined that the training of the multi-modal classification network model is completed.
在实际计算过程中,损失函数的计算公式为:In the actual calculation process, the calculation formula of the loss function is:
其中,p为标签的概率值,q为分类结果输出的预测概率值,x
i代表第i个类别,n代表类别数;
Among them, p is the probability value of the label, q is the predicted probability value output by the classification result, x i represents the i-th category, and n represents the number of categories;
总损失函数的计算公式为:The calculation formula of the total loss function is:
Loss
总=0.3×Loss
D+0.3×Loss
s+0.4×Loss
P,其中,
Total Loss = 0.3 × Loss D + 0.3 × Loss s + 0.4 × Loss P , where,
Loss
D为第一分类结果的损失函数,Loss
s为第二分类结果的损失函数,Loss
P为第三分类结果的损失函数。
Loss D is the loss function of the first classification result, Loss s is the loss function of the second classification result, and Loss P is the loss function of the third classification result.
其中,各损失函数前的参数是根据效果和医学经验来设定的,在反向传播过程中,突出了第一分类结果的权重,同时考虑了另外两个分类结果的权重,当整个网络收敛到最小后,(即损失函数收敛),最后选择第一分类结果(有级联分类结果的去级联分类结果)去作为本模型输出最终结果。Among them, the parameters before each loss function are set according to the effect and medical experience. In the back propagation process, the weight of the first classification result is highlighted, and the weights of the other two classification results are considered at the same time. When the entire network converges After reaching the minimum, (that is, the loss function converges), the first classification result (de-cascade classification result with cascade classification results) is finally selected as the final result of the model output.
如此,使用眼底OCT图像样本集配合所述频谱图像样本集,即可实现对多模态分类网络模型的训练,当眼底OCT图像样本集和所述频谱图像样本集的样本均使用完毕后,默认模型训练完毕。In this way, using the fundus OCT image sample set in conjunction with the spectrum image sample set can realize the training of the multimodal classification network model. When the fundus OCT image sample set and the samples of the spectrum image sample set are all used, the default The model training is complete.
S130:在多模态分类网络模型训练完毕后,将待分类眼底OCT图像以及与待分类眼底OCT图像对应的待分类频谱图像输入至多模态分类网络模型,通过多模态分类网络模型对所述待分类眼底OCT图像进行分类,并根据分类结果进行质量评估。需要说明的是,由于前期的分类类型与眼底OCT图像的质量有关,因此,当分类结束后,使用该分类结果即可实现质量评估,由于在知道分类结果后进行质量评估的过程为本领域的常用手段,在此不再赘述。S130: After the training of the multi-modal classification network model is completed, input the fundus OCT image to be classified and the spectrum image to be classified corresponding to the fundus OCT image to be classified into the multi-modal classification network model, and the multi-modal classification network model The OCT images of the fundus to be classified are classified, and the quality is evaluated according to the classification results. It should be noted that since the early classification type is related to the quality of the fundus OCT image, when the classification is over, the classification result can be used to achieve the quality evaluation, because the quality evaluation process after the classification result is known is the field Common methods will not be repeated here.
其中,待分类眼底OCT图像为没有标签且需要进行自动分类的OCT图像,待分类频谱图像为该待分类眼底OCT图像经傅里叶变换得到的频域图像。Wherein, the fundus OCT image to be classified is an OCT image that has no label and needs to be automatically classified, and the spectrum image to be classified is a frequency domain image obtained by Fourier transform of the fundus OCT image to be classified.
需要说明的是,该待分类眼底OCT图像的傅里叶变换的过程与步骤S110中的傅里叶变换过程相同,因此该处不做赘述。It should be noted that the Fourier transform process of the OCT image of the fundus to be classified is the same as the Fourier transform process in step S110, so it will not be repeated here.
需要说明的是,待分类眼底OCT图像输入至Deep stream分支以及shallow stream分支,待分类频谱图像输入至Deep stream分支,经多模态分类网络模型处理后,最终输入多个相应的分类结果(包括第一分类结果、第二分类结果、第三分类以及联级分类结果),通常情况下,取联级分类结果中概率最大的分类结果作为最后的分类识别结果。It should be noted that the fundus OCT image to be classified is input to the Deep stream branch and the shallow stream branch, and the spectrum image to be classified is input to the Deep stream branch. After processing by the multi-modal classification network model, multiple corresponding classification results (including The first classification result, the second classification result, the third classification and the cascade classification result), usually, the classification result with the highest probability among the cascade classification results is taken as the final classification recognition result.
此外,由于该处多模态分类网络模型对于数据处理的过程与步骤S120中的步骤类似,只是取消了使用损失函数进行训练的过程,因此,对于该处多模态分类网络模型待分类眼底OCT图像的具体处理过程在此不再赘述。In addition, since the data processing process of the multi-modal classification network model here is similar to the steps in step S120, except that the training process using the loss function is canceled, the fundus OCT for the multi-modal classification network model to be classified is The specific process of image processing will not be repeated here.
需要说明的是,在实际应用过程中,在对待分类的眼底OCT图像进行识别之前,可以先使用未知类别信息的眼底OCT图像对该多模态分类网络模型进行测试,具体测试过程与步骤S130的步骤相似,在此不再赘述,得到测试结果后,通过医生评估等手段判断该测试结果的正确性,若测试结果与医生的评估结果相同,则使用该多模态分类网络模型对待分类眼底OCT图像进行识别,若测试结果与医生的评估结果不同,则再次增加训练样本,继续对多模态分类网络模型进行训练,直至所述测试结果与医生的评估结果相同。It should be noted that in the actual application process, before recognizing the fundus OCT image to be classified, the multimodal classification network model can be tested using the fundus OCT image with unknown category information. The specific test process is the same as that of step S130. The steps are similar and will not be repeated here. After the test result is obtained, the correctness of the test result is judged by the doctor's evaluation and other means. If the test result is the same as the doctor's evaluation result, the multi-modal classification network model is used to classify the fundus OCT The image is recognized. If the test result is different from the doctor's evaluation result, the training samples are added again, and the multi-modal classification network model is continuously trained until the test result is the same as the doctor's evaluation result.
最后,需要说明的是,当没有设定级联分类结果时,以第一分类结果为模型的输出结果。当设定了定级联分类结果时,以级联分类结果为模型最终的分类结果。Finally, it should be noted that when the cascade classification result is not set, the first classification result is used as the output result of the model. When the definite cascade classification result is set, the cascade classification result is the final classification result of the model.
此外,在另一实施例中,还可以仅对Deep stream分支和shallow stream分支进行融合联级,输出的结果作为对Deep stream分支的分类结果、然后后续的过程与步骤S120相同。In addition, in another embodiment, only the deep stream branch and the shallow stream branch may be merged and connected, and the output result is used as the classification result of the deep stream branch, and the subsequent process is the same as step S120.
通过上述技术方案的表述可知,本申请提供的基于傅里叶变换的OCT图像质量评估方法,通过利用人工智能中的图像识别技术进行自动地OCT图像质量评估,能够大大节约医生的工作时间,提高医生的工作效率;此外,通过引入快速傅里叶变换,可以从多个维度对图像进行特征提取,显著提高图像质量评估的精度;此外,通过为多模态分类网络模型设定多条分支,不同的分支提取不同的特征信息,然后通过计算总损失函数的方式确定最终模型,能够显著提高模型的识别精度;另外,通过特征联级的方式,得到联级分类结果,并将联级分类结果作为最终的模型输出结果,能够进一步提供模型的分类效果,进而提高提高OCT图像质量评估。最后,在眼底智能筛查领域,由于眼底OCT图像的质量评估,是眼底检查是否有意义的关键,因此该眼底OCT图像的质量评估方法能够在眼底智能筛查时,显著提高工作效率。It can be seen from the expression of the above technical solution that the Fourier transform-based OCT image quality evaluation method provided in this application can automatically perform OCT image quality evaluation by using image recognition technology in artificial intelligence, which can greatly save doctors’ working time and improve Doctor’s work efficiency; in addition, by introducing fast Fourier transform, images can be extracted from multiple dimensions, significantly improving the accuracy of image quality evaluation; in addition, by setting multiple branches for the multi-modal classification network model, Different branches extract different feature information, and then determine the final model by calculating the total loss function, which can significantly improve the recognition accuracy of the model; in addition, through the feature cascade method, the cascade classification result is obtained, and the cascade classification result is obtained. As the final model output result, the classification effect of the model can be further provided, and the OCT image quality evaluation can be improved. Finally, in the field of intelligent fundus screening, since the quality assessment of fundus OCT images is the key to whether the fundus inspection is meaningful, the quality assessment method of fundus OCT images can significantly improve work efficiency during intelligent fundus screening.
应理解,上述实施例中各步骤的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。It should be understood that the size of the sequence number of each step in the foregoing embodiment does not mean the order of execution, and the execution sequence of each process should be determined by its function and internal logic, and should not constitute any limitation to the implementation process of the embodiment of the present application.
实施例2Example 2
与上述方法相对应,本申请还提供一种基于傅里叶变换的OCT图像质量评估系统,该系统包括:Corresponding to the above method, this application also provides an OCT image quality assessment system based on Fourier transform, which includes:
样本集建立单元,用于对具有已知图像标签的眼底OCT图像样本集中的各眼底OCT图像样本进行傅里叶变换,以建立相应的频谱图像样本集;A sample set establishment unit, configured to perform Fourier transform on each fundus OCT image sample set with a known image label in the fundus OCT image sample set to establish a corresponding spectrum image sample set;
模型训练单元,创建多模态分类网络模型,并通过所述眼底OCT图像样本集和所述频谱图像样本集对所述多模态分类网络模型进行训练;A model training unit, creating a multi-modal classification network model, and training the multi-modal classification network model through the fundus OCT image sample set and the spectrum image sample set;
模型应用单元,用于在所述多模态分类网络模型训练完毕后,将待分类眼底OCT图像以及与所述待分类眼底OCT图像对应的待分类频谱图像输入至所述多模态分类网络模型,通过所述多模态分类网络模型对所述待分类眼底OCT图像进行质量评估。The model application unit is configured to input the fundus OCT image to be classified and the spectrum image to be classified corresponding to the fundus OCT image to be classified into the multimodal classification network model after the training of the multimodal classification network model is completed , Performing quality evaluation on the OCT image of the fundus to be classified through the multi-modal classification network model.
实施例3Example 3
本申请还提供一种电子装置70。参照图2所示,该图为本申请提供的电子装置70的较佳实施例结构示意图。The application also provides an electronic device 70. Referring to FIG. 2, this figure is a schematic structural diagram of a preferred embodiment of the electronic device 70 provided by this application.
在本实施例中,电子装置70可以是服务器、智能手机、平板电脑、便携计算机、桌上型计算机等具有运算功能的终端设备。In this embodiment, the electronic device 70 may be a terminal device with a computing function, such as a server, a smart phone, a tablet computer, a portable computer, a desktop computer, and the like.
该电子装置70包括:处理器71以及存储器72。The electronic device 70 includes a processor 71 and a memory 72.
存储器72包括至少一种类型的可读存储介质。至少一种类型的可读存储介质可为如闪存、硬盘、多媒体卡、卡型存储器等的非易失性存储介质。在一些实施例中,可读存储介质可以是该电子装置70的内部存储单元,例如该电子装置70的硬盘。在另一些实施例中,可读存储介质也可以是电子装置1的外部存储器,例如电子装置70上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。The memory 72 includes at least one type of readable storage medium. The at least one type of readable storage medium may be a non-volatile storage medium such as flash memory, hard disk, multimedia card, card-type memory, and the like. In some embodiments, the readable storage medium may be an internal storage unit of the electronic device 70, such as a hard disk of the electronic device 70. In other embodiments, the readable storage medium may also be an external memory of the electronic device 1, such as a plug-in hard disk equipped on the electronic device 70, a smart memory card (Smart Media Card, SMC), and a secure digital (Secure Digital, SD) card, flash card (Flash Card), etc.
在本实施例中,存储器72的可读存储介质通常用于存储安装于电子装置70的基于傅里叶变换的OCT图像质量评估程序73。存储器72还可以用于暂时地存储已经输出或者将要输出的数据。In this embodiment, the readable storage medium of the memory 72 is generally used to store the Fourier transform-based OCT image quality evaluation program 73 installed in the electronic device 70. The memory 72 can also be used to temporarily store data that has been output or will be output.
处理器72在一些实施例中可以是一中央处理器(Central Processing Unit,CPU),微处理器或其他数据处理芯片,用于运行存储器72中存储的程序代码或处理数据,例如基于傅里叶变换的OCT图像质量评估程序73等。In some embodiments, the processor 72 may be a central processing unit (CPU), microprocessor or other data processing chip, used to run the program code or processing data stored in the memory 72, for example based on Fourier Transformed OCT image quality evaluation program 73 and so on.
在一些实施例中,电子装置70为智能手机、平板电脑、便携计算机等的终端设备。在其他实施例中,电子装置70可以为服务器。In some embodiments, the electronic device 70 is a terminal device such as a smart phone, a tablet computer, and a portable computer. In other embodiments, the electronic device 70 may be a server.
图2仅示出了具有组件71-73的电子装置70,但是应理解的是,并不要求实施所有示出的组件,可以替代的实施更多或者更少的组件。FIG. 2 only shows the electronic device 70 with the components 71-73, but it should be understood that it is not required to implement all the illustrated components, and more or fewer components may be implemented instead.
可选地,该电子装置70还可以包括用户接口,用户接口可以包括输入单元比如键盘(Keyboard)、语音输入装置比如麦克风(microphone)等具有语音识别功能的设备、语音输出装置比如音响、耳机等,可选地用户接口还可以包括标准的有线接口、无线接口。Optionally, the electronic device 70 may also include a user interface. The user interface may include an input unit such as a keyboard (Keyboard), a voice input device such as a microphone (microphone) and other devices with voice recognition functions, and a voice output device such as audio, earphones, etc. Optionally, the user interface may also include a standard wired interface and a wireless interface.
可选地,该电子装置70还可以包括显示器,显示器也可以称为显示屏或显示单元。在一些实施例中可以是LED显示器、液晶显示器、触控式液晶显示器以及有机发光二极管(Organic Light-Emitting Diode,OLED)触摸器等。显示器用于显示在电子装置70中处理的信息以及用于显示可视化的用户界面。Optionally, the electronic device 70 may also include a display, and the display may also be referred to as a display screen or a display unit. In some embodiments, it may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an organic light-emitting diode (OLED) touch device, and the like. The display is used for displaying information processed in the electronic device 70 and for displaying a visualized user interface.
可选地,该电子装置70还可以包括触摸传感器。触摸传感器所提供的供用户进行触摸操作的区域称为触控区域。此外,这里的触摸传感器可以为电阻式触摸传感器、电容式触摸传感器等。而且,触摸传感器不仅包括接触式的触摸传感器,也可包括接近式的触摸传感器等。此外,触摸传感器可以为单个传感器,也可以为例如阵列布置的多个传感器。Optionally, the electronic device 70 may also include a touch sensor. The area provided by the touch sensor for the user to perform touch operations is called the touch area. In addition, the touch sensor here may be a resistive touch sensor, a capacitive touch sensor, or the like. Moreover, the touch sensor includes not only a contact type touch sensor, but also a proximity type touch sensor and the like. In addition, the touch sensor may be a single sensor, or may be, for example, a plurality of sensors arranged in an array.
此外,该电子装置70的显示器的面积可以与触摸传感器的面积相同,也可以不同。可选地,将显示器与触摸传感器层叠设置,以形成触摸显示屏。该装置基于触摸显示屏侦测用户触发的触控操作。In addition, the area of the display of the electronic device 70 may be the same as or different from the area of the touch sensor. Optionally, the display and the touch sensor are stacked to form a touch display screen. The device detects the touch operation triggered by the user based on the touch screen.
可选地,该电子装置70还可以包括射频(Radio Frequency,RF)电路,传感器、音频电路等等,在此不再赘述。Optionally, the electronic device 70 may also include a radio frequency (RF) circuit, a sensor, an audio circuit, etc., which will not be repeated here.
在图2所示的装置实施例中,作为一种计算机存储介质的存储器72中可以包括操作系统、以及基于傅里叶变换的OCT图像质量评估程序73;处理器71执行存储器72中存储基于傅里叶变换的OCT图像质量评估程序73时实现如下步骤:In the device embodiment shown in FIG. 2, the memory 72, which is a computer storage medium, may include an operating system and an OCT image quality evaluation program 73 based on Fourier transform; the processor 71 executes the Fourier transform-based OCT image quality evaluation program 73; The OCT image quality evaluation program 73 of the inner transformation implements the following steps:
对具有已知图像标签的眼底OCT图像样本集中的各眼底OCT图像样本进行傅里叶变换,以建立相应的频谱图像样本集;Perform Fourier transform on each fundus OCT image sample set with a known image label in the fundus OCT image sample set to establish a corresponding spectrum image sample set;
创建多模态分类网络模型,并通过所述眼底OCT图像样本集和所述频谱图像样本集对所述多模态分类网络模型进行训练;Creating a multi-modal classification network model, and training the multi-modal classification network model through the fundus OCT image sample set and the spectrum image sample set;
所述多模态分类网络模型训练完毕后,将待分类眼底OCT图像以及与所述待分类眼底OCT图像对应的待分类频谱图像输入至所述多模态分类网络模型,通过所述多模态分类网络模型对所述待分类眼底OCT图像进行质量评估。After the multi-modal classification network model is trained, the fundus OCT image to be classified and the spectrum image to be classified corresponding to the fundus OCT image to be classified are input to the multi-modal classification network model, and the multi-modal The classification network model evaluates the quality of the fundus OCT image to be classified.
在该实施例中,图3为根据本申请实施例的基于傅里叶变换的OCT图像质量评估程序的内部逻辑示意图,如图3所示,基于傅里叶变换的OCT图像质量评估程序73还可以被分割为一个或者多个模块,一个或者多个模块被存储于存储器72中,并由处理器71执行,以完成本申请。本申请所称的模块是指能够完成特定功能的一系列计算机程序指令段。参照图3所示,为图2中基于傅里叶变换的OCT图像质量评估程序73较佳实施例的程序模块图。基于傅里叶变换的OCT图像质量评估程序73可以被分割为:样本集建立模块74、模型训练模块75以及模型应用模块76。模块74-76所实现的功能或操作步骤均与上文类似,此处不再详述,示例性地,例如,其中:In this embodiment, FIG. 3 is a schematic diagram of the internal logic of the Fourier transform-based OCT image quality evaluation program according to an embodiment of the present application. As shown in FIG. 3, the Fourier transform-based OCT image quality evaluation program 73 also It can be divided into one or more modules, and one or more modules are stored in the memory 72 and executed by the processor 71 to complete the application. The module referred to in this application refers to a series of computer program instruction segments that can complete specific functions. Referring to FIG. 3, it is a program module diagram of a preferred embodiment of the OCT image quality evaluation program 73 based on Fourier transform in FIG. The OCT image quality evaluation program 73 based on Fourier transform can be divided into: a sample set establishment module 74, a model training module 75, and a model application module 76. The functions or operation steps implemented by modules 74-76 are similar to the above, and will not be described in detail here. Illustratively, for example, where:
样本集建立模块74,用于对具有已知图像标签的眼底OCT图像样本集中的各眼底OCT图像样本进行傅里叶变换,以建立相应的频谱图像样本集;The sample set establishment module 74 is configured to perform Fourier transform on each fundus OCT image sample set with a known image label in the fundus OCT image sample set to establish a corresponding spectrum image sample set;
模型训练单模块75,用于创建多模态分类网络模型,并通过所述眼底OCT图像样本集和所述频谱图像样本集对所述多模态分类网络模型进行训练;The model training module 75 is configured to create a multi-modal classification network model, and train the multi-modal classification network model through the fundus OCT image sample set and the spectrum image sample set;
模型应用模块76,用于在所述多模态分类网络模型训练完毕后,将待分类眼底OCT图像以及与所述待分类眼底OCT图像对应的待分类频谱图像输入至所述多模态分类网络模型,通过所述多模态分类网络模型对所述待分类眼底OCT图像进行质量评估。The model application module 76 is configured to input the fundus OCT image to be classified and the spectrum image to be classified corresponding to the fundus OCT image to be classified into the multimodal classification network after the training of the multimodal classification network model is completed The model is used to evaluate the quality of the fundus OCT image to be classified through the multi-modal classification network model.
实施例4Example 4
本申请还提供一种计算机可读存储介质,计算机可读存储介质中存储有基于傅里叶变换的OCT图像质量评估程序73,基于傅里叶变换的OCT图像质量评估程序73被处理器执行时实现如下操作:The present application also provides a computer-readable storage medium. The computer-readable storage medium stores an OCT image quality evaluation program 73 based on Fourier transform. When the OCT image quality evaluation program 73 based on Fourier transform is executed by the processor To achieve the following operations:
对具有已知图像标签的眼底OCT图像样本集中的各眼底OCT图像样本进行傅里叶变换,以建立相应的频谱图像样本集;Perform Fourier transform on each fundus OCT image sample set with a known image label in the fundus OCT image sample set to establish a corresponding spectrum image sample set;
创建多模态分类网络模型,并通过所述眼底OCT图像样本集和所述频谱图像样本集对所述多模态分类网络模型进行训练;Creating a multi-modal classification network model, and training the multi-modal classification network model through the fundus OCT image sample set and the spectrum image sample set;
所述多模态分类网络模型训练完毕后,将待分类眼底OCT图像以及与所述待分类眼底OCT图像对应的待分类频谱图像输入至所述多模态分类网络模型,通过所述多模态分类网络模型对所述待分类眼底OCT图像进行质量评估。After the multimodal classification network model is trained, the fundus OCT image to be classified and the spectrum image to be classified corresponding to the fundus OCT image to be classified are input to the multimodal classification network model, and the multimodality The classification network model evaluates the quality of the fundus OCT image to be classified.
本申请提供的计算机可读存储介质的具体实施方式与上述基于傅里叶变换的OCT图像质量评估方法、电子装置的具体实施方式大致相同,在此不再赘述。The specific implementation of the computer-readable storage medium provided by the present application is substantially the same as the specific implementation of the Fourier transform-based OCT image quality evaluation method and the electronic device, and will not be repeated here.
可选的,本申请涉及的介质如计算机可读存储介质可以是非易失性的,也可以是易失性的。Optionally, the medium involved in this application, such as a computer-readable storage medium, may be non-volatile or volatile.
需要说明的是,本申请所指区块链是分布式数据存储、点对点传输、共识机制、加密算法等计算机技术的新型应用模式。区块链(Blockchain),本质上是一个去中心化的数据库,是一串使用密码学方法相关联产生的数据块,每一个数据块中包含了一批次网络交易的信息,用于验证其信息的有效性(防伪)和生成下一个区块。区块链可以包括区块链底层平台、平台产品服务层以及应用服务层等。It should be noted that the blockchain referred to in this application is a new application mode of computer technology such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm. Blockchain, essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information for verification. The validity of the information (anti-counterfeiting) and the generation of the next block. The blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.
需要进一步说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、装置、物品或者方法不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、装置、物品或者方法所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、装置、物品或者方法中还存在另外的相同要素。It should be further explained that in this article, the terms "include", "include" or any other variants thereof are intended to cover non-exclusive inclusion, so that a process, device, article or method including a series of elements not only includes those elements , But also includes other elements that are not explicitly listed, or elements inherent to the process, device, article, or method. If there are no more restrictions, the element defined by the sentence "including a..." does not exclude the existence of other identical elements in the process, device, article, or method that includes the element.
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在如上的一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,或者网络设备等)执行本申请各个实施例的方法。The serial numbers of the foregoing embodiments of the present application are for description only, and do not represent the superiority or inferiority of the embodiments. Through the description of the above implementation manners, those skilled in the art can clearly understand that the above-mentioned embodiment method can be implemented by means of software plus the necessary general hardware platform, of course, it can also be implemented by hardware, but in many cases the former is better.的实施方式。 Based on this understanding, the technical solution of this application essentially or the part that contributes to the existing technology can be embodied in the form of a software product, and the computer software product is stored in a storage medium such as ROM/RAM, magnetic Disk, optical disk) includes several instructions to make a terminal device (which can be a mobile phone, a computer, a server, or a network device, etc.) execute the methods of the various embodiments of the present application.
以上仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。The above are only the preferred embodiments of the application, and do not limit the scope of the patent for this application. Any equivalent structure or equivalent process transformation made using the content of the description and drawings of the application, or directly or indirectly applied to other related technical fields , The same reason is included in the scope of patent protection of this application.
Claims (20)
- 一种基于傅里叶变换的OCT图像质量评估方法,应用于电子装置,其中,所述方法包括:An OCT image quality evaluation method based on Fourier transform, applied to an electronic device, wherein the method includes:对具有已知图像标签的眼底OCT图像样本集中的各眼底OCT图像样本进行傅里叶变换,以建立相应的频谱图像样本集;Perform Fourier transform on each fundus OCT image sample set with a known image label in the fundus OCT image sample set to establish a corresponding spectrum image sample set;创建多模态分类网络模型,并通过所述眼底OCT图像样本集和所述频谱图像样本集对所述多模态分类网络模型进行训练;Creating a multi-modal classification network model, and training the multi-modal classification network model through the fundus OCT image sample set and the spectrum image sample set;所述多模态分类网络模型训练完毕后,将待分类眼底OCT图像以及与所述待分类眼底OCT图像对应的待分类频谱图像输入至所述多模态分类网络模型,通过所述多模态分类网络模型对所述待分类眼底OCT图像进行质量评估。After the multi-modal classification network model is trained, the fundus OCT image to be classified and the spectrum image to be classified corresponding to the fundus OCT image to be classified are input to the multi-modal classification network model, and the multi-modal The classification network model evaluates the quality of the fundus OCT image to be classified.
- 根据权利要求1所述的基于傅里叶变换的OCT图像质量评估方法,其中,所述眼底OCT图像样本集存储于区块链中,对所述眼底OCT图像样本进行傅里叶变换的过程包括:The Fourier transform-based OCT image quality assessment method according to claim 1, wherein the fundus OCT image sample set is stored in a blockchain, and the process of performing Fourier transform on the fundus OCT image sample comprises :依次对各眼底OCT图像分别进行灰度处理;Perform gray-scale processing on each fundus OCT image in turn;对灰度处理后的各眼底OCT图像进行快速傅里叶变换生成相应的频域样本;Perform fast Fourier transform on each fundus OCT image after gray-scale processing to generate corresponding frequency domain samples;根据所述频域样本建立所述频谱图像样本集。The spectrum image sample set is established according to the frequency domain samples.
- 根据权利要求1或2所述的基于傅里叶变换的OCT图像质量评估方法,其中,所述多模态分类网络模型包括Deep stream分支、shallow stream分支以及Simple Modal Image stream分支;其中,The method for evaluating OCT image quality based on Fourier transform according to claim 1 or 2, wherein the multi-modal classification network model includes a Deep stream branch, a shallow stream branch, and a Simple Modal Image stream branch; wherein,在训练所述多模态分类网络模型的过程中,In the process of training the multi-modal classification network model,所述Deep stream分支用于通过所述眼底OCT图像样本集进行训练,以对所述眼底OCT图像样本集中的各眼底OCT图像样本的深层特征进行提取;The Deep stream branch is used for training through the fundus OCT image sample set, so as to extract the deep features of each fundus OCT image sample in the fundus OCT image sample set;所述Shallow stream分支用于通过所述眼底OCT图像样本集进行训练,以对所述眼底OCT图像样本集中的各眼底OCT图像样本的浅层特征进行提取;The Shallow stream branch is used for training through the fundus OCT image sample set, so as to extract the shallow features of each fundus OCT image sample in the fundus OCT image sample set;所述Simple Modal Image stream分支用于通过所述频谱图像样本集进行训练,以对所述眼底OCT图像样本集中的各频谱图像样本的频域浅层特征进行提取。The Simple Modal Image stream branch is used for training through the spectrum image sample set to extract the frequency-domain shallow features of each spectrum image sample in the fundus OCT image sample set.
- 根据权利要求3所述的基于傅里叶变换的OCT图像质量评估方法,其中,在训练所述多模态分类网络模型的过程中,The OCT image quality assessment method based on Fourier transform according to claim 3, wherein, in the process of training the multi-modal classification network model,所述Deep stream分支还用于根据所述深层特征对所述眼底OCT图像样本进行一次分类,以获取相应的第一分类结果;The Deep stream branch is also used to classify the fundus OCT image samples once according to the deep features to obtain the corresponding first classification result;所述Shallow stream分支还用于根据所述浅层特征对所述眼底OCT图像样本进行二次分类,以获取相应的第二分类结果;The Shallow stream branch is also used to perform secondary classification on the fundus OCT image samples according to the shallow features to obtain a corresponding second classification result;所述Simple Modal Image stream分支还用于根据所述频域浅层特征对述待频谱图像样本进行三次分类,以获取相应的第三分类结果。The Simple Modal Image stream branch is also used to classify the to-be-spectrum image sample three times according to the shallow features of the frequency domain to obtain a corresponding third classification result.
- 根据权利要求4所述的基于傅里叶变换的OCT图像质量评估方法,其中,在训练所述多模态分类网络模型的过程中,The OCT image quality assessment method based on Fourier transform according to claim 4, wherein, in the process of training the multi-modal classification network model,根据所述第一分类结果、所述第二分类结果以及所述第三分类结果计算相应的损失函数;Calculating a corresponding loss function according to the first classification result, the second classification result, and the third classification result;根据所述损失函数计算所述多模态分类网络模型的总损失函数,当所述总损失函数收敛至最小时,认定所述多模态分类网络模型训练完毕。Calculate the total loss function of the multimodal classification network model according to the loss function, and when the total loss function converges to a minimum, it is determined that the training of the multimodal classification network model is completed.
- 根据权利要求5所述的基于傅里叶变换的OCT图像质量评估方法,其中,The OCT image quality evaluation method based on Fourier transform according to claim 5, wherein:所述损失函数的计算公式为:The calculation formula of the loss function is:其中,p为标签的概率值,q为分类结果输出的预测概率值,x i代表第i个类别,n代表类别数; Among them, p is the probability value of the label, q is the predicted probability value output by the classification result, x i represents the i-th category, and n represents the number of categories;所述总损失函数的计算公式为:The calculation formula of the total loss function is:Loss 总=0.3×Loss D+0.3×Loss s+0.4×Loss P,其中, Total Loss = 0.3 × Loss D + 0.3 × Loss s + 0.4 × Loss P , where,Loss D为所述第一分类结果的损失函数,Loss S为所述第二分类结果的损失函数,Loss P为所述第三分类结果的损失函数。 Loss D is the loss function of the first classification result, Loss S is the loss function of the second classification result, and Loss P is the loss function of the third classification result.
- 根据权利要求6所述的基于傅里叶变换的OCT图像质量评估方法,其中,在训练所述多模态分类网络模型的过程中,The OCT image quality assessment method based on Fourier transform according to claim 6, wherein, in the process of training the multi-modal classification network model,对所述深层特征、所述浅层特征以及所述频域浅层特征进行级联以获得级联特征;Cascading the deep features, the shallow features, and the frequency-domain shallow features to obtain cascaded features;所述多模态分类网络模型根据所述级联特征对所述眼底OCT图像样本进行级联分类,以获取与所述眼底OCT图像样本相对应的级联分类结果;The multi-modal classification network model performs cascaded classification of the fundus OCT image samples according to the cascaded features to obtain a cascaded classification result corresponding to the fundus OCT image samples;在所述多模态分类网络模型训练完毕,并通过所述多模态分类网络模型对所述待分类眼底OCT图像进行分类识别时,以所述联级分类结果为所述多模态分类网络模型输出的最终分类结果。After the multi-modal classification network model is trained, and the multi-modal classification network model is used to classify and recognize the to-be-classified fundus OCT image, the cascade classification result is used as the multi-modal classification network The final classification result output by the model.
- 一种基于傅里叶变换的OCT图像质量评估系统,其中,所述系统包括:An OCT image quality assessment system based on Fourier transform, wherein the system includes:样本集建立单元,用于对具有已知图像标签的眼底OCT图像样本集中的各眼底OCT图像样本进行傅里叶变换,以建立相应的频谱图像样本集;A sample set establishment unit, configured to perform Fourier transform on each fundus OCT image sample set with a known image label in the fundus OCT image sample set to establish a corresponding spectrum image sample set;模型训练单元,用于创建多模态分类网络模型,并通过所述眼底OCT图像样本集和所述频谱图像样本集对所述多模态分类网络模型进行训练;A model training unit, configured to create a multi-modal classification network model, and train the multi-modal classification network model through the fundus OCT image sample set and the spectrum image sample set;模型应用单元,用于在所述多模态分类网络模型训练完毕后,将待分类眼底OCT图像以及与所述待分类眼底OCT图像对应的待分类频谱图像输入至所述多模态分类网络模型,通过所述多模态分类网络模型对所述待分类眼底OCT图像进行质量评估。The model application unit is configured to input the fundus OCT image to be classified and the spectrum image to be classified corresponding to the fundus OCT image to be classified into the multimodal classification network model after the training of the multimodal classification network model is completed , Performing quality evaluation on the OCT image of the fundus to be classified through the multi-modal classification network model.
- 一种电子装置,其中,所述电子装置包括:存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的基于傅里叶变换的OCT图像质量评估程序,所述基于傅里叶变换的OCT图像质量评估程序被所述处理器执行时实现如下步骤:An electronic device, wherein the electronic device includes: a memory, a processor, and a Fourier transform-based OCT image quality evaluation program stored in the memory and running on the processor, the Fourier-based When the OCT image quality evaluation program of the Liye transform is executed by the processor, the following steps are implemented:获取具有已知图像标签的眼底OCT图像样本集并对各眼底OCT图像样本进行傅里叶变换,以建立相应的频谱图像样本集;Acquire fundus OCT image sample sets with known image tags and perform Fourier transform on each fundus OCT image sample to establish a corresponding spectrum image sample set;创建多模态分类网络模型,并通过所述眼底OCT图像样本集和所述频谱图像样本集对所述多模态分类网络模型进行训练;Creating a multi-modal classification network model, and training the multi-modal classification network model through the fundus OCT image sample set and the spectrum image sample set;所述多模态分类网络模型训练完毕后,将待分类眼底OCT图像以及所述待分类眼底OCT图像对应的待分类频谱图像输入至所述多模态分类网络模型,通过所述多模态分类网络模型对所述待分类眼底OCT图像进行分类识别。After the multimodal classification network model is trained, the fundus OCT image to be classified and the spectrum image to be classified corresponding to the fundus OCT image to be classified are input to the multimodal classification network model, and the multimodal classification is performed The network model classifies and recognizes the OCT image of the fundus to be classified.
- 根据权利要求9所述的电子装置,其中,所述眼底OCT图像样本集存储于区块链中,对所述眼底OCT图像样本进行傅里叶变换的过程包括:The electronic device according to claim 9, wherein the fundus OCT image sample set is stored in a blockchain, and the process of performing Fourier transform on the fundus OCT image sample comprises:依次对各眼底OCT图像分别进行灰度处理;Perform gray-scale processing on each fundus OCT image in turn;对灰度处理后的各眼底OCT图像进行快速傅里叶变换生成相应的频域样本;Perform fast Fourier transform on each fundus OCT image after gray-scale processing to generate corresponding frequency domain samples;根据所述频域样本建立所述频谱图像样本集。The spectrum image sample set is established according to the frequency domain samples.
- 根据权利要求9或10所述的电子装置,其中,所述多模态分类网络模型包括Deep stream分支、shallow stream分支以及Simple Modal Image stream分支;其中,The electronic device according to claim 9 or 10, wherein the multi-modal classification network model includes a Deep stream branch, a shallow stream branch, and a Simple Modal Image stream branch; wherein,在训练所述多模态分类网络模型的过程中,In the process of training the multi-modal classification network model,所述Deep stream分支用于通过所述眼底OCT图像样本集进行训练,以对所述眼底OCT图像样本集中的各眼底OCT图像样本的深层特征进行提取;The Deep stream branch is used for training through the fundus OCT image sample set, so as to extract the deep features of each fundus OCT image sample in the fundus OCT image sample set;所述Shallow stream分支用于通过所述眼底OCT图像样本集进行训练,以对所述眼底OCT图像样本集中的各眼底OCT图像样本的浅层特征进行提取;The Shallow stream branch is used for training through the fundus OCT image sample set, so as to extract the shallow features of each fundus OCT image sample in the fundus OCT image sample set;所述Simple Modal Image stream分支用于通过所述频谱图像样本集进行训练,以对所 述眼底OCT图像样本集中的各频谱图像样本的频域浅层特征进行提取。The Simple Modal Image stream branch is used for training through the spectrum image sample set to extract the frequency-domain shallow features of each spectrum image sample in the fundus OCT image sample set.
- 根据权利要求11所述的电子装置,其中,在训练所述多模态分类网络模型的过程中,The electronic device according to claim 11, wherein, in the process of training the multi-modal classification network model,所述Deep stream分支还用于根据所述深层特征对所述眼底OCT图像样本进行一次分类,以获取相应的第一分类结果;The Deep stream branch is also used to classify the fundus OCT image samples once according to the deep features to obtain the corresponding first classification result;所述Shallow stream分支还用于根据所述浅层特征对所述眼底OCT图像样本进行二次分类,以获取相应的第二分类结果;The Shallow stream branch is also used to perform secondary classification on the fundus OCT image samples according to the shallow features to obtain a corresponding second classification result;所述Simple Modal Image stream分支还用于根据所述频域浅层特征对述待频谱图像样本进行三次分类,以获取相应的第三分类结果。The Simple Modal Image stream branch is also used to classify the to-be-spectrum image sample three times according to the shallow features of the frequency domain to obtain a corresponding third classification result.
- 根据权利要求12所述的电子装置,其中,在训练所述多模态分类网络模型的过程中,The electronic device according to claim 12, wherein, in the process of training the multi-modal classification network model,根据所述第一分类结果、所述第二分类结果以及所述第三分类结果计算相应的损失函数;Calculating a corresponding loss function according to the first classification result, the second classification result, and the third classification result;根据所述损失函数计算所述多模态分类网络模型的总损失函数,当所述总损失函数收敛至最小时,认定所述多模态分类网络模型训练完毕。Calculate the total loss function of the multimodal classification network model according to the loss function, and when the total loss function converges to a minimum, it is determined that the training of the multimodal classification network model is completed.
- 根据权利要求13所述的电子装置,其中,The electronic device according to claim 13, wherein:所述损失函数的计算公式为:The calculation formula of the loss function is:其中,p为标签的概率值,q为分类结果输出的预测概率值,x i代表第i个类别,n代表类别数; Among them, p is the probability value of the label, q is the predicted probability value output by the classification result, x i represents the i-th category, and n represents the number of categories;所述总损失函数的计算公式为:The calculation formula of the total loss function is:Loss 总=0.3×Loss D+0.3×Loss S+0.4×Loss P,其中, Total Loss = 0.3 × Loss D + 0.3 × Loss S + 0.4 × Loss P , where,Loss D为所述第一分类结果的损失函数,Loss S为所述第二分类结果的损失函数,Loss P为所述第三分类结果的损失函数。 Loss D is the loss function of the first classification result, Loss S is the loss function of the second classification result, and Loss P is the loss function of the third classification result.
- 根据权利要求14所述的电子装置,其中,在训练所述多模态分类网络模型的过程中,The electronic device according to claim 14, wherein, in the process of training the multi-modal classification network model,对所述深层特征、所述浅层特征以及所述频域浅层特征进行级联以获得级联特征;Cascading the deep features, the shallow features, and the frequency-domain shallow features to obtain cascaded features;所述多模态分类网络模型根据所述级联特征对所述眼底OCT图像样本进行级联分类,以获取与所述眼底OCT图像样本相对应的级联分类结果;The multi-modal classification network model performs cascaded classification of the fundus OCT image samples according to the cascaded features to obtain a cascaded classification result corresponding to the fundus OCT image samples;在所述多模态分类网络模型训练完毕,并通过所述多模态分类网络模型对所述待分类眼底OCT图像进行分类识别时,以所述联级分类结果为所述多模态分类网络模型输出的最终分类结果。After the multi-modal classification network model is trained, and the multi-modal classification network model is used to classify and recognize the to-be-classified fundus OCT image, the cascade classification result is used as the multi-modal classification network The final classification result output by the model.
- 一种计算机可读存储介质,其中,所述计算机可读存储介质中存储有基于傅里叶变换的OCT图像质量评估程序,所述基于傅里叶变换的OCT图像质量评估程序被处理器执行时,实现以下步骤:A computer-readable storage medium, wherein a Fourier transform-based OCT image quality evaluation program is stored in the computer-readable storage medium, and when the Fourier transform-based OCT image quality evaluation program is executed by a processor , To achieve the following steps:对具有已知图像标签的眼底OCT图像样本集中的各眼底OCT图像样本进行傅里叶变换,以建立相应的频谱图像样本集;Perform Fourier transform on each fundus OCT image sample set with a known image label in the fundus OCT image sample set to establish a corresponding spectrum image sample set;创建多模态分类网络模型,并通过所述眼底OCT图像样本集和所述频谱图像样本集对所述多模态分类网络模型进行训练;Creating a multi-modal classification network model, and training the multi-modal classification network model through the fundus OCT image sample set and the spectrum image sample set;所述多模态分类网络模型训练完毕后,将待分类眼底OCT图像以及与所述待分类眼底OCT图像对应的待分类频谱图像输入至所述多模态分类网络模型,通过所述多模态分类网络模型对所述待分类眼底OCT图像进行质量评估。After the multi-modal classification network model is trained, the fundus OCT image to be classified and the spectrum image to be classified corresponding to the fundus OCT image to be classified are input to the multi-modal classification network model, and the multi-modal The classification network model evaluates the quality of the fundus OCT image to be classified.
- 根据权利要求16所述的计算机可读存储介质,其中,所述多模态分类网络模型包 括Deep stream分支、shallow stream分支以及Simple Modal Image stream分支;其中,The computer-readable storage medium according to claim 16, wherein the multi-modal classification network model includes a Deep stream branch, a shallow stream branch, and a Simple Modal Image stream branch; wherein,在训练所述多模态分类网络模型的过程中,In the process of training the multi-modal classification network model,所述Deep stream分支用于通过所述眼底OCT图像样本集进行训练,以对所述眼底OCT图像样本集中的各眼底OCT图像样本的深层特征进行提取;The Deep stream branch is used for training through the fundus OCT image sample set, so as to extract the deep features of each fundus OCT image sample in the fundus OCT image sample set;所述Shallow stream分支用于通过所述眼底OCT图像样本集进行训练,以对所述眼底OCT图像样本集中的各眼底OCT图像样本的浅层特征进行提取;The Shallow stream branch is used for training through the fundus OCT image sample set, so as to extract the shallow features of each fundus OCT image sample in the fundus OCT image sample set;所述Simple Modal Image stream分支用于通过所述频谱图像样本集进行训练,以对所述眼底OCT图像样本集中的各频谱图像样本的频域浅层特征进行提取。The Simple Modal Image stream branch is used for training through the spectrum image sample set, so as to extract the frequency-domain shallow features of each spectrum image sample in the fundus OCT image sample set.
- 根据权利要求17所述的计算机可读存储介质,其中,在训练所述多模态分类网络模型的过程中,The computer-readable storage medium according to claim 17, wherein, in the process of training the multi-modal classification network model,所述Deep stream分支还用于根据所述深层特征对所述眼底OCT图像样本进行一次分类,以获取相应的第一分类结果;The Deep stream branch is also used to classify the fundus OCT image samples once according to the deep features to obtain the corresponding first classification result;所述Shallow stream分支还用于根据所述浅层特征对所述眼底OCT图像样本进行二次分类,以获取相应的第二分类结果;The Shallow stream branch is also used to perform secondary classification on the fundus OCT image samples according to the shallow features to obtain a corresponding second classification result;所述Simple Modal Image stream分支还用于根据所述频域浅层特征对述待频谱图像样本进行三次分类,以获取相应的第三分类结果。The Simple Modal Image stream branch is also used to classify the to-be-spectrum image sample three times according to the frequency-domain shallow features to obtain a corresponding third classification result.
- 根据权利要求18所述的计算机可读存储介质,其中,在训练所述多模态分类网络模型的过程中,The computer-readable storage medium according to claim 18, wherein, in the process of training the multi-modal classification network model,根据所述第一分类结果、所述第二分类结果以及所述第三分类结果计算相应的损失函数;Calculating a corresponding loss function according to the first classification result, the second classification result, and the third classification result;根据所述损失函数计算所述多模态分类网络模型的总损失函数,当所述总损失函数收敛至最小时,认定所述多模态分类网络模型训练完毕。Calculate the total loss function of the multimodal classification network model according to the loss function, and when the total loss function converges to a minimum, it is determined that the training of the multimodal classification network model is completed.
- 根据权利要求19所述的计算机可读存储介质,其中,The computer-readable storage medium of claim 19, wherein:所述损失函数的计算公式为:The calculation formula of the loss function is:其中,p为标签的概率值,q为分类结果输出的预测概率值,x i代表第i个类别,n代表类别数; Among them, p is the probability value of the label, q is the predicted probability value output by the classification result, x i represents the i-th category, and n represents the number of categories;所述总损失函数的计算公式为:The calculation formula of the total loss function is:Loss 总=0.3×Loss D+0.3×Loss S+0.4×Loss P,其中, Total Loss = 0.3 × Loss D + 0.3 × Loss S + 0.4 × Loss P , where,Loss D为所述第一分类结果的损失函数,Loss S为所述第二分类结果的损失函数,Loss P为所述第三分类结果的损失函数。 Loss D is the loss function of the first classification result, Loss S is the loss function of the second classification result, and Loss P is the loss function of the third classification result.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010618087.4 | 2020-06-30 | ||
CN202010618087.4A CN111784665B (en) | 2020-06-30 | 2020-06-30 | OCT image quality evaluation method, system and device based on Fourier transform |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021114818A1 true WO2021114818A1 (en) | 2021-06-17 |
Family
ID=72760784
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/117943 WO2021114818A1 (en) | 2020-06-30 | 2020-09-25 | Method, system, and device for oct image quality evaluation based on fourier transform |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN111784665B (en) |
WO (1) | WO2021114818A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11842490B2 (en) * | 2022-02-21 | 2023-12-12 | Zhejiang University | Fundus image quality evaluation method and device based on multi-source and multi-scale feature fusion |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113011485B (en) * | 2021-03-12 | 2023-04-07 | 北京邮电大学 | Multi-mode multi-disease long-tail distribution ophthalmic disease classification model training method and device |
CN113222985B (en) * | 2021-06-04 | 2022-01-21 | 中国人民解放军总医院 | Image processing method, image processing device, computer equipment and medium |
CN118154995B (en) * | 2024-05-10 | 2024-07-30 | 国网安徽省电力有限公司电力科学研究院 | Image quality cracking evaluation method based on time-frequency association self-adaptive learning model |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150062590A1 (en) * | 2013-08-29 | 2015-03-05 | Carl Zeiss Meditec, Inc. | Evaluation of optical coherence tomographic data prior to segmentation |
CN106372661A (en) * | 2016-08-30 | 2017-02-01 | 北京小米移动软件有限公司 | Method and device for constructing classification model |
CN109308692A (en) * | 2018-07-30 | 2019-02-05 | 西北大学 | Based on the OCT image quality evaluating method for improving Resnet and SVR mixed model |
CN110648326A (en) * | 2019-09-29 | 2020-01-03 | 精硕科技(北京)股份有限公司 | Method and device for constructing image quality evaluation convolutional neural network |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140276025A1 (en) * | 2013-03-14 | 2014-09-18 | Carl Zeiss Meditec, Inc. | Multimodal integration of ocular data acquisition and analysis |
-
2020
- 2020-06-30 CN CN202010618087.4A patent/CN111784665B/en active Active
- 2020-09-25 WO PCT/CN2020/117943 patent/WO2021114818A1/en active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150062590A1 (en) * | 2013-08-29 | 2015-03-05 | Carl Zeiss Meditec, Inc. | Evaluation of optical coherence tomographic data prior to segmentation |
CN106372661A (en) * | 2016-08-30 | 2017-02-01 | 北京小米移动软件有限公司 | Method and device for constructing classification model |
CN109308692A (en) * | 2018-07-30 | 2019-02-05 | 西北大学 | Based on the OCT image quality evaluating method for improving Resnet and SVR mixed model |
CN110648326A (en) * | 2019-09-29 | 2020-01-03 | 精硕科技(北京)股份有限公司 | Method and device for constructing image quality evaluation convolutional neural network |
Non-Patent Citations (1)
Title |
---|
CHEN DEYUN; FU LIJUN; ZHANG XUESONG; YU LIANG; CHEN HAILONG; LI AO: "Multiple Representations for Image Classification Approaches", JOURNAL OF FRONTIERS OF COMPUTER SCIENCE AND TECHNOLOGY, vol. 13, no. 12, 1 December 2019 (2019-12-01), pages 2138 - 2148, XP009528650, ISSN: 1673-9418, DOI: 10.3778/j.issn.1673-9418.1809054 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11842490B2 (en) * | 2022-02-21 | 2023-12-12 | Zhejiang University | Fundus image quality evaluation method and device based on multi-source and multi-scale feature fusion |
Also Published As
Publication number | Publication date |
---|---|
CN111784665A (en) | 2020-10-16 |
CN111784665B (en) | 2024-05-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021114818A1 (en) | Method, system, and device for oct image quality evaluation based on fourier transform | |
Shen et al. | An automatic diagnosis method of facial acne vulgaris based on convolutional neural network | |
Li et al. | Deep learning-based automated detection of glaucomatous optic neuropathy on color fundus photographs | |
Agrawal et al. | Grape leaf disease detection and classification using multi-class support vector machine | |
CN110097003A (en) | Check class attendance method, equipment, storage medium and device neural network based | |
US20200279358A1 (en) | Method, device, and system for testing an image | |
WO2022105118A1 (en) | Image-based health status identification method and apparatus, device and storage medium | |
Meng et al. | Tongue images classification based on constrained high dispersal network | |
CN113240655B (en) | Method, storage medium and device for automatically detecting type of fundus image | |
TWI728369B (en) | Method and system for analyzing skin texture and skin lesion using artificial intelligence cloud based platform | |
Kolla et al. | CNN‐Based Brain Tumor Detection Model Using Local Binary Pattern and Multilayered SVM Classifier | |
WO2021159811A1 (en) | Auxiliary diagnostic apparatus and method for glaucoma, and storage medium | |
US11721023B1 (en) | Distinguishing a disease state from a non-disease state in an image | |
WO2021120587A1 (en) | Method and apparatus for retina classification based on oct, computer device, and storage medium | |
CN114372564A (en) | Model training method for object classification, object classification method and device | |
Samant et al. | Comparative analysis of classification based algorithms for diabetes diagnosis using iris images | |
WO2022205779A1 (en) | Processing method and apparatus based on multi-modal eye detection data, and terminal device | |
Huan et al. | Multilevel and multiscale feature aggregation in deep networks for facial constitution classification | |
CN114511759A (en) | Method and system for identifying categories and determining characteristics of skin state images | |
Wang et al. | Diagnosis of cognitive and motor disorders levels in stroke patients through explainable machine learning based on MRI | |
Tang et al. | Explainable survival analysis with uncertainty using convolution-involved vision transformer | |
Ding et al. | HI-MViT: A lightweight model for explainable skin disease classification based on modified MobileViT | |
WO2021114626A1 (en) | Method for detecting quality of medical record data and related device | |
CN114186784B (en) | Electrical examination scoring method, system, medium and equipment based on edge calculation | |
KR102165487B1 (en) | Skin disease discrimination system based on skin image |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20898718 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20898718 Country of ref document: EP Kind code of ref document: A1 |