CN116824116A

CN116824116A - Super wide angle fundus image identification method, device, equipment and storage medium

Info

Publication number: CN116824116A
Application number: CN202310761433.8A
Authority: CN
Inventors: 戴伟伟; 程宇; 谢靖
Original assignee: Aier Eye Hospital Group Co Ltd
Current assignee: Aier Eye Hospital Group Co Ltd
Priority date: 2023-06-26
Filing date: 2023-06-26
Publication date: 2023-09-29
Anticipated expiration: 2043-06-26
Also published as: CN116824116B

Abstract

This application discloses an ultra-wide-angle fundus image recognition method, device, equipment and storage medium, which relates to the field of image processing technology and the field of deep learning technology, including: performing an image pre-processing operation on the original ultra-wide-angle fundus image to obtain a pre-processed image; Use the pre-trained feature extraction model to extract features from the pre-processed image to obtain the target encoding information; decode the target encoding information through the bilinear decoder model to obtain the output probability value information set; select the category with the largest probability value in the probability value information set Determined as the final prediction category corresponding to the original ultra-wide-angle fundus image. This application extracts feature vectors from pre-processed image features through a pre-trained feature extraction model, uses a bilinear decoder model to perform multiple complex algebraic operations, outputs multiple probability values, and determines the final recognition result by comparing the probability values. It reduces the probability of misidentification caused by manual reading and improves the accuracy of ultra-wide-angle fundus image recognition.

Description

An ultra-wide-angle fundus image recognition method, device, equipment and storage medium

技术领域Technical field

本发明涉及图像处理技术领域以及深度学习技术领域，特别涉及一种超广角眼底影像识别方法、装置、设备及存储介质。The present invention relates to the field of image processing technology and the field of deep learning technology, and in particular to an ultra-wide-angle fundus image recognition method, device, equipment and storage medium.

背景技术Background technique

随着人口老年化的不断加快，若不及时进行干预，各类眼底疾病将会更加迅速的蔓延，当前眼科专家的数量及医疗设备还不足以满足如此庞大的眼病患者群体的需求。As the aging of the population continues to accelerate, various fundus diseases will spread more rapidly if timely intervention is not carried out. The current number of ophthalmologists and medical equipment are not enough to meet the needs of such a large group of eye disease patients.

目前已有的眼底疾病图像识别系统大都基于普通眼底彩超影像进行识别，此类影像的视网膜眼底可视范围较窄，通常只有30°-75°。较为狭窄的可视区域不能提供完整的眼底信息，会增加误识别的概率。而超广角眼底影像的获取难度较小，且眼底可视范围能达到200°，对于眼底病的判断可依据的信息更多，判断精度更高。但是当前基于超广角的眼底影像识别都是采用人工阅片的方式，受临床医师的数量和临床经验限制较大，人工阅片时图像识别错误的概率较高且影像识别的精度较低。Most of the existing fundus disease image recognition systems are based on ordinary fundus color ultrasound images. The visible range of the retinal fundus in such images is narrow, usually only 30°-75°. A relatively narrow visual area cannot provide complete fundus information, which will increase the probability of misidentification. Ultra-wide-angle fundus images are less difficult to obtain, and the fundus visual range can reach 200°. The judgment of fundus diseases can be based on more information and the judgment accuracy is higher. However, current fundus image recognition based on ultra-wide angles uses manual reading, which is greatly limited by the number of clinicians and clinical experience. The probability of image recognition errors during manual reading is high and the accuracy of image recognition is low.

发明内容Contents of the invention

有鉴于此，本发明的目的在于提供一种超广角眼底影像识别方法、装置、设备和存储介质，能够降低人工阅片而导致误识别的概率，提高超广角眼底影像识别的精度。其具体方案如下：In view of this, the purpose of the present invention is to provide an ultra-wide-angle fundus image recognition method, device, equipment and storage medium, which can reduce the probability of false recognition caused by manual reading and improve the accuracy of ultra-wide-angle fundus image recognition. The specific plan is as follows:

第一方面，本申请公开了一种超广角眼底影像识别方法，包括：In the first aspect, this application discloses an ultra-wide-angle fundus image recognition method, including:

对原始超广角眼底影像进行图像预处理操作，以得到预处理图像；Perform image preprocessing operations on the original ultra-wide-angle fundus image to obtain the preprocessed image;

利用预训练的特征提取模型对所述预处理图像进行特征提取，以得到目标编码信息；Using a pre-trained feature extraction model to perform feature extraction on the pre-processed image to obtain target encoding information;

通过双线性解码器模型解码所述目标编码信息，以得到所述双线性解码器模型输出的概率值信息集合；Decode the target encoding information through a bilinear decoder model to obtain a probability value information set output by the bilinear decoder model;

将所述概率值信息集合中概率值最大的类别确定为所述原始超广角眼底影像对应的最终预测类别。The category with the largest probability value in the probability value information set is determined as the final predicted category corresponding to the original ultra-wide-angle fundus image.

可选的，所述对原始超广角眼底影像进行图像预处理操作，以得到预处理图像，包括：Optionally, the image preprocessing operation is performed on the original ultra-wide-angle fundus image to obtain the preprocessed image, including:

基于自适应ROI区域粗提取组件从所述原始超广角眼底影像中提取目标区域；其中，所述目标区域为由满足预设像素条件的像素点构成的区域；Extract a target area from the original ultra-wide-angle fundus image based on an adaptive ROI area rough extraction component; wherein the target area is an area composed of pixels that meet preset pixel conditions;

判断所述目标区域是否达到预设区域质量要求；Determine whether the target area meets the preset area quality requirements;

若所述目标区域达到所述预设区域质量要求，则基于预设亮度调整规则调整所述目标区域的亮度，以得到所述预处理图像；If the target area meets the preset area quality requirements, adjust the brightness of the target area based on the preset brightness adjustment rules to obtain the preprocessed image;

若所述目标区域未达到所述预设区域质量要求或未提取到所述目标区域，则进行最大椭圆近似拟合，以得到拟合区域；If the target area does not meet the preset area quality requirements or the target area is not extracted, then perform maximum ellipse approximate fitting to obtain the fitting area;

将所述拟合区域确定为所述目标区域，并重新进入所述基于预设亮度调整规则调整所述目标区域的亮度，以得到所述预处理图像的步骤。Determine the fitting area as the target area, and re-enter the step of adjusting the brightness of the target area based on a preset brightness adjustment rule to obtain the preprocessed image.

可选的，所述基于预设亮度调整规则调整所述目标区域的亮度，以得到所述预处理图像，包括：Optionally, adjusting the brightness of the target area based on preset brightness adjustment rules to obtain the preprocessed image includes:

获取所述目标区域的当前亮度，并确定所述当前亮度对应的当前亮度区间；Obtain the current brightness of the target area, and determine the current brightness interval corresponding to the current brightness;

基于预设亮度调整策略确定规则确定所述当前亮度区间对应的当前亮度调整策略；Determine the current brightness adjustment strategy corresponding to the current brightness interval based on the preset brightness adjustment strategy determination rules;

利用所述当前亮度调整策略调整所述目标区域的亮度，以得到所述预处理图像。The current brightness adjustment strategy is used to adjust the brightness of the target area to obtain the preprocessed image.

可选的，所述进行最大椭圆近似拟合，以得到拟合区域，包括：Optionally, performing maximum ellipse approximation fitting to obtain the fitting area includes:

根据所述原始超广角眼底影像的长与宽计算对应的长轴长信息以及短轴长信息；Calculate the corresponding long axis length information and short axis length information according to the length and width of the original ultra-wide-angle fundus image;

基于基准点信息确定中心点位置信息；Determine the center point position information based on the reference point information;

基于所述中心点位置信息、所述长轴长信息以及所述短轴长信息生成椭圆区域，并将所述椭圆区域确定为所述拟合区域。An elliptical area is generated based on the center point position information, the major axis length information, and the minor axis length information, and the elliptical area is determined as the fitting area.

可选的，所述利用预训练的特征提取模型对所述预处理图像进行特征提取，以得到目标编码信息之前，还包括：Optionally, before using the pre-trained feature extraction model to perform feature extraction on the pre-processed image to obtain the target encoding information, it also includes:

利用开源数据集训练原始特征提取模型，以得到具备初始权重的初始特征提取模型；Use open source data sets to train the original feature extraction model to obtain an initial feature extraction model with initial weights;

利用超广角眼底影像数据集训练所述初始特征提取模型，以得到具备目标权重的所述预训练的特征提取模型；Using the ultra-wide-angle fundus image data set to train the initial feature extraction model to obtain the pre-trained feature extraction model with target weights;

相应的，所述利用预训练的特征提取模型对所述预处理图像进行特征提取，以得到目标编码信息，包括：Correspondingly, using the pre-trained feature extraction model to perform feature extraction on the pre-processed image to obtain target encoding information includes:

利用具备所述目标权重的所述预训练的特征提取模型提取所述预处理图像中达到预设特征要求的特征信息；Using the pre-trained feature extraction model with the target weight to extract feature information that meets the preset feature requirements in the pre-processed image;

以向量的形式对所述特征信息进行编码，以得到所述目标编码信息。The feature information is encoded in the form of a vector to obtain the target encoding information.

可选的，所述通过双线性解码器模型解码所述目标编码信息，以得到所述双线性解码器模型输出的概率值信息集合，包括：Optionally, decoding the target encoding information through a bilinear decoder model to obtain a probability value information set output by the bilinear decoder model includes:

将所述目标编码信息输入至所述双线性解码器模型的第一通道与第二通道；Input the target encoding information to the first channel and the second channel of the bilinear decoder model;

获取所述第一通道输出的第一输出信息以及所述第二通道输出的第二输出信息；Obtain the first output information output by the first channel and the second output information output by the second channel;

将所述第一输出信息与所述第二输出信息进行空间维度特征堆叠，以得到堆叠后特征层；Perform spatial dimension feature stacking on the first output information and the second output information to obtain a stacked feature layer;

通过所述堆叠后特征层执行预设预测操作，以输出所述概率值信息集合。A preset prediction operation is performed through the stacked feature layer to output the probability value information set.

可选的，所述将所述目标编码信息输入至所述双线性解码器模型的第一通道与第二通道，包括：Optionally, inputting the target encoding information to the first channel and the second channel of the bilinear decoder model includes:

将所述目标编码信息输入至所述双线性解码器模型的所述第一通道，根据所述第一通道中的卷积核进行同尺度卷积，以得到卷积后信息；Input the target encoding information into the first channel of the bilinear decoder model, and perform same-scale convolution according to the convolution kernel in the first channel to obtain post-convolution information;

基于所述卷积后信息以及所述目标编码信息计算当前特征层的第一注意力权重；Calculate the first attention weight of the current feature layer based on the convolved information and the target encoding information;

若所述当前特征层为最后一层，则基于所述目标编码信息与上一层的所述第一注意力权重计算所述第一通道输出的所述第一输出信息；其中，所述第一输出信息为加权注意力特征；If the current feature layer is the last layer, the first output information output by the first channel is calculated based on the target encoding information and the first attention weight of the previous layer; wherein, the first One output information is the weighted attention feature;

将所述目标编码信息输入至所述双线性解码器模型的所述第二通道，计算所述目标编码信息的平均值，以得到第二注意力权重；Input the target encoding information to the second channel of the bilinear decoder model, and calculate the average value of the target encoding information to obtain a second attention weight;

基于所述第二注意力权重以及所述目标编码信息计算所述第二通道输出的所述第二输出信息。The second output information output by the second channel is calculated based on the second attention weight and the target encoding information.

第二方面，本申请公开了一种超广角眼底影像识别装置，包括：In the second aspect, this application discloses an ultra-wide-angle fundus image recognition device, including:

图像预处理模块，用于对原始超广角眼底影像进行图像预处理操作，以得到预处理图像；The image preprocessing module is used to perform image preprocessing operations on the original ultra-wide-angle fundus image to obtain the preprocessed image;

特征提取模块，用于利用预训练的特征提取模型对所述预处理图像进行特征提取，以得到目标编码信息；A feature extraction module, used to perform feature extraction on the preprocessed image using a pre-trained feature extraction model to obtain target encoding information;

解码模块，用于通过双线性解码器模型解码所述目标编码信息，以得到所述双线性解码器模型输出的概率值信息集合；A decoding module configured to decode the target encoding information through a bilinear decoder model to obtain a probability value information set output by the bilinear decoder model;

预测类别确定模块，用于将所述概率值信息集合中概率值最大的类别确定为所述原始超广角眼底影像对应的最终预测类别。A prediction category determination module, configured to determine the category with the largest probability value in the probability value information set as the final prediction category corresponding to the original ultra-wide-angle fundus image.

第三方面，本申请公开了一种电子设备，包括：In a third aspect, this application discloses an electronic device, including:

存储器，用于保存计算机程序；Memory, used to hold computer programs;

处理器，用于执行所述计算机程序，以实现如前述公开的超广角眼底影像识别方法的步骤。A processor, configured to execute the computer program to implement the steps of the ultra-wide-angle fundus image recognition method disclosed above.

第四方面，本申请公开了一种计算机可读存储介质，用于存储计算机程序；其中，所述计算机程序被处理器执行时实现如前述公开的超广角眼底影像识别方法。In a fourth aspect, the present application discloses a computer-readable storage medium for storing a computer program; wherein when the computer program is executed by a processor, the ultra-wide-angle fundus image recognition method disclosed above is implemented.

可见，本申请提供了一种超广角眼底影像识别方法，包括：对原始超广角眼底影像进行图像预处理操作，以得到预处理图像；利用预训练的特征提取模型对所述预处理图像进行特征提取，以得到目标编码信息；通过双线性解码器模型解码所述目标编码信息，以得到所述双线性解码器模型输出的概率值信息集合；将所述概率值信息集合中概率值最大的类别确定为所述原始超广角眼底影像对应的最终预测类别。由此可见，本申请通过预训练的特征提取模型对所述预处理图像进行特征提取，得到特征向量，然后利用双线性解码器模型将特征向量进行多重复杂代数运算，输出多个概率值，通过比较概率值的大小确定最终的识别结果，降低人工阅片而导致误识别的概率，提高了超广角眼底影像识别的精度。It can be seen that this application provides an ultra-wide-angle fundus image recognition method, which includes: performing an image pre-processing operation on the original ultra-wide-angle fundus image to obtain a pre-processed image; using a pre-trained feature extraction model to characterize the pre-processed image. Extract to obtain target encoding information; decode the target encoding information through a bilinear decoder model to obtain a probability value information set output by the bilinear decoder model; obtain the maximum probability value in the probability value information set The category of is determined as the final predicted category corresponding to the original ultra-wide-angle fundus image. It can be seen that this application uses a pre-trained feature extraction model to extract features from the pre-processed image to obtain a feature vector, and then uses a bilinear decoder model to perform multiple complex algebraic operations on the feature vector to output multiple probability values. The final recognition result is determined by comparing the probability values, which reduces the probability of misrecognition caused by manual reading and improves the accuracy of ultra-wide-angle fundus image recognition.

附图说明Description of the drawings

为了更清楚地说明本发明实施例或现有技术中的技术方案，下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图仅仅是本发明的实施例，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据提供的附图获得其他的附图。In order to explain the embodiments of the present invention or the technical solutions in the prior art more clearly, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings in the following description are only These are embodiments of the present invention. For those of ordinary skill in the art, other drawings can be obtained based on the provided drawings without exerting creative efforts.

图1为本申请公开的一种超广角眼底影像识别方法流程图；Figure 1 is a flow chart of an ultra-wide-angle fundus image recognition method disclosed in this application;

图2为本申请公开的一种具体的超广角眼底影像识别方法流程图；Figure 2 is a flow chart of a specific ultra-wide-angle fundus image recognition method disclosed in this application;

图3为本申请公开的超广角眼底影像预处理流程示意图；Figure 3 is a schematic flowchart of the ultra-wide-angle fundus image preprocessing process disclosed in this application;

图4为本申请公开的一种具体的超广角眼底影像识别方法流程图；Figure 4 is a flow chart of a specific ultra-wide-angle fundus image recognition method disclosed in this application;

图5为本申请公开的双线性特征解码预测流程图；Figure 5 is a flow chart of bilinear feature decoding and prediction disclosed in this application;

图6为本申请提供的超广角眼底影像识别装置结构示意图；Figure 6 is a schematic structural diagram of the ultra-wide-angle fundus image recognition device provided by this application;

图7为本申请提供的一种电子设备结构图。Figure 7 is a structural diagram of an electronic device provided by this application.

具体实施方式Detailed ways

下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例仅仅是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some of the embodiments of the present invention, not all of them. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts fall within the scope of protection of the present invention.

目前，目前已有的眼底疾病图像识别系统大都基于普通眼底彩超超广角眼底影像进行识别，此类超广角眼底影像的视网膜眼底可视范围较窄，通常只有30°-75°。较为狭窄的可视区域不能提供完整的眼底信息，会增加误识别的概率。而超广角眼底超广角眼底影像的获取难度较小，且眼底可视范围能达到200°，对于眼底病的判断可依据的信息更多，判断精度更高。但是当前基于超广角的眼底超广角眼底影像识别都是采用人工阅片的方式，受临床医师的数量和临床经验限制较大，人工阅片时图像识别错误的概率较高且超广角眼底影像识别的精度较低。为此，本申请提供了一种超广角眼底影像识别方法，能够降低人工阅片而导致误识别的概率，提高超广角眼底影像识别的精度。At present, most of the existing fundus disease image recognition systems are based on ordinary fundus color ultrasound ultra-wide-angle fundus images. The visible range of the retinal fundus in such ultra-wide-angle fundus images is narrow, usually only 30°-75°. A relatively narrow visual area cannot provide complete fundus information, which will increase the probability of misidentification. Ultra-wide-angle fundus images are less difficult to obtain, and the visual range of the fundus can reach 200°. The judgment of fundus diseases can be based on more information and the judgment accuracy is higher. However, current ultra-wide-angle fundus image recognition based on ultra-wide-angle fundus images uses manual reading, which is greatly limited by the number of clinicians and clinical experience. The probability of image recognition errors during manual reading is high and ultra-wide-angle fundus image recognition The accuracy is lower. To this end, this application provides an ultra-wide-angle fundus image recognition method, which can reduce the probability of misidentification caused by manual reading and improve the accuracy of ultra-wide-angle fundus image recognition.

本发明实施例公开了一种超广角眼底影像识别方法，参见图1所示，该方法包括：An embodiment of the present invention discloses an ultra-wide-angle fundus image recognition method, as shown in Figure 1. The method includes:

步骤S11：对原始超广角眼底影像进行图像预处理操作，以得到预处理图像。Step S11: Perform an image preprocessing operation on the original ultra-wide-angle fundus image to obtain a preprocessed image.

本实施例中，对原始超广角眼底影像进行图像预处理操作，以得到预处理图像。具体的，基于自适应ROI区域粗提取组件从所述原始超广角眼底影像中提取目标区域；其中，所述目标区域为由满足预设像素条件的像素点构成的区域；判断所述目标区域是否达到预设区域质量要求；若所述目标区域达到所述预设区域质量要求，则基于预设亮度调整规则调整所述目标区域的亮度，以得到所述预处理图像；若所述目标区域未达到所述预设区域质量要求或未提取到所述目标区域，则进行最大椭圆近似拟合，以得到拟合区域；将所述拟合区域确定为所述目标区域，并重新进入所述基于预设亮度调整规则调整所述目标区域的亮度，以得到所述预处理图像的步骤。可以理解的是，所述目标区域为由满足预设像素条件的像素点构成的区域，例如，超广角眼底影像本身的特性为像素值，则将大于预设像素阈值的像素点确定为足预设像素条件的像素点。In this embodiment, an image preprocessing operation is performed on the original ultra-wide-angle fundus image to obtain a preprocessed image. Specifically, the target area is extracted from the original ultra-wide-angle fundus image based on the adaptive ROI area rough extraction component; wherein the target area is an area composed of pixels that meet preset pixel conditions; it is determined whether the target area Reach the preset area quality requirements; if the target area reaches the preset area quality requirements, adjust the brightness of the target area based on the preset brightness adjustment rules to obtain the preprocessed image; if the target area does not If the preset area quality requirements are reached or the target area is not extracted, then maximum ellipse approximate fitting is performed to obtain the fitting area; the fitting area is determined as the target area, and re-enters the based on The step of adjusting the brightness of the target area by a preset brightness adjustment rule to obtain the preprocessed image. It can be understood that the target area is an area composed of pixels that meet preset pixel conditions. For example, if the characteristics of the ultra-wide-angle fundus image itself are pixel values, then the pixels that are greater than the preset pixel threshold are determined to be sufficient for the preset pixels. Set the pixel point of the pixel condition.

可以理解的是，基于预设亮度调整规则调整所述目标区域的亮度，以得到所述预处理图像，包括：获取所述目标区域的当前亮度，并确定所述当前亮度对应的当前亮度区间；基于预设亮度调整策略确定规则确定所述当前亮度区间对应的当前亮度调整策略；利用所述当前亮度调整策略调整所述目标区域的亮度，以得到所述预处理图像。进行最大椭圆近似拟合，以得到拟合区域，包括：根据所述原始超广角眼底影像的长与宽计算对应的长轴长信息以及短轴长信息；基于基准点信息确定中心点位置信息；基于所述中心点位置信息、所述长轴长信息以及所述短轴长信息生成椭圆区域，并将所述椭圆区域确定为所述拟合区域。It can be understood that adjusting the brightness of the target area based on a preset brightness adjustment rule to obtain the preprocessed image includes: obtaining the current brightness of the target area and determining the current brightness interval corresponding to the current brightness; The current brightness adjustment strategy corresponding to the current brightness interval is determined based on the preset brightness adjustment strategy determination rule; the brightness of the target area is adjusted using the current brightness adjustment strategy to obtain the preprocessed image. Perform maximum ellipse approximate fitting to obtain the fitting area, including: calculating the corresponding major axis length information and minor axis length information based on the length and width of the original ultra-wide-angle fundus image; determining the center point position information based on the reference point information; An elliptical area is generated based on the center point position information, the major axis length information, and the minor axis length information, and the elliptical area is determined as the fitting area.

如图2所示，本发明分为三个主要模块：图像预处理模块(包含自适应ROI区域粗提取、最大内接椭圆近似拟合以及自适应亮度调整)；特征提取与编码模块；特征解码与病种预测模块。在图像预处理模块之前，获取原超广角眼底影响，构建6分类训练集和测试集，进行数据平衡操作，保证每类数据量接近于1:1。特征提取与编码模块中，利用它ResNet50进行特征提取，先在ImageNet开源数据集上预训练，得到初始化权重，然后在超广角数据集的训练中对初始化权重进行调整。特征解码与病种预测模块中，接收ResNet50编码的特征向量，通过双支路模型解码特征并输出预测结果。模块之间联系紧密又互相解耦，即上一个模块的输出是下一个模块的输入，但每个模块在执行相应功能时相互独立，整个系统实现的功能是，输入一张超广角眼底超广角眼底影像，经过一系列处理，输出该超广角眼底影像最高概率值对应的眼底病种。As shown in Figure 2, the present invention is divided into three main modules: image preprocessing module (including adaptive ROI area rough extraction, maximum inscribed ellipse approximate fitting and adaptive brightness adjustment); feature extraction and encoding module; feature decoding and disease prediction module. Before the image preprocessing module, the original ultra-wide-angle fundus effect is obtained, a 6-category training set and a test set are constructed, and data balancing operations are performed to ensure that the amount of data for each category is close to 1:1. In the feature extraction and encoding module, ResNet50 is used for feature extraction. It is first pre-trained on the ImageNet open source data set to obtain the initialization weights, and then the initialization weights are adjusted during training on the ultra-wide-angle data set. In the feature decoding and disease prediction module, the feature vector encoded by ResNet50 is received, the features are decoded through the dual-branch model and the prediction results are output. The modules are closely connected and decoupled from each other, that is, the output of the previous module is the input of the next module, but each module is independent of each other when performing the corresponding function. The function of the entire system is to input an ultra-wide-angle fundus ultra-wide-angle After a series of processing of the fundus image, the fundus disease corresponding to the highest probability value of the ultra-wide-angle fundus image is output.

在图像预处理模块中构建一个能自动定位超广角眼底影像眼底区域并分割的模块，对于个别原始图像亮度过低或者畸变较严重的超广角眼底影像，增加一个内接椭圆形拟合模块，最大程度对这些难正确定位的超广角眼底影像进行眼底区域拟合，保证整体数据的强可用性。该模块包含3个组件：自适应ROI区域粗提取组件、最大内接椭圆近似拟合、自适应亮度调整。可以理解的是，超广角眼底超广角眼底影像进入自适应ROI区域粗提取组件基于机器学习和传统图像处理方法，先构建一个KMENAS聚类模型，设置其n_clusters超参数为2，表示二簇聚类任务，KMEANS是一种无监督聚类算法，根据超广角眼底影像本身的特性(即眼底区域的像素值相对较为接近，而其他区域的像素值相对变化较大)，然后通过KMEANS自动捕捉该特性，将眼底区域和其他区域分成2个不同的簇，对于绝大多数拍摄质量正常的超广角眼底影像都可以正常实现眼底和背景分割，从而实现了ROI区域粗提取。然后根据上一步ROI区域提取的效果决定是否进行最大椭圆近似拟合，如图3所示，如果第一步提取的ROI区域质量较差(例如未提取到ROI、ROI区域面积或位置不符合预期)则进一步采用第二个组件进行处理，该组件先根据原始图像的长宽计算出超广角眼底影像的长轴长l、短轴长s(例如，取原始图像的长的一半作为长轴长，取原始图像的宽的一半作为短轴长)，并获取中心点位置(x，y)(通常为预设的基准点位置，例如(0，0))，生成一个该眼底影像最大内接椭圆mask，将原图中对应的mask区域分割出来作为近似拟合，将近似拟合区域确定为粗提取的ROI区域(即目标区域)。最后，基于opencv提供的亮度调整算法，先计算输入图像，即目标区域的亮度，然后判断亮度值处于哪个阶(阶指的是亮度阈值，通过多次人工测算得知)，对于不同阶的亮度采用不同程度的亮度提升策略，直至最后输出的图像亮度基本处于相同水平。该组件主要目的是点亮较为昏暗的超广角眼底影像，同时降低亮度过高的超广角眼底影像亮度，最终提升后续模型的预测精度。In the image preprocessing module, a module is constructed that can automatically locate and segment the fundus area of ultra-wide-angle fundus images. For individual ultra-wide-angle fundus images where the brightness of the original image is too low or the distortion is severe, an inscribed elliptical fitting module is added, with a maximum It can perform fundus area fitting on these ultra-wide-angle fundus images that are difficult to position correctly to ensure strong usability of the overall data. This module contains 3 components: adaptive ROI area rough extraction component, maximum inscribed ellipse approximate fitting, and adaptive brightness adjustment. It can be understood that the ultra-wide-angle fundus image enters the adaptive ROI area rough extraction component. Based on machine learning and traditional image processing methods, a KMENAS clustering model is first built and its n_clusters hyperparameter is set to 2, indicating two-cluster clustering. Task, KMEANS is an unsupervised clustering algorithm, based on the characteristics of the ultra-wide-angle fundus image itself (that is, the pixel values in the fundus area are relatively close, while the pixel values in other areas have relatively large changes), and then automatically capture this characteristic through KMEANS , divide the fundus area and other areas into two different clusters. For most ultra-wide-angle fundus images with normal shooting quality, fundus and background segmentation can be achieved normally, thus achieving rough extraction of the ROI area. Then decide whether to perform maximum ellipse approximate fitting based on the effect of ROI region extraction in the previous step. As shown in Figure 3, if the quality of the ROI region extracted in the first step is poor (for example, no ROI is extracted, the area or position of the ROI region does not meet expectations ) is further processed using the second component, which first calculates the long axis length l and short axis length s of the ultra-wide-angle fundus image based on the length and width of the original image (for example, take half the length of the original image as the long axis length , take half the width of the original image as the short axis length), and obtain the center point position (x, y) (usually the preset reference point position, such as (0, 0)), and generate a maximum inscribed of the fundus image For the elliptical mask, the corresponding mask area in the original image is segmented as an approximate fitting, and the approximate fitting area is determined as the roughly extracted ROI area (i.e., the target area). Finally, based on the brightness adjustment algorithm provided by opencv, first calculate the brightness of the input image, that is, the target area, and then determine which order the brightness value is in (the order refers to the brightness threshold, which is known through multiple manual measurements). For different levels of brightness Different degrees of brightness improvement strategies are adopted until the brightness of the final output image is basically at the same level. The main purpose of this component is to light up the dim ultra-wide-angle fundus images, while reducing the brightness of the ultra-wide-angle fundus images that are too bright, and ultimately improve the prediction accuracy of subsequent models.

如上述图3所示，若所述目标区域达到所述预设区域质量要求，或，执行完最大内接椭圆近似拟合操作得到近似拟合区域，则基于预设亮度调整规则调整所述目标区域的亮度。As shown in Figure 3 above, if the target area meets the preset area quality requirements, or the approximate fitting area is obtained after performing the maximum inscribed ellipse approximate fitting operation, the target is adjusted based on the preset brightness adjustment rules. The brightness of the area.

步骤S12：利用预训练的特征提取模型对所述预处理图像进行特征提取，以得到目标编码信息。Step S12: Use the pre-trained feature extraction model to perform feature extraction on the pre-processed image to obtain target coding information.

本实施例中，对原始超广角眼底影像进行图像预处理操作，以得到预处理图像之后，利用预训练的特征提取模型对所述预处理图像进行特征提取，以得到目标编码信息。具体的，利用具备所述目标权重的所述预训练的特征提取模型提取所述预处理图像中达到预设特征要求的特征信息；以向量的形式对所述特征信息进行编码，以得到所述目标编码信息。In this embodiment, after performing an image preprocessing operation on the original ultra-wide-angle fundus image to obtain a preprocessed image, a pretrained feature extraction model is used to perform feature extraction on the preprocessed image to obtain target encoding information. Specifically, the pre-trained feature extraction model with the target weight is used to extract feature information that meets the preset feature requirements in the pre-processed image; the feature information is encoded in the form of a vector to obtain the Target encodes information.

可以理解的是，在利用预训练的特征提取模型对所述预处理图像进行特征提取之前，利用开源数据集训练原始特征提取模型，以得到具备初始权重的初始特征提取模型；利用超广角眼底影像数据集训练所述初始特征提取模型，以得到具备目标权重的所述预训练的特征提取模型。例如，ResNet50特征提取模块采用开源模型ResNet50，先在预训练数据集(即开源数据集)ImageNet上训练该模型，因为该数据集包含了超过1400万张图像，涵盖的类别数有1000类，大数据量训练后的ResNet50具有较好的初始权重，可以在其他个性化任务中表现出比随机权重初始化的ResNet50好很多的性能。利用具备初始权重的初始特征提取模型在当前超广角眼底影像数据集上进行训练，得到具备目标权重的所述预训练的特征提取模型，上述预训练的特征提取模型可以高效抽取到眼底特征，便于后续模块对这些特征进行解码。It can be understood that before using the pre-trained feature extraction model to perform feature extraction on the pre-processed image, the open source data set is used to train the original feature extraction model to obtain an initial feature extraction model with initial weights; the ultra-wide-angle fundus image is used The initial feature extraction model is trained on the data set to obtain the pre-trained feature extraction model with target weights. For example, the ResNet50 feature extraction module uses the open source model ResNet50. The model is first trained on the pre-training data set (i.e., the open source data set) ImageNet, because this data set contains more than 14 million images and covers 1,000 categories. ResNet50 trained with a large amount of data has better initial weights and can show much better performance in other personalized tasks than ResNet50 initialized with random weights. The initial feature extraction model with initial weights is used to train on the current ultra-wide-angle fundus image data set to obtain the pre-trained feature extraction model with target weights. The above-mentioned pre-trained feature extraction model can efficiently extract fundus features, which is convenient for Subsequent modules decode these features.

需要指出的是，如上述图2所示，在利用具备初始权重的初始特征提取模型在当前超广角眼底影像数据集上进行训练时，获取当前超广角眼底影像数据集，并构建6分类训练集和测试集，同时进行数据平衡操作，使得每类数据量接近于1:1。It should be pointed out that, as shown in Figure 2 above, when using the initial feature extraction model with initial weights to train on the current ultra-wide-angle fundus image data set, the current ultra-wide-angle fundus image data set is obtained and a 6-category training set is constructed. and test set, perform data balancing operations at the same time, so that the amount of data of each type is close to 1:1.

步骤S13：通过双线性解码器模型解码所述目标编码信息，以得到所述双线性解码器模型输出的概率值信息集合。Step S13: Decode the target encoding information through a bilinear decoder model to obtain a probability value information set output by the bilinear decoder model.

本实施例中，利用预训练的特征提取模型对所述预处理图像进行特征提取，以得到目标编码信息，通过双线性解码器模型解码所述目标编码信息，以得到所述双线性解码器模型输出的概率值信息集合。具体的，将所述目标编码信息输入至所述双线性解码器模型的第一通道与第二通道；获取所述第一通道输出的第一输出信息以及所述第二通道输出的第二输出信息；将所述第一输出信息与所述第二输出信息进行空间维度特征堆叠，以得到堆叠后特征层；通过所述堆叠后特征层执行预设预测操作，以输出所述概率值信息集合。In this embodiment, a pre-trained feature extraction model is used to perform feature extraction on the pre-processed image to obtain target encoding information, and the target encoding information is decoded through a bilinear decoder model to obtain the bilinear decoding The set of probability value information output by the controller model. Specifically, the target encoding information is input to the first channel and the second channel of the bilinear decoder model; the first output information output by the first channel and the second output information output by the second channel are obtained. Output information; perform spatial dimension feature stacking on the first output information and the second output information to obtain a stacked feature layer; perform a preset prediction operation through the stacked feature layer to output the probability value information gather.

步骤S14：将所述概率值信息集合中概率值最大的类别确定为所述原始超广角眼底影像对应的最终预测类别。Step S14: Determine the category with the largest probability value in the probability value information set as the final predicted category corresponding to the original ultra-wide-angle fundus image.

本实施例中，通过双线性解码器模型解码所述目标编码信息，以得到所述双线性解码器模型输出的概率值信息集合之后，将所述概率值信息集合中概率值最大的类别确定为所述原始超广角眼底影像对应的最终预测类别。可以理解的是，每张图像预测6个概率值，分别对应玻璃体混浊，黄斑病变，糖尿病性视网膜病变，青光眼，其他眼底病，正常这6个类别的概率，取概率值最大的类别作为模型最终的输出类别。每张图像预测的概率值数量以及类别可以根据不同的情况自行设置。In this embodiment, after decoding the target encoding information through a bilinear decoder model to obtain a probability value information set output by the bilinear decoder model, the category with the largest probability value in the probability value information set is The final prediction category corresponding to the original ultra-wide-angle fundus image is determined. It can be understood that each image predicts 6 probability values, corresponding to the probabilities of the six categories of vitreous opacity, macular degeneration, diabetic retinopathy, glaucoma, other fundus diseases, and normal. The category with the largest probability value is selected as the final model. output category. The number of predicted probability values and categories for each image can be set according to different situations.

本发明真实临床数据实验性能较好，6分类模型在上万张欧堡超广角眼底影像数据上取得了86.2％的Acc，86.1％的特异性，97.3％的灵敏度和86％的f1值，目前即使是具有丰富临床经验的眼科专家，也要结合病人的年龄，性别，病史等信息才能对超广角眼底影像识别，并根据识别结果确定疾病类别，且存在多名专家超广角眼底影像识别结果不一致的情况，直接通过人工阅片更是难以做到精确识别的。目前尚未有基于超广角眼底影像的多眼病辅助AI系统，现有的技术大部分只能执行二分类任务，例如青光眼/非青光眼，而现实情况是可能患者患病为黄斑病变，在这样的二分类模型中无法被检测出来，使得大多数现有技术无法真正用于临床来辅助医生超广角眼底影像识别，本发明建立了超广角眼底影像辅助系统研究范式，该模型建立了一整套多眼底病辅助超广角眼底影像识别研究范式，后续还可以进一步丰富可识别眼病类别，提升模型超广角眼底影像识别精度，同时搭建了基于注意力机制的双线性解码网络。该模型不仅在当前超广角眼底影像数据上取得了较高的测试精度，同时在基于眼底彩超的糖网分级任务上也取得了不错的效果，证明该模型可被进一步微调，扩展到更为丰富的图像识别与分类场景。The real clinical data experimental performance of the present invention is good. The 6-classification model has achieved 86.2% Acc, 86.1% specificity, 97.3% sensitivity and 86% f1 value on tens of thousands of Oubao ultra-wide-angle fundus image data. Currently, Even ophthalmologists with rich clinical experience must combine the patient's age, gender, medical history and other information to identify ultra-wide-angle fundus images, and determine the disease category based on the recognition results, and there are inconsistent ultra-wide-angle fundus image recognition results among multiple experts. In this case, it is even more difficult to accurately identify directly through manual reading of the film. Currently, there is no multi-eye disease auxiliary AI system based on ultra-wide-angle fundus images. Most of the existing technologies can only perform two-category tasks, such as glaucoma/non-glaucoma. The reality is that the patient may be suffering from macular degeneration. In such a two-category category, cannot be detected in the classification model, making most existing technologies unable to be truly used clinically to assist doctors in ultra-wide-angle fundus image recognition. The present invention establishes a research paradigm for an ultra-wide-angle fundus image auxiliary system. This model establishes a complete set of multi-fundus diseases. Assisting the ultra-wide-angle fundus image recognition research paradigm, it can further enrich the categories of identifiable eye diseases, improve the model's ultra-wide-angle fundus image recognition accuracy, and build a bilinear decoding network based on the attention mechanism. This model not only achieves high test accuracy on current ultra-wide-angle fundus imaging data, but also achieves good results on the glucose reticular classification task based on fundus color ultrasound, proving that the model can be further fine-tuned and expanded to richer Image recognition and classification scenarios.

本发明对原超广角眼底影像做数据预处理，其主要目的是从包含部分眼睑、眼睫毛和背景信息的原图中提取出仅含眼底的区域，方法是基于传统图像处理技术构建一个自动识别眼底ROI区域并提取该区域的图像预处理模块，原图经过该模块就会得到较为纯净的眼底图。其次，采用ResNet50深度学习模型进行图像特征编码，提取出眼底图中较为细致的像素级特征并以向量的形式进行编码。最后，对编码好的特征进行解码，通过构建一个双线性解码器模型，将特征向量进行多重复杂代数运算，最终输出6个概率值，6个概率值分别对应5种不同眼病和正常类别的概率，取概率值最大的类别作为输出类别，即超广角眼底影像识别结果。基于超广角眼底影像的多眼底病辅助影像识别系统，能够自动识别多种眼底疾病且具有较高的识别精度，可应用于临床进行早期眼底病筛查，给专业眼科医生提供可靠的超广角眼底影像识别参考，缓解当前一线临床医师的压力，也能降低人工阅片导致的误识别概率，提高阅片效率以及超广角眼底影像识别准确率。This invention performs data preprocessing on the original ultra-wide-angle fundus image. Its main purpose is to extract the area containing only the fundus from the original image containing part of the eyelids, eyelashes and background information. The method is to build an automatic fundus recognition system based on traditional image processing technology. ROI area and extract the image preprocessing module of this area. After passing the original image through this module, a relatively pure fundus image will be obtained. Secondly, the ResNet50 deep learning model is used to encode image features, and the more detailed pixel-level features in the fundus images are extracted and encoded in the form of vectors. Finally, the encoded features are decoded, and a bilinear decoder model is constructed to perform multiple complex algebraic operations on the feature vector, finally outputting 6 probability values, which correspond to 5 different eye diseases and normal categories respectively. Probability, take the category with the largest probability value as the output category, that is, the ultra-wide-angle fundus image recognition result. The multi-fundus disease auxiliary image recognition system based on ultra-wide-angle fundus images can automatically identify multiple fundus diseases with high recognition accuracy. It can be used in clinical early-stage fundus disease screening and provides professional ophthalmologists with reliable ultra-wide-angle fundus images. The image recognition reference relieves the pressure on current front-line clinicians, reduces the probability of misrecognition caused by manual reading, and improves the efficiency of reading and the accuracy of ultra-wide-angle fundus image recognition.

参见图4所示，本发明实施例公开了一种超广角眼底影像识别方法，相对于上一实施例，本实施例对技术方案作了进一步的说明和优化。As shown in Figure 4, an embodiment of the present invention discloses an ultra-wide-angle fundus image recognition method. Compared with the previous embodiment, this embodiment further explains and optimizes the technical solution.

步骤S21：对原始超广角眼底影像进行图像预处理操作，以得到预处理图像。Step S21: Perform an image preprocessing operation on the original ultra-wide-angle fundus image to obtain a preprocessed image.

步骤S22：利用预训练的特征提取模型对所述预处理图像进行特征提取，以得到目标编码信息。Step S22: Use the pre-trained feature extraction model to perform feature extraction on the pre-processed image to obtain target encoding information.

步骤S23：将所述目标编码信息输入至所述双线性解码器模型的第一通道与第二通道。Step S23: Input the target encoding information to the first channel and the second channel of the bilinear decoder model.

本实施例中，将所述目标编码信息输入至所述双线性解码器模型的第一通道与第二通道。具体的，将所述目标编码信息输入至所述双线性解码器模型的所述第一通道，根据所述第一通道中的卷积核进行同尺度卷积，以得到卷积后信息；基于所述卷积后信息以及所述目标编码信息计算当前特征层的第一注意力权重；若所述当前特征层为最后一层，则基于所述目标编码信息与上一层的所述第一注意力权重计算所述第一通道输出的所述第一输出信息；其中，所述第一输出信息为加权注意力特征；将所述目标编码信息输入至所述双线性解码器模型的所述第二通道，计算所述目标编码信息的平均值，以得到第二注意力权重；基于所述第二注意力权重以及所述目标编码信息计算所述第二通道输出的所述第二输出信息。In this embodiment, the target encoding information is input to the first channel and the second channel of the bilinear decoder model. Specifically, the target encoding information is input to the first channel of the bilinear decoder model, and the same-scale convolution is performed according to the convolution kernel in the first channel to obtain the convolved information; The first attention weight of the current feature layer is calculated based on the post-convolution information and the target coding information; if the current feature layer is the last layer, the first attention weight of the current feature layer is calculated based on the target coding information and the first attention weight of the previous layer. An attention weight calculates the first output information output by the first channel; wherein the first output information is a weighted attention feature; inputting the target encoding information to the bilinear decoder model The second channel calculates the average value of the target encoding information to obtain a second attention weight; calculates the second output of the second channel based on the second attention weight and the target encoding information. Output information.

可以理解的是，如图5所示，双线性特征解码预测模块采用2条并行通道注意力支路：AttentionNet1基于卷积支路、AttentionNet2基于池化支路，采用不同的注意力生成范式及初始权重，最大程度寻找到重要特征，对于重要程度不同的特征给予不同的通道权重，让模型有重点的训练和学习。最后将加权特征在空间维度进行融合，保证特征的完整性并进行后续的一系列卷积、池化、全连接层的处理得到最终的类别得分向量，对于每一个类别会给出对应的得分，经过softmax操作将得分以概率值形式进行输出，得到每个类别的概率，取概率值最大的类别作为最终的预测结果。It can be understood that, as shown in Figure 5, the bilinear feature decoding prediction module uses two parallel channel attention branches: AttentionNet1 is based on the convolution branch, and AttentionNet2 is based on the pooling branch, using different attention generation paradigms and Initial weights are used to find important features to the greatest extent, and different channel weights are given to features with different degrees of importance, so that the model can focus on training and learning. Finally, the weighted features are fused in the spatial dimension to ensure the integrity of the features and a subsequent series of convolution, pooling, and fully connected layers are processed to obtain the final category score vector. For each category, a corresponding score will be given. After the softmax operation, the score is output in the form of a probability value to obtain the probability of each category, and the category with the largest probability value is taken as the final prediction result.

AttentionNet1单元(即第一通道)基于卷积方式实现通道注意力机制。采用7×7大小的卷积核对上个模块ResNet50提取后的特征进行同尺度卷积，对每个特征层都将输出一个大小为1×1的浮点数，被确定为当前特征层的注意力权重，原ResNet50输出的(B，N，7，7)经过同尺度卷积后输出为(B，N，1，1)，B代表Batch size，N代表特征通道数，W，H分别代表图像宽高，每个特征层的注意力权重都将在下一阶段与原输出特征相乘得到加权注意力特征。The AttentionNet1 unit (i.e. the first channel) implements the channel attention mechanism based on convolution. A 7×7 convolution kernel is used to perform same-scale convolution on the features extracted by the previous module ResNet50. Each feature layer will output a floating point number of size 1×1, which is determined as the attention of the current feature layer. Weight, the (B, N, 7, 7) output by the original ResNet50 is output as (B, N, 1, 1) after the same scale convolution. B represents the Batch size, N represents the number of feature channels, and W and H represent the image respectively. Width and height, the attention weight of each feature layer will be multiplied with the original output feature in the next stage to obtain the weighted attention feature.

I∈(B，N，W，H)；I∈(B,N,W,H);

K₁＝W×H；K ₁ =W×H;

Attention₁＝I*K₁∈(B，N，1，1)；Attention ₁ =I*K ₁ ∈ (B, N, 1, 1);

Out₁＝I×Attention₁∈(B，N，W，H)；Out ₁ =I×Attention ₁ ∈ (B, N, W, H);

I为目标编码信息，Attention₁为第一注意力权重，Out₁为第一输出信息。I is the target encoding information, Attention ₁ is the first attention weight, and Out ₁ is the first output information.

AttentionNet2单元(即第二通道)基于全局平均池化的方式实现通道注意力机制。将7×7的特征矩阵求取平均值作为该特征的注意力权重，因为在重要特征层中绝大多数细节特征都将具有较高像素值，因为平均后得到的权重值也较大，可认为在后续的训练过程中将得到更大程度的关注，同理，背景等特征大部分像素值趋于0，平均值也较小，在后续训练过程中将被逐渐忽视其影响。The AttentionNet2 unit (i.e., the second channel) implements the channel attention mechanism based on global average pooling. The average value of the 7×7 feature matrix is used as the attention weight of the feature, because most detailed features in the important feature layer will have higher pixel values, because the weight value obtained after averaging is also larger, which can be It is believed that greater attention will be paid in the subsequent training process. Similarly, most pixel values of background and other features tend to be 0, and the average value is also small, and their influence will be gradually ignored in the subsequent training process.

I∈(B，N，W，H)；Attention2＝GlobalAvgPool(I)∈(B，N，1，1)；I∈(B,N,W,H);Attention2=GlobalAvgPool(I)∈(B,N,1,1);

Out₂＝I×Attention₂∈(B，N，W，H)；Out ₂ =I×Attention ₂ ∈ (B, N, W, H);

I为目标编码信息，Attention₂为第二注意力权重，Out₂为第二输出信息。I is the target encoding information, Attention ₂ is the second attention weight, and Out ₂ is the second output information.

步骤S24：获取所述第一通道输出的第一输出信息以及所述第二通道输出的第二输出信息，将所述第一输出信息与所述第二输出信息进行空间维度特征堆叠，以得到堆叠后特征层。Step S24: Obtain the first output information output by the first channel and the second output information output by the second channel, and perform spatial dimension feature stacking on the first output information and the second output information to obtain Stacked feature layers.

本实施例中，将所述目标编码信息输入至所述双线性解码器模型的第一通道与第二通道之后，获取所述第一通道输出的第一输出信息以及所述第二通道输出的第二输出信息，将所述第一输出信息与所述第二输出信息进行空间维度特征堆叠，以得到堆叠后特征层。可以理解的是，FeatureFusion单元采用空间维度特征堆叠，将Out₁和Out₂的特征层堆叠在一起，将会变成2N个特征层。In this embodiment, after inputting the target encoding information into the first channel and the second channel of the bilinear decoder model, the first output information output by the first channel and the second channel output are obtained. The second output information is stacked with spatial dimension features on the first output information and the second output information to obtain a stacked feature layer. It can be understood that the FeatureFusion unit uses spatial dimension feature stacking. When the feature layers of Out ₁ and Out ₂ are stacked together, it will become 2N feature layers.

步骤S25：通过所述堆叠后特征层执行预设预测操作，以输出所述概率值信息集合。Step S25: Perform a preset prediction operation through the stacked feature layer to output the probability value information set.

本实施例中，将所述第一输出信息与所述第二输出信息进行空间维度特征堆叠，以得到堆叠后特征层之后，通过所述堆叠后特征层执行预设预测操作，以输出所述概率值信息集合。可以理解的是，将Out₁和Out₂的特征层堆叠在一起，将会变成2N个特征层，保留所有加权后的特征层并进行后续的一系列卷积、池化、全连接层之后得到最终类别概率。In this embodiment, after the first output information and the second output information are stacked with spatial dimension features to obtain a stacked feature layer, a preset prediction operation is performed through the stacked feature layer to output the A collection of probability value information. It can be understood that stacking the feature layers of Out ₁ and Out ₂ together will become 2N feature layers, retaining all weighted feature layers and performing a subsequent series of convolution, pooling, and fully connected layers. Get the final class probability.

F＝Concatenate(Out₁，Out₂)∈(B，2N，W，H)；F=Concatenate(Out ₁ , Out ₂ )∈(B, 2N, W, H);

Score＝FC(AvgPool(ConvBlock(F)))∈(B，6，1，1)；Score=FC(AvgPool(ConvBlock(F)))∈(B, 6, 1, 1);

Out＝SoftMax(Score)∈(B，6，1，1)。Out=SoftMax(Score)∈(B, 6, 1, 1).

最终的Out表示对于B张输入图像，每张图像预测6个概率值，6个概率值分别对应玻璃体混浊，黄斑病变，糖尿病性视网膜病变，青光眼，其他眼底病，正常这6个类别的概率，取概率值最大的类别作为模型最终的输出类别。The final Out indicates that for B input images, each image predicts 6 probability values, and the 6 probability values correspond to the probabilities of the six categories of vitreous opacity, macular degeneration, diabetic retinopathy, glaucoma, other fundus diseases, and normal. The category with the largest probability value is taken as the final output category of the model.

步骤S26：将所述概率值信息集合中概率值最大的类别确定为所述原始超广角眼底影像对应的最终预测类别。Step S26: Determine the category with the largest probability value in the probability value information set as the final predicted category corresponding to the original ultra-wide-angle fundus image.

关于上述步骤S21、S22、S26的具体内容可以参考前述实施例中公开的相应内容，在此不再进行赘述。Regarding the specific content of the above steps S21, S22, and S26, reference may be made to the corresponding content disclosed in the foregoing embodiments and will not be described again here.

可见，本申请实施例通过对原始超广角眼底影像进行图像预处理操作，以得到预处理图像；利用预训练的特征提取模型对所述预处理图像进行特征提取，以得到目标编码信息；将所述目标编码信息输入至所述双线性解码器模型的第一通道与第二通道；获取所述第一通道输出的第一输出信息以及所述第二通道输出的第二输出信息；将所述第一输出信息与所述第二输出信息进行空间维度特征堆叠，以得到堆叠后特征层；通过所述堆叠后特征层执行预设预测操作，以输出所述概率值信息集合；将所述概率值信息集合中概率值最大的类别确定为所述原始超广角眼底影像对应的最终预测类别，降低人工阅片而导致误识别的概率，提高了超广角眼底影像识别的精度。It can be seen that the embodiment of the present application performs image preprocessing on the original ultra-wide-angle fundus image to obtain a preprocessed image; uses a pretrained feature extraction model to perform feature extraction on the preprocessed image to obtain target encoding information; The target encoding information is input to the first channel and the second channel of the bilinear decoder model; the first output information output by the first channel and the second output information output by the second channel are obtained; The first output information and the second output information are stacked with spatial dimension features to obtain a stacked feature layer; a preset prediction operation is performed through the stacked feature layer to output the probability value information set; The category with the largest probability value in the probability value information set is determined as the final predicted category corresponding to the original ultra-wide-angle fundus image, which reduces the probability of misrecognition caused by manual reading and improves the accuracy of ultra-wide-angle fundus image recognition.

参见图6所示，本申请实施例还相应公开了一种超广角眼底影像识别装置，包括：Referring to Figure 6, the embodiment of the present application also discloses an ultra-wide-angle fundus image recognition device, which includes:

图像预处理模块11，用于对原始超广角眼底影像进行图像预处理操作，以得到预处理图像；The image preprocessing module 11 is used to perform image preprocessing operations on the original ultra-wide-angle fundus image to obtain a preprocessed image;

特征提取模块12，用于利用预训练的特征提取模型对所述预处理图像进行特征提取，以得到目标编码信息；The feature extraction module 12 is used to perform feature extraction on the pre-processed image using a pre-trained feature extraction model to obtain target encoding information;

解码模块13，用于通过双线性解码器模型解码所述目标编码信息，以得到所述双线性解码器模型输出的概率值信息集合；Decoding module 13, configured to decode the target encoding information through a bilinear decoder model to obtain a probability value information set output by the bilinear decoder model;

预测类别确定模块14，用于将所述概率值信息集合中概率值最大的类别确定为所述原始超广角眼底影像对应的最终预测类别。The prediction category determination module 14 is configured to determine the category with the largest probability value in the probability value information set as the final prediction category corresponding to the original ultra-wide-angle fundus image.

可见，本申请包括：对原始超广角眼底影像进行图像预处理操作，以得到预处理图像；利用预训练的特征提取模型对所述预处理图像进行特征提取，以得到目标编码信息；通过双线性解码器模型解码所述目标编码信息，以得到所述双线性解码器模型输出的概率值信息集合；将所述概率值信息集合中概率值最大的类别确定为所述原始超广角眼底影像对应的最终预测类别。由此可见，本申请通过预训练的特征提取模型对所述预处理图像进行特征提取，得到特征向量，然后利用双线性解码器模型将特征向量进行多重复杂代数运算，输出多个概率值，通过比较概率值的大小确定最终的识别结果，降低人工阅片而导致误识别的概率，提高了超广角眼底影像识别的精度。It can be seen that this application includes: performing image pre-processing operations on the original ultra-wide-angle fundus image to obtain a pre-processed image; using a pre-trained feature extraction model to perform feature extraction on the pre-processed image to obtain target encoding information; through double lines The linear decoder model decodes the target encoding information to obtain a probability value information set output by the bilinear decoder model; determines the category with the largest probability value in the probability value information set as the original ultra-wide-angle fundus image The corresponding final prediction category. It can be seen that this application uses a pre-trained feature extraction model to extract features from the pre-processed image to obtain a feature vector, and then uses a bilinear decoder model to perform multiple complex algebraic operations on the feature vector to output multiple probability values. The final recognition result is determined by comparing the probability values, which reduces the probability of misrecognition caused by manual reading and improves the accuracy of ultra-wide-angle fundus image recognition.

在一些具体实施例中，所述图像预处理模块11，具体包括：In some specific embodiments, the image preprocessing module 11 specifically includes:

目标区域提取单元，用于基于自适应ROI区域粗提取组件从所述原始超广角眼底影像中提取目标区域；其中，所述目标区域为由满足预设像素条件的像素点构成的区域；A target area extraction unit, configured to extract a target area from the original ultra-wide-angle fundus image based on an adaptive ROI area rough extraction component; wherein the target area is an area composed of pixels that meet preset pixel conditions;

目标区域质量判断单元，用于判断所述目标区域是否达到预设区域质量要求；A target area quality judgment unit, used to judge whether the target area meets the preset area quality requirements;

当前亮度获取单元，用于若所述目标区域达到所述预设区域质量要求，则获取所述目标区域的当前亮度；A current brightness acquisition unit, configured to acquire the current brightness of the target area if the target area meets the preset area quality requirements;

亮度区间确定单元，用于确定所述当前亮度对应的当前亮度区间；A brightness interval determination unit, used to determine the current brightness interval corresponding to the current brightness;

当前亮度调整策略确定单元，用于基于预设亮度调整策略确定规则确定所述当前亮度区间对应的当前亮度调整策略；A current brightness adjustment strategy determination unit, configured to determine the current brightness adjustment strategy corresponding to the current brightness interval based on a preset brightness adjustment strategy determination rule;

亮度调整单元，用于利用所述当前亮度调整策略调整所述目标区域的亮度，以得到所述预处理图像；A brightness adjustment unit configured to adjust the brightness of the target area using the current brightness adjustment strategy to obtain the preprocessed image;

长轴短轴确定单元，用于若所述目标区域未达到所述预设区域质量要求或未提取到所述目标区域，则根据所述原始超广角眼底影像的长与宽计算对应的长轴长信息以及短轴长信息；A long axis and short axis determination unit, configured to calculate the corresponding long axis based on the length and width of the original ultra-wide-angle fundus image if the target area does not meet the preset area quality requirements or the target area is not extracted. long information as well as short axis length information;

中心点确定单元，用于基于基准点信息确定中心点位置信息；The center point determination unit is used to determine the center point position information based on the reference point information;

椭圆区域生成单元，用于基于所述中心点位置信息、所述长轴长信息以及所述短轴长信息生成椭圆区域，并将所述椭圆区域确定为所述拟合区域；An elliptical region generating unit, configured to generate an elliptical region based on the center point position information, the major axis length information, and the minor axis length information, and determine the elliptical region as the fitting region;

目标区域确定单元，用于将所述拟合区域确定为所述目标区域，并重新进入所述基于预设亮度调整规则调整所述目标区域的亮度，以得到所述预处理图像的步骤。A target area determination unit, configured to determine the fitting area as the target area, and re-enter the step of adjusting the brightness of the target area based on a preset brightness adjustment rule to obtain the preprocessed image.

在一些具体实施例中，所述特征提取模块12，具体包括：In some specific embodiments, the feature extraction module 12 specifically includes:

初始特征提取模型获取单元，用于利用开源数据集训练原始特征提取模型，以得到具备初始权重的初始特征提取模型；The initial feature extraction model acquisition unit is used to train the original feature extraction model using open source data sets to obtain an initial feature extraction model with initial weights;

预训练的特征提取模型获取单元，用于利用超广角眼底影像数据集训练所述初始特征提取模型，以得到具备目标权重的所述预训练的特征提取模型；A pre-trained feature extraction model acquisition unit is used to train the initial feature extraction model using an ultra-wide-angle fundus image data set to obtain the pre-trained feature extraction model with target weights;

特征信息获取单元，用于利用具备所述目标权重的所述预训练的特征提取模型提取所述预处理图像中达到预设特征要求的特征信息；A feature information acquisition unit, configured to use the pre-trained feature extraction model with the target weight to extract feature information that meets preset feature requirements in the pre-processed image;

编码单元，用于以向量的形式对所述特征信息进行编码，以得到所述目标编码信息。An encoding unit is used to encode the feature information in the form of a vector to obtain the target encoding information.

在一些具体实施例中，所述解码模块13，具体包括：In some specific embodiments, the decoding module 13 specifically includes:

卷积单元，用于将所述目标编码信息输入至所述双线性解码器模型的所述第一通道，根据所述第一通道中的卷积核进行同尺度卷积，以得到卷积后信息；A convolution unit, configured to input the target encoding information to the first channel of the bilinear decoder model, and perform same-scale convolution according to the convolution kernel in the first channel to obtain convolution post information;

第一注意力权重计算单元，用于基于所述卷积后信息以及所述目标编码信息计算当前特征层的第一注意力权重；A first attention weight calculation unit configured to calculate the first attention weight of the current feature layer based on the convolved information and the target encoding information;

第一输出信息生成单元，用于若所述当前特征层为最后一层，则基于所述目标编码信息与上一层的所述第一注意力权重计算所述第一通道输出的所述第一输出信息；其中，所述第一输出信息为加权注意力特征；A first output information generation unit configured to, if the current feature layer is the last layer, calculate the first output of the first channel based on the target encoding information and the first attention weight of the previous layer. An output information; wherein the first output information is a weighted attention feature;

第二注意力权重计算单元，用于将所述目标编码信息输入至所述双线性解码器模型的所述第二通道，计算所述目标编码信息的平均值，以得到第二注意力权重；A second attention weight calculation unit is used to input the target encoding information to the second channel of the bilinear decoder model, and calculate the average value of the target encoding information to obtain a second attention weight. ;

第二输出信息生成单元，用于基于所述第二注意力权重以及所述目标编码信息计算所述第二通道输出的所述第二输出信息；A second output information generation unit configured to calculate the second output information output by the second channel based on the second attention weight and the target encoding information;

输出信息获取单元，用于获取所述第一通道输出的第一输出信息以及所述第二通道输出的第二输出信息；An output information acquisition unit, configured to acquire the first output information output by the first channel and the second output information output by the second channel;

输出信息堆叠单元，用于将所述第一输出信息与所述第二输出信息进行空间维度特征堆叠，以得到堆叠后特征层；An output information stacking unit is used to stack spatial dimension features on the first output information and the second output information to obtain a stacked feature layer;

概率值信息集合输出单元，用于通过所述堆叠后特征层执行预设预测操作，以输出所述概率值信息集合。A probability value information set output unit is configured to perform a preset prediction operation through the stacked feature layer to output the probability value information set.

在一些具体实施例中，所述预测类别确定模块14，具体包括：In some specific embodiments, the prediction category determination module 14 specifically includes:

预测类别确定单元，用于将所述概率值信息集合中概率值最大的类别确定为所述原始超广角眼底影像对应的最终预测类别。A prediction category determination unit is configured to determine the category with the largest probability value in the probability value information set as the final prediction category corresponding to the original ultra-wide-angle fundus image.

进一步的，本申请实施例还提供了一种电子设备。图7是根据一示例性实施例示出的电子设备20结构图，图中的内容不能认为是对本申请的使用范围的任何限制。Further, the embodiment of the present application also provides an electronic device. FIG. 7 is a structural diagram of the electronic device 20 according to an exemplary embodiment. The content in the figure cannot be considered to limit the scope of the present application.

图7为本申请实施例提供的一种电子设备20的结构示意图。该电子设备20，具体可以包括：至少一个处理器21、至少一个存储器22、电源23、通信接口24、输入输出接口25和通信总线26。其中，所述存储器22用于存储计算机程序，所述计算机程序由所述处理器21加载并执行，以实现前述任一实施例公开的超广角眼底影像识别方法中的相关步骤。另外，本实施例中的电子设备20具体可以为电子计算机。FIG. 7 is a schematic structural diagram of an electronic device 20 provided by an embodiment of the present application. The electronic device 20 may specifically include: at least one processor 21, at least one memory 22, a power supply 23, a communication interface 24, an input-output interface 25 and a communication bus 26. The memory 22 is used to store a computer program, which is loaded and executed by the processor 21 to implement the relevant steps in the ultra-wide-angle fundus image recognition method disclosed in any of the foregoing embodiments. In addition, the electronic device 20 in this embodiment may specifically be an electronic computer.

本实施例中，电源23用于为电子设备20上的各硬件设备提供工作电压；通信接口24能够为电子设备20创建与外界设备之间的数据传输通道，其所遵循的通信协议是能够适用于本申请技术方案的任意通信协议，在此不对其进行具体限定；输入输出接口25，用于获取外界输入数据或向外界输出数据，其具体的接口类型可以根据具体应用需要进行选取，在此不进行具体限定。In this embodiment, the power supply 23 is used to provide working voltage for each hardware device on the electronic device 20; the communication interface 24 can create a data transmission channel between the electronic device 20 and external devices, and the communication protocol it follows can be applicable Any communication protocol of the technical solution of this application is not specifically limited here; the input and output interface 25 is used to obtain external input data or output data to the external world, and its specific interface type can be selected according to specific application needs. Here No specific limitation is made.

另外，存储器22作为资源存储的载体，可以是只读存储器、随机存储器、磁盘或者光盘等，其上所存储的资源可以包括操作系统221、计算机程序222等，存储方式可以是短暂存储或者永久存储。In addition, the memory 22, as a carrier for resource storage, can be a read-only memory, a random access memory, a magnetic disk or an optical disk, etc. The resources stored thereon can include an operating system 221, a computer program 222, etc., and the storage method can be short-term storage or permanent storage. .

其中，操作系统221用于管理与控制电子设备20上的各硬件设备以及计算机程序222，其可以是Windows Server、Netware、Unix、Linux等。计算机程序222除了包括能够用于完成前述任一实施例公开的由电子设备20执行的超广角眼底影像识别方法的计算机程序之外，还可以进一步包括能够用于完成其他特定工作的计算机程序。Among them, the operating system 221 is used to manage and control each hardware device and the computer program 222 on the electronic device 20, which can be Windows Server, Netware, Unix, Linux, etc. In addition to computer programs that can be used to complete the ultra-wide-angle fundus image recognition method executed by the electronic device 20 disclosed in any of the foregoing embodiments, the computer program 222 may further include computer programs that can be used to complete other specific tasks.

进一步的，本申请实施例还公开了一种存储介质，所述存储介质中存储有计算机程序，所述计算机程序被处理器加载并执行时，实现前述任一实施例公开的超广角眼底影像识别方法步骤。Further, embodiments of the present application also disclose a storage medium, which stores a computer program. When the computer program is loaded and executed by a processor, the ultra-wide-angle fundus image recognition disclosed in any of the foregoing embodiments is realized. Method steps.

本说明书中各个实施例采用递进的方式描述，每个实施例重点说明的都是与其它实施例的不同之处，各个实施例之间相同或相似部分互相参见即可。对于实施例公开的装置而言，由于其与实施例公开的方法相对应，所以描述的比较简单，相关之处参见方法部分说明即可。Each embodiment in this specification is described in a progressive manner. Each embodiment focuses on its differences from other embodiments. The same or similar parts between the various embodiments can be referred to each other. As for the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple. For relevant details, please refer to the description in the method section.

最后，还需要说明的是，在本文中，诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来，而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且，术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含，从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素，而且还包括没有明确列出的其他要素，或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下，由语句“包括一个……”限定的要素，并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。Finally, it should be noted that in this article, relational terms such as first and second are only used to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply that these entities or any such actual relationship or sequence between operations. Furthermore, the terms "comprises," "comprises," or any other variations thereof are intended to cover a non-exclusive inclusion such that a process, method, article, or apparatus that includes a list of elements includes not only those elements, but also those not expressly listed other elements, or elements inherent to the process, method, article or equipment. Without further limitation, an element defined by the statement "comprises a..." does not exclude the presence of additional identical elements in a process, method, article, or apparatus that includes the stated element.

以上对本发明所提供的一种超广角眼底影像识别方法、装置、设备及存储介质进行了详细介绍，本文中应用了具体个例对本发明的原理及实施方式进行了阐述，以上实施例的说明只是用于帮助理解本发明的方法及其核心思想；同时，对于本领域的一般技术人员，依据本发明的思想，在具体实施方式及应用范围上均会有改变之处，综上所述，本说明书内容不应理解为对本发明的限制。The above is a detailed introduction to the ultra-wide-angle fundus image recognition method, device, equipment and storage medium provided by the present invention. This article uses specific examples to illustrate the principles and implementation methods of the present invention. The description of the above embodiments is only It is used to help understand the method and its core idea of the present invention; at the same time, for those of ordinary skill in the field, there will be changes in the specific implementation and application scope according to the idea of the present invention. In summary, this The content of the description should not be construed as limiting the invention.

Claims

1. An ultra-wide-angle fundus image recognition method, which is characterized by including:

Perform image preprocessing operations on the original ultra-wide-angle fundus image to obtain the preprocessed image;

Using a pre-trained feature extraction model to perform feature extraction on the pre-processed image to obtain target encoding information;

Decode the target encoding information through a bilinear decoder model to obtain a probability value information set output by the bilinear decoder model;

The category with the largest probability value in the probability value information set is determined as the final predicted category corresponding to the original ultra-wide-angle fundus image.

2. The ultra-wide-angle fundus image recognition method according to claim 1, characterized in that the image pre-processing operation is performed on the original ultra-wide-angle fundus image to obtain the pre-processed image, including:

Extract a target area from the original ultra-wide-angle fundus image based on an adaptive ROI area rough extraction component; wherein the target area is an area composed of pixels that meet preset pixel conditions;

Determine whether the target area meets the preset area quality requirements;

If the target area meets the preset area quality requirements, adjust the brightness of the target area based on the preset brightness adjustment rules to obtain the preprocessed image;

If the target area does not meet the preset area quality requirements or the target area is not extracted, then perform maximum ellipse approximate fitting to obtain the fitting area;

Determine the fitting area as the target area, and re-enter the step of adjusting the brightness of the target area based on a preset brightness adjustment rule to obtain the preprocessed image.

3. The ultra-wide-angle fundus image recognition method according to claim 2, wherein the adjusting the brightness of the target area based on a preset brightness adjustment rule to obtain the pre-processed image includes:

Obtain the current brightness of the target area, and determine the current brightness interval corresponding to the current brightness;

Determine the current brightness adjustment strategy corresponding to the current brightness interval based on the preset brightness adjustment strategy determination rules;

The current brightness adjustment strategy is used to adjust the brightness of the target area to obtain the preprocessed image.

4. The ultra-wide-angle fundus image recognition method according to claim 2, wherein the maximum ellipse approximate fitting is performed to obtain the fitting area, including:

Calculate the corresponding long axis length information and short axis length information according to the length and width of the original ultra-wide-angle fundus image;

Determine the center point position information based on the reference point information;

An elliptical area is generated based on the center point position information, the major axis length information, and the minor axis length information, and the elliptical area is determined as the fitting area.

5. The ultra-wide-angle fundus image recognition method according to claim 1, characterized in that, before using a pre-trained feature extraction model to perform feature extraction on the pre-processed image to obtain the target encoding information, it also includes:

Use open source data sets to train the original feature extraction model to obtain an initial feature extraction model with initial weights;

Using the ultra-wide-angle fundus image data set to train the initial feature extraction model to obtain the pre-trained feature extraction model with target weights;

Correspondingly, using the pre-trained feature extraction model to perform feature extraction on the pre-processed image to obtain target encoding information includes:

Using the pre-trained feature extraction model with the target weight to extract feature information that meets the preset feature requirements in the pre-processed image;

The feature information is encoded in the form of a vector to obtain the target encoding information.

6. The ultra-wide-angle fundus image recognition method according to claim 1, characterized in that the target encoding information is decoded through a bilinear decoder model to obtain the probability value output by the bilinear decoder model. Collection of information including:

Input the target encoding information to the first channel and the second channel of the bilinear decoder model;

Obtain the first output information output by the first channel and the second output information output by the second channel;

Perform spatial dimension feature stacking on the first output information and the second output information to obtain a stacked feature layer;

A preset prediction operation is performed through the stacked feature layer to output the probability value information set.

7. The ultra-wide-angle fundus image recognition method according to claim 6, wherein the input of the target encoding information to the first channel and the second channel of the bilinear decoder model includes:

Input the target encoding information into the first channel of the bilinear decoder model, and perform same-scale convolution according to the convolution kernel in the first channel to obtain post-convolution information;

Calculate the first attention weight of the current feature layer based on the convolved information and the target encoding information;

If the current feature layer is the last layer, the first output information output by the first channel is calculated based on the target encoding information and the first attention weight of the previous layer; wherein, the first One output information is the weighted attention feature;

Input the target encoding information to the second channel of the bilinear decoder model, and calculate the average value of the target encoding information to obtain a second attention weight;

The second output information output by the second channel is calculated based on the second attention weight and the target encoding information.

8. An ultra-wide-angle fundus image recognition device, characterized by including:

The image preprocessing module is used to perform image preprocessing operations on the original ultra-wide-angle fundus image to obtain the preprocessed image;

A feature extraction module, used to perform feature extraction on the preprocessed image using a pre-trained feature extraction model to obtain target encoding information;

A decoding module configured to decode the target encoding information through a bilinear decoder model to obtain a probability value information set output by the bilinear decoder model;

A prediction category determination module, configured to determine the category with the largest probability value in the probability value information set as the final prediction category corresponding to the original ultra-wide-angle fundus image.

9. An electronic device, characterized in that it includes:

Memory, used to hold computer programs;

A processor, configured to execute the computer program to implement the steps of the ultra-wide-angle fundus image recognition method according to any one of claims 1 to 7.

10. A computer-readable storage medium, characterized in that it is used to store a computer program; wherein when the computer program is executed by a processor, the ultra-wide-angle fundus image recognition method according to any one of claims 1 to 7 is implemented. .