WO2022188489A1

WO2022188489A1 - Training method and apparatus for multi-mode multi-disease long-tail distribution ophthalmic disease classification model

Info

Publication number: WO2022188489A1
Application number: PCT/CN2021/137142
Authority: WO
Inventors: 欧中洪; 王莉菲; 柴文俊; 宋美娜; 鄂海红; 何佳雯; 张如如; 李峻迪; 袁立飞; 贾鑫; 黄儒剑
Original assignee: 北京邮电大学
Priority date: 2021-03-12
Filing date: 2021-12-10
Publication date: 2022-09-15
Also published as: CN113011485B; CN113011485A

Abstract

The present application relates to the technical field of deep learning. Provided are a training method and apparatus for a multi-mode multi-disease long-tail distribution ophthalmic disease classification model, and an identification method and apparatus for the model. A classification method comprises: acquiring a dual-mode image sample, and performing diagnosis label annotation on the dual-mode image sample; inputting an infrared macular region ophthalmoscopic image sample and an OCT image sample into a first neural network at the same time for training, so as to acquire first image feature information and second image feature information; calculating overall image feature information according to the first image feature information, a first weight, the second image feature information and a second weight, and inputting same into a fully-connected network to acquire a prediction result; and continuously adjusting parameters of the neural network by means of a backpropagation technique until an error value is kept at a preset threshold value, and generating an ophthalmic disease classification model.

Description

Multimodal and multi-disease long-tail distribution ophthalmic disease classification model training method and device

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is based on the Chinese patent application with the application number of 202110270878.7 and the filing date of March 12, 2021, and claims the priority of the Chinese patent application. The entire content of the Chinese patent application is incorporated herein by reference.

technical field

The present application relates to the technical field of deep learning, and in particular, to an ophthalmic disease classification model training and its identification method and device under the multimodal and multi-disease long-tail distribution.

Background technique

In recent years, deep learning has developed rapidly in the medical field due to its high efficiency and accuracy. Deep learning technology can perform pixel-by-pixel analysis and quantification of pathological features in medical images, and reduce the subjectivity of doctors' judgment to a certain extent, making disease diagnosis more objective and stable. Optical coherence tomography (OCT) is a non-contact, non-invasive imaging technology that can provide clear cross-sectional imaging of pathological macular region; fundus imaging can provide clear planar fundus imaging. Based on single-modality data of OCT or fundus images, the use of deep learning technology for intelligent auxiliary diagnosis of ophthalmic diseases has caused extensive research, but how to effectively assist in the diagnosis of ophthalmic images in the clinical environment still faces great challenges.

In the related art, (1) the color fundus photos and their corresponding disease labels are input into the neural network for training, and the fundus image features are extracted to finally give the disease classification results; (2) the OCT images and their corresponding disease labels are input into the neural network for training, The OCT image features are extracted to finally give the disease classification results; (3) Fundus images and OCT images and their corresponding disease labels are simultaneously input into the neural network for training, and the feature combination of the two modal images is extracted to finally give the disease classification results.

However, Scheme 1 and Scheme 2 can easily collect a large number of images, but only using a single image for auxiliary diagnosis does not conform to the actual clinical process of most eye disease diagnosis. In clinical situations, doctors usually combine multiple modal information to make comprehensive judgments; and Only a single image is used for the deep learning model eye disease classification decision, the number of features is limited, and the recognition accuracy is not enough. Scheme 3 combines the characteristics of fundus images and OCT, which is in line with the actual clinical situation, but because it is difficult to collect a large number of images corresponding to fundus and OCT at the same time, there is less available data, and the existing research diseases are limited to AMD diseases.

In addition, there are many types of ophthalmic diseases and the incidence is seriously unbalanced, and there are many rare ophthalmic diseases. However, most of the image data in the existing research has a balanced distribution of disease types and a small number of disease types, which cannot effectively deal with the long data distribution that may occur in real scenarios. tail phenomenon.

SUMMARY OF THE INVENTION

The present application aims to solve one of the technical problems in the related art at least to a certain extent.

Therefore, the first purpose of this application is to propose a method for classifying ophthalmic diseases under the long-tailed distribution of multi-modality and multi-disease. By collecting infrared macular fundus images and OCT images on OCT equipment, a large number of Paired dual-modality images, through the dual-channel convolutional neural network model to learn the image features of the two modalities, obtain a deep learning model similar to the clinical diagnosis process, which solves the problem of ophthalmic images that rely on multiple modalities. When the modalities are used for classification, the accuracy is not enough, the paired color fundus and OCT images are difficult to collect, the coverage of diseases is small and the disease categories in the real scene show long-tailed data distribution, the categories are unbalanced, and the classification effect of diseases with fewer samples is poor. technical issues.

The second objective of the present application is to propose an ophthalmic disease classification device under the long-tailed distribution of multi-modality and multi-disease.

The third object of the present application is to propose an electronic device.

A fourth object of the present application is to propose a computer-readable storage medium.

A fifth object of the present application is a computer program product.

In order to achieve the above purpose, the embodiment of the first aspect of the present application proposes a method for classifying ophthalmic diseases under the long-tailed distribution of multi-modality and multi-disease, including:

Data collection is performed on the electronic medical record to obtain a dual-modality image sample; wherein, the dual-modality image sample includes an infrared macular region fundus image sample and an optical coherence tomography OCT image sample, and the dual-modality image sample is marked diagnostic labels;

inputting the infrared macular region fundus image sample and the OCT image sample into a first neural network for training at the same time, to obtain first image feature information and second image feature information;

Calculate the total image feature information according to the first image feature information and the first weight, the second image feature information and the second weight, and input the total image feature information into a fully connected network to obtain a prediction result;

The error value of the prediction result and the diagnostic label is calculated by a loss function, and the parameters of the neural network are continuously adjusted by back-propagation technology until the error value is maintained at a preset threshold, and an ophthalmic disease classification model is generated.

The method for classifying ophthalmic diseases under the long-tailed distribution of multi-modality and multi-disease according to the embodiment of the present application, by acquiring bimodal image samples, marking the bimodal image samples with diagnostic labels; Respectively input the first neural network for training to obtain the first image feature information and the second image feature information; calculate the total image feature information according to the first image feature information and the first weight, the second image feature information and the second weight, and input the full connection The network obtains the prediction results; the parameters of the neural network are continuously adjusted by back-propagation technology until the error value is maintained at a preset threshold, and an ophthalmic disease classification model is generated. Therefore, the two-way convolutional neural network model is used to learn two modal image features to obtain a deep learning model similar to the clinical diagnosis process, which solves the problem of ophthalmic images that rely on multiple modal features when only a single modality is used for classification. , The accuracy is not enough, the paired color fundus and OCT images are difficult to collect, the disease categories are covered with few diseases and the real scenes show long-tailed data distribution, the categories are unbalanced, and the classification effect of the diseases with fewer samples is poor.

Optionally, in an embodiment of the present application, the performing data collection on an electronic medical record, acquiring a dual-modality image sample, and labeling the dual-modality image sample with a diagnostic label, includes:

By designing an electronic case parsing algorithm that parses the document format, the bimodal image of the electronic medical record and the current diagnosis information are parsed, and a diagnostic label is marked on the bimodal image sample according to the diagnosis information.

Optionally, in an embodiment of the present application, the method further includes:

Adjust the size of the infrared macular fundus image sample and the optical coherence tomography OCT image sample, and perform random preset angle rotation, random sharpness enhancement, random brightness enhancement, random chromaticity enhancement, random contrast enhancement and random horizontal flip operations. one or more of.

Optionally, in an embodiment of the present application, the loss function is shown in formula (1):

in,

are the diagnostic label y and the prediction result, respectively

The one-hot encoding form of , γ≥0, γ is a hyperparameter, E=[E ₁ ,E ₂ ,...,E _N ],

N=12 is the total number of labels, i∈{1,2,...,N}, n _i is the number of samples of the ith label.

Optionally, in an embodiment of the present application, the method for identifying an ophthalmic disease classification model under a multimodal, multi-disease long-tailed distribution includes:

Obtain the fundus image samples and OCT images in the infrared macular region to be identified;

Inputting the infrared macular region fundus image sample and the OCT image into the ophthalmic disease classification model for processing to obtain a diagnosis result.

In order to achieve the above purpose, a second aspect embodiment of the present application proposes a multi-modal multi-disease long-tailed distribution device for classifying ophthalmic diseases, including:

The acquisition and annotation module is used to collect data from electronic medical records and acquire dual-modality image samples; wherein, the dual-modality image samples include infrared macular region fundus image samples and optical coherence tomography OCT image samples, and the dual-modality image samples are analyzed. Modal image samples are labeled with diagnostic labels;

an extraction module, configured to input the infrared macular fundus image sample and the OCT image sample into the first neural network for training at the same time, and obtain the first image feature information and the second image feature information;

a prediction module, configured to calculate the total image feature information according to the first image feature information and the first weight, the second image feature information and the second weight, and input it into a fully connected network to obtain a prediction result;

The generating module is used to calculate the error value of the prediction result and the diagnostic label through a loss function, and continuously adjust the parameters of the neural network through the back-propagation technology until the error value is maintained at the preset threshold, and generate an ophthalmic disease classification Model.

The device for classifying ophthalmic diseases under the long-tailed distribution of multi-modality and multi-disease according to the embodiment of the present application, by acquiring dual-modality image samples, the dual-modality image samples are marked with diagnostic labels; the infrared macular region fundus image samples and the OCT image samples are Respectively input the first neural network for training to obtain the first image feature information and the second image feature information; calculate the total image feature information according to the first image feature information and the first weight, the second image feature information and the second weight, and input the full connection The network obtains the prediction results; the parameters of the neural network are continuously adjusted by back-propagation technology until the error value is maintained at a preset threshold, and an ophthalmic disease classification model is generated. Therefore, the two-way convolutional neural network model is used to learn two modal image features to obtain a deep learning model similar to the clinical diagnosis process, which solves the problem of ophthalmic images that rely on multiple modal features when only a single modality is used for classification. , The accuracy is not enough, the paired color fundus and OCT images are difficult to collect, the disease categories are covered with few diseases and the real scenes show long-tailed data distribution, the categories are unbalanced, and the classification effect of the diseases with fewer samples is poor.

Optionally, in an embodiment of the present application, the obtaining and labeling module is specifically used for:

By designing an electronic case analysis algorithm that parses the document format, the bimodal image of the electronic medical record and the diagnosis information at that time are parsed, and the bimodal image sample is marked with a diagnostic label according to the diagnosis information.

Optionally, in an embodiment of the present application, the device further includes:

The preprocessing module is used to adjust the size of the infrared macular fundus image sample and the optical coherence tomography OCT image sample, and perform random preset angle rotation, random sharpness enhancement, random brightness enhancement, random chromaticity enhancement, and random contrast enhancement. and one or more of random horizontal flip operations.

in,

are the diagnostic label y and the prediction result, respectively

Optionally, in an embodiment of the present application, the device for identifying an ophthalmic disease classification model under a multimodal and multi-disease long-tailed distribution includes:

The acquisition module is used to acquire the fundus image samples and OCT images in the infrared macular region to be identified;

A diagnosis module is used for inputting the infrared macular region fundus image sample and the OCT image into the ophthalmic disease classification model for processing to obtain a diagnosis result.

To achieve the above purpose, an embodiment of a third aspect of the present application provides an electronic device, comprising: a processor; a memory for storing instructions executable by the processor; wherein the processor is configured to execute the instructions , in order to realize a multimodal multi-disease long-tailed distribution ophthalmic disease classification method proposed in the embodiment of the first aspect of the present application.

To achieve the above purpose, a fourth aspect of the present application provides a computer-readable storage medium, when the instructions in the computer-readable storage medium are executed by a processor of an electronic device, the electronic device can execute the present invention. A method for classifying ophthalmic diseases under the long-tailed distribution of multimodality and multidiseases proposed by the embodiment of the first aspect of the application.

In order to achieve the above purpose, the fifth aspect of the present application provides a computer program product, including a computer program that, when executed by a processor, implements the multi-modality and multi-illness proposed by the first aspect of the present application. Classification of ophthalmic diseases under a long-tailed distribution.

Additional aspects and advantages of the present application will be set forth, in part, in the following description, and in part will be apparent from the following description, or learned by practice of the present application.

Description of drawings

The above and/or additional aspects and advantages of the present application will become apparent and readily understood from the following description of embodiments taken in conjunction with the accompanying drawings, wherein:

1 is a schematic flowchart of a method for classifying ophthalmic diseases under the long-tailed distribution of multi-modality and multi-diseases provided in Embodiment 1 of the present application;

FIG. 2 is an example diagram of a two-way model provided by Embodiment 1 of the present application;

3 is a schematic flowchart of a method for classifying ophthalmic diseases under a multimodal multi-disease long-tail distribution provided by the second embodiment of the application;

FIG. 4 is a schematic structural diagram of a device for classifying ophthalmic diseases under the long-tailed distribution of multi-modality and multi-disease according to an embodiment of the present application.

Detailed ways

The following describes in detail the embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein the same or similar reference numerals refer to the same or similar elements or elements having the same or similar functions throughout. The embodiments described below with reference to the accompanying drawings are exemplary, and are intended to be used to explain the present application, but should not be construed as a limitation to the present application.

The following describes the method and device for classifying ophthalmic diseases under the long-tailed distribution of multimodality and multidisease according to the embodiments of the present application with reference to the accompanying drawings.

FIG. 1 is a schematic flowchart of a method for classifying ophthalmic diseases under the long-tailed distribution of multi-modality and multi-disease according to the first embodiment of the present application.

It is very difficult to collect data based on the existing technology. The color fundus image and OCT image of the same eye are often divided into different departments in most hospitals, and it is difficult to realize data circulation. The existing technology adopts the loosepair training method, that is, by combining the images of the same disease Instead of combining the multimodal images of the same eye to complete the training, although the number of samples is effectively expanded, this scheme reduces the correlation between the two images of the input model and reduces the interpretability of the model.

This application uses the infrared macular fundus image and the OCT image of the same eye used by doctors in the diagnosis of OCT equipment as dual-modal data. The infrared macular fundus image and the OCT image exist in large numbers in the electronic diagnosis report at the same time. Therefore, a large amount of effective multimodal data can be obtained, which is more in line with the actual clinical diagnosis process and can improve the classification effect. The electronic case data collection module and data labeling module designed in this application can effectively use this data.

In addition, the existing technology has fewer classification labels, and only performs three-disease internal classification for one disease of AMD, which cannot effectively deal with the long-tailed distribution of multi-disease data in real scenarios. This proposal uses a two-stage training model and designs a training scheme combined with class-balanced-loss to effectively classify more than ten diseases, which can effectively improve the overall classification effect and the classification effect of diseases with a small number of samples.

That is to say, the current mainstream ophthalmic disease image classification research mainly includes lesion recognition based on fundus images, and lesion recognition based on OCT images, and the classification features are extracted by the convolutional neural network model to give prediction results. However, most of the existing schemes use a single modality image. In the face of eye diseases that need to combine multiple modal feature information, the number of features is limited and the recognition accuracy is not enough; the existing methods mostly assume that the distribution of disease categories is uniform, which does not conform to the actual clinical data distribution. , it is difficult to deal with the problem of long-tailed distribution of data in real scenarios. In order to solve the above problems, the present application collects a large number of pairs of dual-modal images by a convenient method by collecting infrared macular fundus images and OCT images on OCT equipment, and learns two-modal images through a dual-channel convolutional neural network model. The features are derived from deep learning models similar to the clinical diagnosis process.

As shown in FIG. 1 , the method for classifying ophthalmic diseases under the multimodal and multi-disease long-tail distribution includes the following steps 101 to 104 .

Step 101: Collect data from electronic medical records to obtain dual-modality image samples; wherein, the dual-modality image samples include infrared macular fundus image samples and optical coherence tomography OCT image samples, and perform labeling and diagnosis on the dual-modality image samples Label.

In the embodiment of the present application, an electronic case parsing algorithm that parses the document format is designed to parse the bimodal image of the electronic medical record and the current diagnosis information, and the bimodal image sample is marked with a diagnostic label according to the diagnosis information.

In the embodiment of the present application, the size of the infrared macular fundus image sample and the optical coherence tomography OCT image sample is adjusted, and random preset angle rotation, random sharpness enhancement, random brightness enhancement, random chromaticity enhancement, random contrast are performed. One or more of augmentation and random horizontal flip operations.

Specifically, since the OCT device will find the position of the corresponding OCT image slice through the infrared macular fundus image during use, the generated electronic medical record contains the infrared macular fundus image, and also includes the corresponding OCT image slice. By designing an electronic case analysis algorithm that parses the PDF format, the bimodal images of the electronic medical record and the diagnosis information at that time are parsed, and the images are preliminarily preprocessed.

Specifically, the disease labels to be labeled are established according to the actual clinical situation, the parsed bimodal images and case diagnosis information are selected and uploaded to the image labeling platform, and professional labelers (chief doctors, etc.) Images are annotated.

Further, data enhancement is performed on the data. The data is cropped into fundus images and OCT images before being input into the model. The size of each image is modified to 224×224×3, and random 30° rotation and random sharpening are performed on the training data. Brightness Boost, Random Luminance Boost, Random Chroma Boost, Random Contrast Boost, and Random Horizontal Flip operations.

Step 102 , input the infrared macular region fundus image sample and the OCT image sample into the first neural network respectively for training, and obtain the first image feature information and the second image feature information.

Step 103: Calculate the total image feature information according to the first image feature information and the first weight, the second image feature information and the second weight, and input the total image feature information into a fully connected network to obtain a prediction result.

Specifically, define a dataset D={x _f , x _o |y}, where x _f and x _o are the fundus images and OCT images obtained from the same eye, respectively, and y is the diagnostic label of the group of images, including 11 types Ophthalmic diseases and no obvious lesions. The model is denoted as "OurModel", OurModel receives paired inputs {x _f ,x _o } and outputs the diagnosis result of the eye

as the following formula:

Specifically, as shown in Figure 2, the network model consists of two symmetrical branches, one for processing fundus images and the other for processing OCT images, and the weights of the two branches are not shared. Each branch uses ResNet18 to delete all fully connected layers as the backbone network as shown in ResNet18-backbone in Figure 2, splicing CBAM (Convolutional Block Attention Module, the attention mechanism module of the convolution module) attention mechanism module, extracting image features information, finally merge the two branch weights, and concatenate the fully connected layer to give prediction results, such as no obvious lesions, epiretinal membrane, central serous chorioretinopathy, macular hole, macular schisis, choroidal neovascularization, age-related Macular degeneration, retinal detachment, branch vein occlusion, arterial occlusion, central vein occlusion, one of Harada disease.

Step 104: Calculate the error value of the prediction result and the diagnostic label through the loss function, and continuously adjust the neural network parameters through the back-propagation technology until the error value is maintained at a preset threshold, and an ophthalmic disease classification model is generated.

In this embodiment of the present application, the loss function is shown in formula (1):

in,

are the diagnostic label y and the prediction result, respectively

Specifically, first use the cross-entropy loss function to train the entire model, freeze the weights other than the fully connected layer after the validation set loss converges, and use the class-balanced-loss to retrain the weights of the fully connected layer, and wait for the validation set loss again After convergence, the final output model in Figure 3 is obtained.

First define the number of valid samples for each category:

Among them, N=12 is the total number of labels, i∈{1,2,...,N}, n _i is the number of samples of the i-th label, and β∈[0,1) is a hyperparameter. The loss function is reweighted using the inverse of the number of valid samples of each class to balance the loss, thereby effectively improving the performance of small sample data in classification.

Focal loss is a loss function proposed to solve the serious imbalance of the proportion of positive and negative samples in one-stage target detection. Therefore, Focal Loss is selected as the loss function in this scheme. Focal loss is defined as follows:

in,

are the label y and the model prediction result, respectively

The one-hot encoded form of ,

γ≥0, where γ is a hyperparameter. Therefore the definition of class-balanced-loss of this application is as follows:

Among them, E=[E ₁ , E ₂ , . . . , E _N ], E∈R ¹² .

Further, in the embodiments of the present application, the infrared macular fundus image samples and OCT images to be identified are obtained; the infrared macular fundus image samples and OCT images are input into the ophthalmic disease classification model for processing to obtain diagnostic results.

Specifically, as shown in Figure 3, load the model through TensorFlowServing, use Docker as the service container to complete the model deployment, provide the model in the form of an HTTP interface, develop the basic back-end functions of the system through the Django framework, and receive multimodal image requests , forward the request to Docker to request TensorFlowServing, get the model recognition result, and finally Django passes the information to the front-end display based on this result.

Therefore, by using the mid-infrared macular fundus image of the OCT device as an auxiliary image, combined with the OCT image to construct a dual-modal image input, an efficient acquisition algorithm was designed to obtain dual-modal data, and a two-stage model training method was used. Data distribution characteristics, freezing the convolution layer in two stages, and weighting the class-balanced-loss retraining through the statistical information of each disease category. The model training scheme designed in this application can significantly improve the overall classification effect, especially for diseases with a small number of samples classification effect.

In order to realize the embodiments of the present application, the present application also proposes a device for classifying ophthalmic diseases under the long-tailed distribution of multi-modality and multi-disease.

As shown in FIG. 4 , the apparatus for classifying ophthalmic diseases under the long-tailed distribution of multi-modality and multi-disease includes: an acquisition and annotation module 410 , an extraction module 420 , a prediction module 430 and a generation module 440 .

The acquisition and labeling module 410 is used to collect data from electronic medical records and acquire dual-modality image samples; wherein, the dual-modality image samples include infrared macular region fundus image samples and optical coherence tomography OCT image samples. Bimodal image samples were labeled with diagnostic labels.

The extraction module 420 is configured to input the infrared macular fundus image samples and the OCT image samples into the first neural network for training at the same time, and obtain the first image feature information and the second image feature information.

The prediction module 430 is configured to calculate the total image feature information according to the first image feature information and the first weight, the second image feature information and the second weight, and input the total image feature information to a fully connected network to obtain a prediction result.

The generating module 440 is configured to calculate the error value of the prediction result and the diagnostic label through a loss function, and continuously adjust the parameters of the neural network through the back-propagation technology until the error value is maintained at the preset threshold, and generate an ophthalmology Disease classification models.

In an embodiment of the present application, the acquiring and labeling module is specifically used for:

In an embodiment of the present application, the device further includes:

In an embodiment of the present application, the loss function is shown in formula (1):

in,

are the diagnostic label y and the prediction result, respectively

In an embodiment of the present application, the device for identifying an ophthalmic disease classification model under a multimodal and multi-disease long-tailed distribution includes:

The device for classifying ophthalmic diseases under the long-tailed distribution of multi-modality and multi-disease according to the embodiment of the present application, by acquiring dual-modality image samples, the dual-modality image samples are marked with diagnostic labels; the infrared macular region fundus image samples and the OCT image samples are Input the first neural network for training to obtain the first image feature information and the second image feature information; calculate the total image feature information according to the first image feature information and the first weight, the second image feature information and the second weight, and input the fully connected network to obtain Predict the results; continuously adjust the neural network parameters through back-propagation technology until the error value is maintained at a preset threshold, and generate an ophthalmic disease classification model. Therefore, the two-way convolutional neural network model is used to learn two modal image features to obtain a deep learning model similar to the clinical diagnosis process, which solves the problem of ophthalmic images that rely on multiple modal features when only a single modality is used for classification. , The accuracy is not enough, the paired color fundus and OCT images are difficult to collect, the disease categories are covered with few diseases and the real scenes show long-tailed data distribution, the categories are unbalanced, and the classification effect of diseases with few samples is poor.

It should be noted that the foregoing explanations of the embodiment of the method for classifying ophthalmic diseases under the long-tailed distribution of multimodality and multidiseases are also applicable to the device for classifying ophthalmic diseases under the long-tailed distribution of multimodality and multidiseases in this embodiment. No longer.

In the description of this specification, description with reference to the terms "one embodiment," "some embodiments," "example," "specific example," or "some examples", etc., mean specific features described in connection with the embodiment or example , structure, material or feature is included in at least one embodiment or example of the present application. In this specification, schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, those skilled in the art may combine and combine the different embodiments or examples described in this specification, as well as the features of the different embodiments or examples, without conflicting each other.

In addition, the terms "first" and "second" are only used for descriptive purposes, and should not be construed as indicating or implying relative importance or implying the number of indicated technical features. Thus, a feature delimited with "first", "second" may expressly or implicitly include at least one of that feature. In the description of the present application, "plurality" means at least two, such as two, three, etc., unless expressly and specifically defined otherwise.

Any process or method description in the flowcharts or otherwise described herein may be understood to represent a module, segment or portion of code comprising one or more executable instructions for implementing custom logical functions or steps of the process , and the scope of the preferred embodiments of the present application includes alternative implementations in which the functions may be performed out of the order shown or discussed, including performing the functions substantially concurrently or in the reverse order depending upon the functions involved, which should It is understood by those skilled in the art to which the embodiments of the present application belong.

The logic and/or steps represented in flowcharts or otherwise described herein, for example, may be considered an ordered listing of executable instructions for implementing the logical functions, may be embodied in any computer-readable medium, For use with, or in conjunction with, an instruction execution system, apparatus, or device (such as a computer-based system, a system including a processor, or other system that can fetch instructions from and execute instructions from an instruction execution system, apparatus, or apparatus) or equipment. For the purposes of this specification, a "computer-readable medium" can be any device that can contain, store, communicate, propagate, or transport the program for use by or in connection with an instruction execution system, apparatus, or apparatus. More specific examples (non-exhaustive list) of computer readable media include the following: electrical connections with one or more wiring (electronic devices), portable computer disk cartridges (magnetic devices), random access memory (RAM), Read Only Memory (ROM), Erasable Editable Read Only Memory (EPROM or Flash Memory), Fiber Optic Devices, and Portable Compact Disc Read Only Memory (CDROM). In addition, the computer readable medium may even be paper or other suitable medium on which the program may be printed, as the paper or other medium may be optically scanned, for example, followed by editing, interpretation, or other suitable medium as necessary process to obtain the program electronically and then store it in computer memory.

It should be understood that various parts of this application may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware as in another embodiment, it can be implemented by any one of the following techniques known in the art, or a combination thereof: discrete with logic gates for implementing logic functions on data signals Logic circuits, application specific integrated circuits with suitable combinational logic gates, Programmable Gate Arrays (PGA), Field Programmable Gate Arrays (FPGA), etc.

Those of ordinary skill in the art can understand that all or part of the steps carried by the methods of the embodiments of the present application can be completed by instructing relevant hardware through a program, and the program can be stored in a computer-readable storage medium, and the program When executed, one or a combination of the steps of a method embodiment is included.

In addition, each functional unit in each embodiment of the present application may be integrated into one processing module, or each unit may exist physically alone, or two or more units may be integrated into one module. The above-mentioned integrated modules can be implemented in the form of hardware, and can also be implemented in the form of software function modules. If the integrated modules are implemented in the form of software functional modules and sold or used as independent products, they may also be stored in a computer-readable storage medium.

The above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, and the like. Although the embodiments of the present application have been shown and described above, it should be understood that the embodiments of the present application are exemplary and should not be construed as limitations on the present application. Variations, modifications, substitutions and alterations are made to the application examples.

Claims

A method for training an ophthalmic disease classification model under multimodal and multi-disease long-tail distribution, characterized in that it includes the following steps:

Data collection is performed on the electronic medical record to obtain a dual-modality image sample; wherein, the dual-modality image sample includes an infrared macular region fundus image sample and an optical coherence tomography OCT image sample, and the dual-modality image sample is marked diagnostic labels;

inputting the infrared macular region fundus image sample and the OCT image sample into a first neural network for training at the same time, to obtain first image feature information and second image feature information;

Calculate the total image feature information according to the first image feature information and the first weight, the second image feature information and the second weight, and input the total image feature information into a fully connected network to obtain a prediction result;

The error value of the prediction result and the diagnostic label is calculated by a loss function, and the parameters of the neural network are continuously adjusted by back-propagation technology until the error value is maintained at a preset threshold, and an ophthalmic disease classification model is generated.
The method according to claim 1, wherein the collecting data from the electronic medical record, acquiring the dual-modality image samples, and labeling the dual-modality image samples with diagnostic labels, comprises:

By designing an electronic case parsing algorithm that parses the document format, the bimodal image of the electronic medical record and the current diagnosis information are parsed, and a diagnostic label is marked on the bimodal image sample according to the diagnosis information.
The method of claim 1, further comprising:

Adjust the size of the infrared macular fundus image sample and the optical coherence tomography OCT image sample, and perform random preset angle rotation, random sharpness enhancement, random brightness enhancement, random chromaticity enhancement, random contrast enhancement and random horizontal flip operations. one or more of.
The method of claim 1, wherein the loss function is shown in formula (1):

in,
Among them, s y ,
are the diagnostic label y and the prediction result, respectively
The one-hot encoding form of , γ≥0, γ is a hyperparameter, E=[E 1 ,E 2 ,...,E N ],
N=12 is the total number of labels, i∈{1,2,...,N}, n i is the number of samples of the ith label.
The method for identifying an ophthalmic disease classification model under a multimodal, multi-disease long-tailed distribution according to any one of claims 1 to 4, characterized in that, comprising:

Obtain the fundus image samples and OCT images in the infrared macular region to be identified;

Inputting the infrared macular region fundus image sample and the OCT image into the ophthalmic disease classification model for processing to obtain a diagnosis result.
An ophthalmic disease classification model training device under multimodal and multi-disease long-tail distribution, characterized in that it includes:

The acquisition and annotation module is used to collect data from electronic medical records and acquire dual-modality image samples; wherein, the dual-modality image samples include infrared macular region fundus image samples and optical coherence tomography OCT image samples, and the dual-modality image samples are analyzed. Modal image samples are labeled with diagnostic labels;

an extraction module, configured to input the infrared macular fundus image sample and the OCT image sample into the first neural network for training at the same time, and obtain the first image feature information and the second image feature information;

a prediction module, configured to calculate the total image feature information according to the first image feature information and the first weight, the second image feature information and the second weight, and input it into a fully connected network to obtain a prediction result;

The generating module is configured to calculate the error value of the prediction result and the diagnostic label through a loss function, and continuously adjust the parameters of the neural network through the back-propagation technology until the error value is maintained at a preset threshold, thereby generating an ophthalmic disease classification model.
The device according to claim 6, wherein the acquiring and labeling module is specifically used for:

By designing an electronic case parsing algorithm that parses the document format, the bimodal image of the electronic medical record and the current diagnosis information are parsed, and the bimodal image sample is marked with a diagnostic label according to the diagnosis information.
The apparatus of claim 6, further comprising:

The preprocessing module is used to adjust the size of the infrared macular fundus image sample and the optical coherence tomography OCT image sample, and perform random preset angle rotation, random sharpness enhancement, random brightness enhancement, random chromaticity enhancement, and random contrast enhancement. and one or more of random horizontal flip operations.
The device of claim 6, wherein the loss function is shown in formula (1):

in,
Among them, s y ,
are the diagnostic label y and the prediction result, respectively
The one-hot encoding form of , γ≥0, γ is a hyperparameter, E=[E 1 ,E 2 ,...,E N ],
N=12 is the total number of labels, i∈{1,2,...,N}, n i is the number of samples of the ith label.
The device for identifying an ophthalmic disease classification model under a multimodal and multi-disease long-tailed distribution according to any one of claims 6 to 9, characterized in that it includes:

The acquisition module is used to acquire the fundus image samples and OCT images in the infrared macular region to be identified;

A diagnosis module is used for inputting the infrared macular region fundus image sample and the OCT image into the ophthalmic disease classification model for processing to obtain a diagnosis result.
An electronic device, comprising:

processor;

a memory for storing the processor-executable instructions;

wherein the processor is configured to execute the instructions to implement the following steps:

Data collection is performed on the electronic medical record to obtain a dual-modality image sample; wherein, the dual-modality image sample includes an infrared macular region fundus image sample and an optical coherence tomography OCT image sample, and the dual-modality image sample is marked diagnostic labels;

inputting the infrared macular region fundus image sample and the OCT image sample into a first neural network for training at the same time, to obtain first image feature information and second image feature information;

Calculate the total image feature information according to the first image feature information and the first weight, the second image feature information and the second weight, and input the total image feature information into a fully connected network to obtain a prediction result;

The error value of the prediction result and the diagnostic label is calculated by a loss function, and the parameters of the neural network are continuously adjusted by back-propagation technology until the error value is maintained at a preset threshold, and an ophthalmic disease classification model is generated.
A computer-readable storage medium, characterized in that, when the instructions in the computer-readable storage medium are executed by a processor of an electronic device, the electronic device can perform the following steps:

Data collection is performed on the electronic medical record to obtain a dual-modality image sample; wherein, the dual-modality image sample includes an infrared macular region fundus image sample and an optical coherence tomography OCT image sample, and the dual-modality image sample is marked diagnostic labels;

inputting the infrared macular region fundus image sample and the OCT image sample into a first neural network for training at the same time, to obtain first image feature information and second image feature information;

Calculate the total image feature information according to the first image feature information and the first weight, the second image feature information and the second weight, and input the total image feature information into a fully connected network to obtain a prediction result;

The error value of the prediction result and the diagnostic label is calculated by a loss function, and the parameters of the neural network are continuously adjusted by back-propagation technology until the error value is maintained at a preset threshold, and an ophthalmic disease classification model is generated.
A computer program product, comprising a computer program, characterized in that, when the computer program is executed by a processor, the following steps are implemented:

Data collection is performed on the electronic medical record to obtain a dual-modality image sample; wherein, the dual-modality image sample includes an infrared macular region fundus image sample and an optical coherence tomography OCT image sample, and the dual-modality image sample is marked diagnostic labels;

inputting the infrared macular region fundus image sample and the OCT image sample into a first neural network for training at the same time, to obtain first image feature information and second image feature information;

Calculate the total image feature information according to the first image feature information and the first weight, the second image feature information and the second weight, and input the total image feature information into a fully connected network to obtain a prediction result;

The error value of the prediction result and the diagnostic label is calculated by a loss function, and the parameters of the neural network are continuously adjusted by back-propagation technology until the error value is maintained at a preset threshold, and an ophthalmic disease classification model is generated.