CN115272743A

CN115272743A - Two-dimensional IOLMaster lens image grading method and device based on deep learning

Info

Publication number: CN115272743A
Application number: CN202210689298.6A
Authority: CN
Inventors: 刘芳; 赵一天; 周愉; 方利鑫
Original assignee: Shanghai Tenth Peoples Hospital
Current assignee: Shanghai Tenth Peoples Hospital
Priority date: 2022-06-16
Filing date: 2022-06-16
Publication date: 2022-11-01

Abstract

The invention relates to a two-dimensional IOLMaster lens image grading method and device based on deep learning, wherein the method comprises the following steps: acquiring a two-dimensional IOLMaster lens image dataset, wherein the dataset comprises a plurality of two-dimensional IOLMaster lens images; eliminating noise labels of the data set through a Cleanlab library to obtain a cleaned data set; dividing the cleaned data set into a test set and a training set, and performing step-by-step binary classification to obtain trained classification network models corresponding to each level; and respectively inputting the test set of the cleaned data set into the trained classification network model corresponding to each grade to obtain the grading result of the cataract. Through the application, the problems that grading of the lenticular opacity degree depends on manual work, the subjectivity is strong, and the grading result is deviated are solved, and the effect of improving the accuracy of the grading result of the cataract is realized.

Description

Two-dimensional IOLMaster lens image grading method and device based on deep learning

Technical Field

The invention relates to the technical field of image processing, in particular to a two-dimensional IOLMaster lens image grading method and device based on deep learning, computer equipment and a computer readable storage medium.

Background

Age-related cataracts are still the leading cause of blindness and impaired vision in the world today. Relevant statistics indicate that the number of elderly cataract patients in China in 2018 reaches 3500 ten thousand. The prevalence of cataracts increases with age, from 3.9% at age 55-64 to 92.6% at and above age 80. Nowadays, cataract surgery has been comprehensively advanced from reconstructive surgery into the era of refractive surgery due to phacoemulsification (Phaco) combined with intraocular lens Implantation (IOL) and advanced IOL (IOL) design, and the significance and indications of cataract surgery are constantly changing. The evaluation criteria of general cataract surgery indications mainly include the degree of lenticular opacity, visual quality (vision, contrast sensitivity, glare, etc.), and whether cataract affects the diagnosis and treatment of diseases in the posterior segment of the eye. Due to the lack of professional knowledge of cataract, patients need to be classified to know their disease condition intuitively, and a convenient and accurate detection method can obtain quantitative indexes of cataract diagnosis and treatment data, which is helpful for communication between doctors and patients, and can enable patients to know cataract more simply and intuitively, and assist clinicians in detecting disease progression and follow-up visit, determining treatment time of cataract and selecting treatment method.

At present, the evaluation of the cataract lens opacity degree is mainly divided into two main categories, namely a subjective method and an objective method. The most common subjective grading method in clinic is the lens opacity grading system (LOCS iii), and the objective methods mainly include lens Dysfunction (DLI) index of Ray Tracing aberration analysis system, PNS score of Pentacam anterior segment analysis system, and the like. The LOCS III lens opacity degree grading system is the most widely clinically applied at present, however, since the evaluation is carried out by an ophthalmologist under a slit lamp, the reliability of the evaluation can be influenced by the experience of the ophthalmologist and the setting precision of the slit lamp, and researches indicate that the method is strong in subjectivity and not ideal in repeatability. The DLI provided by the iTrace aberration analysis system is an index obtained by comprehensively calculating data such as high-order aberration, contrast sensitivity, pupil diameter and the like, the numerical range is 0-10, the smaller the numerical value is, the heavier the lens opacity degree is, and the DLI value of the transparent lens is 10. The Pentacam anterior ocular segment analysis system was used to acquire Scheimpflug images of the lens, with the built-in PNS software providing an average density reading of the entire nucleus defined as the Pentacam nucleus density, ranging from 0% to 100%. The two instruments have higher functions and can provide grading of the degree of lenticular opacity, but the two instruments are expensive to manufacture and have single function, only can provide indexes of lenticular opacity and cannot provide intraocular lens measurement, so that most hospitals are not provided with the two instruments, but the function is very important in clinical work, and most clinicians lack effective cataract assessment means.

IOLMaster 700 is a remote scanning source optical coherence tomography (SS-OCT) biometric that is the first biometric identification technique based on swept-frequency Optical Coherence Tomography (OCT). It enables OCT imaging and visualization of the entire eye, allowing the ophthalmologist to view a longitudinal cross section of the entire eye. Satisfactory refractive results after intraocular lens (IOL) implantation depend on optimal biometrics. Accurate Axial Length (AL), anterior Chamber Depth (ACD) and corneal curvature (K) are critical to all biometric formulas to calculate the desired outcome. With the widespread use of both astigmatic and multifocal intraocular lenses, accurate biometrics is more important than ever before to achieve the patient's desired visual quality. And IOLMASter 700 can clearly scan out the pathological changes of the macular area, and the examination range covers the macular area where the anterior segment and the posterior segment of the eye have the most important influence on the vision, thereby breaking through the limitation of the traditional auxiliary examination scanning range of ophthalmology. Therefore, the instrument is widely applied to auxiliary ophthalmology examination and aims to improve the refractive effect after cataract surgery.

At present, no effective solution is provided aiming at the problems that grading of the lenticular opacity degree depends on manual work, the subjectivity is strong, and the grading result is deviated in the related technology.

Disclosure of Invention

The present application aims to overcome the defects in the prior art and provide a two-dimensional IOLMaster lens image grading method, device, computer equipment and computer readable storage medium based on deep learning, so as to solve the problems that grading of the degree of lenticular opacity in the related art depends on manual work, the subjectivity is strong, and the grading result is biased.

In order to achieve the purpose, the technical scheme adopted by the application is as follows:

in a first aspect, an embodiment of the present application provides a two-dimensional IOLmaster lens image grading method based on deep learning, including:

acquiring a two-dimensional IOLMaster lens image dataset, wherein the dataset includes a plurality of two-dimensional IOLMaster lens images;

noise label elimination is carried out on the data set through a Cleanlab library to obtain a cleaned data set;

dividing the cleaned data set into a test set and a training set, and performing step-by-step binary classification processing to obtain trained classification network models corresponding to each level;

and respectively inputting the test set of the cleaned data set into the trained classification network model corresponding to each grade to obtain the grading result of the cataract.

In some embodiments, the performing noise label cancellation on the data set through a clearlab library to obtain a cleaned data set includes:

dividing the data set into a test set and a training set;

training a ResNet34 model using a training set of the data set; obtaining a trained ResNet34 model;

inputting the test set of the data set into a trained ResNet34 model to obtain the probability that each two-dimensional IOLmaster lens image in the test set of the data set is predicted to be of each cataract grade;

inputting the probability of each cataract grade into a Cleanlab library to obtain a distribution interval of a noise label in the data set;

and removing the two-dimensional IOLMaster lens image of the data set central label in the distribution interval to obtain the cleaned data set.

In some embodiments, the dividing the cleaned data set into a test set and a training set, and performing a step-by-step binary classification process to obtain trained classification network models corresponding to each step includes:

taking the two-dimensional IOLMaster lens images of the cataracts of the first level in the training set and the testing set of the cleaned data set as one class, taking the two-dimensional IOLMaster lens images of the cataracts of the other levels as another class, and training a classification network model corresponding to the first level to obtain a trained classification network model corresponding to the first level;

performing the following process separately for each level of IOLMaster lens images after the first level in the training set and test set of the cleaned data set:

taking the two-dimensional IOLMaster lens images of the N-level cataract as one class, removing the two-dimensional IOLMaster lens images of the N-1 level cataract, taking the two-dimensional IOLMaster lens images of the rest levels cataract as another class, training the classification network model corresponding to the N level, and obtaining the trained classification network model corresponding to the N level, wherein N is an integer greater than 1.

In some embodiments, the inputting the test set of the cleaned data set into the trained classification network model corresponding to each level respectively to obtain the grading result of the cataract includes:

repeatedly executing the following processes on the test set of the cleaned data set: inputting the test set of the cleaned data set into a trained classification network model corresponding to the mth level to obtain classification results of the mth level cataract and other levels of cataracts, wherein m is an integer and is more than or equal to 1 and less than or equal to N; removing the two-dimensional IOLMaster lens image of the mth level cataract from the test set of the cleaned data set to form a new test set; taking the new test set as a test set of the cleaned data set;

and aggregating the grading results of all grades to obtain the grading result of the cataract.

In some embodiments, the method performs level-by-level binary classification processing by using a Resnet18-CBAM network, where the Resnet18-CBAM network uses a Resnet18 as an infrastructure, and adds spatial and channel attention to a residual module of the Resnet18, and includes the following steps:

feature map F to be input₁(H multiplied by W multiplied by C) respectively carrying out global maximum pooling and global average pooling based on width and height to obtain two 1 multiplied by C feature maps; respectively inputting the two obtained 1 × 1 × C feature maps into two full-connection layers, wherein the number of neurons in the first layer is C/r, r is the reduction rate, the activation function is ReLU, and the number of neurons in the second layer is C;

the output characteristics of the two fully-connected layers are weighted and summed and input to a Sigmoid activation function to generate two cataract-level channel weights, M_c；

To M_cAnd input feature map F₁Performing channel attention operation and outputting a characteristic diagram F₂；

For feature map F₂Performing global maximum pooling and global average pooling of channel dimensions to obtain two H multiplied by W multiplied by 1 feature maps, and splicing the obtained 2 feature maps based on channels to generate an H multiplied by W multiplied by 2 feature map;

carrying out 7 × 7 convolution operation on the H × W × 2 feature graph to reduce dimension into 1 channel to obtain H × W × 1 feature graph, and generating space attention through Sigmoid to obtain M_s(ii) a Will M_sAnd F₂The matrix cartesian product is solved to obtain the final features.

In some of these embodiments, the method employs a cross-entropy loss L with label smoothing_lsrTraining a classification network model, wherein:

wherein, the first and the second end of the pipe are connected with each other,

representing the probability of the prediction of the ith class in the nth classification network model, p_iFor positive classes is 1 and for other classes is 0, a =0.1 is a smoothing factor.

In a second aspect, an embodiment of the present application provides a two-dimensional IOLmaster lens image grading apparatus based on deep learning, including:

an acquisition unit configured to acquire a two-dimensional IOLmaster lens image dataset, wherein the dataset includes a plurality of two-dimensional IOLmaster lens images;

the elimination unit is used for eliminating the noise label of the data set through a Cleanlab library to obtain a cleaned data set;

the training unit is used for dividing the cleaned data set into a test set and a training set and carrying out step-by-step binary classification processing to obtain trained classification network models corresponding to all the levels;

and the classification unit is used for respectively inputting the test set of the cleaned data set into the trained classification network model corresponding to each grade to obtain the classification result of the cataract.

In some of these embodiments, the cancellation unit comprises:

the dividing module is used for dividing the data set into a test set and a training set;

a first training module for training a ResNet34 model using a training set of the data set; obtaining a trained ResNet34 model;

an input module, configured to input the test set of the data set into a trained ResNet34 model, to obtain a probability that each two-dimensional IOLmaster lens image in the test set of the data set is predicted as each cataract level;

the obtaining module is used for inputting the probability of each cataract grade into a Cleanlab library to obtain the distribution interval of the noise label in the data set;

and the removing module is used for removing the two-dimensional IOLMaster lens image of the tag in the data set, which is positioned in the distribution region, so as to obtain the cleaned data set.

In a third aspect, the present application provides a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the two-dimensional IOLmaster lens image grading method based on deep learning as described in the first aspect above when executing the computer program.

In a fourth aspect, embodiments of the present application provide a computer-readable storage medium on which a computer program is stored, which when executed by a processor, implements the two-dimensional IOLmaster lens image ranking method based on deep learning as described in the first aspect above.

This application adopts above technical scheme, compare with prior art, the two-dimentional IOLMaster lens image classification algorithm based on degree of deep learning that this application embodiment provided, through turn into a sequencing problem with cataract classification, specifically realize through a series of two categorizations, finally, obtain cataract classification result through totaling two categorised results, the classification that has solved current lens opacity degree relies on artifical, the subjectivity is strong, and the problem that the classification result has the deviation, the effect of the hierarchical result degree of accuracy of improvement cataract has been realized.

The details of one or more embodiments of the application are set forth in the accompanying drawings and the description below to provide a more thorough understanding of the application.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:

fig. 1 is a block diagram of a mobile terminal according to an embodiment of the present application;

FIG. 2 is a flow chart of a two-dimensional IOLMaster lens image ranking method based on deep learning according to an embodiment of the present application;

FIG. 3 is a diagram of a cataract grading network framework according to a preferred embodiment of the present application;

FIG. 4 is a schematic illustration of a cataract classification outcome prediction procedure according to a preferred embodiment of the present application;

FIG. 5 is a schematic illustration of the prediction of the outcome of a laboratory cataract classification in accordance with a preferred embodiment of the present application;

fig. 6 is a block diagram of a two-dimensional IOLmaster lens image grading apparatus based on deep learning according to an embodiment of the present application;

fig. 7 is a schematic hardware structure diagram of a computer device according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more clearly understood, the present application is described and illustrated below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of and not restrictive on the broad application. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments provided in the present application without any inventive step are within the scope of protection of the present application.

It is obvious that the drawings in the following description are only examples or embodiments of the present application, and that it is also possible for a person skilled in the art to apply the present application to other similar contexts on the basis of these drawings without inventive effort. Moreover, it should be appreciated that in the development of any such actual implementation, as in any engineering or design project, numerous implementation-specific decisions must be made to achieve the developers' specific goals, such as compliance with system-related and business-related constraints, which may vary from one implementation to another.

Reference in the specification to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the specification. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of ordinary skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments without conflict.

Unless defined otherwise, technical or scientific terms referred to herein shall have the ordinary meaning as understood by those of ordinary skill in the art to which this application belongs. Reference to "a," "an," "the," and similar words throughout this application are not to be construed as limiting in number, and may refer to the singular or the plural. The present application is directed to the use of the terms "including," "comprising," "having," and any variations thereof, which are intended to cover non-exclusive inclusions; for example, a process, method, system, article, or apparatus that comprises a list of steps or modules (elements) is not limited to the listed steps or elements, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus. Reference to "connected," "coupled," and the like in this application is not intended to be limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. The term "plurality" as referred to herein means two or more. "and/or" describes an association relationship of associated objects, meaning that three relationships may exist, for example, "A and/or B" may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. Reference herein to the terms "first," "second," "third," and the like, are merely to distinguish similar objects and do not denote a particular ordering for the objects.

The embodiment provides a mobile terminal. Fig. 1 is a block diagram of a mobile terminal according to an embodiment of the present application. As shown in fig. 1, the mobile terminal includes: a Radio Frequency (RF) circuit 110, a memory 120, an input unit 130, a display unit 140, a sensor 150, an audio circuit 160, a wireless fidelity (WiFi) module 170, a processor 180, and a power supply 190. Those skilled in the art will appreciate that the mobile terminal architecture shown in fig. 1 is not intended to be limiting of mobile terminals and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.

The following specifically describes each constituent element of the mobile terminal with reference to fig. 1:

the RF circuit 110 may be used for receiving and transmitting signals during information transmission and reception or during a call, and in particular, receives downlink information of a base station and then processes the received downlink information to the processor 180; in addition, the data for designing uplink is transmitted to the base station. In general, RF circuits include, but are not limited to, an antenna, at least one Amplifier, a transceiver, a coupler, a Low Noise Amplifier (LNA), a duplexer, and the like. In addition, the RF circuitry 110 may also communicate with networks and other devices via wireless communications. The wireless communication may use any communication standard or protocol, including but not limited to Global System for Mobile communication (GSM), general Packet Radio Service (GPRS), code Division Multiple Access (CDMA), wideband Code Division Multiple Access (WCDMA), long Term Evolution (LTE), email, short message Service (Short Messaging Service (SMS)), and so on.

The memory 120 may be used to store software programs and modules, and the processor 180 executes various functional applications and data processing of the mobile terminal by operating the software programs and modules stored in the memory 120. The memory 120 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the mobile terminal, and the like. Further, the memory 120 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.

The input unit 130 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the mobile terminal. Specifically, the input unit 130 may include a touch panel 131 and other input devices 132. The touch panel 131, also referred to as a touch screen, may collect touch operations of a user on or near the touch panel 131 (e.g., operations of the user on or near the touch panel 131 using any suitable object or accessory such as a finger or a stylus pen), and drive the corresponding connection device according to a preset program. Alternatively, the touch panel 131 may include two parts, i.e., a touch detection device and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts the touch information into touch point coordinates, sends the touch point coordinates to the processor 180, and receives and executes commands sent from the processor 180. In addition, the touch panel 131 may be implemented by various types such as a resistive type, a capacitive type, an infrared ray, and a surface acoustic wave. The input unit 130 may include other input devices 132 in addition to the touch panel 131. In particular, other input devices 132 may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and the like.

The display unit 140 may be used to display information input by a user or information provided to the user and various menus of the mobile terminal. The Display unit 140 may include a Display panel 141, and optionally, the Display panel 141 may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like. Further, the touch panel 131 can cover the display panel 141, and when the touch panel 131 detects a touch operation on or near the touch panel 131, the touch operation is transmitted to the processor 180 to determine the type of the touch event, and then the processor 180 provides a corresponding visual output on the display panel 141 according to the type of the touch event. Although the touch panel 131 and the display panel 141 are shown in fig. 1 as two separate components to implement the input and output functions of the mobile terminal, in some embodiments, the touch panel 131 and the display panel 141 may be integrated to implement the input and output functions of the mobile terminal.

The mobile terminal may also include at least one sensor 150, such as a light sensor, a motion sensor, and other sensors. Specifically, the light sensor may include an ambient light sensor that may adjust the brightness of the display panel 141 according to the brightness of ambient light, and a proximity sensor that may turn off the display panel 141 and/or the backlight when the mobile terminal is moved to the ear. As one of the motion sensors, the accelerometer sensor can detect the magnitude of acceleration in each direction (generally three axes), detect the magnitude and direction of gravity when stationary, and can be used for applications (such as horizontal and vertical screen switching, related games, magnetometer attitude calibration), vibration recognition related functions (such as pedometer and tapping) and the like for recognizing the attitude of the mobile terminal; as for other sensors such as a gyroscope, a barometer, a hygrometer, a thermometer, and an infrared sensor, which can be configured on the mobile terminal, further description is omitted here.

A speaker 161 and a microphone 162 in the audio circuit 160 may provide an audio interface between the user and the mobile terminal. The audio circuit 160 may transmit the electrical signal converted from the received audio data to the speaker 161, and convert the electrical signal into a sound signal for output by the speaker 161; on the other hand, the microphone 162 converts the collected sound signal into an electrical signal, which is received by the audio circuit 160 and converted into audio data, which is then processed by the audio data output processor 180 and then transmitted to, for example, another mobile terminal via the RF circuit 110, or the audio data is output to the memory 120 for further processing.

WiFi belongs to a short-distance wireless transmission technology, and the mobile terminal can help a user to receive and send e-mails, browse webpages, access streaming media and the like through the WiFi module 170, which provides wireless broadband internet access for the user. Although fig. 1 illustrates the WiFi module 170, it is understood that it does not belong to the essential component of the mobile terminal, and can be omitted or replaced with other short-range wireless transmission modules, such as Zigbee module or WAPI module, etc., as required within the scope not changing the essence of the invention.

The processor 180 is a control center of the mobile terminal, connects various parts of the entire mobile terminal using various interfaces and lines, and performs various functions of the mobile terminal and processes data by operating or executing software programs and/or modules stored in the memory 120 and calling data stored in the memory 120, thereby performing overall monitoring of the mobile terminal. Alternatively, processor 180 may include one or more processing units; preferably, the processor 180 may integrate an application processor, which mainly handles operating systems, user interfaces, application programs, etc., and a modem processor, which mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 180.

The mobile terminal also includes a power supply 190 (e.g., a battery) for powering the various components, which may preferably be logically coupled to the processor 180 via a power management system that may be configured to manage charging, discharging, and power consumption.

Although not shown, the mobile terminal may further include a camera, a bluetooth module, and the like, which will not be described herein.

In this embodiment, the processor 180 is configured to: acquiring a two-dimensional IOLMaster lens image dataset, wherein the dataset comprises a plurality of two-dimensional IOLMaster lens images; noise label elimination is carried out on the data set through a Cleanlab library to obtain a cleaned data set; dividing the cleaned data set into a test set and a training set, and performing step-by-step binary classification processing to obtain trained classification network models corresponding to each level; and respectively inputting the test set of the cleaned data set into the trained classification network model corresponding to each grade to obtain the grading result of the cataract.

In some of these embodiments, the processor 180 is further configured to: dividing the data set into a test set and a training set; training a ResNet34 model with a training set of the data set; obtaining a trained ResNet34 model; inputting the test set of the data set into a trained ResNet34 model to obtain the probability that each two-dimensional IOLmaster lens image in the test set of the data set is predicted to be of each cataract grade; inputting the probability of each cataract grade into a Cleanlab library to obtain a distribution interval of a noise label in the data set; and removing the two-dimensional IOLMaster lens image with the data set centralized label positioned in the distribution interval to obtain the cleaned data set.

In some of these embodiments, the processor 180 is further configured to: taking the two-dimensional IOLMaster lens images of the first-level cataract in the training set and the testing set of the cleaned data set as one class, taking the two-dimensional IOLMaster lens images of the other cataract levels as the other class, and training a classification network model corresponding to the first level to obtain a trained classification network model corresponding to the first level; performing the following process separately for each level of IOLMaster lens images after the first level in the training set and test set of the cleaned data set: taking the two-dimensional IOLMaster lens images of the N-level cataract as a class, removing the two-dimensional IOLMaster lens images of the N-1 level cataract, taking the two-dimensional IOLMaster lens images of the other levels of cataract as another class, training the classification network model corresponding to the N level, and obtaining the trained classification network model corresponding to the N level, wherein N is an integer larger than 1.

In some of these embodiments, the processor 180 is further configured to: repeating the following process on the test set of the cleaned data set: inputting the test set of the cleaned data set into a trained classification network model corresponding to the mth level to obtain classification results of the mth level cataract and other levels of cataracts, wherein m is an integer and is more than or equal to 1 and less than or equal to N; removing the two-dimensional IOLMaster lens image of the mth level cataract from the test set of the cleaned data set to form a new test set; taking the new test set as a test set of the cleaned data set; and aggregating the grading results of all grades to obtain the grading result of the cataract.

In some of these embodiments, the processor 180 is further configured to: adopting Resnet18-CBAM network to carry out step-by-step binary classification processing, wherein the Resnet18-CBAM network takes Resnet18 as a basic framework, and adding space and channel attention into a residual module of Resnet18, and the method comprises the following steps:

To M is aligned with_cAnd input feature map F₁Performing channel attention operation and outputting a characteristic diagram F₂；

For feature map F₂Performing global maximum pooling and global average pooling of channel dimensions to obtain two H multiplied by W multiplied by 1 feature maps, splicing the obtained 2 feature maps based on the channels to generate an H multiplied by W multiplied by 2 feature map;

performing 7 × 7 convolution operation on the H × W × 2 feature map to reduce the dimension to 1 channel to obtain H × W × 1 feature map, and generating spatial attention through Sigmoid to obtain M_s(ii) a Will M_sAnd F₂And solving a Cartesian product of the matrixes to obtain final characteristics.

In some of these embodiments, the processor 180 is further configured to: cross entropy loss L with label smoothing_lsrTraining a classification network model, wherein:

representing the probability that the ith class is predicted in the nth classification network model, p_i1 for the positive class and 0 for the other class, a =0.1 is a smoothing factor.

The embodiment provides a two-dimensional IOLMaster lens image grading method based on deep learning, belongs to the field of image processing, and can be used for cataract grading and intelligent medical diagnosis.

Fig. 2 is a flowchart of a two-dimensional IOLmaster lens image grading method based on deep learning according to an embodiment of the present application, as shown in fig. 2, the flowchart includes the following steps:

step S201, acquiring a two-dimensional IOLmaster lens image data set, wherein the data set comprises a plurality of two-dimensional IOLmaster lens images;

step S202, eliminating noise labels of the data set through a Cleanlab library to obtain a cleaned data set;

step S203, dividing the cleaned data set into a test set and a training set, and performing step-by-step binary classification processing to obtain trained classification network models corresponding to each level;

and S204, respectively inputting the test set of the cleaned data set into the trained classification network model corresponding to each grade to obtain the grading result of the cataract.

Through the steps, the two-dimensional IOLMaster lens image grading algorithm based on deep learning provided by the embodiment of the application converts cataract grading into a sequencing problem, specifically, the cataract grading is realized through a series of two classifications, and finally, a cataract grading result is obtained through summing the results of the two classifications.

In some embodiments, the step S202 performs noise label cancellation on the data set through a clearlab library to obtain a cleaned data set, including:

dividing the data set into a test set and a training set;

training a ResNet34 model with a training set of the data set; obtaining a trained ResNet34 model;

and removing the two-dimensional IOLMaster lens image with the data set centralized label positioned in the distribution interval to obtain the cleaned data set.

The label for cataract classification was made visually by a professional physician against the gold standard of the LOCS iii lens opacity classification system, and the adjacent classes were highly similar. In addition, due to subjectivity and visual fatigue of the artificially created label, there may be cases where the label level does not match the true cataract level in the artificially created label. In order to shield the influence of wrong labels caused by artificial errors on model training, the method eliminates noise labels of the data set through Cleanlab. Firstly, the data set is equally divided into five groups, one group is selected as a test set, the other four groups are taken as training sets, and a ResNet34 model is trained according to data division. The test set is then entered into the trained model, and the probability that each sample in the test set is predicted to be of each cataract class is obtained. The probability of each individual cataract is then entered into the clearlab library to obtain a distribution of noise signatures throughout the dataset. And finally, removing the cataract image with the wrong label to obtain a cleaned IOLMaster lens image data set.

In some embodiments, the step S203 divides the cleaned data set into a test set and a training set, and performs a step-by-step binary classification process to obtain trained classification network models corresponding to each step, including:

performing the following process separately on IOLMaster lens images of each level in the training set and test set of the cleaned data set after the first level:

taking the two-dimensional IOLMaster lens images of the N-level cataract as a class, removing the two-dimensional IOLMaster lens images of the N-1 level cataract, taking the two-dimensional IOLMaster lens images of the other levels of cataract as another class, training the classification network model corresponding to the N level, and obtaining the trained classification network model corresponding to the N level, wherein N is an integer larger than 1.

Traditional cataract classification ignores the sequential information of cataract classification and over-simplifies the classification of cataract into a linear model. Therefore, in order to fuse the adjacency relations of different cataract levels in the cataract classification model, the method adopts the Ranking-Net idea to perform independent feature extraction on images of different cataract levels, so that the learned features can more effectively distinguish the adjacent cataract levels, and by combining the cataract classification network frame diagram shown in fig. 3, the detailed steps of the whole algorithm are as follows: in the first step, the data set is divided into five equal parts, one part is selected as a test set, and the other four parts are selected as training sets. And (3) taking the stage 1 cataract in the training set and the testing set as one class and taking the other stage 2 to 6 cataracts as another class, thereby forming a binary classification task. Then, training a first classification network on the divided binary classification data set, and recording the first classification network as Level-1. And step two, taking the 2-level cataract images in the training set and the testing set as a category, simultaneously removing the 1-level cataract image category in the previous step, and taking the rest 3-6-level cataracts as another category, thereby forming a second binary classification task. Similarly, a second classification network is trained again on the divided binary classification dataset, and is marked as Level-2. And so on, finally, the 5-grade cataract and the 6-grade cataract are respectively divided into a single class, and a fifth classification network is trained and recorded as Level-5.

In some embodiments, step S204 inputs the test set of the cleaned data set into the trained classification network model corresponding to each level, respectively, to obtain a classification result of cataract, including:

In the network frame diagram shown in fig. 3, after five two-class networks are acquired, firstly, test set data is input into Level-1 to obtain grading results of the 1 st-Level cataract and the other levels of cataracts, then, the obtained 1 st-Level prediction results are removed from the test set to form a new test set, and the newly constructed test set is input into a Level-2 network to obtain the prediction results of the 2 nd-Level cataract and the other levels of cataracts. By analogy, the grading results of 5 th and 6 th grade cataracts are obtained in the last Level-5. And finally, aggregating the grading results of all grades to obtain the final grading result of the cataract.

(1) Feature map F to be input₁(H x W x C) are respectively subjected to global maximum pooling and global average pooling based on width and heightObtaining two characteristic graphs of 1 multiplied by C; respectively inputting the two obtained 1 × 1 × C feature maps into two full-connection layers, wherein the number of neurons in the first layer is C/r, r is the reduction rate, the activation function is ReLU, and the number of neurons in the second layer is C; it is noted that the weight parameters of the two fully connected layers are shared.

(2) The output characteristics of the two fully-connected layers are weighted and summed and input to a Sigmoid activation function to generate two cataract-level channel weights, M_c；

(3) To M is aligned with_cAnd inputting feature map F₁Performing element-by-element multiplication, i.e. channel attention operation, and outputting feature diagram F₂；

(4) Feature map F of channel attention module output₂Input into the spatial attention module, i.e. to the feature map F₂Performing global maximum pooling and global average pooling of channel dimensions to obtain two H multiplied by W multiplied by 1 feature maps, and splicing the obtained 2 feature maps based on channels to generate an H multiplied by W multiplied by 2 feature map;

(5) Carrying out 7 × 7 convolution operation on the H × W × 2 feature graph to reduce dimension into 1 channel to obtain H × W × 1 feature graph, and generating space attention through Sigmoid to obtain M_s(ii) a Will M_sAnd F₂And solving a Cartesian product of the matrixes to obtain final characteristics.

The embodiment of the application provides a two-dimensional IOLMaster lens image grading method based on deep learning aiming at the problems that grading of the existing lens opacity degree depends on manual work, the subjectivity is strong, and the grading result is biased. As shown in fig. 4, the technical solution for implementing the above object in the embodiment of the present application includes the following processes: 1) Data acquisition: 2) Data cleaning: removing tag noise in the data set; 3) Binary classification: grouping the cataract data sets to carry out step-by-step binary classification; 4) As a result, polymerization: and aggregating the two classification results to obtain a cataract grade prediction result.

As shown in FIG. 5, the ResNet34 model based on the convolutional neural network and the Ranking-CNN can be used separately in the experiment. The Ranking-CNN is classified by creating N-1 CNN models, and each model is classified in two categories by taking one level as a reference. For example, when predicting cataract grades 1 to 6, the first CNN model is classified into two categories with grade 1 as one category and the other categories as the second category, and the fifth CNN model is classified into one category with grade 5 and the second category with grade 6 as the second category. Finally, N-1 binary predictions are obtained for the N classes, and the results of the two classes are summed to obtain the cataract classification result.

Compared with the prior art, the embodiment of the application has the following advantages:

1. the IOLMaster lens image grading algorithm guided by a LOCS III lens opacity degree grading system is provided for the first time, an effective tool is provided for objectively evaluating the lens opacity degree, and the possibility is provided for accurate diagnosis and treatment of cataract;

2. the method for removing the noise labels is provided for the first time, the noise labels existing in the IOLMaster lens image data set are removed, and the robustness and the accuracy of the hierarchical model are improved;

3. the IOLMaster lens image cataract classification method based on sequencing classification is provided for the first time, and through creating N-1 CNN models, each model is classified for two times by taking one grade as reference, so that the learning characteristics of each model have more effective expression capability;

4. the application carries out preliminary experiments in a batch of IOLMaster lens images, and the result shows that the algorithm used in the invention can accurately predict the cataract grade.

It should be noted that the steps illustrated in the above-described flow diagrams or in the flow diagrams of the figures may be performed in a computer system, such as a set of computer-executable instructions, and that, although a logical order is illustrated in the flow diagrams, in some cases, the steps illustrated or described may be performed in an order different than here.

The present embodiment provides a two-dimensional IOLmaster lens image grading device based on deep learning, which is used for implementing the above embodiments and preferred embodiments, and the description of the device is omitted. As used hereinafter, the terms "module," "unit," "subunit," and the like may implement a combination of software and/or hardware for a predetermined function. Although the means described in the embodiments below are preferably implemented in software, an implementation in hardware, or a combination of software and hardware is also possible and contemplated.

Fig. 6 is a block diagram illustrating a structure of a two-dimensional IOLmaster lens image grading apparatus based on deep learning according to an embodiment of the present application, as shown in fig. 6, the apparatus including:

an acquisition unit 61 for acquiring a two-dimensional IOLmaster lens image dataset, wherein the dataset includes a plurality of two-dimensional IOLmaster lens images;

the eliminating unit 62 is configured to eliminate the noise label from the data set through the clearlab library to obtain a cleaned data set;

a training unit 63, configured to divide the cleaned data set into a test set and a training set, and perform step-by-step binary classification processing to obtain trained classification network models corresponding to each level;

and the classification unit 64 is configured to input the test set of the cleaned data set into the trained classification network model corresponding to each level, respectively, to obtain a classification result of the cataract.

In some of these embodiments, the elimination unit 62 includes:

an input module, configured to input the test set of the data set into a trained ResNet34 model, to obtain a probability that each two-dimensional IOLmaster lens image in the test set of the data set is predicted as each cataract grade;

and the removing module is used for removing the two-dimensional IOLMaster lens image of the tag in the data set, which is positioned in the distribution interval, so as to obtain the cleaned data set.

In some of these embodiments, the training unit 63 comprises:

the second training module is used for training the classification network model corresponding to the first level by taking the two-dimensional IOLMaster lens images of the first level of cataracts in the training set and the testing set of the cleaned data set as one class and taking the two-dimensional IOLMaster lens images of the other levels of cataracts as another class to obtain a trained classification network model corresponding to the first level;

a third training module for performing the following processes on the IOLMaster lens image of each level after the first level in the training set and the test set of the cleaned data set, respectively:

In some of these embodiments, the classification unit 64 includes:

an execution module, configured to repeatedly execute the following processes on the cleaned test set of the data set: inputting the test set of the cleaned data set into a trained classification network model corresponding to the mth level to obtain classification results of the mth level cataract and other levels of cataracts, wherein m is an integer and is more than or equal to 1 and less than or equal to N; removing the two-dimensional IOLMaster lens image of the mth grade cataract from the test set of the cleaned data set to form a new test set; taking the new test set as a test set of the cleaned data set;

In some embodiments, the apparatus performs a step-by-step binary classification process using a Resnet18-CBAM network, where the Resnet18-CBAM network is based on Resnet18, and adds spatial and channel attention to a residual error module of Resnet18, and includes the following steps:

performing 7 × 7 convolution operation on the H × W × 2 feature map to reduce the dimension to 1 channel to obtain H × W × 1 feature map, and generating spatial attention through Sigmoid to obtain M_s(ii) a Will M_sAnd F₂Solving the matrix Cartesian product to obtain the finalAnd (5) performing characteristic.

In some of these embodiments, the apparatus employs a cross-entropy loss L with label smoothing_lsrTraining a classification network model, wherein:

representing the probability that the ith class is predicted in the nth classification network model, p_iFor positive classes is 1 and for other classes is 0, a =0.1 is a smoothing factor.

The above modules may be functional modules or program modules, and may be implemented by software or hardware. For a module implemented by hardware, the modules may be located in the same processor; or the modules can be respectively positioned in different processors in any combination.

An embodiment provides a computer device. The two-dimensional IOLMaster lens image grading method based on deep learning in combination with the embodiment of the application can be realized by computer equipment. Fig. 7 is a hardware structure diagram of a computer device according to an embodiment of the present application.

The computer device may comprise a processor 71 and a memory 72 in which computer program instructions are stored.

Specifically, the processor 71 may include a Central Processing Unit (CPU), or an Application Specific Integrated Circuit (ASIC), or may be configured to implement one or more integrated circuits of the embodiments of the present Application.

Memory 72 may include, among other things, mass storage for data or instructions. By way of example, and not limitation, the memory 72 may include a hard disk drive (hard disk drive, abbreviated HDD), a floppy disk drive, a Solid State Drive (SSD), flash memory, an optical disk, a magneto-optical disk, tape, or a Universal Serial Bus (USB) drive or a combination of two or more of these. Memory 72 may include removable or non-removable (or fixed) media, where appropriate. The memory 72 may be internal or external to the data processing apparatus, where appropriate. In a particular embodiment, the memory 72 is a Non-Volatile (Non-Volatile) memory. In certain embodiments, memory 72 includes Read-only Memory (ROM) and Random Access Memory (RAM). The ROM may be mask-programmed ROM, programmable ROM (PROM), erasable PROM (EPROM), electrically Erasable PROM (EEPROM), electrically rewritable ROM (EAROM), or FLASH Memory (FLASH), or a combination of two or more of these, where appropriate. The RAM may be a Static Random-Access Memory (SRAM) or a Dynamic Random-Access Memory (DRAM), where the DRAM may be a Fast Page Mode Dynamic Random-Access Memory (FPMDRAM), an Extended data output Dynamic Random-Access Memory (EDODRAM), a Synchronous Dynamic Random-Access Memory (SDRAM), and the like.

The memory 72 may be used to store or cache various data files that need to be processed and/or used for communication, as well as possible computer program instructions executed by the processor 71.

The processor 71 implements any one of the two-dimensional IOLmaster lens image grading methods based on deep learning in the above embodiments by reading and executing computer program instructions stored in the memory 72.

In some of these embodiments, the computer device may also include a communication interface 73 and a bus 70. As shown in fig. 7, the processor 71, the memory 72, and the communication interface 73 are connected via a bus 70 to complete communication therebetween.

The communication interface 73 is used for realizing communication among modules, devices, units and/or apparatuses in the embodiments of the present application. The communication interface 73 may also enable communication with other components such as: the data communication is carried out among external equipment, image/data acquisition equipment, a database, external storage, an image/data processing workstation and the like.

The bus 70 includes hardware, software, or both coupling the components of the computer device to each other. Bus 70 includes, but is not limited to, at least one of the following: data Bus (Data Bus), address Bus (Address Bus), control Bus (Control Bus), expansion Bus (Expansion Bus), and Local Bus (Local Bus). By way of example and not limitation, bus 70 may include an Accelerated Graphics Port (AGP) or other Graphics Bus, an Enhanced Industry Standard Architecture (EISA) Bus, a Front-Side Bus (FSB), a Hyper Transport (HT) Interconnect, an ISA (ISA) Bus, an InfiniBand (InfiniBand) Interconnect, a Low Pin Count (LPC) Bus, a memory Bus, a microchannel Architecture (MCA) Bus, a PCI (Peripheral Component Interconnect) Bus, a PCI-Express (PCI-X) Bus, a Serial Advanced Technology Attachment (SATA) Bus, a vlslave Bus, a Video Bus, or a combination of two or more of these suitable electronic buses. Bus 70 may include one or more buses, where appropriate. Although specific buses are described and shown in the embodiments of the application, any suitable buses or interconnects are contemplated by the application.

In addition, in combination with the two-dimensional IOLmaster lens image classification method based on deep learning in the foregoing embodiment, the embodiment of the present application may be implemented by providing a computer-readable storage medium. The computer readable storage medium having stored thereon computer program instructions; the computer program instructions, when executed by a processor, implement any of the above embodiments of a two-dimensional IOLMaster lens image grading method based on deep learning.

All possible combinations of the technical features of the above embodiments may not be described for the sake of brevity, but should be considered as within the scope of the present disclosure as long as there is no contradiction between the combinations of the technical features.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. A two-dimensional IOLMaster lens image grading method based on deep learning is characterized by comprising the following steps:

acquiring a two-dimensional IOLMaster lens image dataset, wherein the dataset comprises a plurality of two-dimensional IOLMaster lens images;

2. The method of claim 1, wherein the performing noise label cancellation on the data set through a clearlab library to obtain a cleaned data set comprises:

dividing the data set into a test set and a training set;

inputting the test set of the data set into a trained ResNet34 model to obtain the probability that each two-dimensional IOLMaster lens image in the test set of the data set is predicted to be in each cataract grade;

3. The method according to claim 1, wherein the dividing the cleaned data set into a test set and a training set, and performing a step-by-step binary classification process to obtain trained classification network models corresponding to each step comprises:

taking the two-dimensional IOLMaster lens images of the first-level cataract in the training set and the testing set of the cleaned data set as one class, taking the two-dimensional IOLMaster lens images of the other cataract levels as the other class, and training a classification network model corresponding to the first level to obtain a trained classification network model corresponding to the first level;

4. The method according to claim 3, wherein the inputting the test set of the cleaned data set into the trained classification network model corresponding to each level respectively to obtain the grading result of the cataract comprises:

repeating the following process on the test set of the cleaned data set: inputting the test set of the cleaned data set into a trained classification network model corresponding to the mth level to obtain classification results of the mth level cataract and other levels of cataracts, wherein m is an integer and is more than or equal to 1 and less than or equal to N; removing the two-dimensional IOLMaster lens image of the mth level cataract from the test set of the cleaned data set to form a new test set; taking the new test set as a test set of the cleaned data set;

5. The method according to any one of claims 1 to 4, wherein the method performs a level-by-level binary classification process using a Resnet18-CBAM network, the Resnet18-CBAM network is based on Resnet18, and space and channel attention is added to a residual module of Resnet18, the method comprising the following steps:

For feature map F₂Performing global max pooling and global of channel dimensionsPerforming average pooling to obtain two H multiplied by W multiplied by 1 feature maps, and splicing the obtained 2 feature maps based on channels to generate an H multiplied by W multiplied by 2 feature map;

6. The method of claim 5, wherein the method employs a cross entropy loss L with label smoothing_lsrTraining a classification network model, wherein:

wherein the content of the first and second substances,

7. A two-dimensional IOLmaster lens image grading device based on deep learning, comprising:

8. The apparatus of claim 7, wherein the elimination unit comprises:

a first training module for training a ResNet34 model with a training set of the data set; obtaining a trained ResNet34 model;

the obtaining module is used for inputting the probability of each cataract grade into a Cleanlab library to obtain a distribution interval of the noise label in the data set;

9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method according to any of claims 1 to 6 when executing the computer program.

10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1 to 6.