CN108876776B

CN108876776B - Classification model generation method, fundus image classification method and device

Info

Publication number: CN108876776B
Application number: CN201810607909.1A
Authority: CN
Inventors: 王晓婷; 栾欣泽; 何光宇; 孟健
Original assignee: Neusoft Corp
Current assignee: Neusoft Corp
Priority date: 2018-06-13
Filing date: 2018-06-13
Publication date: 2021-08-24
Anticipated expiration: 2038-06-13
Also published as: CN108876776A

Abstract

The embodiment of the application discloses a classification model generation method, a fundus image classification method and a fundus image classification device, wherein the method comprises the following steps: by taking one or more of the fundus original image, the feature vector image corresponding to the fundus original image and the preprocessed image corresponding to the fundus original image as fundus training images, training the initial depth learning model by using the fundus training images and the retina classification labels corresponding to the fundus training images to generate a retina classification model, the generated retina classification model can classify the retina types of the fundus images, so that the retina types of the fundus images can be automatically and rapidly classified, the subjective influence is eliminated by the classification result, and the classification result is more accurate. Meanwhile, the number of training images is effectively increased by using various images as the training images, so that the generated retina classification model is more accurate.

Description

Classification model generation method, fundus image classification method and device

Technical Field

The application relates to the technical field of image processing, in particular to a classification model generation method and device and an eye fundus image classification method and device.

Background

With the development of information acquisition technology and the popularization of big data, effective information can be obtained by processing acquired images. For example, some schemes for acquiring images of human bodies such as tongue bodies, eyegrounds and the like by using intelligent terminal equipment appear at present, and great convenience is brought to information acquisition of the human bodies by people.

In the prior art, the acquired fundus images can be transmitted to professionals to screen whether retinopathy exists in the fundus, but the subjective view of manual judgment is strong, quantification is difficult, and the efficiency is low, so that a mode of quickly and accurately classifying retina types in the fundus images is lacked in the prior art.

Disclosure of Invention

In view of this, embodiments of the present application provide a method and an apparatus for generating a classification model, a method and an apparatus for classifying fundus images, so as to solve the technical problem in the prior art that retina types of fundus images cannot be classified quickly and accurately.

In order to solve the above problem, the technical solution provided by the embodiment of the present application is as follows:

a classification model generation method, the method comprising:

acquiring an original fundus image;

taking one or more of the fundus original image, the feature vector image corresponding to the fundus original image and the preprocessed image corresponding to the fundus original image as fundus training images;

and training an initial deep learning model according to the eye ground training image and the retina classification label corresponding to the eye ground training image to generate a retina classification model.

In one possible implementation manner, the generation process of the feature vector image corresponding to the fundus original image includes:

extracting image characteristic vectors of the fundus original image;

and drawing the image characteristic vector of the fundus original image into a characteristic vector image corresponding to the fundus original image.

In one possible implementation, the rendering of the image feature vector of the fundus original image as a feature vector image corresponding to the fundus original image includes:

drawing the image characteristic vector of the fundus original image into an original characteristic vector image;

and carrying out scale change processing on the original characteristic vector image to generate a characteristic vector image corresponding to the fundus original image.

In one possible implementation, the image feature vector includes a scale-invariant feature transform feature vector and a corner detection feature vector.

In one possible implementation, the generation process of the pre-processed image corresponding to the fundus original image includes:

and carrying out scale change processing, shearing processing and/or overturning processing on the fundus original image to generate a preprocessed image corresponding to the fundus original image.

In a possible implementation manner, the training an initial deep learning model according to the fundus training image and a retina classification label corresponding to the fundus training image to generate a retina classification model includes:

training the initial deep learning model by using a general training image set to generate a general classification model;

and training the general classification model according to the eye ground training image and the retina classification label corresponding to the eye ground training image to generate a retina classification model.

A method of eye fundus image classification, the method comprising:

acquiring an original fundus image to be classified;

inputting one or more of the fundus original image to be classified, the feature vector image corresponding to the fundus original image to be classified and the preprocessed image corresponding to the fundus original image to be classified into a retina classification model to obtain at least one retina classification result, and determining the retina classification result of the fundus original image to be classified from the at least one retina classification result according to a voting mechanism, wherein the retina classification model is generated according to the classification model generation method.

In a possible implementation manner, the generation process of the feature vector image corresponding to the fundus original image to be classified includes:

extracting image characteristic vectors of the fundus original images to be classified;

and drawing the image characteristic vector of the fundus original image to be classified into a characteristic vector image corresponding to the fundus original image to be classified.

In a possible implementation manner, the rendering of the image feature vector of the fundus original image to be classified as the feature vector image corresponding to the fundus original image to be classified includes:

drawing the image characteristic vector of the fundus original image to be classified into an original characteristic vector image;

and carrying out scale change processing on the original characteristic vector image to generate a characteristic vector image corresponding to the fundus original image to be classified.

In a possible implementation manner, the generation process of the preprocessed image corresponding to the fundus original image to be classified includes:

and carrying out scale change processing, shearing processing and/or overturning processing on the fundus original image to be classified to generate a preprocessed image corresponding to the fundus original image to be classified.

A classification model generation apparatus, the apparatus comprising:

a first acquisition unit configured to acquire an original image of a fundus;

the second acquisition unit is used for taking one or more of the fundus original image, the characteristic vector image corresponding to the fundus original image and the preprocessed image corresponding to the fundus original image as a fundus training image;

and the generating unit is used for training the initial deep learning model according to the eye fundus training image and the retina classification label corresponding to the eye fundus training image to generate a retina classification model.

extracting image characteristic vectors of the fundus original image;

In one possible implementation, the generating unit includes:

the first generation subunit is used for training the initial deep learning model by using the universal training image set to generate a universal classification model;

and the second generation subunit is used for training the general classification model according to the eye fundus training image and the retina classification label corresponding to the eye fundus training image to generate a retina classification model.

A fundus image classification apparatus, the apparatus comprising:

an acquisition unit for acquiring an original image of the fundus to be classified;

an obtaining unit, configured to input one or more of the fundus original image to be classified, the feature vector image corresponding to the fundus original image to be classified, and the preprocessed image corresponding to the fundus original image to be classified into a retina classification model, obtain at least one retina classification result, and determine, according to a voting mechanism, a retina classification result of the fundus original image to be classified from the at least one retina classification result, where the retina classification model is generated by the classification model generation device.

A computer-readable storage medium having stored therein instructions that, when executed on a terminal device, cause the terminal device to execute the above-described classification model generation method or the above-described fundus image classification method.

A computer program product which, when run on a terminal device, causes the terminal device to execute the above-described classification model generation method or the above-described fundus image classification method.

Therefore, the embodiment of the application has the following beneficial effects:

according to the embodiment of the application, one or more of the fundus original image, the feature vector image corresponding to the fundus original image and the preprocessed image corresponding to the fundus original image are used as the fundus training image, the initial deep learning model is trained by using the fundus training image and the retina classification label corresponding to the fundus training image, the retina classification model is generated, the generated retina classification model can classify the retina types of the fundus images, the retina types of the fundus images are automatically and rapidly classified, the subjective influence is eliminated by the classification result, and the method is more accurate. Meanwhile, the number of training images is effectively increased by using various images as the training images, so that the generated retina classification model is more accurate.

Drawings

Fig. 1 is a flowchart of a classification model generation method according to an embodiment of the present application;

FIG. 2 is a flowchart of classification model training provided by an embodiment of the present application;

fig. 3 is a flowchart of a process of generating a feature vector image corresponding to an original fundus image according to an embodiment of the present application;

fig. 4 is a flowchart of a classification model verification method according to an embodiment of the present application;

fig. 5 is a flowchart of a fundus image classification method according to an embodiment of the present application;

fig. 6 is a schematic structural diagram of a classification model generation apparatus according to an embodiment of the present application;

fig. 7 is a schematic structural diagram of a fundus image classification apparatus according to an embodiment of the present application.

Detailed Description

In order to make the aforementioned objects, features and advantages of the present application more comprehensible, embodiments accompanying the drawings are described in detail below.

In order to facilitate understanding of the technical solutions provided in the present application, the following briefly describes the research background of the technical solutions in the present application.

In recent years, with the development of computer technology, people can process collected images by using more advanced technology to obtain effective information. For example, a scheme that an intelligent terminal such as a mobile phone with a camera is used for acquiring images of parts of a human body such as a tongue and eyes can be utilized, and great convenience is brought to information acquisition of the human body.

However, at present, aiming at the acquired fundus images, only professionals can be used for identifying and classifying retina types, and the manual judgment mode has strong subjectivity, is difficult to quantify, has low efficiency and has low accuracy in classifying and identifying the fundus images.

Based on the above, the application provides a classification model generation method, a fundus image classification method and a device, a retina classification model is trained and generated, and the retina classification model can be used for classifying the retina types of fundus images, so that the retina types of the fundus images can be automatically and rapidly classified, the subjective influence is eliminated according to the classification result, and the classification result is more accurate.

The following describes a classification model generation method provided in an embodiment of the present application with reference to the drawings.

Referring to fig. 1, which shows a flowchart of a classification model generation method provided in an embodiment of the present application, as shown in fig. 1, the method includes:

step 101: an original image of the fundus is acquired.

In practical application, in order to classify the retina type of a fundus image, a retina classification model needs to be generated through training, and in the generation process of the classification model, a fundus original image needs to be acquired, wherein the fundus original image refers to a group of basic images which can be used for training the classification model, and the fundus original image can be obtained through shooting the fundus by using a special ophthalmoscope device.

Further, a fundus training image for retina classification model training may be generated using the fundus original image, whereby after the fundus original image is acquired, execution of step 102 may continue.

Step 102: and taking one or more of the fundus original image, the characteristic vector image corresponding to the fundus original image and the preprocessed image corresponding to the fundus original image as fundus training images.

In practical application, after the fundus original image is acquired in step 101, the fundus original image can be used as a fundus training image, and further, under the condition that the data volume of the fundus original image is limited, in order to improve the classification accuracy of the generated retina classification model, a data increment mode can be adopted, namely, the fundus original image is used for generating a plurality of fundus training images such as a feature vector image corresponding to the fundus original image and a preprocessed image corresponding to the fundus original image, so that the plurality of images are used as training images, the data volume of the training images is effectively expanded, and the accuracy of the generated classification model can be improved.

For example, the following steps are carried out: assuming that 100 fundus original images are acquired, the 100 fundus original images can be used as fundus training images, and further, 100 corresponding feature vector images and 100 corresponding preprocessed images generated from the 100 fundus original images can be used as fundus training images, so that any one or more of the fundus original images, the feature vector images corresponding to the fundus original images and the preprocessed images corresponding to the fundus original images can be selected as fundus training images to perform classification model training according to actual conditions.

It should be noted that the feature vector image corresponding to the fundus oculi original image may be generated by performing feature extraction on the fundus oculi original image to generate a vector, and then drawing a picture by using the vector to form a feature vector image, and after performing feature extraction on the fundus oculi original image, the generated vector may include a scale-invariant feature transformation feature vector and a corner detection feature vector. The preprocessed image corresponding to the fundus original image may be obtained by performing preprocessing such as scaling, cutting, and turning on the fundus original image, and after one or more of the fundus original image, the feature vector image corresponding to the fundus original image, and the preprocessed image corresponding to the fundus original image are used as the fundus training image, step 103 may be continuously performed, where a specific generation manner of the feature vector image corresponding to the fundus original image and the preprocessed image corresponding to the fundus original image will be described in detail in the following embodiments.

Step 103: and training the initial deep learning model according to the eye ground training image and the retina classification label corresponding to the eye ground training image to generate a retina classification model.

In a specific implementation process, in step 102, after one or more of the fundus original image, the feature vector image corresponding to the fundus original image, and the preprocessed image corresponding to the fundus original image are used as a fundus training image, further, the initial deep learning model may be trained according to the fundus training image and the retina classification label corresponding to the tongue training image, so as to generate a retina classification model.

Each fundus training image is provided with a known retina classification label, the retina classification label corresponding to the fundus training image is a label corresponding to the retina type of the fundus image which is labeled in advance, for example, the retina classification of the fundus image can be generally divided into six types, namely, a retina with a small bleeding point, a retina with a bleeding spot, a retina with a lint spot, a retina with a new blood vessel, a retina with fiber proliferation, a retina with retinal detachment, and the like, and correspondingly, the retina classification label corresponding to the fundus training image can also be identified by using different characters, for example, the retina with a small bleeding point is identified by the label 1, the retina with a bleeding spot is identified by the label 2, the retina with a lint spot is identified by the label 3, the retina with a new blood vessel is identified by the label 4, Label 5 corresponds to the retina where fibroplasia is present, and label 6 corresponds to the retina where retinal detachment is present. It should be noted that, the specific classification of the retina and the label form corresponding to the classification can be set according to the actual situation, which is not limited in the embodiment of the present application.

In the embodiment of the present application, an optional implementation manner is that the initial deep learning model in the present application may be a google network model (google lenet), the model is a 22-layer deep network, and a full-link layer may be changed into a sparse-link layer by using google lenet, so as to solve the problem of limited depth and width, and further improve the accuracy of the classification result of the retina classification model.

According to the embodiments, the fundus training image is one or more of the fundus original image, the feature vector image corresponding to the fundus original image and the preprocessed image corresponding to the fundus original image, the initial deep learning model is trained by using the fundus training image and the retina classification label corresponding to the fundus training image, the retina classification model is generated, the generated retina classification model can classify the retina types of the fundus images, accordingly, the retina types of the fundus images can be automatically and rapidly classified, the classification result eliminates the influence of subjectivity, and the method is more accurate. Meanwhile, the number of training images is effectively increased by using various images as the training images, so that the generated retina classification model is more accurate.

Referring to fig. 2, which shows a flowchart of training a classification model provided in an embodiment of the present application, as shown in fig. 2, in the process of training the classification model, the method firstly needs to acquire the fundus original image, then, can carry out angular point detection (Harris) feature extraction and Scale-invariant feature transform (SIFT) feature extraction on the fundus original image, further generating a corresponding characteristic vector image, and carrying out preprocessing such as scaling, shearing, overturning and the like on the fundus oculi original image to obtain a preprocessed image corresponding to the fundus oculi original image, one or more of the fundus original image, the characteristic vector image corresponding to the fundus original image and the preprocessed image corresponding to the fundus original image can be combined to be used as a fundus training image, an initial deep learning model, such as the google lenet model, is trained to generate a retinal classification model.

Next, a specific generation method of the feature vector image corresponding to the fundus oculi original image in step 102 will be described.

Referring to fig. 3, it is shown that in an alternative embodiment, the generation process of the feature vector image corresponding to the fundus original image includes:

step 301: and extracting image characteristic vectors of the fundus original image.

In practical application, in order to generate a feature vector image corresponding to a fundus original image, as shown in fig. 2, feature vector extraction needs to be performed on the fundus original image first, and in some possible implementations of the present application, the obtained image feature vectors include scale-invariant feature transformation feature vectors and corner detection feature vectors. Next, specific embodiments of SIFT feature extraction and Harris feature extraction performed on the original fundus image shown in fig. 2 will be described.

SIFT feature extraction

The SIFT feature is a description used in the field of image processing. The SIFT is to establish a scale space by utilizing convolution of an original image and a gaussian kernel, and extract feature points of scale invariance on a gaussian difference space pyramid. The algorithm has certain affine invariance, view angle invariance, rotation invariance and illumination invariance, so the algorithm is most widely applied to the aspect of image feature extraction.

The implementation process of the SIFT feature extraction algorithm is roughly as follows:

(1) constructing a Gaussian difference pyramid;

(2) searching the characteristic points;

(3) and (5) describing features.

In practical application, by combining the implementation rough process of the above-mentioned SIFT feature extraction algorithm, the detailed process of each specific step of SIFT feature extraction on the fundus original image is as follows:

(1) in the construction process of the Gaussian difference pyramid, a pyramid structure with a linear relation is constructed by using the structures of the groups and the layers, so that feature points can be searched on a continuous Gaussian kernel scale.

(2) In the feature point search process of the present application, the main key step is the interpolation of the extreme points, because in the discrete space, the local extreme points may not be the true extreme points, and the true extreme points may fall in the gaps of the discrete points. The positions of these gaps are interpolated and then the coordinate positions of the extreme points are found.

(3) In the feature description process of the present application, the direction of the feature point is calculated by performing histogram statistics on the gradient directions of points in the neighborhood of the feature point, and the direction with the largest specific gravity in the histogram is selected as the main direction of the feature point, and an auxiliary direction can be selected. When calculating the feature vector, the local image needs to be rotated in the main direction and then go to the gradient histogram statistics in the neighborhood (4x4x 8).

Further, the feature vector of the image can be obtained through an SIFT feature extraction algorithm, and can be used as [ a ]₁,…,a_n]And (4) showing.

The algorithm has certain affine invariance, visual angle invariance, rotation invariance and illumination invariance, and is beneficial to subsequently improving the accuracy of classification and identification after the image is subjected to feature extraction.

(II) Harris feature extraction

Harris corner detection is a first derivative matrix detection method based on image gray scale. The main idea of the detector is local auto-similarity/auto-correlation, i.e. the similarity of an image block within a certain local window to an image block within a window after a small movement in each direction. .

In the neighborhood of the pixel points, the derivative matrix describes the change of the data signal. And assuming that the block area is moved in any direction in the neighborhood of the pixel points, if the intensity is changed violently, the pixel points at the changed positions are angular points. A 2 × 2 Harris matrix is defined as:

wherein, C_xAnd C_yIndicating the point x ═ first derivatives of the intensity information in the x and y directions, respectively, and ω (x, y) indicates the weight of the corresponding position. Judging whether the matrix is a corner point by calculating a corner point response value D of the Harris matrix, and calculating the commonThe formula is as follows:

D＝det A-m(traceA)²＝(ac-b)²-m(a+c)²

and det and trace represent operators of the determinant and the trace, and m is a constant with the value of 0.04-0.06. And when the angular point response value is larger than the set threshold and is a local maximum value in the neighborhood of the point, the point is taken as the angular point.

Therefore, the characteristic extraction calculation can be carried out on the fundus original image with the label through the Harris algorithm to obtain the corresponding characteristic vector, and the corresponding characteristic vector can be used as [ b ]₁,…,b_n]And (4) showing.

By adopting the mode, after SIFT feature extraction and Harris feature extraction are carried out on the fundus original image, the feature vectors [ a ] of the two groups of images of the fundus original image can be extracted₁,…,a_n]And [ b)₁,…,b_n]Further, the step 302 may be continued.

Step 302: and drawing the image characteristic vector of the fundus original image into a characteristic vector image corresponding to the fundus original image.

In a specific implementation process, after the image feature vectors of the fundus original image are extracted in step 301, further, the image feature vectors of the fundus original image may be respectively drawn into feature vector images corresponding to the fundus original image, for example, the plot function in matlab may be used for drawing.

In some possible implementation manners of the present application, the implementation process of the step 302 specifically includes:

step A: drawing an image characteristic vector of the fundus original image into an original characteristic vector image;

and B: and carrying out scale change processing on the original characteristic vector image to generate a characteristic vector image corresponding to the fundus original image.

In practical applications, after the image feature vectors of the fundus original images are extracted in step 301, further, the image feature vectors of the fundus original images may be drawn as original feature vector images by using a plot function in matlab, that is, two sets of one-dimensional vectors are respectively drawn as images, it is understood that the feature vectors of each type of fundus original images are similar, and then, after the image drawing is completed, in order to unify the sizes of the feature maps, step B is performed, that is, a scaling adjustment is performed on the drawn images, that is, the scaling processing is performed on the original feature vector images to generate feature vector images corresponding to the fundus original images, for example, the adjusted feature vector images are unified 256 × 256 images.

After SIFT feature extraction and Harris feature extraction are carried out on the fundus original image, feature vectors [ a ] of two groups of images of the fundus original image can be extracted₁,…,a_n]And [ b)₁,…,b_n]And after the image is drawn, a characteristic vector image corresponding to the fundus original image can be generated and can be used as one of fundus training images, so that the number of the fundus training images is increased, and the classification accuracy of the retina classification model generated by training is improved.

Next, a specific generation method of the preprocessed image corresponding to the original fundus image in step 102 will be described.

In an alternative embodiment, the generation of the pre-processed image corresponding to the fundus original image comprises:

In practical application, in order to increase the number of fundus training images and improve the classification accuracy of the generated retina classification model, a data increment mode can be adopted to perform scale change processing, shearing processing and/or inversion processing on fundus original images, and generate preprocessed images corresponding to the fundus original images as fundus training images, so that the number of fundus training images is increased, for example, if 100 fundus original images are acquired, not only can the 100 fundus original images be used as fundus training images, but also can the 100 fundus original images respectively undergo scale change processing, shearing processing and inversion processing, and then respectively generate 100 corresponding preprocessed images, so as to generate 300 preprocessed images corresponding to the fundus original images, and further can perform classification model training on the 300 preprocessed images as fundus training images, to increase the number of fundus training images.

It can be understood that, in order to unify the sizes of the feature maps, the size of the image output after the original fundus image is subjected to the scale change processing and the cropping processing needs to be unified with the above feature vector image, for example, 256 × 256, then the image is subjected to the left-right turning processing along the vertical axis, and the image after the image preprocessing is output, so that the image can be used as a fundus training image, and the data volume of the training image is effectively expanded.

An optional implementation manner is that, when the application performs scale change processing on the fundus original image, the adopted scale change processing method is the same as that of the feature vector image corresponding to the fundus original image generated by performing scale change processing on the original feature vector image in the step B, and a bilinear interpolation algorithm can be adopted, which is also called bilinear interpolation. Mathematically, bilinear interpolation is linear interpolation extension of an interpolation function with two variables, and the core idea is to perform linear interpolation in two directions respectively.

Assuming that the unknown function f is desired to have a value of (x, y) at point P, assume that the known function f is at Q₁₁＝(x₁,y₁)，Q₁₂＝(x₁,y₂)，Q₂₁＝(x₂,y₁) And Q₂₂＝(x₂,y₂) Values of four points.

Firstly, linear interpolation is carried out in the x direction to obtain:

wherein R is₁＝(x,y₁)。

Wherein R is₂＝(x,y₂)。

Then, linear interpolation is performed in the y direction to obtain:

in this way, the desired result f (x, y) is obtained as follows:

if a coordinate system is chosen such that the four known point coordinates of f are (0, 0), (0, 1), (1, 0) and (1, 1), respectively, the corresponding interpolation formula can be simplified as:

f(x,y)≈f(0_,0)(1-x)(1-y)+f(1,0)x(1-y)+f(0,1)(1-x)y+f(1,1)xy

or expressed as:

the result of such interpolation methods is generally not linear, the result of linear interpolation being independent of the order of interpolation. The same applies to the results obtained by first performing an interpolation in the y-direction and then an interpolation in the x-direction.

After the scale change processing, the shearing processing and/or the overturning processing are/is carried out on the fundus original image in the mode, the preprocessed image corresponding to the fundus original image can be generated and can be used as one of fundus training images, the number of the fundus training images is increased, and the classification accuracy of the retina classification model generated by training is improved.

Further, in step 103, the google lenet model may be trained by using one or more of the fundus original image, the feature vector image corresponding to the fundus original image, and the preprocessed image corresponding to the fundus original image as training images and corresponding retina classification labels (for example, a retina with a small bleeding point corresponding to label 1, a retina with a bleeding spot corresponding to label 2, a retina with a lint spot corresponding to label 3, a retina with a new blood vessel corresponding to label 4, a retina with a fiber growth corresponding to label 5, and a retina with a retinal detachment corresponding to label 6), so as to generate a retina classification model.

In some possible implementation manners of the present application, the step 103 "train the initial deep learning model according to the fundus training image and the retina classification label corresponding to the fundus training image, and an implementation process of generating the retina classification model" specifically includes:

and C: training the initial deep learning model by using a general training image set to generate a general classification model;

in practical application, the initial deep learning training model adopted by the application is a google lenet model, and the general architecture of the google lenet model is as follows:

(1) all convolutions including the inclusion module use a modified linear unit (ReLU);

(2) adopting an RGB color channel, wherein the RGB color space is 224 multiplied by 224, and subtracting the mean value;

(3) #3x3reduce and #5x5reduce denote the number of 1x1 filters in the pre-convolution reduction layers of 3x3 and 5x5, respectively; pool proj represents the number of 1 × 1 filters in the projection layer after the embedded max-pooling; ReLU is used for both the reduction layer and the projection layer;

(4) the network contains 22 layers with parameters (27 if posing layers are considered), with a total of about 100 independent chunking layers;

(5) features generated at levels in the middle of the network may be differentiated, and secondary classifiers may be added to these levels. These classifiers are placed on the outputs of the inclusion (4a) and the inclusion (4b) in the form of small convolutional networks. During the training process, the loss is superimposed on the total loss according to the discounted weight (discount weight is 0.3).

In a specific model training process, firstly, a general training image set imagnet data set is used for training the GoogLeNet model to generate a trained model serving as a general classification model, and then the step D can be continuously executed.

Step D: and training the general classification model according to the eye ground training image and the retina classification label corresponding to the eye ground training image to generate a retina classification model.

In practical application, after the general classification model is generated in step C, one or more of the fundus original image, the feature vector image corresponding to the fundus original image, and the preprocessed image corresponding to the fundus original image may be used as a training image and a retina classification label corresponding thereto (for example, a retina with a small bleeding point corresponding to label 1, a retina with a bleeding spot corresponding to label 2, a retina with a lint spot corresponding to label 3, a retina with a new blood vessel corresponding to label 4, a retina with a fiber proliferation corresponding to label 5, and a retina with a retinal detachment corresponding to label 6) to train the general classification model, and then the retina classification model is generated.

Wherein, this application confirms through a large amount of experiments: when the eyeground training image is trained, asynchronous random gradient reduction is adopted, the momentum is 0.9, and the learning rate is reduced by 4% every 8 epochs. The patch size of the image samples is from 8% to 100% of the image, the aspect ratio is chosen between 3/4 and 4/3 so that photometric warping is beneficial to reduce overfitting, and the image size is also adjusted using bilinear interpolation methods in combination with other hyper-parameter changes.

According to the method, the GoogLeNet model adopted by the method is a 22-layer deep network, and a full connection layer can be changed into a sparse connection layer, so that the problem of limited depth and width is solved, and the accuracy of the classification result of the retina classification model can be improved.

With the above-described embodiment, the retina classification model can be generated by fundus training image training, and further, the generated retina classification model can be verified by a fundus verification image.

The classification model verification method provided by the embodiment of the present application is described below with reference to the accompanying drawings.

Referring to fig. 4, which shows a flowchart of a classification model verification method provided in an embodiment of the present application, as shown in fig. 4, the method includes:

step 401: an original image of the fundus is acquired.

Step 402: and taking one or more of the fundus original image, the characteristic vector image corresponding to the fundus original image and the preprocessed image corresponding to the fundus original image as fundus verification images.

It should be noted that steps 401 to 402 are similar to steps 101 to 102 in the above embodiment, and please refer to the description of the above embodiment for related parts, which is not described herein again.

Step 403: and inputting the fundus verification image into the retina classification model to obtain a retina classification result of the fundus verification image.

In a specific implementation process, in step 402, after one or more of the fundus original image, the feature vector image corresponding to the fundus original image, and the preprocessed image corresponding to the fundus original image are used as a fundus verification image, the fundus verification image may be further input into the retina classification model to obtain a retina classification result of the fundus verification image, and then step 404 may be continuously performed.

Specifically, the fundus verification image may be input into the retina classification model, at least one retina classification result may be obtained, and the retina classification result of the fundus verification image may be determined from the at least one retina classification result according to a voting mechanism. For a detailed description of the voting mechanism, reference may be made to the following examples.

Step 404: and when the retina classification result of the fundus verification image is inconsistent with the retina classification label corresponding to the fundus verification image, the fundus verification image is used as a fundus training image again, and the retina classification model is updated.

In practical applications, through step 403, a retina classification result of the fundus verification image is obtained, wherein when the retina classification result of the fundus verification image is inconsistent with the retina classification label corresponding to the fundus verification image, the fundus verification image may be used as a fundus training image again to update the retina classification model. For example, in the retina classification label, if the label 1 corresponds to a retina with a small bleeding point, and a fundus verification image with a small bleeding point on the retina is input into the retina classification model, the retina classification label of the obtained fundus verification image is the label 2, which indicates that the retina classification result of the fundus verification image is inconsistent with the retina classification label corresponding to the fundus verification image, the fundus verification image with a small bleeding point on the retina can be used as a fundus training image again, the retina classification model is updated, and the accuracy of the retina classification model is improved.

Through the embodiment, the retina classification model can be effectively verified by utilizing the fundus verification image, and when the retina classification result of the fundus verification image is inconsistent with the retina classification label corresponding to the fundus verification image, the retina classification model can be timely adjusted and updated, so that the classification precision and accuracy of the classification model can be improved.

The above is a specific implementation manner of the classification model generation method provided in the embodiment of the present application, and based on the retina classification model in the above embodiment, the embodiment of the present application further provides a fundus image classification method.

Referring to fig. 5, which shows a flowchart of a fundus image classification method provided in an embodiment of the present application, as shown in fig. 5, the method includes:

step 501: acquiring an original fundus image to be classified.

In practical application, the obtained fundus original images can be classified based on the retina classification model generated in the above embodiment, in the classification process, the fundus original images to be classified need to be obtained first, and for relevant description, reference may be made to the content in step 101 in the above embodiment, and details are not described here again. After the fundus original images to be classified are acquired, because a data increment mode is adopted in the training process of the retina classification model, in the process of acquiring the retina classification result, a plurality of fundus images to be classified such as a feature vector image corresponding to the fundus original images and a preprocessed image corresponding to the fundus original images can be generated by using the fundus original images in a data increment mode, and further, the step 502 can be executed.

Step 502: inputting one or more of the fundus original image to be classified, the characteristic vector image corresponding to the fundus original image to be classified and the preprocessed image corresponding to the fundus original image to be classified into a retina classification model to obtain at least one retina classification result, and determining the retina classification result of the fundus original image to be classified from the at least one retina classification result according to a voting mechanism.

In practical application, after acquiring the fundus original image to be classified, the feature vector image corresponding to the fundus original image to be classified, and the preprocessed image corresponding to the fundus original image to be classified in step 501, one or more of the fundus images to be classified may be further input into the retina classification model, so as to obtain at least one retina classification result, it may be understood that the input fundus image to be classified needs to be consistent with the data type of the fundus training image trained to generate the retina classification mode, for example, if the fundus training image trained to generate the retina classification model includes the fundus original image, the input fundus image type to be classified should include the fundus original image, and in the same way, if the fundus training image trained to generate the retina classification mode includes the feature vector image corresponding to the fundus original image, the type of the input fundus image to be classified should include a feature vector image corresponding to the fundus original image, and similarly, if the fundus training image for training to generate the retina classification mode includes a preprocessed image corresponding to the fundus original image, the type of the input fundus image to be classified should include a preprocessed image corresponding to the fundus original image, and the like.

After the fundus original image to be classified is input into the retina classification model, a retina classification result can be obtained, after the feature vector image corresponding to the fundus original image to be classified is input into the retina classification model, another retina classification result can be obtained, after the preprocessed image corresponding to the fundus original image to be classified is input into the retina classification model, another retina classification result can be obtained, the retina classification results may be the same or different, and the final retina classification result of the fundus original image to be classified needs to be determined through a voting mechanism.

That is, further, after obtaining at least one retina classification result, a retina classification result of the fundus original image to be classified may be determined from the at least one retina classification result according to a voting mechanism. The voting mechanism is that when one result with the largest number is selected from a plurality of retina classification results to be used as a final retina classification result of the fundus original image to be classified, or when the retina classification result with the largest number is not unique, one result with the highest classification and identification accuracy is selected to be used as the final retina classification result of the fundus original image to be classified. The classification recognition accuracy is output by the retina classification model during the classification process.

For example, when three fundus images to be classified, namely a fundus original image to be classified, a feature vector image corresponding to the fundus original image to be classified and a preprocessed image corresponding to the fundus original image to be classified, are respectively used for retina classification and identification, the obtained retina classification results are respectively classification 1, classification 1 and classification 2, and then the final retina classification result of the fundus image to be classified is classification 1; for another example, when three fundus images to be classified, namely a fundus original image to be classified, a feature vector image corresponding to the fundus original image to be classified, and a preprocessed image corresponding to the fundus original image to be classified, are respectively used for retina classification and identification, the obtained retina classification results are respectively classification 1, classification 2, and classification 3, and the identification accuracy rates are respectively 80%, 85%, and 90%, the retina classification result (classification 3) with the highest classification and identification accuracy rate is determined as the retina classification result of the fundus original image to be classified according to a voting mechanism.

Wherein the retina classification model is generated according to the classification model generation method in the above embodiment.

In some possible implementations of the present application, the generating process of the feature vector image corresponding to the fundus original image to be classified includes:

extracting image characteristic vectors of fundus original images to be classified;

It should be noted that, for a specific implementation process of the present implementation, reference may be made to the related description of the above step 301 to step 302, and details are not described here again.

In some possible implementation manners of the present application, the process of drawing the image feature vector of the fundus original image to be classified as the feature vector image corresponding to the fundus original image to be classified specifically includes:

drawing an image characteristic vector of an eye fundus original image to be classified into an original characteristic vector image;

It should be noted that, for the specific implementation process of this implementation, reference may be made to the description of step a to step B, which is not described herein again.

In some possible implementations of the present application, the image feature vectors include scale-invariant feature transform feature vectors and corner detection feature vectors.

In some possible implementations of the present application, the generating process of the pre-processed image corresponding to the fundus original image to be classified includes:

It should be noted that, for a specific implementation process of the present implementation, reference may be made to the relevant description of the foregoing embodiments, and details are not described here again.

In some possible implementation manners of the present application, the scale change processing in the present application adopts a bilinear interpolation algorithm, and the specific implementation process thereof may refer to the related description of the above embodiments, which is not described herein again.

Referring to fig. 6, the present application further provides an embodiment of a classification model generation apparatus, which may include:

a first acquisition unit 601 for acquiring a fundus original image;

a second obtaining unit 602, configured to use one or more of the fundus original image, a feature vector image corresponding to the fundus original image, and a preprocessed image corresponding to the fundus original image as a fundus training image;

a generating unit 603, configured to train an initial deep learning model according to the fundus training image and the retina classification label corresponding to the fundus training image, and generate a retina classification model.

In some possible implementations of the present application, the generating process of the feature vector image corresponding to the fundus original image includes:

extracting image characteristic vectors of the fundus original image;

In some possible implementations of the present application, the rendering the image feature vector of the fundus original image as a feature vector image corresponding to the fundus original image includes:

In some possible implementations of the present application, the generating of the pre-processed image corresponding to the fundus original image includes:

In some possible implementations of the present application, the scale-change processing employs a bilinear interpolation algorithm.

In some possible implementations of the present application, the generating unit 603 includes:

Referring to fig. 7, the present application further provides an embodiment of a fundus image classification apparatus, which may include:

an acquisition unit 701 for acquiring an original image of the fundus to be classified;

an obtaining unit 702, configured to input one or more of the original fundus image to be classified, the feature vector image corresponding to the original fundus image to be classified, and the preprocessed image corresponding to the original fundus image to be classified into a retina classification model, obtain at least one retina classification result, and determine, according to a voting mechanism, a retina classification result of the original fundus image to be classified from the at least one retina classification result, where the retina classification model is generated by the classification model generating device.

In some possible implementations of the present application, the rendering the image feature vector of the fundus original image to be classified as the feature vector image corresponding to the fundus original image to be classified includes:

In some possible implementations of the present application, the generating process of the preprocessed image corresponding to the fundus original image to be classified includes:

In addition, an embodiment of the present application further provides a computer-readable storage medium, where instructions are stored, and when the instructions are executed on a terminal device, the terminal device is caused to execute the above classification model generation method or the above fundus image classification method.

Embodiments of the present application further provide a computer program product, which when running on a terminal device, causes the terminal device to execute the above classification model generation method or the above fundus image classification method.

It should be noted that, in the present specification, the embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments may be referred to each other. For the system or the device disclosed by the embodiment, the description is simple because the system or the device corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the method part for description.

It should be understood that in the present application, "at least one" means one or more, "a plurality" means two or more. "and/or" for describing an association relationship of associated objects, indicating that there may be three relationships, e.g., "a and/or B" may indicate: only A, only B and both A and B are present, wherein A and B may be singular or plural. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. "at least one of the following" or similar expressions refer to any combination of these items, including any combination of single item(s) or plural items. For example, at least one (one) of a, b, or c, may represent: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", wherein a, b, c may be single or plural.

It is further noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A classification model generation method, characterized in that the method comprises:

acquiring an original fundus image;

extracting image characteristic vectors of the fundus original image; the image feature vectors comprise scale-invariant feature transformation feature vectors and corner detection feature vectors;

respectively drawing the image characteristic vectors of the fundus original image into characteristic vector images corresponding to the fundus original image;

taking the fundus original image, the feature vector image corresponding to the fundus original image and the preprocessed image corresponding to the fundus original image as fundus training images;

training an initial deep learning model according to the eye ground training image and the retina classification label corresponding to the eye ground training image to generate a retina classification model, wherein the initial deep learning model is a Google network model.

2. The method according to claim 1, wherein the rendering of the image feature vectors of the fundus original image as feature vector images corresponding to the fundus original image, respectively, comprises:

respectively drawing the image characteristic vectors of the fundus original image into original characteristic vector images;

3. The method according to claim 1, wherein the generation process of the pre-processed image corresponding to the fundus original image comprises:

4. The method of claim 1, wherein training an initial deep learning model from the fundus training image and a retinal classification label corresponding to the fundus training image to generate a retinal classification model comprises:

5. A method of eye fundus image classification, the method comprising:

acquiring an original fundus image to be classified;

extracting image characteristic vectors of the fundus original images to be classified; the image feature vectors comprise scale-invariant feature transformation feature vectors and corner detection feature vectors;

respectively drawing the image characteristic vectors of the fundus original images to be classified into characteristic vector images corresponding to the fundus original images to be classified;

inputting the fundus original image to be classified, the feature vector image corresponding to the fundus original image to be classified and the preprocessed image corresponding to the fundus original image to be classified into a retina classification model, obtaining at least one retina classification result, and determining the retina classification result of the fundus original image to be classified from the at least one retina classification result according to a voting mechanism, wherein the retina classification model is generated according to the classification model generation method of any one of claims 1 to 4.

6. The method according to claim 5, wherein the rendering of the image feature vectors of the fundus original images to be classified as the feature vector images corresponding to the fundus original images to be classified respectively comprises:

respectively drawing the image characteristic vectors of the fundus original images to be classified into original characteristic vector images;

7. The method according to claim 5, wherein the generation process of the pre-processed image corresponding to the fundus original image to be classified comprises the following steps:

8. An apparatus for classification model generation, the apparatus comprising:

a first acquisition unit configured to acquire an original image of a fundus; extracting image characteristic vectors of the fundus original image; the image feature vectors comprise scale-invariant feature transformation feature vectors and corner detection feature vectors; respectively drawing the image characteristic vectors of the fundus original image into characteristic vector images corresponding to the fundus original image;

the second acquisition unit is used for taking the fundus original image, the characteristic vector image corresponding to the fundus original image and the preprocessed image corresponding to the fundus original image as fundus training images;

and the generating unit is used for training an initial deep learning model according to the eye fundus training image and the retina classification label corresponding to the eye fundus training image to generate a retina classification model, and the initial deep learning model is a Google network model.

9. An eye fundus image classification apparatus, characterized in that the apparatus comprises:

an acquisition unit for acquiring an original image of the fundus to be classified; extracting image characteristic vectors of the fundus original images to be classified; the image feature vectors comprise scale-invariant feature transformation feature vectors and corner detection feature vectors; respectively drawing the image characteristic vectors of the fundus original images to be classified into characteristic vector images corresponding to the fundus original images to be classified;

an obtaining unit, configured to input the fundus original image to be classified, the feature vector image corresponding to the fundus original image to be classified, and the preprocessed image corresponding to the fundus original image to be classified into a retina classification model, obtain at least one retina classification result, and determine the retina classification result of the fundus original image to be classified from the at least one retina classification result according to a voting mechanism, where the retina classification model is generated by the classification model generating apparatus according to claim 8.

10. A computer-readable storage medium characterized in that instructions are stored therein, which when run on a terminal device, cause the terminal device to execute the classification model generation method of any one of claims 1 to 4 or the fundus image classification method of any one of claims 5 to 7.