CN113421250A - Intelligent fundus disease diagnosis method based on lesion-free image training - Google Patents
Intelligent fundus disease diagnosis method based on lesion-free image training Download PDFInfo
- Publication number
- CN113421250A CN113421250A CN202110756395.8A CN202110756395A CN113421250A CN 113421250 A CN113421250 A CN 113421250A CN 202110756395 A CN202110756395 A CN 202110756395A CN 113421250 A CN113421250 A CN 113421250A
- Authority
- CN
- China
- Prior art keywords
- image
- decoder
- encoder
- loss
- training
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000012549 training Methods 0.000 title claims abstract description 63
- 238000000034 method Methods 0.000 title claims abstract description 57
- 201000010099 disease Diseases 0.000 title claims abstract description 55
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 title claims abstract description 55
- 238000003745 diagnosis Methods 0.000 title claims abstract description 37
- 230000006870 function Effects 0.000 claims abstract description 77
- 230000001575 pathological effect Effects 0.000 claims abstract description 28
- 238000012360 testing method Methods 0.000 claims abstract description 20
- 230000009466 transformation Effects 0.000 claims abstract description 14
- 238000007781 pre-processing Methods 0.000 claims abstract description 3
- 230000004913 activation Effects 0.000 claims description 56
- 239000013598 vector Substances 0.000 claims description 41
- 239000003795 chemical substances by application Substances 0.000 claims description 22
- 238000005070 sampling Methods 0.000 claims description 22
- 238000011084 recovery Methods 0.000 claims description 14
- 230000003902 lesion Effects 0.000 claims description 8
- 238000006243 chemical reaction Methods 0.000 claims description 7
- 230000003247 decreasing effect Effects 0.000 claims description 7
- 238000005457 optimization Methods 0.000 claims description 6
- 238000013507 mapping Methods 0.000 claims description 5
- 230000008569 process Effects 0.000 claims description 5
- 238000012216 screening Methods 0.000 claims description 5
- 238000004364 calculation method Methods 0.000 claims description 4
- 210000002569 neuron Anatomy 0.000 claims description 3
- 238000009827 uniform distribution Methods 0.000 claims description 3
- 238000012935 Averaging Methods 0.000 claims description 2
- 238000000605 extraction Methods 0.000 claims description 2
- 230000001131 transforming effect Effects 0.000 claims description 2
- 230000002708 enhancing effect Effects 0.000 claims 1
- 230000008520 organization Effects 0.000 abstract 1
- 238000004422 calculation algorithm Methods 0.000 description 13
- 230000036541 health Effects 0.000 description 6
- 230000036285 pathological change Effects 0.000 description 6
- 231100000915 pathological change Toxicity 0.000 description 6
- 238000012014 optical coherence tomography Methods 0.000 description 3
- 230000003042 antagnostic effect Effects 0.000 description 2
- 238000007635 classification algorithm Methods 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000010191 image analysis Methods 0.000 description 2
- 230000008447 perception Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 1
- 201000004569 Blindness Diseases 0.000 description 1
- 206010012689 Diabetic retinopathy Diseases 0.000 description 1
- 208000010412 Glaucoma Diseases 0.000 description 1
- 206010020772 Hypertension Diseases 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000017531 blood circulation Effects 0.000 description 1
- 210000004556 brain Anatomy 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000002950 deficient Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 208000002780 macular degeneration Diseases 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000009885 systemic effect Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0012—Biomedical image inspection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30041—Eye; Retina; Ophthalmic
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30168—Image quality inspection
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Medical Informatics (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Radiology & Medical Imaging (AREA)
- Quality & Reliability (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to an intelligent diagnosis method for fundus diseases based on non-pathological image training, and belongs to the technical field of image classification and disease diagnosis. The method comprises the following steps: 1, constructing a training set and a test set and finishing the preprocessing of a data set; 2, constructing an encoder, a decoder, a discriminator and a restoration decoder model for the training of the non-pathological image; 3 constructing an agent task based on image transformation; 4, constructing a weighting loss function based on reconstruction loss, discriminant loss and restoration loss; 5, training a model; and 6, testing the image to be detected by using the trained coding-decoding model. The method gets rid of condition dependence on coexistence of different types of data in a training set through a training mode of image reconstruction; the introduction of the agent task reduces the requirement of the model for data; the constraint of the image and the feature space strengthens the learning of the model to the image organization structure; the above characteristics jointly improve the recognition capability of the model for the disease image.
Description
Technical Field
The invention relates to an intelligent diagnosis method for fundus diseases based on non-pathological image training, and belongs to the technical field of image classification and disease diagnosis.
Background
The fundus images have great significance for medical disease diagnosis, and are often used for ophthalmologists to diagnose various diseases. Various diseases of the eye as well as diseases affecting blood circulation and brain can be visualized in fundus images, including macular degeneration causing blindness, glaucoma, and complications of systemic diseases such as diabetic retinopathy, hypertension, and the like. Compared with other medical images, the device for acquiring the ophthalmologic image has lower requirements, is suitable for basic wide-range general investigation, provides efficient diagnosis service for basic patients, and has wide application prospect and practical social value. The artificial intelligence has the advantages of high speed, high accuracy and the like in medical image auxiliary diagnosis, and plays an important role in assisting doctors in analyzing and identifying pathological changes and improving the diagnosis efficiency.
The medical image diagnosis algorithm at the present stage is mainly based on a deep neural network, training is carried out by utilizing health and disease samples, and a large amount of labeled data is required to be used as a training basis. Clinically, labeled lesion data is rare, and for some novel diseases, it is very difficult to obtain a large number of lesion labels in a short time. When a small amount of samples are used for training, the performance of the model is reduced, and the like. On the other hand, in the field of medical image analysis, a large number of healthy samples cannot be effectively utilized due to the restriction of the amount of lesion samples. Although some researches initially explore classification algorithms under a single-class sample, the classification algorithms cannot be applied to clinic due to the problems of long time consumption, low accuracy and the like in the testing process. By combining the above two problems, how to complete the development of a high-performance diagnosis system under the condition of no pathological change data, namely only using the data of healthy people, is a research problem to be solved in the field of medical image analysis.
The invention aims to make an effort to solve the clinical disease diagnosis under the scene of a non-pathological image by utilizing an unsupervised diagnosis algorithm and combining the distribution characteristics of fundus images in a characteristic space, provides a deep learning fundus disease intelligent diagnosis method based on non-pathological image training, and assists doctors to finish high-accuracy disease diagnosis.
Disclosure of Invention
The invention aims to solve the following two defects existing in the existing fundus image classification diagnosis algorithm: 1) existing algorithms rely on a large amount of labeled data: under the condition of lacking data, the performance of the algorithm is poor; 2) data tags that are overly dependent on balance: on the premise of only one type of label, the performance of the algorithm is deficient; provides an intelligent diagnosis method for fundus diseases based on non-pathological image training.
In order to achieve the above object, the present invention adopts the following aspects.
The intelligent fundus disease diagnosis method based on the non-pathological image training is realized by the following steps:
the method comprises the following steps: preprocessing the collected images to construct a data set, specifically comprising: screening the collected images, eliminating images with poor image quality, unifying the resolution of the screened images into the same dimension W C, and normalizing the pixel values to the range of [ -1,1 ];
wherein c is greater than or equal to 1;
the data set is divided into a training set and a testing set, and clinically acquired ophthalmic images are selected; the images in the training set are formed by the ophthalmic images of healthy individuals, and the images in the testing set are a set of the images of the healthy individuals and the diseased individuals; the image with poor quality specifically includes: the picture is too dark, the deviation of the shooting visual angle is large, and a jittered image is shot;
step two: designing a network model comprising an encoder, a decoder, a discriminator 1, a discriminator 2 and a restoration decoder, and specifically comprising the following substeps:
step 2.1, constructing an encoder of the multilayer convolution layer;
the encoder comprises N groups of down-sampling layers, a regularization layer and an activation function which are connected in series; the encoder input is an image with dimension W C after step one, and the output is 1W 2N+1A feature vector z expressing the essence of the image;
the value of N is less than or equal to log2W;
The down-sampling layer comprises convolution layer with convolution kernel size of n × n and step length of 2, and the number of channels is from 22Sequentially increasing to 2 by multiple increment of 2N+1;
The activation function is a LeakyReLU activation function with a negative slope L;
the value of n is [3,5 ]; l takes the value of [0,1 ];
step 2.2, constructing a decoder of the multilayer deconvolution layer;
the decoder comprises N groups of up-sampling layers, a regularization layer and an activation function which are connected in series; the decoder inputs the feature vector z output by the encoder in the second step, and the output is an image with the dimension of W x c;
the up-sampling layer comprises convolution layer with convolution kernel size n × n and step length 2, and the number of channels is from 2N+1Sequentially decreasing to 2 by multiple of 22;
The activation function is a ReLU activation function with a negative slope L, and the activation function at the outermost layer is a Tanh function;
step 2.3, constructing a discriminator 1 of a feature space by adopting a multi-layer perceptron structure;
wherein, the input of the discriminator 1 is the characteristic vector z expressing the essence of the image output in the step 2.1; the discriminator 1 comprises K series-connected full-connection layers, a Dropout layer and an activation function;
the number of neurons in each layer of the full-connection layer is from 2KDecreasing by a factor of 2 to 20;
The random discard parameter of the Dropout layer is p;
the activation functions are all K-1 activation functions except the activation function of the last layer which is sigmoid, and the other activation functions are LeakyReLU with the negative slope L;
wherein, the value range of K is [5, log2W](ii) a p has a value range of [0,1]];
Step 2.4, constructing a discriminator 2 of an image space by adopting a PatchGAN structure;
wherein, the discriminator 2 comprises P convolution layers; the front P-1 convolutional layers comprise convolutional layers, regularization layers and activation functions;
the convolution layer comprises convolution layers with convolution kernel size n × n and step size 2, and the number of channels is from 22Sequentially increasing to 2 by multiple increment of 2P;
Directly outputting the final layer of convolution, wherein the rest P-1 activation functions are LeakyReLU functions with negative slope L;
wherein, the value range of PIs [5, log ]2W];
Step 2.5, a restoration decoder is constructed by adopting a plurality of layers of deconvolution layers;
the recovery decoder comprises N groups of up-sampling layers, a regularization layer and an activation function which are connected in series; the input of the decoder is the output vector of the encoder in the second step, and the output is an image with the dimension of W x c;
the up-sampling unit comprises convolution layers with convolution kernel size of n x n and step length of 2, and the number of channels is from 2N+1Sequentially decreasing to 2 by multiple of 22;
The activation function in the first N-1 groups of 'up-sampling layer, regularization layer and activation function' is a ReLU activation function with a negative slope L, and the activation function in the last group of 'up-sampling layer, regularization layer and activation function' is a Tanh function;
step three: constructing an agent task, specifically: the method comprises the following steps of constructing an agent task based on reconstruction by using images in a training set, namely constructing the agent task by adopting a local pixel conversion mode, an image brightness nonlinear transformation mode and a local region patch mode, and specifically comprising the following substeps:
step 3.1, local pixel conversion, namely, randomly exchanging pixel values of different positions of a local area in an image, and outputting the image after random conversion, wherein the method specifically comprises the following steps: randomly selecting M in an image1Size of [1,7 ]]The pixels of the image blocks are randomly exchanged inside the image blocks;
step 3.2, image brightness nonlinear transformation is carried out, and the image after the nonlinear transformation is output, specifically: constructing a Bezier mapping curve B (t) based on formula (1) through three control points which are given randomly, and completing mapping of image pixel values based on the curve:
B(t)=(1-t)2P0+2t(1-t)P1+t2P2,t∈[0,1] (1)
where t denotes the luminance of the pixel, P0,P1,P2Three control points are randomly acquired;
step 3.3, local area patching, namely, carrying out random pixel value filling on the randomly selected area in the image to obtain the patched image of the local area;
randomly selecting M in an image2Each size isAn inter-integer image block, wherein pixels contained in the image block are filled with random noise values obtained from uniform distribution;
step four: constructing a total loss function, specifically a weighted sum of image reconstruction loss, feature reconstruction loss, discrimination loss of an image space and a feature space and restoration loss based on an agent task;
wherein image reconstruction is lostAnd (3) constraining the difference between the real image and the reconstructed image by adopting an L1 loss function, and calculating the formula as (2):
wherein, x represents the input image,respectively representing the decoder and the encoder;representing the coded output resulting from coding the input image x,representing the encoded output of the input image x, and then decoding to obtain a decoded output; II-1Represents a norm of 1;
loss of feature reconstructionThe difference between the representation of the features of the image in the feature space and the representation of the features of the reconstructed image is calculated, again using the L1 loss functionThe formula is shown as (3):
wherein z is a feature vector of the essence of the image output by the encoder in the step 2.1;represents a decoded output obtained by decoding the input feature vector z;represents the output of decoding and then encoding the input eigenvector z;
discriminator loss of image space and feature spaceThe difference between the output of the encoder and the decoder and the image and the feature in the real space is constrained, and the calculation formula is as follows (4):
wherein D isI、DFA discriminator 1 of an image space and a discriminator 2 of a feature space, respectively;is the output result of the input image x through the encoder and the discriminator 2;is the result of the output of the characteristic vector z by the decoder and the discriminator 1; dI、DFBy the following loss equation(5) And (6) performing iterative optimization:
wherein D isI(x) For the input image x, the result, D, is output via a discriminator 1F(z) the result of the feature vector z output by the discriminator 2;
agent task based recovery lossBy utilizing the recovered proxy task, the extraction capability of the encoder on the image features is enhanced, and the calculation formula is as follows (7):
wherein,represents a restoration decoder; x is the number ofaRepresenting transformed image input obtained by the agent task;representing a transformed image xaThe output after the encoder and the recovery decoder;
the total loss function is calculated as follows in equation (8):
wherein, a is image reconstruction loss, b is characteristic reconstruction loss, c is discrimination loss of an image space and a characteristic space, and d is a weight coefficient of partial loss;
step five: model training to obtain a trained encoder and decoder, comprising the following substeps:
step 5.1 will beHealthy eyeground image input encoder for normal personForward propagation to obtain image feature vector, input to decoderIn the middle, a reconstructed image is obtained in the forward direction; inputting the reconstructed image into an encoder, and carrying out forward propagation to obtain a reconstructed feature vector; taking the three images constructed in the steps 3.1 to 3.3 as the input of a coding-restoring decoder, and the original image as a training label, the learning of the agent task is completed, and the method specifically comprises the following steps: randomly transforming the input healthy eye fundus images according to the proxy task, inputting the transformed healthy eye fundus images into an encoder and a recovery decoder, and obtaining a recovery image in the forward direction;
step 5.2, calculating image reconstruction loss and characteristic reconstruction loss;
step 5.3, inputting the reconstructed image and the reconstructed feature vector into discriminators 1 and 2 of an image space and a feature space respectively, and calculating discrimination loss of the image space and the feature space;
step 5.4, calculating the recovery loss of the agent task;
step 5.5, performing back propagation and parameter optimization, and optimizing the discriminator and the encoder-decoder by adopting a mode of alternately optimizing the discriminator and the encoder-decoder;
step 5.6, repeating the steps 5.1-5.5, traversing all images in the training set once, recording a total loss function value in the process and drawing a curve, and adjusting the learning rate after the loss curve is converged stably so as to facilitate the model to continue learning;
step 5.7, storing the trained encoder and decoder;
step six: testing the images of the test set by using a trained encoder and a trained decoder, selecting a threshold value for judging the large-batch images, and outputting a conclusion whether the images are normal or not, specifically;
step 6.1, the input image is processed by an encoder to obtain an image characteristic vector, then processed by a decoder to obtain a reconstructed image, and the reconstructed image is input into the encoder again to obtain a reconstructed characteristic vector;
step 6.2, calculating the difference d1 between the input image and the reconstructed image, and calculating the difference d2 between the image feature vector and the reconstructed feature vector;
step 6.3, averaging the differences d1 and d2 of the two parts to be used as a score value of the input image, wherein the larger the score value is, the higher the probability that the image is a lesion image is, and the lower the probability is otherwise; selecting an optimal threshold value according to the label of the test set, if the optimal threshold value is larger than the threshold value, judging that the input image is a pathological image, otherwise, judging that the input image is a normal image;
and finally, completing the disease diagnosis method of the disease-free image training by the steps from the first step to the sixth step.
Advantageous effects
The invention relates to an intelligent diagnosis method for eyeground diseases based on non-pathological image training, which has the following beneficial effects compared with the existing disease diagnosis algorithm:
1. the method directly utilizes the images of healthy people for training without a data balance training set with pathological changes, effectively solves the problem that the existing algorithm depends on disease image data, and simultaneously caters to the current application situation of rare clinical disease data;
2. according to the method, the image reconstruction is constrained through two dimensions of an image space and a feature space, so that the feature perception capability of the model on a normal image is improved;
3. on the basis of reconstruction, the method provides discrimination loss of an image and a feature space, and further improves each structural part of a model: the encoder and the decoder have the characteristic learning capacity for the trained health images; the recognition capability of the model for dealing with the disease image is further improved;
4. the method provides an agent task, under the condition of a certain data volume, the model learns the boundary and structure information of the deep image through different forms of restoration tasks, the help is provided on the task of healthy image reconstruction, and the performance of the model in the aspect of disease identification is improved.
Drawings
FIG. 1 is a schematic diagram of a network structure supported by the method for intelligently diagnosing fundus diseases based on non-pathological image training of the present invention;
FIG. 2 is a schematic flow chart of an example of the method for intelligently diagnosing fundus diseases based on the training of non-pathological images according to the present invention;
FIG. 3 is a test flow of the intelligent diagnosis method for fundus diseases based on the training of non-pathological images according to the present invention;
FIG. 4 is a ROC curve chart comparing the fundus disease intelligent diagnosis method based on the non-pathological image training with the prior method.
Detailed Description
The intelligent diagnosis method for fundus diseases based on the non-pathological image training of the invention is further explained and described in detail with reference to the accompanying drawings and embodiments.
Example 1
This embodiment describes a specific implementation of the fundus disease intelligent diagnosis method based on the non-pathological image training of the present invention.
The invention can be applied to the screening work of hospitals and medical institutions of different scales for diseases, and can judge whether the patient is a healthy or sick individual or not according to the difference among the images in the step six by carrying out medical image acquisition and model classification and identification on the patient to be diagnosed. In medical institutions of different sizes, the uniform image pixel size W may be adjusted according to the device computing power at different locations.
FIG. 1 is a schematic diagram of a network structure supported by the fundus disease intelligent diagnosis method based on the pathologic image training. In FIG. 1In order to be an encoder, the encoder is,in order to be a decoder, the decoder,to restore the decoder, DFDiscriminators being feature spaces, DIIs a discriminator of image space, x is an input mapLike, z is a feature vector of the image,is reconstructed image, x'aRestoring an image for the agent task; updating parameters of the model through reconstruction loss of the image space and the feature space, discrimination loss of the image space and the feature space and restoration loss of the agent task;
fig. 2 is a flowchart of a disease identification algorithm in an embodiment of the present invention, taking a fundus OCT image as a specific example, including the following steps:
step A: constructing a training set and a testing set;
1 ten thousand images acquired by an Optical Coherence Tomography (OCT) technology are used as implementation images, and only health data without pathological changes are used as a training set; simultaneously, 200 equivalent healthy data and 200 equivalent pathological change data are taken as a test set;
after image quality screening is carried out, data which are too high in fuzzy degree and too dark and do not meet the conditions are removed, the image is zoomed to 256 × 1 pixel size, and the pixel size is normalized to a pixel value interval of [ -1,1 ];
completing the construction of a training set and a test set through the step A;
and B: constructing a network model;
constructing an encoder, a decoder, a recovery decoder, a discriminator model 1 and a discriminator model 2; updating model parameters according to the loss function, and storing the trained models, such as an encoder and a decoder, for model testing;
step B.1, an encoder model comprises 8 groups of down-sampling layers, a regularization layer and an activation function which are connected in series; inputting an encoder model into the image with the dimension of 256 × 1 processed in the step A, and outputting a feature vector z with the dimension of 1 × 512; the down-sampling layer is a convolution layer with convolution kernel size of 4 x 4 and step length of 2, and the number of channels increases from 4 to 512 in turn by multiple of 2; the activation function is a LeakyReLU activation function with a negative slope of 0.2;
step B.2, the decoder comprises 8 groups of up-sampling layers, a regularization layer and an activation function which are connected in series; the decoder model input is the output vector of the encoder, and the output is an image with the dimension of 256 × 1; the up-sampling layer comprises a convolution layer with convolution kernel size of 4 x 4 and step length of 2, and the number of channels is sequentially decreased from 512 to 4 by multiples of 2; the activation function is a ReLU activation function with a negative slope of 0.2, and the activation function at the outermost layer is a Tanh function;
step B.3, the recovery decoder comprises 8 groups of up-sampling layers, regularization layers and activation functions which are connected in series; the decoder model input is the output vector of the encoder, and the output is an image with the dimension of 256 × 1; the up-sampling unit comprises a convolution layer with convolution kernel size of 4 x 4 and step length of 2, and the number of channels 512 is sequentially decreased to 4 by multiples of 2; the activation function is a ReLU activation function with a negative slope of 0.2, and the activation function at the outermost layer is a Tanh function;
step B.4, the input of the discriminator model 1 is a characteristic vector z output by the encoder, and the characteristic vector z comprises 7 serially connected fully-connected layers, a Dropout layer and an activation function; d.4, each layer of the fully-connected layer utilizes a discriminator model and adopts antagonistic discrimination loss, so that the capability of the encoder on image encoding is improved, and the capability of the decoder on feature vector reconstruction images is improved; the model has large response to non-training images (disease images), and the recognition capability of the model to diseases is improved.
The number of neurons is decreased from 128 to 1 by a multiple of 2; the random discard parameter of Dropout layer is 0.5; the activation functions are LeakyReLU with negative slope of 0.2 except that the activation function of the last layer is sigmoid;
step B.5 the discriminator model 2 includes 5 convolution layers; the first 4 convolutional layers comprise convolutional layers, regularization layers and activation functions; the convolution layer comprises convolution layers with convolution kernel size of 4 x 4 and step size of 2, and the number of channels increases from 4 to 32 in turn by multiple increment of 2; the activation function is LeakyReLU with negative slope of 0.2, and the final layer of convolution is directly output;
step C, constructing an agent task;
in the training image of the step A, randomly performing some transformation of local pixel transformation, luminance nonlinear transformation and local area patches on each image;
the local pixel conversion utilizes the random selection of 6 image blocks with the integral size of [1,6], and the random exchange of pixels is carried out in each selected image block;
the brightness nonlinear transformation utilizes a Bezier mapping curve constructed in the formula (1) to regulate and control the pixel value of the image, and the transformation of the pixel value is completed;
randomly selecting 6 image blocks with the size of an integer between [42 and 51] in the image by the local area patch, and filling pixels contained in the image blocks with random noise values obtained from uniform distribution;
step D: training a loss function and a model;
step D.1 construction of model loss function, data representation asWhereinA training image representative of health; x is the number ofaThe image is processed by the agent task; the method optimizes the weighted loss functionUpdating parameters, specifically including the following contents:
Wherein a is 1, b is 10, c is 10, and d is 10;
step E: model training, updating and storing parameters; training As shown in FIG. 2, firstly, the processed normal images are sequentially input into the encoder-decoder-encoder, the feature vectors, the reconstructed images and the reconstructed feature vectors are sequentially obtained, and the image reconstruction loss is calculatedAnd feature reconstruction lossRespectively initializing the reconstructed feature vector and the reconstructed image into a feature discriminator model 1 and a feature discriminator model 2 to calculate the discriminator lossB, sequentially obtaining the restoration characteristic vector and the reconstruction restoration image through the image in the step A.2 in an encoder-restoration decoder, and calculating the restoration loss of the imageLoss weighting to final loss according to step D.1Performing back propagation and parameter optimization, wherein Adam is selected as an optimization mode, the batch size is 32, each batch of image is optimized for the discriminator once, the generator is optimized for the generator once again, and the learning rate is set to be 2e-4(ii) a Finally, storing the trained model;
wherein the generator comprises an encoder, a decoder and a restoration decoder;
step F: model testing and image classification;
as shown in fig. 3, the input image is input to the model, the difference between the input image and the reconstructed image and the difference between the image feature and the reconstructed image feature are calculated, the average of the two is calculated as the probability that the image is discriminated as a lesion image, and the lesion image is identified by threshold screening.
Thus, the whole process of the disease diagnosis method based on the non-pathological image is realized. The ROC curve of the experimental result is shown in FIG. 4, and the curve of the method is much higher than that of other methods, so that the method can well complete the identification of the healthy image and the pathological image under the training condition of lacking pathological image, and the problem that the existing algorithm can not perform clinical classification task under the condition of lacking category is solved.
The method directly utilizes the images of healthy people for training without a data balance training set with pathological changes, effectively solves the problem that the existing algorithm depends on disease image data, and caters to the current application situation of rare clinical disease data, and is specifically embodied as follows: step A, the requirement of an algorithm on the data volume is reduced through the transformation of an image; the training process of the step C solves the problem that the existing algorithm depends on multi-class training images;
according to the method, the image reconstruction is constrained through two dimensions of the image space and the feature space, and the feature perception capability of the model on the normal image is improved. Step D.2 and step D.3, comparing the original input and the reconstructed output of the two spaces by using L1 loss, and improving the perception capability of the model to the image characteristics through constraint difference;
on the basis of reconstruction, the method provides discrimination loss of an image and a feature space, and further improves each structural part of a model: the encoder and the decoder have the characteristic learning capacity for the trained health images; the recognition capability of the model for dealing with the disease image is further improved, and the recognition capability is embodied as follows: d.4, by utilizing a discriminator model and adopting antagonistic discrimination loss, the capability of an encoder for encoding the image is improved, and the capability of the decoder for reconstructing the image by using the feature vector is improved; the model has large response to non-training images (disease images), and the recognition capability of the model to diseases is improved.
The method provides an agent task, and under the condition of a certain data volume, the model learns the information such as the boundary, the structure and the like of the deep image through different forms of restoration tasks, thereby providing help on the task of healthy image reconstruction and improving the performance of the model on disease identification. Step C, by carrying out proxy task transformation on the existing health image and combining the restoration loss of the step D.6, the learning of the model on boundary and structural information is promoted according to different tasks, and the accuracy of the model on a reconstruction task is strengthened; and further, the recognition capability of the model for the disease image is improved.
While the foregoing is directed to the preferred embodiment of the present invention, it is not intended that the invention be limited to the embodiment and the drawings disclosed herein. Equivalents and modifications may be made without departing from the spirit of the disclosure, which is to be considered as within the scope of the invention.
Claims (10)
1. An intelligent diagnosis method for fundus diseases based on non-pathological image training is characterized in that: the method is realized by the following steps:
the method comprises the following steps: preprocessing the collected images to construct a data set; wherein, the data set is divided into a training set and a testing set;
step two: designing a network model comprising an encoder, a decoder, a discriminator 1, a discriminator 2 and a restoration decoder, and specifically comprising the following substeps:
step 2.1, constructing an encoder of the multilayer convolution layer;
the encoder comprises N groups of down-sampling layers, a regularization layer and an activation function which are connected in series;
step 2.2, constructing a decoder of the multilayer deconvolution layer;
the decoder comprises N groups of up-sampling layers, a regularization layer and an activation function which are connected in series;
step 2.3, constructing a discriminator 1 of a feature space by adopting a multi-layer perceptron structure;
step 2.4, constructing a discriminator 2 of an image space by adopting a PatchGAN structure;
wherein, the discriminator 2 comprises P convolution layers; the front P-1 convolutional layers comprise convolutional layers, regularization layers and activation functions;
step 2.5, a restoration decoder is constructed by adopting a plurality of layers of deconvolution layers;
the recovery decoder comprises N groups of up-sampling layers, a regularization layer and an activation function which are connected in series;
step three: constructing an agent task, specifically: the method comprises the following steps of constructing an agent task based on reconstruction by using images in a training set, namely constructing the agent task by adopting a local pixel conversion mode, an image brightness nonlinear transformation mode and a local region patch mode, and specifically comprising the following substeps:
step 3.1, local pixel conversion, namely, randomly exchanging pixel values of different positions of a local area in an image and outputting the image after random conversion;
step 3.2, carrying out nonlinear transformation on the image brightness, and outputting the image subjected to the nonlinear transformation;
step 3.3, local area patching, namely, carrying out random pixel value filling on the randomly selected area in the image to obtain the patched image of the local area;
step four: constructing a total loss function, specifically a weighted sum of image reconstruction loss, feature reconstruction loss, discrimination loss of an image space and a feature space and restoration loss based on an agent task;
step five: model training to obtain a trained encoder and decoder, comprising the following substeps:
step 5.1 inputting the healthy eyeground image of the normal person into the encoderForward propagation to obtain image feature vector, input to decoderIn the middle, a reconstructed image is obtained in the forward direction; inputting the reconstructed image into an encoder, and carrying out forward propagation to obtain a reconstructed feature vector; taking the three images constructed in the steps 3.1 to 3.3 as the input of a coding-restoring decoder, and the original image as a training label, the learning of the agent task is completed, and the method specifically comprises the following steps: randomly transforming the input healthy eye fundus images according to the proxy task, inputting the transformed healthy eye fundus images into an encoder and a recovery decoder, and obtaining a recovery image in the forward direction;
step 5.2, calculating image reconstruction loss and characteristic reconstruction loss;
step 5.3, inputting the reconstructed image and the reconstructed feature vector into discriminators 1 and 2 of an image space and a feature space respectively, and calculating discrimination loss of the image space and the feature space;
step 5.4, calculating the recovery loss of the agent task;
step 5.5, performing back propagation and parameter optimization, and optimizing the discriminator and the encoder-decoder by adopting a mode of alternately optimizing the discriminator and the encoder-decoder;
step 5.6, repeating the steps 5.1-5.5, traversing all images in the training set once, recording the value of the total loss function in the process, drawing a curve, and adjusting the learning rate after the loss curve is converged stably so as to facilitate the model to continue learning;
step 5.7, storing the trained encoder and decoder;
step six: testing the images of the test set by using a trained encoder and a trained decoder, selecting a threshold value for judging the large-batch images, and outputting a conclusion whether the images are normal or not, specifically;
step 6.1, the input image is processed by an encoder to obtain an image characteristic vector, then processed by a decoder to obtain a reconstructed image, and the reconstructed image is input into the encoder again to obtain the characteristic vector of the reconstructed image;
step 6.2, calculating the difference d1 between the input image and the reconstructed image, and calculating the difference d2 between the image feature vector and the reconstructed feature vector;
step 6.3, averaging the differences d1 and d2 of the two parts to be used as a score value of the input image, wherein the larger the score value is, the higher the probability that the image is a lesion image is, and the lower the probability is otherwise; and selecting an optimal threshold according to the label of the test set, and if the optimal threshold is larger than the optimal threshold, judging that the input image is a lesion image, otherwise, judging that the input image is a normal image.
2. An intelligent diagnosis method for fundus diseases based on non-pathological image training as claimed in claim 1, wherein: the method comprises the following steps: screening the collected images, eliminating images with poor image quality, unifying the resolution of the screened images into the same dimension W C, and normalizing the pixel values to the range of [ -1,1 ].
3. An intelligent diagnosis method for fundus diseases based on non-pathological image training as claimed in claim 2, wherein: c is 1 or more.
4. A method as claimed in claim 3, wherein the method comprises the following steps: the training set and the testing set both adopt clinically collected ophthalmic images; the images in the training set are formed by the ophthalmic images of healthy individuals, and the images in the testing set are a set of the images of the healthy individuals and the diseased individuals; the image with poor quality specifically includes: too dark picture, large deviation of shooting angle of view, and shooting a jittery image.
5. An intelligent diagnosis method for fundus diseases based on non-pathological image training as claimed in claim 4, wherein: in step 2.1 the encoder input is the image with dimension W x c after step one and the output is 1 x 2N+1A feature vector z expressing the essence of the image; the value of N is less than or equal to log2W; the down-sampling layer comprises convolution layer with convolution kernel size of n × n and step length of 2, and the number of channels is from 22Sequentially increasing to 2 by multiple increment of 2N+1(ii) a The activation function is a LeakyReLU activation function with a negative slope L; n is [3,5]](ii) a L is [0,1]]。
6. An intelligent diagnosis method for fundus diseases based on non-pathological image training as claimed in claim 5, wherein: in step 2.2, the decoder inputs the eigenvector z output by the encoder in step two, and the output is an image with the dimension of W x c; the up-sampling layer comprises convolution layer with convolution kernel size n × n and step length 2, and the number of channels is from 2N+1Sequentially decreasing to 2 by multiple of 22(ii) a The activation function is a ReLU activation function with a negative slope L, and the outermost activation function is a Tanh function.
7. An intelligent diagnosis method for fundus diseases based on non-pathological image training as claimed in claim 6, wherein: in step 2.3, the input of the discriminator 1 is the feature vector z expressing the essence of the image output in step 2.1; the discriminator 1 comprises K series-connected full-connection layers, a Dropout layer and an activation function; the number of neurons in each layer of the full-connection layer is from 2KDecreasing by a factor of 2 to 20(ii) a The random discard parameter of the Dropout layer is p; the activation functions are all K-1 activation functions except the activation function of the last layer which is sigmoid, and the other activation functions are LeakyReLU with the negative slope L; wherein, the value range of K is [5, log2W](ii) a p has a value range of [0,1]]。
8. An intelligent diagnosis method for fundus diseases based on non-pathological image training as claimed in claim 7, wherein: in step 2.4, the convolution layer includes a convolution kernel having a convolution kernel size of n x n,Convolution layer with step length of 2 and channel number of 22Sequentially increasing to 2 by multiple increment of 2P(ii) a Directly outputting the convolution of the last layer, wherein the rest P-1 activation functions are LeakyReLU functions with negative slope L; wherein, the value range of P is [5, log2W];
In step 2.5, the decoder inputs the output vector of the encoder in step two and outputs the image with dimension W x c; the up-sampling unit comprises convolution layers with convolution kernel size of n x n and step length of 2, and the number of channels is from 2N+1Sequentially decreasing to 2 by multiple of 22(ii) a The activation function in the first N-1 groups of the up-sampling layer, the regularization layer and the activation function is a ReLU activation function with a negative slope L, and the activation function in the last group of the up-sampling layer, the regularization layer and the activation function is a Tanh function.
9. An intelligent diagnosis method for fundus diseases based on non-pathological image training as claimed in claim 8, wherein: step 3.1, specifically: randomly selecting M in an image1Size of [1,7 ]]The pixels of the image blocks are randomly exchanged inside the image blocks;
step 3.2, specifically: constructing a Bezier mapping curve B (t) based on formula (1) through three control points which are given randomly, and completing mapping of image pixel values based on the curve:
B(t)=(1-t)2P0+2t(1-t)P1+t2P2,t∈[0,1] (1)
where t denotes the luminance of the pixel, P0,P1,P2Three control points are randomly acquired;
10. According to claim9 the intelligent diagnosis method for fundus diseases based on the training of the non-pathological image is characterized in that: in step four, image reconstruction is lostAnd (3) constraining the difference between the real image and the reconstructed image by adopting an L1 loss function, and calculating the formula as (2):
wherein, x represents the input image,respectively representing the decoder and the encoder;representing the coded output resulting from coding the input image x,representing the encoded output of the input image x, and then decoding to obtain a decoded output; II-1Represents a norm of 1;
loss of feature reconstructionThe difference between the feature expression of the image in the feature space and the feature expression of the reconstructed image is calculated by using the L1 loss function, and the formula is shown as (3):
wherein z is a feature vector z of the essence of the image output by the encoder in the step 2.1;represents a decoded output obtained by decoding the input feature vector z;represents the output of decoding and then encoding the input eigenvector z;
discriminator loss of image space and feature spaceThe difference between the output of the encoder and the decoder and the image and the feature in the real space is constrained, and the calculation formula is as follows (4):
wherein D isI、DFA discriminator 1 of an image space and a discriminator 2 of a feature space, respectively;is the output result of the input image x through the encoder and the discriminator 2;is the result of the output of the characteristic vector z by the decoder and the discriminator 1; dI、DFBy the following loss equation(5) And (6) performing iterative optimization:
wherein D isI(x) For the input image x, the result, D, is output via a discriminator 1F(z) the result of the feature vector z output by the discriminator 2;
agent task based recovery lossAnd enhancing the extraction capability of the encoder on the image characteristics by utilizing the restored proxy task, wherein the calculation formula is as follows:
wherein,represents a restoration decoder; x is the number ofaRepresenting transformed image input obtained by the agent task;representing a transformed image xaThe output after the encoder and the recovery decoder;
the final loss function is calculated as follows in equation (8):
where a is an image reconstruction loss, b is a feature reconstruction loss, c is a discrimination loss between an image space and a feature space, and d is a weight coefficient of a partial loss.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110756395.8A CN113421250A (en) | 2021-07-05 | 2021-07-05 | Intelligent fundus disease diagnosis method based on lesion-free image training |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110756395.8A CN113421250A (en) | 2021-07-05 | 2021-07-05 | Intelligent fundus disease diagnosis method based on lesion-free image training |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113421250A true CN113421250A (en) | 2021-09-21 |
Family
ID=77720146
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110756395.8A Pending CN113421250A (en) | 2021-07-05 | 2021-07-05 | Intelligent fundus disease diagnosis method based on lesion-free image training |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113421250A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113902036A (en) * | 2021-11-08 | 2022-01-07 | 哈尔滨理工大学 | Multi-feature fusion type fundus retinal disease identification method |
CN117173543A (en) * | 2023-11-02 | 2023-12-05 | 天津大学 | Mixed image reconstruction method and system for lung adenocarcinoma and pulmonary tuberculosis |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111476757A (en) * | 2020-03-10 | 2020-07-31 | 西北大学 | Coronary artery patch data detection method, system, storage medium and terminal |
CN111798464A (en) * | 2020-06-30 | 2020-10-20 | 天津深析智能科技有限公司 | Lymphoma pathological image intelligent identification method based on deep learning |
CN112598658A (en) * | 2020-12-29 | 2021-04-02 | 哈尔滨工业大学芜湖机器人产业技术研究院 | Disease identification method based on lightweight twin convolutional neural network |
-
2021
- 2021-07-05 CN CN202110756395.8A patent/CN113421250A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111476757A (en) * | 2020-03-10 | 2020-07-31 | 西北大学 | Coronary artery patch data detection method, system, storage medium and terminal |
CN111798464A (en) * | 2020-06-30 | 2020-10-20 | 天津深析智能科技有限公司 | Lymphoma pathological image intelligent identification method based on deep learning |
CN112598658A (en) * | 2020-12-29 | 2021-04-02 | 哈尔滨工业大学芜湖机器人产业技术研究院 | Disease identification method based on lightweight twin convolutional neural network |
Non-Patent Citations (2)
Title |
---|
CHUAN LI等: "Precomputed Real-Time Texture Synthesis with Markovian Generative Adversarial Networks", 《ARXIV》 * |
HE ZHAO等: "Anomaly Detection for Medical Images using Self-supervised and Translation-consistent Features", 《IEEE》 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113902036A (en) * | 2021-11-08 | 2022-01-07 | 哈尔滨理工大学 | Multi-feature fusion type fundus retinal disease identification method |
CN117173543A (en) * | 2023-11-02 | 2023-12-05 | 天津大学 | Mixed image reconstruction method and system for lung adenocarcinoma and pulmonary tuberculosis |
CN117173543B (en) * | 2023-11-02 | 2024-02-02 | 天津大学 | Mixed image reconstruction method and system for lung adenocarcinoma and pulmonary tuberculosis |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109376636B (en) | Capsule network-based eye fundus retina image classification method | |
CN115205300B (en) | Fundus blood vessel image segmentation method and system based on cavity convolution and semantic fusion | |
CN109543526B (en) | True and false facial paralysis recognition system based on depth difference characteristics | |
CN109584254A (en) | A kind of heart left ventricle's dividing method based on the full convolutional neural networks of deep layer | |
CN111598894B (en) | Retina blood vessel image segmentation system based on global information convolution neural network | |
CN116071292A (en) | Ophthalmoscope retina image blood vessel identification method based on contrast generation learning | |
CN112465905A (en) | Characteristic brain region positioning method of magnetic resonance imaging data based on deep learning | |
CN113421250A (en) | Intelligent fundus disease diagnosis method based on lesion-free image training | |
CN115908463A (en) | 3D coronary artery image segmentation method based on semi-supervised consistency learning | |
CN113888412A (en) | Image super-resolution reconstruction method for diabetic retinopathy classification | |
CN115035127A (en) | Retinal vessel segmentation method based on generative confrontation network | |
CN115147600A (en) | GBM multi-mode MR image segmentation method based on classifier weight converter | |
CN117315258A (en) | Lightweight retinal vessel segmentation method based on graph convolution network and partial convolution | |
CN115965807A (en) | TransCNN medical fundus image classification algorithm based on hyper-parameter optimization | |
CN114943721A (en) | Neck ultrasonic image segmentation method based on improved U-Net network | |
CN112932501A (en) | Method for automatically identifying insomnia based on one-dimensional convolutional neural network | |
CN115760835A (en) | Medical image classification method of graph convolution network | |
CN116363145A (en) | Fundus OCT image retina layering method based on double edge representation | |
CN116109565A (en) | Method for detecting retinopathy of prematurity based on regeneration network | |
CN116645283A (en) | Low-dose CT image denoising method based on self-supervision perceptual loss multi-scale convolutional neural network | |
CN115984555A (en) | Coronary artery stenosis identification method based on depth self-encoder composition | |
CN114821259A (en) | Zero-learning medical image fusion method based on twin convolutional neural network | |
CN117010971B (en) | Intelligent health risk providing method and system based on portrait identification | |
CN116977330B (en) | Atrial fibrillation auxiliary analysis method based on pulse neural network and context awareness | |
CN116705297A (en) | Carotid artery detector based on multiple information processing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20210921 |
|
WD01 | Invention patent application deemed withdrawn after publication |