CN113421250A - Intelligent fundus disease diagnosis method based on lesion-free image training - Google Patents

Intelligent fundus disease diagnosis method based on lesion-free image training Download PDF

Info

Publication number
CN113421250A
CN113421250A CN202110756395.8A CN202110756395A CN113421250A CN 113421250 A CN113421250 A CN 113421250A CN 202110756395 A CN202110756395 A CN 202110756395A CN 113421250 A CN113421250 A CN 113421250A
Authority
CN
China
Prior art keywords
image
decoder
encoder
loss
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110756395.8A
Other languages
Chinese (zh)
Inventor
赵赫
李慧琦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Institute of Technology BIT
Original Assignee
Beijing Institute of Technology BIT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Institute of Technology BIT filed Critical Beijing Institute of Technology BIT
Priority to CN202110756395.8A priority Critical patent/CN113421250A/en
Publication of CN113421250A publication Critical patent/CN113421250A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30041Eye; Retina; Ophthalmic
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30168Image quality inspection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Medical Informatics (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Quality & Reliability (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to an intelligent diagnosis method for fundus diseases based on non-pathological image training, and belongs to the technical field of image classification and disease diagnosis. The method comprises the following steps: 1, constructing a training set and a test set and finishing the preprocessing of a data set; 2, constructing an encoder, a decoder, a discriminator and a restoration decoder model for the training of the non-pathological image; 3 constructing an agent task based on image transformation; 4, constructing a weighting loss function based on reconstruction loss, discriminant loss and restoration loss; 5, training a model; and 6, testing the image to be detected by using the trained coding-decoding model. The method gets rid of condition dependence on coexistence of different types of data in a training set through a training mode of image reconstruction; the introduction of the agent task reduces the requirement of the model for data; the constraint of the image and the feature space strengthens the learning of the model to the image organization structure; the above characteristics jointly improve the recognition capability of the model for the disease image.

Description

Intelligent fundus disease diagnosis method based on lesion-free image training
Technical Field
The invention relates to an intelligent diagnosis method for fundus diseases based on non-pathological image training, and belongs to the technical field of image classification and disease diagnosis.
Background
The fundus images have great significance for medical disease diagnosis, and are often used for ophthalmologists to diagnose various diseases. Various diseases of the eye as well as diseases affecting blood circulation and brain can be visualized in fundus images, including macular degeneration causing blindness, glaucoma, and complications of systemic diseases such as diabetic retinopathy, hypertension, and the like. Compared with other medical images, the device for acquiring the ophthalmologic image has lower requirements, is suitable for basic wide-range general investigation, provides efficient diagnosis service for basic patients, and has wide application prospect and practical social value. The artificial intelligence has the advantages of high speed, high accuracy and the like in medical image auxiliary diagnosis, and plays an important role in assisting doctors in analyzing and identifying pathological changes and improving the diagnosis efficiency.
The medical image diagnosis algorithm at the present stage is mainly based on a deep neural network, training is carried out by utilizing health and disease samples, and a large amount of labeled data is required to be used as a training basis. Clinically, labeled lesion data is rare, and for some novel diseases, it is very difficult to obtain a large number of lesion labels in a short time. When a small amount of samples are used for training, the performance of the model is reduced, and the like. On the other hand, in the field of medical image analysis, a large number of healthy samples cannot be effectively utilized due to the restriction of the amount of lesion samples. Although some researches initially explore classification algorithms under a single-class sample, the classification algorithms cannot be applied to clinic due to the problems of long time consumption, low accuracy and the like in the testing process. By combining the above two problems, how to complete the development of a high-performance diagnosis system under the condition of no pathological change data, namely only using the data of healthy people, is a research problem to be solved in the field of medical image analysis.
The invention aims to make an effort to solve the clinical disease diagnosis under the scene of a non-pathological image by utilizing an unsupervised diagnosis algorithm and combining the distribution characteristics of fundus images in a characteristic space, provides a deep learning fundus disease intelligent diagnosis method based on non-pathological image training, and assists doctors to finish high-accuracy disease diagnosis.
Disclosure of Invention
The invention aims to solve the following two defects existing in the existing fundus image classification diagnosis algorithm: 1) existing algorithms rely on a large amount of labeled data: under the condition of lacking data, the performance of the algorithm is poor; 2) data tags that are overly dependent on balance: on the premise of only one type of label, the performance of the algorithm is deficient; provides an intelligent diagnosis method for fundus diseases based on non-pathological image training.
In order to achieve the above object, the present invention adopts the following aspects.
The intelligent fundus disease diagnosis method based on the non-pathological image training is realized by the following steps:
the method comprises the following steps: preprocessing the collected images to construct a data set, specifically comprising: screening the collected images, eliminating images with poor image quality, unifying the resolution of the screened images into the same dimension W C, and normalizing the pixel values to the range of [ -1,1 ];
wherein c is greater than or equal to 1;
the data set is divided into a training set and a testing set, and clinically acquired ophthalmic images are selected; the images in the training set are formed by the ophthalmic images of healthy individuals, and the images in the testing set are a set of the images of the healthy individuals and the diseased individuals; the image with poor quality specifically includes: the picture is too dark, the deviation of the shooting visual angle is large, and a jittered image is shot;
step two: designing a network model comprising an encoder, a decoder, a discriminator 1, a discriminator 2 and a restoration decoder, and specifically comprising the following substeps:
step 2.1, constructing an encoder of the multilayer convolution layer;
the encoder comprises N groups of down-sampling layers, a regularization layer and an activation function which are connected in series; the encoder input is an image with dimension W C after step one, and the output is 1W 2N+1A feature vector z expressing the essence of the image;
the value of N is less than or equal to log2W;
The down-sampling layer comprises convolution layer with convolution kernel size of n × n and step length of 2, and the number of channels is from 22Sequentially increasing to 2 by multiple increment of 2N+1
The activation function is a LeakyReLU activation function with a negative slope L;
the value of n is [3,5 ]; l takes the value of [0,1 ];
step 2.2, constructing a decoder of the multilayer deconvolution layer;
the decoder comprises N groups of up-sampling layers, a regularization layer and an activation function which are connected in series; the decoder inputs the feature vector z output by the encoder in the second step, and the output is an image with the dimension of W x c;
the up-sampling layer comprises convolution layer with convolution kernel size n × n and step length 2, and the number of channels is from 2N+1Sequentially decreasing to 2 by multiple of 22
The activation function is a ReLU activation function with a negative slope L, and the activation function at the outermost layer is a Tanh function;
step 2.3, constructing a discriminator 1 of a feature space by adopting a multi-layer perceptron structure;
wherein, the input of the discriminator 1 is the characteristic vector z expressing the essence of the image output in the step 2.1; the discriminator 1 comprises K series-connected full-connection layers, a Dropout layer and an activation function;
the number of neurons in each layer of the full-connection layer is from 2KDecreasing by a factor of 2 to 20
The random discard parameter of the Dropout layer is p;
the activation functions are all K-1 activation functions except the activation function of the last layer which is sigmoid, and the other activation functions are LeakyReLU with the negative slope L;
wherein, the value range of K is [5, log2W](ii) a p has a value range of [0,1]];
Step 2.4, constructing a discriminator 2 of an image space by adopting a PatchGAN structure;
wherein, the discriminator 2 comprises P convolution layers; the front P-1 convolutional layers comprise convolutional layers, regularization layers and activation functions;
the convolution layer comprises convolution layers with convolution kernel size n × n and step size 2, and the number of channels is from 22Sequentially increasing to 2 by multiple increment of 2P
Directly outputting the final layer of convolution, wherein the rest P-1 activation functions are LeakyReLU functions with negative slope L;
wherein, the value range of PIs [5, log ]2W];
Step 2.5, a restoration decoder is constructed by adopting a plurality of layers of deconvolution layers;
the recovery decoder comprises N groups of up-sampling layers, a regularization layer and an activation function which are connected in series; the input of the decoder is the output vector of the encoder in the second step, and the output is an image with the dimension of W x c;
the up-sampling unit comprises convolution layers with convolution kernel size of n x n and step length of 2, and the number of channels is from 2N+1Sequentially decreasing to 2 by multiple of 22
The activation function in the first N-1 groups of 'up-sampling layer, regularization layer and activation function' is a ReLU activation function with a negative slope L, and the activation function in the last group of 'up-sampling layer, regularization layer and activation function' is a Tanh function;
step three: constructing an agent task, specifically: the method comprises the following steps of constructing an agent task based on reconstruction by using images in a training set, namely constructing the agent task by adopting a local pixel conversion mode, an image brightness nonlinear transformation mode and a local region patch mode, and specifically comprising the following substeps:
step 3.1, local pixel conversion, namely, randomly exchanging pixel values of different positions of a local area in an image, and outputting the image after random conversion, wherein the method specifically comprises the following steps: randomly selecting M in an image1Size of [1,7 ]]The pixels of the image blocks are randomly exchanged inside the image blocks;
step 3.2, image brightness nonlinear transformation is carried out, and the image after the nonlinear transformation is output, specifically: constructing a Bezier mapping curve B (t) based on formula (1) through three control points which are given randomly, and completing mapping of image pixel values based on the curve:
B(t)=(1-t)2P0+2t(1-t)P1+t2P2,t∈[0,1] (1)
where t denotes the luminance of the pixel, P0,P1,P2Three control points are randomly acquired;
step 3.3, local area patching, namely, carrying out random pixel value filling on the randomly selected area in the image to obtain the patched image of the local area;
randomly selecting M in an image2Each size is
Figure BDA0003147739640000041
An inter-integer image block, wherein pixels contained in the image block are filled with random noise values obtained from uniform distribution;
step four: constructing a total loss function, specifically a weighted sum of image reconstruction loss, feature reconstruction loss, discrimination loss of an image space and a feature space and restoration loss based on an agent task;
wherein image reconstruction is lost
Figure BDA0003147739640000042
And (3) constraining the difference between the real image and the reconstructed image by adopting an L1 loss function, and calculating the formula as (2):
Figure BDA0003147739640000051
wherein, x represents the input image,
Figure BDA0003147739640000052
respectively representing the decoder and the encoder;
Figure BDA0003147739640000053
representing the coded output resulting from coding the input image x,
Figure BDA0003147739640000054
representing the encoded output of the input image x, and then decoding to obtain a decoded output; II-1Represents a norm of 1;
loss of feature reconstruction
Figure BDA0003147739640000055
The difference between the representation of the features of the image in the feature space and the representation of the features of the reconstructed image is calculated, again using the L1 loss functionThe formula is shown as (3):
Figure BDA0003147739640000056
wherein z is a feature vector of the essence of the image output by the encoder in the step 2.1;
Figure BDA0003147739640000057
represents a decoded output obtained by decoding the input feature vector z;
Figure BDA0003147739640000058
represents the output of decoding and then encoding the input eigenvector z;
discriminator loss of image space and feature space
Figure BDA0003147739640000059
The difference between the output of the encoder and the decoder and the image and the feature in the real space is constrained, and the calculation formula is as follows (4):
Figure BDA00031477396400000510
wherein D isI、DFA discriminator 1 of an image space and a discriminator 2 of a feature space, respectively;
Figure BDA00031477396400000511
is the output result of the input image x through the encoder and the discriminator 2;
Figure BDA00031477396400000512
is the result of the output of the characteristic vector z by the decoder and the discriminator 1; dI、DFBy the following loss equation
Figure BDA00031477396400000513
(5) And (6) performing iterative optimization:
Figure BDA00031477396400000514
Figure BDA00031477396400000515
wherein D isI(x) For the input image x, the result, D, is output via a discriminator 1F(z) the result of the feature vector z output by the discriminator 2;
agent task based recovery loss
Figure BDA00031477396400000516
By utilizing the recovered proxy task, the extraction capability of the encoder on the image features is enhanced, and the calculation formula is as follows (7):
Figure BDA0003147739640000061
wherein,
Figure BDA0003147739640000062
represents a restoration decoder; x is the number ofaRepresenting transformed image input obtained by the agent task;
Figure BDA0003147739640000063
representing a transformed image xaThe output after the encoder and the recovery decoder;
the total loss function is calculated as follows in equation (8):
Figure BDA0003147739640000064
wherein, a is image reconstruction loss, b is characteristic reconstruction loss, c is discrimination loss of an image space and a characteristic space, and d is a weight coefficient of partial loss;
step five: model training to obtain a trained encoder and decoder, comprising the following substeps:
step 5.1 will beHealthy eyeground image input encoder for normal person
Figure BDA0003147739640000065
Forward propagation to obtain image feature vector, input to decoder
Figure BDA0003147739640000066
In the middle, a reconstructed image is obtained in the forward direction; inputting the reconstructed image into an encoder, and carrying out forward propagation to obtain a reconstructed feature vector; taking the three images constructed in the steps 3.1 to 3.3 as the input of a coding-restoring decoder, and the original image as a training label, the learning of the agent task is completed, and the method specifically comprises the following steps: randomly transforming the input healthy eye fundus images according to the proxy task, inputting the transformed healthy eye fundus images into an encoder and a recovery decoder, and obtaining a recovery image in the forward direction;
step 5.2, calculating image reconstruction loss and characteristic reconstruction loss;
step 5.3, inputting the reconstructed image and the reconstructed feature vector into discriminators 1 and 2 of an image space and a feature space respectively, and calculating discrimination loss of the image space and the feature space;
step 5.4, calculating the recovery loss of the agent task;
step 5.5, performing back propagation and parameter optimization, and optimizing the discriminator and the encoder-decoder by adopting a mode of alternately optimizing the discriminator and the encoder-decoder;
step 5.6, repeating the steps 5.1-5.5, traversing all images in the training set once, recording a total loss function value in the process and drawing a curve, and adjusting the learning rate after the loss curve is converged stably so as to facilitate the model to continue learning;
step 5.7, storing the trained encoder and decoder;
step six: testing the images of the test set by using a trained encoder and a trained decoder, selecting a threshold value for judging the large-batch images, and outputting a conclusion whether the images are normal or not, specifically;
step 6.1, the input image is processed by an encoder to obtain an image characteristic vector, then processed by a decoder to obtain a reconstructed image, and the reconstructed image is input into the encoder again to obtain a reconstructed characteristic vector;
step 6.2, calculating the difference d1 between the input image and the reconstructed image, and calculating the difference d2 between the image feature vector and the reconstructed feature vector;
step 6.3, averaging the differences d1 and d2 of the two parts to be used as a score value of the input image, wherein the larger the score value is, the higher the probability that the image is a lesion image is, and the lower the probability is otherwise; selecting an optimal threshold value according to the label of the test set, if the optimal threshold value is larger than the threshold value, judging that the input image is a pathological image, otherwise, judging that the input image is a normal image;
and finally, completing the disease diagnosis method of the disease-free image training by the steps from the first step to the sixth step.
Advantageous effects
The invention relates to an intelligent diagnosis method for eyeground diseases based on non-pathological image training, which has the following beneficial effects compared with the existing disease diagnosis algorithm:
1. the method directly utilizes the images of healthy people for training without a data balance training set with pathological changes, effectively solves the problem that the existing algorithm depends on disease image data, and simultaneously caters to the current application situation of rare clinical disease data;
2. according to the method, the image reconstruction is constrained through two dimensions of an image space and a feature space, so that the feature perception capability of the model on a normal image is improved;
3. on the basis of reconstruction, the method provides discrimination loss of an image and a feature space, and further improves each structural part of a model: the encoder and the decoder have the characteristic learning capacity for the trained health images; the recognition capability of the model for dealing with the disease image is further improved;
4. the method provides an agent task, under the condition of a certain data volume, the model learns the boundary and structure information of the deep image through different forms of restoration tasks, the help is provided on the task of healthy image reconstruction, and the performance of the model in the aspect of disease identification is improved.
Drawings
FIG. 1 is a schematic diagram of a network structure supported by the method for intelligently diagnosing fundus diseases based on non-pathological image training of the present invention;
FIG. 2 is a schematic flow chart of an example of the method for intelligently diagnosing fundus diseases based on the training of non-pathological images according to the present invention;
FIG. 3 is a test flow of the intelligent diagnosis method for fundus diseases based on the training of non-pathological images according to the present invention;
FIG. 4 is a ROC curve chart comparing the fundus disease intelligent diagnosis method based on the non-pathological image training with the prior method.
Detailed Description
The intelligent diagnosis method for fundus diseases based on the non-pathological image training of the invention is further explained and described in detail with reference to the accompanying drawings and embodiments.
Example 1
This embodiment describes a specific implementation of the fundus disease intelligent diagnosis method based on the non-pathological image training of the present invention.
The invention can be applied to the screening work of hospitals and medical institutions of different scales for diseases, and can judge whether the patient is a healthy or sick individual or not according to the difference among the images in the step six by carrying out medical image acquisition and model classification and identification on the patient to be diagnosed. In medical institutions of different sizes, the uniform image pixel size W may be adjusted according to the device computing power at different locations.
FIG. 1 is a schematic diagram of a network structure supported by the fundus disease intelligent diagnosis method based on the pathologic image training. In FIG. 1
Figure BDA0003147739640000081
In order to be an encoder, the encoder is,
Figure BDA0003147739640000082
in order to be a decoder, the decoder,
Figure BDA0003147739640000083
to restore the decoder, DFDiscriminators being feature spaces, DIIs a discriminator of image space, x is an input mapLike, z is a feature vector of the image,
Figure BDA0003147739640000084
is reconstructed image, x'aRestoring an image for the agent task; updating parameters of the model through reconstruction loss of the image space and the feature space, discrimination loss of the image space and the feature space and restoration loss of the agent task;
fig. 2 is a flowchart of a disease identification algorithm in an embodiment of the present invention, taking a fundus OCT image as a specific example, including the following steps:
step A: constructing a training set and a testing set;
1 ten thousand images acquired by an Optical Coherence Tomography (OCT) technology are used as implementation images, and only health data without pathological changes are used as a training set; simultaneously, 200 equivalent healthy data and 200 equivalent pathological change data are taken as a test set;
after image quality screening is carried out, data which are too high in fuzzy degree and too dark and do not meet the conditions are removed, the image is zoomed to 256 × 1 pixel size, and the pixel size is normalized to a pixel value interval of [ -1,1 ];
completing the construction of a training set and a test set through the step A;
and B: constructing a network model;
constructing an encoder, a decoder, a recovery decoder, a discriminator model 1 and a discriminator model 2; updating model parameters according to the loss function, and storing the trained models, such as an encoder and a decoder, for model testing;
step B.1, an encoder model comprises 8 groups of down-sampling layers, a regularization layer and an activation function which are connected in series; inputting an encoder model into the image with the dimension of 256 × 1 processed in the step A, and outputting a feature vector z with the dimension of 1 × 512; the down-sampling layer is a convolution layer with convolution kernel size of 4 x 4 and step length of 2, and the number of channels increases from 4 to 512 in turn by multiple of 2; the activation function is a LeakyReLU activation function with a negative slope of 0.2;
step B.2, the decoder comprises 8 groups of up-sampling layers, a regularization layer and an activation function which are connected in series; the decoder model input is the output vector of the encoder, and the output is an image with the dimension of 256 × 1; the up-sampling layer comprises a convolution layer with convolution kernel size of 4 x 4 and step length of 2, and the number of channels is sequentially decreased from 512 to 4 by multiples of 2; the activation function is a ReLU activation function with a negative slope of 0.2, and the activation function at the outermost layer is a Tanh function;
step B.3, the recovery decoder comprises 8 groups of up-sampling layers, regularization layers and activation functions which are connected in series; the decoder model input is the output vector of the encoder, and the output is an image with the dimension of 256 × 1; the up-sampling unit comprises a convolution layer with convolution kernel size of 4 x 4 and step length of 2, and the number of channels 512 is sequentially decreased to 4 by multiples of 2; the activation function is a ReLU activation function with a negative slope of 0.2, and the activation function at the outermost layer is a Tanh function;
step B.4, the input of the discriminator model 1 is a characteristic vector z output by the encoder, and the characteristic vector z comprises 7 serially connected fully-connected layers, a Dropout layer and an activation function; d.4, each layer of the fully-connected layer utilizes a discriminator model and adopts antagonistic discrimination loss, so that the capability of the encoder on image encoding is improved, and the capability of the decoder on feature vector reconstruction images is improved; the model has large response to non-training images (disease images), and the recognition capability of the model to diseases is improved.
The number of neurons is decreased from 128 to 1 by a multiple of 2; the random discard parameter of Dropout layer is 0.5; the activation functions are LeakyReLU with negative slope of 0.2 except that the activation function of the last layer is sigmoid;
step B.5 the discriminator model 2 includes 5 convolution layers; the first 4 convolutional layers comprise convolutional layers, regularization layers and activation functions; the convolution layer comprises convolution layers with convolution kernel size of 4 x 4 and step size of 2, and the number of channels increases from 4 to 32 in turn by multiple increment of 2; the activation function is LeakyReLU with negative slope of 0.2, and the final layer of convolution is directly output;
step C, constructing an agent task;
in the training image of the step A, randomly performing some transformation of local pixel transformation, luminance nonlinear transformation and local area patches on each image;
the local pixel conversion utilizes the random selection of 6 image blocks with the integral size of [1,6], and the random exchange of pixels is carried out in each selected image block;
the brightness nonlinear transformation utilizes a Bezier mapping curve constructed in the formula (1) to regulate and control the pixel value of the image, and the transformation of the pixel value is completed;
randomly selecting 6 image blocks with the size of an integer between [42 and 51] in the image by the local area patch, and filling pixels contained in the image blocks with random noise values obtained from uniform distribution;
step D: training a loss function and a model;
step D.1 construction of model loss function, data representation as
Figure BDA0003147739640000101
Wherein
Figure BDA0003147739640000102
A training image representative of health; x is the number ofaThe image is processed by the agent task; the method optimizes the weighted loss function
Figure BDA0003147739640000103
Updating parameters, specifically including the following contents:
step D.2 reconstruction loss of images
Figure BDA0003147739640000104
Figure BDA0003147739640000105
Step D.3 reconstruction loss of features
Figure BDA0003147739640000106
Figure BDA0003147739640000107
Step D.4 discriminator loss of image space and feature space
Figure BDA0003147739640000108
Figure BDA0003147739640000109
Step D.5 recovery loss of proxy task
Figure BDA00031477396400001010
Figure BDA00031477396400001011
Step D.6 Total loss function
Figure BDA00031477396400001012
Figure BDA00031477396400001013
Wherein a is 1, b is 10, c is 10, and d is 10;
step E: model training, updating and storing parameters; training As shown in FIG. 2, firstly, the processed normal images are sequentially input into the encoder-decoder-encoder, the feature vectors, the reconstructed images and the reconstructed feature vectors are sequentially obtained, and the image reconstruction loss is calculated
Figure BDA00031477396400001014
And feature reconstruction loss
Figure BDA00031477396400001015
Respectively initializing the reconstructed feature vector and the reconstructed image into a feature discriminator model 1 and a feature discriminator model 2 to calculate the discriminator loss
Figure BDA00031477396400001016
B, sequentially obtaining the restoration characteristic vector and the reconstruction restoration image through the image in the step A.2 in an encoder-restoration decoder, and calculating the restoration loss of the image
Figure BDA0003147739640000111
Loss weighting to final loss according to step D.1
Figure BDA0003147739640000112
Performing back propagation and parameter optimization, wherein Adam is selected as an optimization mode, the batch size is 32, each batch of image is optimized for the discriminator once, the generator is optimized for the generator once again, and the learning rate is set to be 2e-4(ii) a Finally, storing the trained model;
wherein the generator comprises an encoder, a decoder and a restoration decoder;
step F: model testing and image classification;
as shown in fig. 3, the input image is input to the model, the difference between the input image and the reconstructed image and the difference between the image feature and the reconstructed image feature are calculated, the average of the two is calculated as the probability that the image is discriminated as a lesion image, and the lesion image is identified by threshold screening.
Thus, the whole process of the disease diagnosis method based on the non-pathological image is realized. The ROC curve of the experimental result is shown in FIG. 4, and the curve of the method is much higher than that of other methods, so that the method can well complete the identification of the healthy image and the pathological image under the training condition of lacking pathological image, and the problem that the existing algorithm can not perform clinical classification task under the condition of lacking category is solved.
The method directly utilizes the images of healthy people for training without a data balance training set with pathological changes, effectively solves the problem that the existing algorithm depends on disease image data, and caters to the current application situation of rare clinical disease data, and is specifically embodied as follows: step A, the requirement of an algorithm on the data volume is reduced through the transformation of an image; the training process of the step C solves the problem that the existing algorithm depends on multi-class training images;
according to the method, the image reconstruction is constrained through two dimensions of the image space and the feature space, and the feature perception capability of the model on the normal image is improved. Step D.2 and step D.3, comparing the original input and the reconstructed output of the two spaces by using L1 loss, and improving the perception capability of the model to the image characteristics through constraint difference;
on the basis of reconstruction, the method provides discrimination loss of an image and a feature space, and further improves each structural part of a model: the encoder and the decoder have the characteristic learning capacity for the trained health images; the recognition capability of the model for dealing with the disease image is further improved, and the recognition capability is embodied as follows: d.4, by utilizing a discriminator model and adopting antagonistic discrimination loss, the capability of an encoder for encoding the image is improved, and the capability of the decoder for reconstructing the image by using the feature vector is improved; the model has large response to non-training images (disease images), and the recognition capability of the model to diseases is improved.
The method provides an agent task, and under the condition of a certain data volume, the model learns the information such as the boundary, the structure and the like of the deep image through different forms of restoration tasks, thereby providing help on the task of healthy image reconstruction and improving the performance of the model on disease identification. Step C, by carrying out proxy task transformation on the existing health image and combining the restoration loss of the step D.6, the learning of the model on boundary and structural information is promoted according to different tasks, and the accuracy of the model on a reconstruction task is strengthened; and further, the recognition capability of the model for the disease image is improved.
While the foregoing is directed to the preferred embodiment of the present invention, it is not intended that the invention be limited to the embodiment and the drawings disclosed herein. Equivalents and modifications may be made without departing from the spirit of the disclosure, which is to be considered as within the scope of the invention.

Claims (10)

1. An intelligent diagnosis method for fundus diseases based on non-pathological image training is characterized in that: the method is realized by the following steps:
the method comprises the following steps: preprocessing the collected images to construct a data set; wherein, the data set is divided into a training set and a testing set;
step two: designing a network model comprising an encoder, a decoder, a discriminator 1, a discriminator 2 and a restoration decoder, and specifically comprising the following substeps:
step 2.1, constructing an encoder of the multilayer convolution layer;
the encoder comprises N groups of down-sampling layers, a regularization layer and an activation function which are connected in series;
step 2.2, constructing a decoder of the multilayer deconvolution layer;
the decoder comprises N groups of up-sampling layers, a regularization layer and an activation function which are connected in series;
step 2.3, constructing a discriminator 1 of a feature space by adopting a multi-layer perceptron structure;
step 2.4, constructing a discriminator 2 of an image space by adopting a PatchGAN structure;
wherein, the discriminator 2 comprises P convolution layers; the front P-1 convolutional layers comprise convolutional layers, regularization layers and activation functions;
step 2.5, a restoration decoder is constructed by adopting a plurality of layers of deconvolution layers;
the recovery decoder comprises N groups of up-sampling layers, a regularization layer and an activation function which are connected in series;
step three: constructing an agent task, specifically: the method comprises the following steps of constructing an agent task based on reconstruction by using images in a training set, namely constructing the agent task by adopting a local pixel conversion mode, an image brightness nonlinear transformation mode and a local region patch mode, and specifically comprising the following substeps:
step 3.1, local pixel conversion, namely, randomly exchanging pixel values of different positions of a local area in an image and outputting the image after random conversion;
step 3.2, carrying out nonlinear transformation on the image brightness, and outputting the image subjected to the nonlinear transformation;
step 3.3, local area patching, namely, carrying out random pixel value filling on the randomly selected area in the image to obtain the patched image of the local area;
step four: constructing a total loss function, specifically a weighted sum of image reconstruction loss, feature reconstruction loss, discrimination loss of an image space and a feature space and restoration loss based on an agent task;
step five: model training to obtain a trained encoder and decoder, comprising the following substeps:
step 5.1 inputting the healthy eyeground image of the normal person into the encoder
Figure FDA0003147739630000021
Forward propagation to obtain image feature vector, input to decoder
Figure FDA0003147739630000022
In the middle, a reconstructed image is obtained in the forward direction; inputting the reconstructed image into an encoder, and carrying out forward propagation to obtain a reconstructed feature vector; taking the three images constructed in the steps 3.1 to 3.3 as the input of a coding-restoring decoder, and the original image as a training label, the learning of the agent task is completed, and the method specifically comprises the following steps: randomly transforming the input healthy eye fundus images according to the proxy task, inputting the transformed healthy eye fundus images into an encoder and a recovery decoder, and obtaining a recovery image in the forward direction;
step 5.2, calculating image reconstruction loss and characteristic reconstruction loss;
step 5.3, inputting the reconstructed image and the reconstructed feature vector into discriminators 1 and 2 of an image space and a feature space respectively, and calculating discrimination loss of the image space and the feature space;
step 5.4, calculating the recovery loss of the agent task;
step 5.5, performing back propagation and parameter optimization, and optimizing the discriminator and the encoder-decoder by adopting a mode of alternately optimizing the discriminator and the encoder-decoder;
step 5.6, repeating the steps 5.1-5.5, traversing all images in the training set once, recording the value of the total loss function in the process, drawing a curve, and adjusting the learning rate after the loss curve is converged stably so as to facilitate the model to continue learning;
step 5.7, storing the trained encoder and decoder;
step six: testing the images of the test set by using a trained encoder and a trained decoder, selecting a threshold value for judging the large-batch images, and outputting a conclusion whether the images are normal or not, specifically;
step 6.1, the input image is processed by an encoder to obtain an image characteristic vector, then processed by a decoder to obtain a reconstructed image, and the reconstructed image is input into the encoder again to obtain the characteristic vector of the reconstructed image;
step 6.2, calculating the difference d1 between the input image and the reconstructed image, and calculating the difference d2 between the image feature vector and the reconstructed feature vector;
step 6.3, averaging the differences d1 and d2 of the two parts to be used as a score value of the input image, wherein the larger the score value is, the higher the probability that the image is a lesion image is, and the lower the probability is otherwise; and selecting an optimal threshold according to the label of the test set, and if the optimal threshold is larger than the optimal threshold, judging that the input image is a lesion image, otherwise, judging that the input image is a normal image.
2. An intelligent diagnosis method for fundus diseases based on non-pathological image training as claimed in claim 1, wherein: the method comprises the following steps: screening the collected images, eliminating images with poor image quality, unifying the resolution of the screened images into the same dimension W C, and normalizing the pixel values to the range of [ -1,1 ].
3. An intelligent diagnosis method for fundus diseases based on non-pathological image training as claimed in claim 2, wherein: c is 1 or more.
4. A method as claimed in claim 3, wherein the method comprises the following steps: the training set and the testing set both adopt clinically collected ophthalmic images; the images in the training set are formed by the ophthalmic images of healthy individuals, and the images in the testing set are a set of the images of the healthy individuals and the diseased individuals; the image with poor quality specifically includes: too dark picture, large deviation of shooting angle of view, and shooting a jittery image.
5. An intelligent diagnosis method for fundus diseases based on non-pathological image training as claimed in claim 4, wherein: in step 2.1 the encoder input is the image with dimension W x c after step one and the output is 1 x 2N+1A feature vector z expressing the essence of the image; the value of N is less than or equal to log2W; the down-sampling layer comprises convolution layer with convolution kernel size of n × n and step length of 2, and the number of channels is from 22Sequentially increasing to 2 by multiple increment of 2N+1(ii) a The activation function is a LeakyReLU activation function with a negative slope L; n is [3,5]](ii) a L is [0,1]]。
6. An intelligent diagnosis method for fundus diseases based on non-pathological image training as claimed in claim 5, wherein: in step 2.2, the decoder inputs the eigenvector z output by the encoder in step two, and the output is an image with the dimension of W x c; the up-sampling layer comprises convolution layer with convolution kernel size n × n and step length 2, and the number of channels is from 2N+1Sequentially decreasing to 2 by multiple of 22(ii) a The activation function is a ReLU activation function with a negative slope L, and the outermost activation function is a Tanh function.
7. An intelligent diagnosis method for fundus diseases based on non-pathological image training as claimed in claim 6, wherein: in step 2.3, the input of the discriminator 1 is the feature vector z expressing the essence of the image output in step 2.1; the discriminator 1 comprises K series-connected full-connection layers, a Dropout layer and an activation function; the number of neurons in each layer of the full-connection layer is from 2KDecreasing by a factor of 2 to 20(ii) a The random discard parameter of the Dropout layer is p; the activation functions are all K-1 activation functions except the activation function of the last layer which is sigmoid, and the other activation functions are LeakyReLU with the negative slope L; wherein, the value range of K is [5, log2W](ii) a p has a value range of [0,1]]。
8. An intelligent diagnosis method for fundus diseases based on non-pathological image training as claimed in claim 7, wherein: in step 2.4, the convolution layer includes a convolution kernel having a convolution kernel size of n x n,Convolution layer with step length of 2 and channel number of 22Sequentially increasing to 2 by multiple increment of 2P(ii) a Directly outputting the convolution of the last layer, wherein the rest P-1 activation functions are LeakyReLU functions with negative slope L; wherein, the value range of P is [5, log2W];
In step 2.5, the decoder inputs the output vector of the encoder in step two and outputs the image with dimension W x c; the up-sampling unit comprises convolution layers with convolution kernel size of n x n and step length of 2, and the number of channels is from 2N+1Sequentially decreasing to 2 by multiple of 22(ii) a The activation function in the first N-1 groups of the up-sampling layer, the regularization layer and the activation function is a ReLU activation function with a negative slope L, and the activation function in the last group of the up-sampling layer, the regularization layer and the activation function is a Tanh function.
9. An intelligent diagnosis method for fundus diseases based on non-pathological image training as claimed in claim 8, wherein: step 3.1, specifically: randomly selecting M in an image1Size of [1,7 ]]The pixels of the image blocks are randomly exchanged inside the image blocks;
step 3.2, specifically: constructing a Bezier mapping curve B (t) based on formula (1) through three control points which are given randomly, and completing mapping of image pixel values based on the curve:
B(t)=(1-t)2P0+2t(1-t)P1+t2P2,t∈[0,1] (1)
where t denotes the luminance of the pixel, P0,P1,P2Three control points are randomly acquired;
step 3.3, randomly selecting M in the image2Each size is
Figure FDA0003147739630000051
Inter-integer image blocks, the pixels contained in the image block are filled with random noise values obtained from a uniform distribution.
10. According to claim9 the intelligent diagnosis method for fundus diseases based on the training of the non-pathological image is characterized in that: in step four, image reconstruction is lost
Figure FDA0003147739630000052
And (3) constraining the difference between the real image and the reconstructed image by adopting an L1 loss function, and calculating the formula as (2):
Figure FDA0003147739630000053
wherein, x represents the input image,
Figure FDA0003147739630000054
respectively representing the decoder and the encoder;
Figure FDA0003147739630000055
representing the coded output resulting from coding the input image x,
Figure FDA0003147739630000056
representing the encoded output of the input image x, and then decoding to obtain a decoded output; II-1Represents a norm of 1;
loss of feature reconstruction
Figure FDA0003147739630000057
The difference between the feature expression of the image in the feature space and the feature expression of the reconstructed image is calculated by using the L1 loss function, and the formula is shown as (3):
Figure FDA0003147739630000058
wherein z is a feature vector z of the essence of the image output by the encoder in the step 2.1;
Figure FDA0003147739630000059
represents a decoded output obtained by decoding the input feature vector z;
Figure FDA00031477396300000510
represents the output of decoding and then encoding the input eigenvector z;
discriminator loss of image space and feature space
Figure FDA00031477396300000511
The difference between the output of the encoder and the decoder and the image and the feature in the real space is constrained, and the calculation formula is as follows (4):
Figure FDA0003147739630000061
wherein D isI、DFA discriminator 1 of an image space and a discriminator 2 of a feature space, respectively;
Figure FDA0003147739630000062
is the output result of the input image x through the encoder and the discriminator 2;
Figure FDA0003147739630000063
is the result of the output of the characteristic vector z by the decoder and the discriminator 1; dI、DFBy the following loss equation
Figure FDA0003147739630000064
(5) And (6) performing iterative optimization:
Figure FDA0003147739630000065
Figure FDA0003147739630000066
wherein D isI(x) For the input image x, the result, D, is output via a discriminator 1F(z) the result of the feature vector z output by the discriminator 2;
agent task based recovery loss
Figure FDA0003147739630000067
And enhancing the extraction capability of the encoder on the image characteristics by utilizing the restored proxy task, wherein the calculation formula is as follows:
Figure FDA0003147739630000068
wherein,
Figure FDA0003147739630000069
represents a restoration decoder; x is the number ofaRepresenting transformed image input obtained by the agent task;
Figure FDA00031477396300000610
representing a transformed image xaThe output after the encoder and the recovery decoder;
the final loss function is calculated as follows in equation (8):
Figure FDA00031477396300000611
where a is an image reconstruction loss, b is a feature reconstruction loss, c is a discrimination loss between an image space and a feature space, and d is a weight coefficient of a partial loss.
CN202110756395.8A 2021-07-05 2021-07-05 Intelligent fundus disease diagnosis method based on lesion-free image training Pending CN113421250A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110756395.8A CN113421250A (en) 2021-07-05 2021-07-05 Intelligent fundus disease diagnosis method based on lesion-free image training

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110756395.8A CN113421250A (en) 2021-07-05 2021-07-05 Intelligent fundus disease diagnosis method based on lesion-free image training

Publications (1)

Publication Number Publication Date
CN113421250A true CN113421250A (en) 2021-09-21

Family

ID=77720146

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110756395.8A Pending CN113421250A (en) 2021-07-05 2021-07-05 Intelligent fundus disease diagnosis method based on lesion-free image training

Country Status (1)

Country Link
CN (1) CN113421250A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113902036A (en) * 2021-11-08 2022-01-07 哈尔滨理工大学 Multi-feature fusion type fundus retinal disease identification method
CN117173543A (en) * 2023-11-02 2023-12-05 天津大学 Mixed image reconstruction method and system for lung adenocarcinoma and pulmonary tuberculosis

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111476757A (en) * 2020-03-10 2020-07-31 西北大学 Coronary artery patch data detection method, system, storage medium and terminal
CN111798464A (en) * 2020-06-30 2020-10-20 天津深析智能科技有限公司 Lymphoma pathological image intelligent identification method based on deep learning
CN112598658A (en) * 2020-12-29 2021-04-02 哈尔滨工业大学芜湖机器人产业技术研究院 Disease identification method based on lightweight twin convolutional neural network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111476757A (en) * 2020-03-10 2020-07-31 西北大学 Coronary artery patch data detection method, system, storage medium and terminal
CN111798464A (en) * 2020-06-30 2020-10-20 天津深析智能科技有限公司 Lymphoma pathological image intelligent identification method based on deep learning
CN112598658A (en) * 2020-12-29 2021-04-02 哈尔滨工业大学芜湖机器人产业技术研究院 Disease identification method based on lightweight twin convolutional neural network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
CHUAN LI等: "Precomputed Real-Time Texture Synthesis with Markovian Generative Adversarial Networks", 《ARXIV》 *
HE ZHAO等: "Anomaly Detection for Medical Images using Self-supervised and Translation-consistent Features", 《IEEE》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113902036A (en) * 2021-11-08 2022-01-07 哈尔滨理工大学 Multi-feature fusion type fundus retinal disease identification method
CN117173543A (en) * 2023-11-02 2023-12-05 天津大学 Mixed image reconstruction method and system for lung adenocarcinoma and pulmonary tuberculosis
CN117173543B (en) * 2023-11-02 2024-02-02 天津大学 Mixed image reconstruction method and system for lung adenocarcinoma and pulmonary tuberculosis

Similar Documents

Publication Publication Date Title
CN109376636B (en) Capsule network-based eye fundus retina image classification method
CN115205300B (en) Fundus blood vessel image segmentation method and system based on cavity convolution and semantic fusion
CN109543526B (en) True and false facial paralysis recognition system based on depth difference characteristics
CN109584254A (en) A kind of heart left ventricle's dividing method based on the full convolutional neural networks of deep layer
CN111598894B (en) Retina blood vessel image segmentation system based on global information convolution neural network
CN116071292A (en) Ophthalmoscope retina image blood vessel identification method based on contrast generation learning
CN112465905A (en) Characteristic brain region positioning method of magnetic resonance imaging data based on deep learning
CN113421250A (en) Intelligent fundus disease diagnosis method based on lesion-free image training
CN115908463A (en) 3D coronary artery image segmentation method based on semi-supervised consistency learning
CN113888412A (en) Image super-resolution reconstruction method for diabetic retinopathy classification
CN115035127A (en) Retinal vessel segmentation method based on generative confrontation network
CN115147600A (en) GBM multi-mode MR image segmentation method based on classifier weight converter
CN117315258A (en) Lightweight retinal vessel segmentation method based on graph convolution network and partial convolution
CN115965807A (en) TransCNN medical fundus image classification algorithm based on hyper-parameter optimization
CN114943721A (en) Neck ultrasonic image segmentation method based on improved U-Net network
CN112932501A (en) Method for automatically identifying insomnia based on one-dimensional convolutional neural network
CN115760835A (en) Medical image classification method of graph convolution network
CN116363145A (en) Fundus OCT image retina layering method based on double edge representation
CN116109565A (en) Method for detecting retinopathy of prematurity based on regeneration network
CN116645283A (en) Low-dose CT image denoising method based on self-supervision perceptual loss multi-scale convolutional neural network
CN115984555A (en) Coronary artery stenosis identification method based on depth self-encoder composition
CN114821259A (en) Zero-learning medical image fusion method based on twin convolutional neural network
CN117010971B (en) Intelligent health risk providing method and system based on portrait identification
CN116977330B (en) Atrial fibrillation auxiliary analysis method based on pulse neural network and context awareness
CN116705297A (en) Carotid artery detector based on multiple information processing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20210921

WD01 Invention patent application deemed withdrawn after publication