CN115471701A - Lung adenocarcinoma histology subtype classification method based on deep learning and transfer learning - Google Patents

Lung adenocarcinoma histology subtype classification method based on deep learning and transfer learning Download PDF

Info

Publication number
CN115471701A
CN115471701A CN202211139740.4A CN202211139740A CN115471701A CN 115471701 A CN115471701 A CN 115471701A CN 202211139740 A CN202211139740 A CN 202211139740A CN 115471701 A CN115471701 A CN 115471701A
Authority
CN
China
Prior art keywords
lung adenocarcinoma
image
original
histological
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211139740.4A
Other languages
Chinese (zh)
Inventor
张景航
王现伟
张燕飞
刘冬玲
朱彦艳
秦宇
王来
庞婷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
First Affiliated Hospital of Xinxiang Medical University
Original Assignee
First Affiliated Hospital of Xinxiang Medical University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by First Affiliated Hospital of Xinxiang Medical University filed Critical First Affiliated Hospital of Xinxiang Medical University
Priority to CN202211139740.4A priority Critical patent/CN115471701A/en
Publication of CN115471701A publication Critical patent/CN115471701A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • G06V10/765Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects using rules for classification or partitioning the feature space
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/03Recognition of patterns in medical or anatomical images
    • G06V2201/032Recognition of patterns in medical or anatomical images of protuberances, polyps nodules, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to the technical field of medical image classification, in particular to a lung adenocarcinoma histology subtype classification method based on deep learning and transfer learning, which comprises the following steps: acquiring a lung adenocarcinoma image to be detected; and determining a target lung adenocarcinoma histological subtype corresponding to the lung adenocarcinoma image to be detected by utilizing the lung adenocarcinoma histological subtype classification network, and outputting the target lung adenocarcinoma histological subtype. The method solves the problem of less available training data through transfer learning so as to improve the classification performance, and compared with the traditional method, the method can provide necessary support for pathological diagnosis of the lung adenocarcinoma histological type and improve the classification performance. Compared with other advanced methods, the scheme obtains a better classification result, the accuracy is 0.91, the sensitivity is 0.89, and the specificity is 0.88. The superior performance and the higher accuracy and sensitivity of the model indicate that the proposed imaging omics model has the potential of being applied to the lung adenocarcinoma pathological typing diagnosis.

Description

Lung adenocarcinoma histology subtype classification method based on deep learning and transfer learning
Technical Field
The invention relates to the technical field of medical image classification, in particular to a lung adenocarcinoma histology subtype classification method based on deep learning and transfer learning.
Background
Lung adenocarcinoma is the most common histological type of lung cancer and is a leading cause of cancer death worldwide. Despite significant advances in targeted diagnosis and treatment based on the histological and genetic characteristics of tumors, lung adenocarcinoma, on average, causes over 500000 deaths worldwide each year. As pathologists' studies on the morphological classification of lung adenocarcinoma have progressed, there is currently more focus on treatment decisions for histological subtype classification of lung adenocarcinoma. Histological classification of lung adenocarcinoma is of great interest because it is not only a reliable prognostic marker, but also can predict survival. In 2015, WHO (World Health Organization) identified five histological subtypes of lung adenocarcinoma, including alveolar type, solid type, adherent type, microemulsion head type and papilla type, each with a unique growth pattern. Such as acinar adenocarcinoma: the cancer tissue is in acinar or tubular shape; papillary adenocarcinoma: the prominent tissue morphology is characterized in that the nipple contains the axis of the fiber blood vessel, the cancer cells covered on the surface of the nipple have obvious abnormal shape and obvious nucleolus; adenocarcinoma of lung with microemulsion structure: a micro-emulsion head cluster (micro-emulsion head) without a fiber blood vessel axis, floats in an alveolar cavity or small papilla and is densely arranged in fine fiber gaps; solid adenocarcinoma: cancerous tissue forms larger solid masses or nests with little formation of glandular ducts; adherent adenocarcinoma: the cancer tissue grows along the original alveolar wall structure, and the prognosis is better.
Pathologists play an important role in the decision making for histologic forms of lung cancer. However, classification of histological subtypes of lung adenocarcinoma is challenging. In addition to variable shape, texture and size, the morphological patterns of adenocarcinomas appear mostly mixed. In addition, the lack of a qualified pathologist to view pathological sections for a long time may lead to a decrease in diagnostic efficiency. Therefore, accurate identification of lung adenocarcinoma is a time consuming and laborious task for a pathologist. Fortunately, with the help of CAD (Computer Aided Diagnosis) system, the efficiency of biomedical image Diagnosis can be improved, and accurate treatment can be provided for patients. The cinematology, which extracts a large number of quantitative features from medical images, is of great significance for better cancer CAD. The advent of deep learning consisting of multiple Neural network layers of CNN (Convolutional Neural Networks) makes it a widespread solution in the field of medical image research. DLR (Deep Learning-based radiology) capable of extracting depth features for cancer regions is an emerging field.
Currently, a number of scholars have explored the implementation of deep learning in lung adenocarcinoma. In 2016, pre-trained CNN was applied to extract depth features from 40 ct images of non-small cell lung cancer, combined with traditional image features and trained classifiers to predict short-term and long-term survivors. Compared with the conventional method, the architecture has an increase in AUC (Area Under Curve) of receiver operation characteristic. With the wide spread of deep learning, in 2018, a sparse depth automatic encoder is used to evaluate phenotypic features and cell distribution from a wider area so as to classify transcriptome subtypes of lung adenocarcinoma from pathological images, which paves the way for the application of deep learning in predicting tumor molecular subtypes with pathological features. In 2019, a deep learning method was developed for automatically classifying histological subtypes on lung adenocarcinoma surgical resection slices. However, these approaches only focus on the implementation of deep learning, and it is a huge challenge to neglect to collect a small number of available training medical images for deep learning. Typically in DLR, the situation of smaller data sets is mitigated by using a pre-trained deep learning model for migration learning and fine-tuning on large natural data sets. But currently there is no study that invokes strong migratory learning to address the smaller datasets in the histological subtype classification of lung adenocarcinoma.
Disclosure of Invention
This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
In order to solve the technical problem that the accuracy of classifying the histological subtype of the lung adenocarcinoma is low when training data are few, the invention provides a histological subtype classification method of the lung adenocarcinoma based on deep learning and transfer learning.
The invention provides a lung adenocarcinoma histological subtype classification method based on deep learning and migratory learning, which comprises the following steps:
acquiring a lung adenocarcinoma image to be detected;
classifying the lung adenocarcinoma image to be detected by utilizing the trained lung adenocarcinoma histological subtype classification network, and determining a target lung adenocarcinoma histological subtype corresponding to the lung adenocarcinoma image to be detected;
outputting the target lung adenocarcinoma histological subtype through a lung adenocarcinoma histological subtype classification network.
Further, the training process of the lung adenocarcinoma histological subtype classification network comprises the following steps:
constructing a lung adenocarcinoma histological subtype classification network, wherein the weights of convolutional layers included in the lung adenocarcinoma histological subtype classification network before training are the same as the weights of convolutional layers included in a pre-trained network after training;
acquiring a lung adenocarcinoma original pathological image set, wherein a lung adenocarcinoma histological subtype corresponding to a lung adenocarcinoma original pathological image in the lung adenocarcinoma original pathological image set is known;
preprocessing the lung adenocarcinoma original pathological images in the lung adenocarcinoma original pathological image set by a characteristic-based method to obtain a target data set;
and training the lung adenocarcinoma histological subtype classification network by using the target data set to obtain the trained lung adenocarcinoma histological subtype classification network.
Further, the preprocessing the lung adenocarcinoma primitive pathological images in the lung adenocarcinoma primitive pathological image set by the feature-based method to obtain a target data set includes:
for each lung adenocarcinoma original pathological image in the lung adenocarcinoma original pathological image set, segmenting the lung adenocarcinoma original pathological image to obtain an interested image corresponding to the lung adenocarcinoma original pathological image;
for each lung adenocarcinoma original pathological image in the lung adenocarcinoma original pathological image set, horizontally and vertically overturning an interested image corresponding to the lung adenocarcinoma original pathological image to obtain a horizontal image and a vertical image corresponding to the lung adenocarcinoma original pathological image;
respectively determining an interested image, a horizontal image and a vertical image corresponding to each lung adenocarcinoma original pathological image in the lung adenocarcinoma original pathological image set as target data to obtain a target data set.
Further, the histological subtype classification network for lung adenocarcinoma is a dense convolutional network.
Further, the training process of the pre-training network includes:
constructing a pre-training network;
acquiring an original image set, wherein labels corresponding to original images in the original image set are of categories;
and training the pre-training network by using the original image set and the labels corresponding to all the original images in the original image set to obtain the pre-training network after training.
The invention has the following beneficial effects:
the invention can solve the problem of less available training data through transfer learning in the CNN using process so as to improve the classification performance. The relatively small data set of the present invention reflects the size of the data set available to most researchers in the medical imaging field. Transfer learning relies on the similarity between the original image data set and the target image data set. If the target dataset is similar to the original dataset, the original classification model may be applied and then the pre-trained CNN model may directly process the target dataset. If the target data set is different from the original data set, the pre-trained network is adjusted to the target data set by fine tuning. By fine-tuning the CNN model, the CNN model can more robustly and flexibly deal with differences between the original data set and the target data set, from which corresponding lung adenocarcinoma typing diagnosis results can be derived. The invention can also reduce the diagnosis repeatability of pathologists, and improve the efficiency of lung adenocarcinoma typing diagnosis and the development of pathological artificial intelligence on the premise of ensuring the accuracy of pathological diagnosis.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions and advantages of the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is a flow diagram of some embodiments of a method for histological subtype classification of lung adenocarcinoma based on deep learning and migratory learning according to the present invention;
FIG. 2 is a schematic representation of lung adenocarcinoma from five histological subtypes of lung adenocarcinoma according to the present invention;
FIG. 3 is a schematic view of an inverted ROI (Region of Interest) image according to the present invention;
FIG. 4 is a diagram illustrating a classification result image according to the present invention;
FIG. 5 is a diagram of t-SNE (t-Distributed stored probabilistic Neighbor Embedding, t-distribution-random Neighbor Embedding) visualized classification performance according to the present invention;
FIG. 6 is a diagram of training loss and training accuracy in accordance with the present invention;
FIG. 7 is a schematic diagram comparing the present invention with three other methods.
Wherein the reference numerals in fig. 2 include: a first lung adenocarcinoma image 201, a second lung adenocarcinoma image 202, a third lung adenocarcinoma image 203, a fourth lung adenocarcinoma image 204, and a fifth lung adenocarcinoma image 205.
The reference numerals in fig. 3 include: an image of interest 301, a horizontal image 302, and a vertical image 303.
The reference numerals in fig. 4 include: alveolar types 401 and 402, solid types 403 and 404, adherent types 405 and 406, microemulsion head types 407 and 408, and papillary types 409 and 410.
Detailed Description
To further explain the technical means and effects of the present invention adopted to achieve the predetermined objects, the following detailed description of the embodiments, structures, features and effects of the technical solutions according to the present invention will be given with reference to the accompanying drawings and preferred embodiments. In the following description, different references to "one embodiment" or "another embodiment" do not necessarily refer to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
The invention provides a lung adenocarcinoma histological subtype classification method based on deep learning and migratory learning, which comprises the following steps:
acquiring a lung adenocarcinoma image to be detected;
classifying the lung adenocarcinoma image to be detected by utilizing the trained lung adenocarcinoma histological subtype classification network, and determining a target lung adenocarcinoma histological subtype corresponding to the lung adenocarcinoma image to be detected;
outputting the target lung adenocarcinoma histological subtype through a lung adenocarcinoma histological subtype classification network.
The following steps are detailed:
referring to fig. 1, a flow diagram of some embodiments of a lung adenocarcinoma histological subtype classification method based on deep learning and migratory learning according to the present invention is shown. The lung adenocarcinoma histological subtype classification method based on deep learning and migratory learning comprises the following steps:
s1, acquiring a lung adenocarcinoma image to be detected.
In some embodiments, a lung adenocarcinoma image to be detected can be obtained.
The lung adenocarcinoma image to be detected can be a lung adenocarcinoma image of a lung adenocarcinoma histological subtype to be detected. Histological subtypes of lung adenocarcinoma may include: acinar type, solid type, adherent type, microemulsion head type and nipple type.
As an example, as shown in fig. 2, the histological subtype of lung adenocarcinoma corresponding to the first lung adenocarcinoma image 201 may be alveolar type. The histological subtype of lung adenocarcinoma corresponding to the second lung adenocarcinoma image 202 may be a solid type. The histological subtype of lung adenocarcinoma corresponding to the third lung adenocarcinoma image 203 may be a microemulsion head type. The histological subtype of lung adenocarcinoma corresponding to the fourth lung adenocarcinoma image 204 may be a papillary type. The histological subtype of lung adenocarcinoma corresponding to the fifth lung adenocarcinoma image 205 may be a wall-attached type. The first lung adenocarcinoma image 201, the second lung adenocarcinoma image 202, the third lung adenocarcinoma image 203, the fourth lung adenocarcinoma image 204, and the fifth lung adenocarcinoma image 205 may be lung adenocarcinoma images of different lung adenocarcinoma histological subtypes.
And S2, classifying the lung adenocarcinoma image to be detected by utilizing the trained lung adenocarcinoma histological subtype classification network, and determining the target lung adenocarcinoma histological subtype corresponding to the lung adenocarcinoma image to be detected.
In some embodiments, the trained lung adenocarcinoma histological subtype classification network may be utilized to classify the lung adenocarcinoma image to be detected, and determine the target lung adenocarcinoma histological subtype corresponding to the lung adenocarcinoma image to be detected.
The target lung adenocarcinoma histological subtype corresponding to the lung adenocarcinoma image to be detected can be the lung adenocarcinoma histological subtype corresponding to the lung adenocarcinoma image to be detected. The lung adenocarcinoma histological subtype classification network may be a dense convolutional network (DenseNet). The lung adenocarcinoma histological subtype classification network can be used for classifying lung adenocarcinomas and determining lung adenocarcinoma histological subtypes corresponding to the lung adenocarcinomas.
As an example, the lung adenocarcinoma image to be detected may be input into the lung adenocarcinoma histological subtype classification network, and the target lung adenocarcinoma histological subtype corresponding to the lung adenocarcinoma image to be detected may be determined through the lung adenocarcinoma histological subtype classification network. Classification can be made according to the magnitude of likelihood of each histological subtype of lung adenocarcinoma obtained. The target lung adenocarcinoma histological subtype corresponding to the lung adenocarcinoma image to be detected may be the lung adenocarcinoma histological subtype with the highest possibility of learning.
Optionally, the training process of the lung adenocarcinoma histological subtype classification network may include the following steps:
firstly, constructing a lung adenocarcinoma histological subtype classification network.
Wherein, the weights of the convolutional layers included in the lung adenocarcinoma histological subtype classification network before training may be the same as the weights of the convolutional layers included in the pre-trained network after training. The pre-training network may be a dense convolutional network (DenseNet). The pre-trained network may be used for classification.
For example, the fully-connected layer and the classifier layer (softmax layer) included in the pre-trained network after training can be replaced with new fully-connected layers and classifier layers. Wherein the new fully connected layer and classifier layer may be an unused fully connected layer and classifier layer. The pre-trained network replacing the new fully-connected layer and classifier layer may be determined as a lung adenocarcinoma histological subtype classification network. The weights of the new classifier layer may be initialized with random weights.
CNN is often considered the best machine learning method in medical image analysis. CNNs tend to consist of multiple simple but nonlinear elements, one or more convolutional layers, active layers, pooling layers, and fully-connected layers. CNN tends to provide an end-to-end processing method that can be used to extract and learn features deeply from images. DenseNet is one of the CNN models. In DenseNet, each layer obtains collective knowledge of all previous layers, for each layer, the profiles of all previous layers can be used as input, while its own profile is used as input for all subsequent layers. As the number of feature maps decreases, CNN models tend to become thinner and more compact. There are several versions of DenseNet, such as DenseNet-121, denoneNet-169, and DenseNet-201. Considering the issues of classification accuracy and algorithm complexity, it is proposed to use DenseNet-169 in this deep learning architecture for histological type classification of lung adenocarcinoma.
The lack of training data is the most common problem for CNNs for medical images because of the busy, experienced clinicians often have difficulty collecting and annotating data. Transfer learning is a common solution to address this limitation. More specifically, it is to train a new medical image analysis model using the learning network parameters (e.g., weights) of a pre-trained CNN model on large datasets (e.g., imageNet) and to solve the problem of smaller datasets. Although medical datasets are different from natural datasets, low-level features (e.g., lines and curves) are common to most image analysis tasks. Since the learned parameters can be used as a powerful set of features, the need for large data sets is reduced, as well as training time and memory costs. The convolutional layer of the pre-trained CNN model is fine-tuned, and the fully-connected layer is trained from scratch on the medical data set. The fine-tuning replaces the final classifier layer with a new layer. In the training process, the weights of the convolutional layers of the well-trained CNN model may be updated. The pre-training weights of the trained CNN model are used to initialize the weights of the convolutional layers, while the random weights are used to initialize the classifier layers.
And secondly, acquiring a lung adenocarcinoma primary pathological image set.
Wherein, the histological subtype of the lung adenocarcinoma corresponding to the lung adenocarcinoma primitive pathological image in the lung adenocarcinoma primitive pathological image set can be known. The primary pathology images of lung adenocarcinoma in the set of primary pathology images of lung adenocarcinoma can be lung adenocarcinoma images of known histological subtypes of lung adenocarcinoma.
135 complete glass slides and 1198 original pathological images of lung adenocarcinoma are adopted in the invention. The lung adenocarcinoma histological subtype is 282 lung adenocarcinoma original pathological images of alveolar type, 257 lung adenocarcinoma histological subtype is solid type, 204 lung adenocarcinoma original pathological images of adherent type, 218 lung adenocarcinoma original pathological images of microemulsion head type, and 237 lung adenocarcinoma original pathological images of papilla type. These data are examined by experienced pathologists. Among them, 959 (80%) pieces of lung adenocarcinoma original pathology images were used as training data, and 239 (20%) pieces of lung adenocarcinoma original pathology images were used as test data. The training data and the test data may be as shown in table 1.
TABLE 1
Figure BDA0003853039610000061
And thirdly, preprocessing the lung adenocarcinoma original pathological images in the lung adenocarcinoma original pathological image set by a characteristic-based method to obtain a target data set.
For example, this step may include the following substeps:
the method comprises a first substep of segmenting each lung adenocarcinoma original pathological image in a lung adenocarcinoma original pathological image set to obtain an interested image corresponding to the lung adenocarcinoma original pathological image.
The image of interest corresponding to the lung adenocarcinoma original pathological image can be an image corresponding to an ROI area in the lung adenocarcinoma original pathological image.
For example, in order to balance the resolution of the lung adenocarcinoma original pathological image and the computational complexity of the deep learning model, the lung adenocarcinoma original pathological image may be cropped to 128 × 128 pixels as ROIs (one ROI for each lung adenocarcinoma original pathological image).
And secondly, horizontally and vertically turning the interested image corresponding to the lung adenocarcinoma original pathological image for each lung adenocarcinoma original pathological image in the lung adenocarcinoma original pathological image set to obtain a horizontal image and a vertical image corresponding to the lung adenocarcinoma original pathological image.
Wherein the horizontal image may be the image of interest after being horizontally flipped. The vertical image may be the image of interest after being vertically flipped.
For example, as shown in fig. 3, an image of interest 301 corresponding to an original pathological image of lung adenocarcinoma can be horizontally and vertically flipped to obtain a horizontal image 302 and a vertical image 303 corresponding to the original pathological image of lung adenocarcinoma.
To initially solve the problem that the number of lung adenocarcinoma primitive pathological images in the lung adenocarcinoma primitive pathological image set is often relatively limited, a traditional data enhancement method, i.e. flipping each ROI horizontally and vertically to train data, can be adopted, and thus the number of images in the data set will be tripled.
And a third substep, respectively determining an interested image, a horizontal image and a vertical image corresponding to each lung adenocarcinoma original pathological image in the lung adenocarcinoma original pathological image set as target data to obtain a target data set.
And fourthly, training the lung adenocarcinoma histological subtype classification network by using the target data set to obtain the trained lung adenocarcinoma histological subtype classification network.
Wherein, training is repeated continuously, and when the training times of the target data set reach the preset maximum value, the training is finished.
For example, the specific implementation manner of this step can be implemented by the prior art.
Optionally, the training process of the pre-training network may include the following steps:
first, a pre-training network is constructed.
The specific implementation manner of this step can be implemented by the prior art, and is not described herein again.
And secondly, acquiring an original image set.
Where the raw image collection may be a large data set (e.g., imageNet). The labels corresponding to the original images in the original image set may be categories.
And thirdly, training the pre-training network by using the original image set and the labels corresponding to the original images in the original image set to obtain the pre-training network after training.
The specific implementation manner of this step can be implemented by the prior art, and is not described herein again.
Optionally, the invention uses 5 cross-validations of case separation at the patient level, i.e. experiments on a single GPU NVIDIA RTX 2070 using Python language, CNN can be implemented on a tensoflow framework with a batch size of 20, a learning rate of 1e-4, an epoch of 200, and an optimizer of Adam (adaptive moment estimation). And (3) verifying the classification performance of the lung adenocarcinoma histological subtype classification network by using three index verification models. AC (Accuracy), SE (Sensitivity), SP (Specificity), ROC (Receiver Operating Characteristic) area under the curve (AUC). They are defined as follows:
Figure BDA0003853039610000081
Figure BDA0003853039610000082
Figure BDA0003853039610000083
where AC represents accuracy. TP indicates true positive. TN indicates true negatives. FP indicates false positive. FN indicates false negatives. SE denotes sensitivity. SP represents specificity. AUC is the area under ROC, which is a curve with (1-specificity) on the x-axis and sensitivity on the y-axis.
The final classification result can be accurately obtained by using the network training model (lung adenocarcinoma histological subtype classification network) provided by the invention. As shown in fig. 4, acinar types 401 and 402, solid types 403 and 404, adherent types 405 and 406, microemulsion head types 407 and 408, and papilla types 409 and 410. Among them, the "acina:0.988862 "can be characterized as being of the alveolar type with a probability of 0.988862. "solid:0.999982 "can be characterized as being of solid type with a probability of 0.999982. "lepidic:0.998788 "can be characterized as a wall-mounted type with a probability of 0.998788. "micropillary: 0.9 "can be characterized by a probability of 0.9 of being a microemulsion head type. "papillary:0.987772 "can be characterized as a nipple type with a probability of 0.987772. Furthermore, as shown in FIG. 5, the present invention applies a t-SNE (t distribution-random neighbor embedding) algorithm to visualize classification performance. It was found from fig. 5 that it exhibited better isolated localization for five histological subtypes of lung adenocarcinoma, and each type was distributed in groups.
The training loss and training accuracy in each epoch using CNN with TL (Transfer learning ) (CNN-TL) and CNN without Transfer learning (CNN) can be shown in fig. 6. As can be seen from FIG. 6, as epoch increases during training, all training losses and training accuracy converge to one point, whereas the method using CNN with transfer learning is robust and does not require local optimization ahead of time, which significantly improves its training performance.
The classification results with AUC were compared to three advanced methods. Among them, the three advanced methods include: manually designed imagery omics: GLCM (Gray-level co-occurrence matrix) -SVM (Support Vector machine); modified CNN model: resNet-TL (transfer learning); migration-free learning: denseNet. As shown in fig. 7, the present invention combines deep learning and migratory learning to be significantly superior to the other three methods. In addition, table 2 shows a comparison of the performance of AC, SE and SP. As can be seen from Table 2, the method provided by the present invention achieves the highest accuracy of 0.91, sensitivity of 0.89 and specificity of 0.88 on the validation data.
TABLE 2
Figure BDA0003853039610000091
The invention can solve the problem of less available training data through transfer learning in the CNN using process so as to improve the classification performance. The method provided by the invention obtains the classification model with the accuracy of 0.91, the sensitivity of 0.89 and the specificity of 0.88. The relatively small data set of the present invention reflects the size of the data set available to most researchers in the medical imaging field. Transfer learning tends to rely on the similarity between the original image dataset and the target image dataset. If the target dataset is similar to the original dataset, the original classification model may be applied and then the pre-trained CNN model may directly process the target dataset. If the target dataset is different from the original dataset, the pre-trained network is adjusted to the target dataset by fine-tuning. By fine-tuning the CNN model, the CNN model can more robustly and flexibly handle differences between the original dataset and the target dataset. The method can deeply extract the high-grade characteristics of the lung adenocarcinoma histological pathological images and solve the problem of less training data.
The algorithm of the present invention has certain advantages over other advanced methods. The deep learning framework for classification proposed by the invention adopts transfer learning, so that the best performance is obtained. Since data normalization is important, the collection of data sets herein is diagnosed by an experienced pathologist strictly in terms of the diagnostic criteria for lung adenocarcinoma typing. In addition, the pre-trained CNN can process domain-specific images. Once the weights and biases are determined, the model can be used for analysis of new images. The lung adenocarcinoma histological subtype classification automatic CAD system based on deep learning can be widely applied to clinic. The model of the invention can highlight major areas of tumor cells and areas where high-level patterns are elusive. Therefore, the model can help clinicians to speed up the process of tumor diagnosis and can promote the development of clinical pathology intelligence.
And S3, outputting the target lung adenocarcinoma histological subtype through a lung adenocarcinoma histological subtype classification network.
In some embodiments, the target lung adenocarcinoma histological subtype may be output by a lung adenocarcinoma histological subtype classification network.
The invention provides a deep learning network with transfer learning to accurately classify histological subtypes of lung adenocarcinoma. The method mainly comprises two parts of pretreatment and classification. For the preprocessing step, feature intelligent processing (FWP) can be applied in advance to reduce the deep learning model processing time on the original image. The input is the original image and the output is the processed image. Classification focuses on depth imaging omics feature extraction. First, there are two phases of training, including migration learning with fine tuning and training of the proposed model on the target dataset. The model was then evaluated by testing. The processed image is input, and the classification result is output.
The invention can solve the problem of less available training data through transfer learning in the CNN using process so as to improve the classification performance. The relatively small data set of the present invention reflects the size of the data set available to most researchers in the medical imaging field. Transfer learning relies on the similarity between the original image data set and the target image data set. If the target dataset is similar to the original dataset, the original classification model may be applied and then the pre-trained CNN model may directly process the target dataset. If the target dataset is different from the original dataset, the pre-trained network is adjusted to the target dataset by fine-tuning. By fine-tuning the CNN model, the CNN model can more robustly and flexibly handle differences between the original and target data sets, from which corresponding lung adenocarcinoma typing diagnoses can be made. The invention can also reduce the diagnosis repeatability of pathologists, and improve the efficiency of lung adenocarcinoma typing diagnosis and the development of pathological artificial intelligence on the premise of ensuring the accuracy of pathological diagnosis.
The above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; the modifications or substitutions do not make the essence of the corresponding technical solutions deviate from the technical solutions of the embodiments of the present application, and are included in the protection scope of the present application.

Claims (5)

1. A lung adenocarcinoma histological subtype classification method based on deep learning and migratory learning is characterized by comprising the following steps:
acquiring a lung adenocarcinoma image to be detected;
classifying the lung adenocarcinoma image to be detected by using the trained lung adenocarcinoma histological subtype classification network, and determining a target lung adenocarcinoma histological subtype corresponding to the lung adenocarcinoma image to be detected;
outputting the target lung adenocarcinoma histological subtype through a lung adenocarcinoma histological subtype classification network.
2. The method for classifying histological subtypes of lung adenocarcinoma based on deep learning and migratory learning according to claim 1, wherein the training process of the histological subtype classification network of lung adenocarcinoma comprises:
constructing a lung adenocarcinoma histological subtype classification network, wherein the weights of convolution layers included in the lung adenocarcinoma histological subtype classification network before training are the same as the weights of convolution layers included in a pre-trained network after training;
acquiring a lung adenocarcinoma original pathological image set, wherein a lung adenocarcinoma histological subtype corresponding to a lung adenocarcinoma original pathological image in the lung adenocarcinoma original pathological image set is known;
preprocessing the lung adenocarcinoma original pathological images in the lung adenocarcinoma original pathological image set by a characteristic-based method to obtain a target data set;
and training the lung adenocarcinoma histological subtype classification network by using the target data set to obtain the trained lung adenocarcinoma histological subtype classification network.
3. The method for classifying histological subtypes of lung adenocarcinoma based on deep learning and migration learning according to claim 2, wherein the pre-processing the primary pathological images of lung adenocarcinoma in the primary pathological image set of lung adenocarcinoma by the feature-based method to obtain the target data set comprises:
for each lung adenocarcinoma original pathological image in the lung adenocarcinoma original pathological image set, segmenting the lung adenocarcinoma original pathological image to obtain an interested image corresponding to the lung adenocarcinoma original pathological image;
for each lung adenocarcinoma original pathological image in the lung adenocarcinoma original pathological image set, horizontally and vertically overturning an interested image corresponding to the lung adenocarcinoma original pathological image to obtain a horizontal image and a vertical image corresponding to the lung adenocarcinoma original pathological image;
respectively determining an interested image, a horizontal image and a vertical image corresponding to each lung adenocarcinoma original pathological image in the lung adenocarcinoma original pathological image set as target data to obtain a target data set.
4. The method for classifying histological subtypes of lung adenocarcinoma based on deep learning and migratory learning of claim 2, wherein the network for classifying histological subtypes of lung adenocarcinoma is a dense convolutional network.
5. The lung adenocarcinoma histological subtype classification method based on deep learning and migratory learning according to claim 2, wherein the training process of the pre-training network comprises:
constructing a pre-training network;
acquiring an original image set, wherein labels corresponding to original images in the original image set are of categories;
and training the pre-training network by using the original image set and the labels corresponding to all the original images in the original image set to obtain the pre-training network after training.
CN202211139740.4A 2022-09-19 2022-09-19 Lung adenocarcinoma histology subtype classification method based on deep learning and transfer learning Pending CN115471701A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211139740.4A CN115471701A (en) 2022-09-19 2022-09-19 Lung adenocarcinoma histology subtype classification method based on deep learning and transfer learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211139740.4A CN115471701A (en) 2022-09-19 2022-09-19 Lung adenocarcinoma histology subtype classification method based on deep learning and transfer learning

Publications (1)

Publication Number Publication Date
CN115471701A true CN115471701A (en) 2022-12-13

Family

ID=84333369

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211139740.4A Pending CN115471701A (en) 2022-09-19 2022-09-19 Lung adenocarcinoma histology subtype classification method based on deep learning and transfer learning

Country Status (1)

Country Link
CN (1) CN115471701A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116468690A (en) * 2023-04-17 2023-07-21 北京透彻未来科技有限公司 Subtype analysis system of invasive non-mucous lung adenocarcinoma based on deep learning
CN116825363A (en) * 2023-08-29 2023-09-29 济南市人民医院 Early lung adenocarcinoma pathological type prediction system based on fusion deep learning network

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116468690A (en) * 2023-04-17 2023-07-21 北京透彻未来科技有限公司 Subtype analysis system of invasive non-mucous lung adenocarcinoma based on deep learning
CN116468690B (en) * 2023-04-17 2023-11-14 北京透彻未来科技有限公司 Subtype analysis system of invasive non-mucous lung adenocarcinoma based on deep learning
CN116825363A (en) * 2023-08-29 2023-09-29 济南市人民医院 Early lung adenocarcinoma pathological type prediction system based on fusion deep learning network
CN116825363B (en) * 2023-08-29 2023-12-12 济南市人民医院 Early lung adenocarcinoma pathological type prediction system based on fusion deep learning network

Similar Documents

Publication Publication Date Title
CN108898160B (en) Breast cancer histopathology grading method based on CNN and imaging omics feature fusion
CN112101451B (en) Breast cancer tissue pathological type classification method based on generation of antagonism network screening image block
CN109447998B (en) Automatic segmentation method based on PCANet deep learning model
CN110245657B (en) Pathological image similarity detection method and detection device
CN110472676A (en) Stomach morning cancerous tissue image classification system based on deep neural network
CN108257135A (en) The assistant diagnosis system of medical image features is understood based on deep learning method
CN108416360B (en) Cancer diagnosis system and method based on breast molybdenum target calcification features
CN115471701A (en) Lung adenocarcinoma histology subtype classification method based on deep learning and transfer learning
CN111767952B (en) Interpretable lung nodule benign and malignant classification method
CN110111895A (en) A kind of method for building up of nasopharyngeal carcinoma far-end transfer prediction model
CN110675411A (en) Cervical squamous intraepithelial lesion recognition algorithm based on deep learning
CN110838100A (en) Colonoscope pathological section screening and segmenting system based on sliding window
CN115100467A (en) Pathological full-slice image classification method based on nuclear attention network
CN114648806A (en) Multi-mechanism self-adaptive fundus image segmentation method
CN114398979A (en) Ultrasonic image thyroid nodule classification method based on feature decoupling
Reenadevi et al. Breast cancer histopathological image classification using augmentation based on optimized deep ResNet-152 structure
CN117036288A (en) Tumor subtype diagnosis method for full-slice pathological image
CN116883341A (en) Liver tumor CT image automatic segmentation method based on deep learning
CN115147640A (en) Brain tumor image classification method based on improved capsule network
CN115100474A (en) Thyroid gland puncture image classification method based on topological feature analysis
CN114782948A (en) Global interpretation method and system for cervical liquid-based cytology smear
CN116228759A (en) Computer-aided diagnosis system and apparatus for renal cell carcinoma type
CN115880245A (en) Self-supervision-based breast cancer disease classification method
Yan et al. Two and multiple categorization of breast pathological images by transfer learning
CN115375632A (en) Lung nodule intelligent detection system and method based on CenterNet model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination