CN113850328A - Non-small cell lung cancer subtype classification system based on multi-view deep learning - Google Patents

Non-small cell lung cancer subtype classification system based on multi-view deep learning Download PDF

Info

Publication number
CN113850328A
CN113850328A CN202111128553.1A CN202111128553A CN113850328A CN 113850328 A CN113850328 A CN 113850328A CN 202111128553 A CN202111128553 A CN 202111128553A CN 113850328 A CN113850328 A CN 113850328A
Authority
CN
China
Prior art keywords
deep learning
small cell
lung cancer
cell lung
classification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111128553.1A
Other languages
Chinese (zh)
Inventor
张光磊
宋凡
田哲源
范广达
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Zhiyuan Medical Technology Co ltd
Original Assignee
Beijing Zhiyuan Medical Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Zhiyuan Medical Technology Co ltd filed Critical Beijing Zhiyuan Medical Technology Co ltd
Priority to CN202111128553.1A priority Critical patent/CN113850328A/en
Publication of CN113850328A publication Critical patent/CN113850328A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10081Computed x-ray tomography [CT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30061Lung
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30096Tumor; Lesion

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)
  • Apparatus For Radiation Diagnosis (AREA)

Abstract

The invention discloses a non-small cell lung cancer subtype classification system based on multi-view deep learning, which is used for classifying lung adenocarcinoma and lung squamous carcinoma based on CT images, generating and displaying image data of an inspected part based on volume data generated by carrying out lung CT on an inspected body and obtaining a pathological subtype classification result of a non-small cell lung cancer focus of the inspected body. The system can realize automatic lung field segmentation, the multi-view model fully utilizes interlayer information and three-dimensional information carried by the CT image, and the obtained classification model can be used as a tool for assisting a doctor through automatic feature extraction and classification training, so that the system has the characteristics of high automation and high practicability.

Description

Non-small cell lung cancer subtype classification system based on multi-view deep learning
Technical Field
The invention belongs to the technical field of artificial intelligence deep learning and biological information, and particularly relates to a non-small cell lung cancer subtype classification system based on multi-view deep learning.
Background
Worldwide, the incidence of lung cancer accounts for 11.6% of new tumors, the mortality accounts for 18.4% of all malignant tumors, and the lung cancer is one of the malignant tumors with the fastest increase of incidence and mortality, and poses great threat to the health and life of human groups. Among lung cancers, non-small cell lung cancer accounts for 85-90%.
Non-small Cell lung cancer can be further classified into Adenocarcinoma (ADC), Squamous Cell Carcinoma (SCC), Large Cell Carcinoma (LCC), and indeterminate (Not other specific, NOS) according to the criteria of world health organization in 2015. The incidence rate of lung adenocarcinoma is increased year by year, is the most common subtype in NSCLC, almost accounts for 60 percent of NSCLC, and the overall survival rate is low; the incidence of squamous cell lung carcinoma is second to that of adenocarcinoma of lung, and is one of the most important histological types of lung cancer, and the number of cases accounts for about 30 percent of that of non-small cell lung cancer; different tumor subtypes differ significantly morphologically and histologically and also exhibit different sensitivities to various treatment methods. Therefore, cancer typing, particularly the typing of squamous carcinoma and adenocarcinoma, has great guiding significance for clinical steps such as treatment scheme confirmation, prognosis and the like.
Pathological diagnosis is the current clinical gold standard for classification of subtypes of non-small cell lung cancer, however, it requires invasive biopsy or pathological tissue section, which often causes severe pain to the patient. In practical sampling operations, due to the spatiotemporal heterogeneity of lung cancer, the sampling result is difficult to describe the whole tumor, so that the accuracy is often questioned. In addition, since the pathological tissues are prepared after the operation, diagnosis requires a series of molecular biological steps, and thus treatment of patients may be delayed due to these time costs.
The research of the radiological characteristics can establish the relationship between the medical image and the tumor phenotype, so as to obtain tumor information, and the information obtained through the medical image is not limited to the local part of the tumor, so that the medical image is more time-saving and safer than a surgical operation. Due to the characteristics of end-to-end and automatic feature extraction, the deep learning is combined with the radiology depth in recent years, so that the complex processes of designing and extracting features are omitted, and the deep learning is widely applied to scenes such as lesion detection, lesion area segmentation, disease classification, image registration and the like. In the application of deep learning in CT images, the classical network uses 2D data, i.e. each scan slice is input as an image, and all slices are integrated to obtain the final result; the 3D method is characterized in that CT data are subjected to interpolation, scaling and missing fault information filling, and the data are integrally and directly sent to a 3D convolutional neural network to obtain a classification result aiming at a single case. The multi-view model uses data of a plurality of views such as coronal view, sagittal view, transverse view and the like, and the method can be regarded as the realization of 2.5D (between 2D and 3D), thereby not only utilizing the interlayer information of CT data, but also avoiding the requirements of great computation amount and severe data amount of 3D network training.
Disclosure of Invention
In order to solve the defects of the prior art, the invention provides a non-small cell lung cancer squamous carcinoma/adenocarcinoma two-subtype classification system based on multi-view deep learning, which uses a non-invasive and rapid deep learning method and utilizes CT images to extract the depth characteristics of the non-small cell lung cancer squamous carcinoma/adenocarcinoma to realize the classification of the non-small cell lung cancer squamous carcinoma/adenocarcinoma two-subtype so as to solve the limitation of the current classification, and the specific technical scheme of the invention is as follows:
a non-small cell lung cancer subtype classification system based on multi-view deep learning generates and displays image data of an examination part based on volume data generated by carrying out lung CT scanning on an examined body to obtain a pathological subtype classification result of a non-small cell lung cancer focus of the examined body, comprises an information acquisition module, a network module and a training module, wherein,
the information acquisition module is used for acquiring n lung CT images with doctor labels, and acquiring CT image data only containing lung fields after each image is subjected to rough segmentation and edge cleaning; under three visual angles of a transverse plane, a coronal plane and a sagittal plane, obtaining an interested area according to the same size by taking a doctor labeling area as a center for the obtained CT image data only containing the lung field;
the network module is used for constructing a multi-view deep learning network model;
the training module is used for performing independent training on the multi-view deep learning network model under a coronal plane, a sagittal plane and a transverse plane by using the region-of-interest data obtained by the information acquisition module, the model initialization weight is used as a pre-training weight on an ImageNet data set to obtain a deep learning network model for classifying the non-small cell lung cancer subtypes under three views, and a final classification result is obtained by performing hard voting on the classification results under the three views, so that the deep learning network model for classifying the non-small cell lung cancer subtypes is established;
the information acquisition module, the network module and the training module are used for processing the lung CT images to obtain a deep learning network model for classifying the subtypes of the non-small cell lung cancer, and the information acquisition module is used for processing the detected lung CT images and then directly inputting the processed lung CT images into the deep learning network model to obtain subtype classification labels predicted by the model.
Further, in the information acquisition module, the process of acquiring image data containing only lung fields for each CT image is as follows:
s1: based on the scanning layer thicknesses of different CT machines, carrying out interpolation by an approximate point sampling method;
s2: based on the pixel value of the CT data, each image is normalized and binarized to obtain a coarse segmentation region;
s3: cleaning the edge of the roughly-divided area through an edge cleaning assembly;
s4: detecting two areas with the maximum binary coarse segmentation area, namely two lung areas, by a maximum connected area method;
s5: filling holes and smoothing edges in the obtained lung region through opening and closing operation;
s6: and carrying out noise reduction processing on the image through Gaussian blur.
Further, in the information acquisition module, the process of obtaining the region of interest under three viewing angles for the obtained CT image data only containing the lung fields is as follows:
step 1: reading doctor labeling information to obtain edge point coordinates of the tumor region, calculating the maximum value and the minimum value of all the edge point coordinates in the x, y and z directions, and obtaining an external cuboid of the tumor region by using 6 values obtained in the three directions;
step 2: respectively calculating the average values of the maximum value and the minimum value of all edge point coordinates in the x direction, the y direction and the z direction, and obtaining average point coordinates in the three directions, namely the body center of the external cuboid;
step 3: and (3) wrapping a tumor external cuboid by using a 100 x 100 pixel cube, aligning the centers of the two cubes, and respectively removing the layer without the tumor region under three different viewing angles to obtain the final region-of-interest data.
Further, the multi-view deep learning network model comprises a first part network, a second part network and an output layer, the number of connected neurons of the output layer is 2, the output layer is constructed according to the requirement of a classification function of the classification model, the first part network uses a deep residual error network pre-trained on an ImageNet data set to remove all layer structures and layer parameters except a full connection layer, and the second part network is a full connection layer with randomly initialized parameters connected behind all layers of the first part network.
Further, the specific process of the training module is as follows:
under three visual angles, respectively using data of corresponding visual angles, magnifying each CT image to 224 multiplied by 224 pixels through scaling, inputting a multi-visual angle deep learning network model according to each CT image, setting a learning rate for training, storing classification information of each CT image in the same region of interest under the same visual angle, and obtaining a lesion classification result under the visual angle through hard voting;
after the three visual angles are trained respectively, collecting the classification results of the same interested area at the three visual angles, and carrying out hard voting to obtain the final subtype classification result; and obtaining a deep learning network model for classifying the non-small cell lung cancer subtypes.
The invention has the beneficial effects that:
1. compared with the input mode using a single fixed visual angle, the three-visual angle input mode can integrate information under three visual angles, maximally excavates information carried by CT data by utilizing CT interlayer relation and the like which cannot be utilized by traditional 2D input, ensures relatively high-efficiency calculation speed and certain generalization capability of a model, and can obtain higher model classification accuracy compared with the traditional method.
2. The method solves the problem of lack of labeled high-quality data required by the deep learning technology by migrating all layers except the full connection layer of the pre-trained deep residual error network and the parameters of the layers, improves the generalization capability of the model, and can train to obtain a model with better performance under the condition of the same data quantity;
3. the invention provides a preprocessing mode aiming at the non-small cell lung cancer CT image with the labeled information, can automatically segment the lung field region, eliminate unnecessary interference information of the external region, adapt to the multi-view task characteristics to perform interpolation, normalization and other processing on the CT image, and can effectively improve the model learning efficiency and the classification accuracy.
4. When the system is used for carrying out subsequent clinical classification tasks, the original CT image can be used as input, classification results are automatically output, training and other preprocessing are not needed, and the system is convenient to use and efficient.
Drawings
In order to illustrate embodiments of the present invention or technical solutions in the prior art more clearly, the drawings which are needed in the embodiments will be briefly described below, so that the features and advantages of the present invention can be understood more clearly by referring to the drawings, which are schematic and should not be construed as limiting the present invention in any way, and for a person skilled in the art, other drawings can be obtained on the basis of these drawings without any inventive effort. Wherein:
FIG. 1 is a flow chart of the construction of a non-small cell lung cancer squamous carcinoma/adenocarcinoma two-subtype classification system based on multi-view deep learning according to the present invention;
FIG. 2 is a schematic flow chart of the present invention for obtaining lung field segmentation data;
fig. 3 is a schematic diagram of a multi-view deep learning network according to the present invention.
Detailed Description
In order that the above objects, features and advantages of the present invention can be more clearly understood, a more particular description of the invention will be rendered by reference to the appended drawings. It should be noted that the embodiments of the present invention and features of the embodiments may be combined with each other without conflict.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, however, the present invention may be practiced in other ways than those specifically described herein, and therefore the scope of the present invention is not limited by the specific embodiments disclosed below.
The invention provides a multi-view mode, which comprehensively utilizes 3D information carried by a CT image and interlayer information of a CT slice and improves the accuracy of classification of two subtypes of non-small cell lung cancer. By utilizing the technical means provided by the invention, better classification effect and generalization capability compared with other deep learning models can be ensured under the condition of limited data volume, and a subtype classification auxiliary tool with higher accuracy is provided for clinicians.
As shown in fig. 1-2, a non-small cell lung cancer subtype classification system based on multi-view deep learning, which generates and displays image data of an examined part based on volume data generated by performing lung CT scanning on an examined subject to obtain a pathological subtype classification result of a non-small cell lung cancer lesion of the examined subject, comprises an information acquisition module, a network module and a training module,
the information acquisition module is used for acquiring n lung CT images with doctor labels, and acquiring CT image data only containing lung fields after each image is subjected to rough segmentation and edge cleaning; under three visual angles of a transverse plane, a coronal plane and a sagittal plane, obtaining an interested area according to the same size by taking a doctor labeling area as a center for the obtained CT image data only containing the lung field;
the network module is used for constructing a multi-view deep learning network model;
the training module performs independent training on the multi-view deep learning network model under the coronal plane, the sagittal plane and the transverse plane by using the region-of-interest data obtained by the information acquisition module, the model initialization weight uses the pre-training weight on the ImageNet data set to obtain a deep learning network model for classifying the non-small cell lung cancer subtypes under three views, and the final classification result is obtained by performing hard voting on the classification results under the three views, so that the deep learning network model for classifying the non-small cell lung cancer subtypes is established;
the information acquisition module, the network module and the training module are used for processing the lung CT images to obtain a deep learning network model for classifying the subtypes of the non-small cell lung cancer, and the information acquisition module is used for processing the detected lung CT images and then directly inputting the processed lung CT images into the deep learning network model to obtain subtype classification labels predicted by the model.
In some embodiments, the information acquiring module acquires image data containing only lung fields for each CT image by:
s1: based on the scanning layer thicknesses of different CT machines, carrying out interpolation by an approximate point sampling method;
s2: based on the pixel value of the CT data, each image is normalized and binarized to obtain a coarse segmentation region;
s3: cleaning the edge of the roughly-divided area through an edge cleaning assembly;
s4: detecting two areas with the maximum binary coarse segmentation area, namely two lung areas, by a maximum connected area method;
s5: filling holes and smoothing edges in the obtained lung region through opening and closing operation;
s6: and carrying out noise reduction processing on the image through Gaussian blur.
In some embodiments, in the information acquisition module, the process of obtaining the region of interest at three viewing angles for the obtained CT image data only containing the lung fields is:
step 1: reading doctor labeling information to obtain edge point coordinates of the tumor region, calculating the maximum value and the minimum value of all the edge point coordinates in the x, y and z directions, and obtaining an external cuboid of the tumor region by using 6 values obtained in the three directions;
step 2: respectively calculating the average values of the maximum value and the minimum value of all edge point coordinates in the x direction, the y direction and the z direction, and obtaining average point coordinates in the three directions, namely the body center of the external cuboid;
step 3: and (3) wrapping a tumor external cuboid by using a 100 x 100 pixel cube, aligning the centers of the two cubes, and respectively removing the layer without the tumor region under three different viewing angles to obtain the final region-of-interest data.
As shown in fig. 3, in some embodiments, the multi-view deep learning network model includes a first partial network, a second partial network and an output layer with a connecting neuron number of 2 constructed according to the classification function requirement of the classification model, wherein the first partial network uses a deep residual network pre-trained on the ImageNet data set to migrate all layer structures except the fully connected layer and parameters of the layers, and the second partial network is a fully connected layer with randomly initialized parameters connected behind all layers of the first partial network.
In some embodiments, the specific process of the training module is:
under three visual angles, respectively using data of corresponding visual angles, magnifying each CT image to 224 multiplied by 224 pixels through scaling, inputting a multi-visual angle deep learning network model according to each CT image, setting a learning rate for training, storing classification information of each CT image in the same region of interest under the same visual angle, and obtaining a lesion classification result under the visual angle through hard voting;
after the three visual angles are trained respectively, collecting the classification results of the same interested area at the three visual angles, and carrying out hard voting to obtain the final subtype classification result; and obtaining a deep learning network model for classifying the non-small cell lung cancer subtypes.
For the convenience of understanding the above technical aspects of the present invention, the following detailed description will be given of the above technical aspects of the present invention by way of specific examples.
Example 1
Firstly, for each CT image, data only containing lung fields are obtained through preprocessing means such as rough segmentation and edge cleaning, and preparation is made for storing multi-view data later.
The data used were from the cancer image archive (NCI-TCIA) under the national cancer institute, and the specific method for obtaining data containing only lung fields for each lung CT image was: the interpolation is carried out by an approximate point sampling method based on the scanning layer thickness of different CT machines, and the method realizes the interpolation of a matrix by copying adjacent pixels. By the method, each voxel is normalized to be 1mm multiplied by 1mm, so that the original size of a focus is ensured to be reduced, and the training of a multi-view model is facilitated.
Based on the pixel value of CT data, carrying out normalization (unit: Hu) with window width of 2000 and window level of 0 on each piece of data, and carrying out binarization by using-400 as a threshold value to obtain a coarse segmentation region; cleaning the edge of the roughly-divided area by using an edge cleaning assembly; detecting two areas with the largest binary rough segmentation area, namely two lung areas, by using a maximum linking area method, wherein the focus area Hu value is higher than that of other areas of the lung field and can be segmented in the process, so that the focus area is added by performing OR operation once; filling holes and smoothing edges in the obtained lung region by using opening and closing operation; and (3) reducing noise of the image by using Gaussian blur, wherein the specific sequence is to corrode firstly, then corrode again after expanding, and finally fill the residual gaps by using a hole filling algorithm to obtain CT data only containing lung fields.
Secondly, under three visual angles of a transverse plane, a coronal plane and a sagittal plane, the obtained CT data only containing the lung fields are centered on the region delineated by the doctor, and the region of interest is obtained according to the same size.
Obtaining 3D position information and size information of a tumor region by reading marking information of a doctor, and counting the sizes of all focus sections under three visual angles, wherein the distribution of the sizes is basically 50-2000mm2Rarely have diseasesThe range reaches 8000mm2Theoretically, almost all lesions could be encapsulated using a 100mm x 100mm cube; it should be noted that the size and shape of the frame used in the specification shall be subject to the specific data set. By searching the edge of the focus, the external cuboid of the focus is calculated, the body center of the focus is calculated, the focus is wrapped by the cuboid, the body centers of the focus and the cuboid are aligned, a 3D VOI area can be obtained, and each layer can be used as a 2D ROI area. The data respectively taking the transverse plane, the coronal plane and the sagittal plane as reference planes can be obtained through the rotation of the 3D CT data matrix, and the layer without the focus is removed to obtain the final data of the region of interest.
Thirdly, a multi-view deep learning network model is constructed, as shown in fig. 3, the leftmost part is a separately trained three-view network part, the convolutional layer of a deep residual error network ResNet50 pre-trained on ImageNet is migrated in the part, the original full-connected layer is not used, the second part is a replaced full-connected layer, the total layers are two layers, the intermediate layers are 2048 neurons and 2 neuron output layers, respectively, and the output results are input to the third part for hard voting to obtain classification results.
In the network, the three visual angles are used for training the model and classifying the last subtype, so that the 3D information and the interlayer information carried by the CT data are utilized to the maximum extent on the premise of ensuring the limitation of data volume and calculated amount, and the classification accuracy of the network model is improved. Meanwhile, the convolution layer of the depth residual error network ResNet50 pre-trained on ImageNet is migrated, and the used weight network learns the feature extraction rule of a common image to a certain extent, so that the method has the advantages of being easier to train, faster to converge and the like compared with random initialization. The model also abandons the pre-trained full-link layer on ImageNet, and changes the randomly initialized full-link layer, so that the requirement of two-subtype classification of tasks is met, 2 neurons are used as output, and the weight fluctuation of the full-link layer is large for different tasks, so that the model is relatively unsuitable for transfer training by using pre-training weights.
Due to the sensitivity of the non-small cell lung cancer CT data and the network model to the data proportion, weighted random sampling is used for compensating data imbalance caused by pathological natural distribution to a certain extent, the proportion among the data is calculated, all the data are sampled randomly with the inverse of the proportion as the weight, and the sampled data are used as the batch of data sent to model training.
And model training, namely respectively training the multi-view deep learning network model based on pre-training weight by adopting low learning rate, training the constructed multi-view model by using the obtained region-of-interest data to obtain a deep learning network model for classifying two subtypes of the non-small cell lung cancer under three views, obtaining a final subtype classification result by hard voting of the classification result under the three views, and establishing the deep learning network model for classifying the two subtypes of the non-small cell lung cancer.
When the model is trained, the obtained data needs to be enhanced, the data needs to be expanded to 224 × 224, random rotation is used, and random horizontal and vertical inversion is used to expand the training data, so that the data volume is increased. In addition, medical image data from different sources may contain noise in different distributions and forms due to the influence of factors such as acquisition equipment and surrounding environment, and high-frequency information may be generated by boundaries generated by segmentation, which may greatly affect the learning and performance of the model and require filtering for removal. For the training of the model, as ResNet requires 3 × 224 × 224 of input and CT image is single-channel gray image, before input, copy operation is needed to change single channel into three channels for input; training is carried out separately under three visual angles, and finally the results output under the three visual angles are collected together and voted to obtain a final subtype classification result.
In this embodiment, the optimizer adopted by the training network is an Adam optimizer, and after practical summary, the non-small cell lung cancer CT data is sensitive to the learning rate, and better classification accuracy can be obtained by using a low learning rate, and the learning rate is set to be 1 × 10-7Attenuation of weight is 1 × 10-3Training with a length of 200 epochs is performed and an early stopping strategy is added.
The loss function used for training is Label Smoothing Cross Entropy (Label Smoothing Cross Entropy). When the traditional cross entropy is used, the loss curve and the accuracy curve can be observed to increase simultaneously in training, and the model is too confident for a certain small gather of data due to the influence of data sources or other factors, but is wrong, so that the loss is increased, and the generalization capability of most data is well reflected, so that the improvement on the accuracy is also reflected, the smooth distribution of labels is equivalent to adding noise into the real distribution, the excessive confidence of the model for correct labels is avoided, the output value difference of positive and negative samples is small, the overfitting is avoided, and the generalization capability of the model is improved.
And for any new CT image, processing data through an information acquisition module, putting the data into an established multi-view network model for classifying the two subtypes of the non-small cell lung cancer, and performing auxiliary judgment to output a two-subtype classification result of the lesion of the labeling area.
In the present invention, the terms "first", "second", "third" and "fourth" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. The term "plurality" means two or more unless expressly limited otherwise.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (5)

1. A non-small cell lung cancer subtype classification system based on multi-view deep learning is characterized by comprising an information acquisition module, a network module and a training module, wherein the information acquisition module is used for acquiring pathological subtype classification results of non-small cell lung cancer focuses of a detected body by generating and displaying image data of a detected part based on volume data generated by carrying out lung CT scanning on the detected body,
the information acquisition module is used for acquiring n lung CT images with doctor labels, and acquiring CT image data only containing lung fields after each image is subjected to rough segmentation and edge cleaning; under three visual angles of a transverse plane, a coronal plane and a sagittal plane, obtaining an interested area according to the same size by taking a doctor labeling area as a center for the obtained CT image data only containing the lung field;
the network module is used for constructing a multi-view deep learning network model;
the training module is used for performing independent training on the multi-view deep learning network model under a coronal plane, a sagittal plane and a transverse plane by using the region-of-interest data obtained by the information acquisition module, the model initialization weight is used as a pre-training weight on the ImageNet data set to obtain a deep learning network model for classifying the non-small cell lung cancer subtypes under three views, and the final classification result is obtained by performing hard voting on the classification results under the three views, so that the deep learning network model for classifying the non-small cell lung cancer subtypes is established;
the information acquisition module, the network module and the training module are used for processing the lung CT images to obtain a deep learning network model for classifying the subtypes of the non-small cell lung cancer, and the information acquisition module is used for processing the detected lung CT images and then directly inputting the processed lung CT images into the deep learning network model to obtain subtype classification labels predicted by the model.
2. The system of claim 1, wherein the information acquisition module is configured to obtain image data only including lung fields for each CT image by:
s1: based on the scanning layer thicknesses of different CT machines, carrying out interpolation by an approximate point sampling method;
s2: based on the pixel value of the CT data, each image is normalized and binarized to obtain a coarse segmentation region;
s3: cleaning the edge of the roughly-divided area through an edge cleaning assembly;
s4: detecting two areas with the maximum binary coarse segmentation area, namely two lung areas, by a maximum connected area method;
s5: filling holes and smoothing edges in the obtained lung region through opening and closing operation;
s6: and carrying out noise reduction processing on the image through Gaussian blur.
3. The non-small cell lung cancer subtype classification system based on multi-view deep learning according to claim 1 or 2, wherein the information acquisition module is configured to obtain the regions of interest at three views from the obtained CT image data only containing the lung fields by:
step 1: reading doctor labeling information to obtain edge point coordinates of the tumor region, calculating the maximum value and the minimum value of all the edge point coordinates in the x, y and z directions, and obtaining an external cuboid of the tumor region by using 6 values obtained in the three directions;
step 2: respectively calculating the average values of the maximum value and the minimum value of all edge point coordinates in the x direction, the y direction and the z direction, and obtaining average point coordinates in the three directions, namely the body center of the external cuboid;
step 3: and (3) wrapping the tumor by using a cube to form an external cuboid, aligning the centers of the cube and the tumor, and respectively removing the layer without the tumor region at three different viewing angles to obtain final region-of-interest data.
4. The non-small cell lung cancer subtype classification system based on multi-perspective deep learning as claimed in claim 1 or 2, wherein the multi-perspective deep learning network model comprises a first part network, a second part network and an output layer with the connecting neuron number of 2 constructed according to the classification function requirement of the classification model, wherein the first part network uses a deep residual error network pre-trained on ImageNet data set to remove all layer structures except the fully connected layer and parameters of the layer, and the second part network is a fully connected layer with the parameters randomly initialized after all the layers of the first part network.
5. The system for classifying non-small cell lung cancer subtypes based on multi-view deep learning according to one of claims 1 to 4, wherein the specific process of the training module is as follows:
under three visual angles, respectively using data of corresponding visual angles, magnifying each CT image to 224 multiplied by 224 pixels through scaling, inputting a multi-visual angle deep learning network model according to each CT image, setting a learning rate for training, storing classification information of each CT image in the same region of interest under the same visual angle, and obtaining a lesion classification result under the visual angle through hard voting;
after the three visual angles are trained respectively, collecting the classification results of the same interested area at the three visual angles, and carrying out hard voting to obtain the final subtype classification result; and obtaining a deep learning network model for classifying the non-small cell lung cancer subtypes.
CN202111128553.1A 2021-09-26 2021-09-26 Non-small cell lung cancer subtype classification system based on multi-view deep learning Pending CN113850328A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111128553.1A CN113850328A (en) 2021-09-26 2021-09-26 Non-small cell lung cancer subtype classification system based on multi-view deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111128553.1A CN113850328A (en) 2021-09-26 2021-09-26 Non-small cell lung cancer subtype classification system based on multi-view deep learning

Publications (1)

Publication Number Publication Date
CN113850328A true CN113850328A (en) 2021-12-28

Family

ID=78979515

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111128553.1A Pending CN113850328A (en) 2021-09-26 2021-09-26 Non-small cell lung cancer subtype classification system based on multi-view deep learning

Country Status (1)

Country Link
CN (1) CN113850328A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115861303A (en) * 2023-02-16 2023-03-28 四川大学 EGFR gene mutation detection method and system based on lung CT image
TWI805290B (en) * 2022-03-28 2023-06-11 臺北醫學大學 Method for predicting whether lung adenocarcinoma has epidermal growth factor receptor mutations
CN116468690A (en) * 2023-04-17 2023-07-21 北京透彻未来科技有限公司 Subtype analysis system of invasive non-mucous lung adenocarcinoma based on deep learning
CN116825363A (en) * 2023-08-29 2023-09-29 济南市人民医院 Early lung adenocarcinoma pathological type prediction system based on fusion deep learning network

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI805290B (en) * 2022-03-28 2023-06-11 臺北醫學大學 Method for predicting whether lung adenocarcinoma has epidermal growth factor receptor mutations
CN115861303A (en) * 2023-02-16 2023-03-28 四川大学 EGFR gene mutation detection method and system based on lung CT image
CN115861303B (en) * 2023-02-16 2023-04-28 四川大学 EGFR gene mutation detection method and system based on lung CT image
CN116468690A (en) * 2023-04-17 2023-07-21 北京透彻未来科技有限公司 Subtype analysis system of invasive non-mucous lung adenocarcinoma based on deep learning
CN116468690B (en) * 2023-04-17 2023-11-14 北京透彻未来科技有限公司 Subtype analysis system of invasive non-mucous lung adenocarcinoma based on deep learning
CN116825363A (en) * 2023-08-29 2023-09-29 济南市人民医院 Early lung adenocarcinoma pathological type prediction system based on fusion deep learning network
CN116825363B (en) * 2023-08-29 2023-12-12 济南市人民医院 Early lung adenocarcinoma pathological type prediction system based on fusion deep learning network

Similar Documents

Publication Publication Date Title
Murugesan et al. A hybrid deep learning model for effective segmentation and classification of lung nodules from CT images
CN110996789B (en) Systems and methods for performing screening, diagnostic, or other image-based analysis tasks
CN107103187B (en) Lung nodule detection grading and management method and system based on deep learning
CN108805209B (en) Lung nodule screening method based on deep learning
CN113850328A (en) Non-small cell lung cancer subtype classification system based on multi-view deep learning
CN108898595B (en) Construction method and application of positioning model of focus region in chest image
John et al. Multilevel thresholding based segmentation and feature extraction for pulmonary nodule detection
CA2827742C (en) Method and apparatus for isolating a potential anomaly in imaging data and its application to medical imagery
CN111402254B (en) CT image lung nodule high-performance automatic detection method and device
CN106056596B (en) Full-automatic three-dimensional liver segmentation method based on local prior information and convex optimization
Tan et al. Analysis of segmentation of lung parenchyma based on deep learning methods
CN102324109A (en) Method for three-dimensionally segmenting insubstantial pulmonary nodule based on fuzzy membership model
Dey et al. Hybrid cascaded neural network for liver lesion segmentation
Jony et al. Detection of lung cancer from CT scan images using GLCM and SVM
CN110706225A (en) Tumor identification system based on artificial intelligence
Ghantasala et al. Texture recognization and image smoothing for microcalcification and mass detection in abnormal region
CN112184684A (en) Improved YOLO-v3 algorithm and application thereof in lung nodule detection
Jaffar et al. GA and morphology based automated segmentation of lungs from Ct scan images
Mastouri et al. A morphological operation-based approach for Sub-pleural lung nodule detection from CT images
Shankara et al. Artificial neural network for lung cancer detection using CT images
Dabade et al. A review paper on computer aided system for lung cancer detection
Tong et al. Computer-aided lung nodule detection based on CT images
Shaziya et al. Comprehensive review of automatic lung segmentation techniques on pulmonary CT images
Alamin et al. Improved framework for breast cancer detection using hybrid feature extraction technique and ffnn
Punithavathi et al. Detection of breast lesion using improved GLCM feature based extraction in mammogram images

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination