CN112215799A - Automatic classification method and system for grinded glass lung nodules - Google Patents

Automatic classification method and system for grinded glass lung nodules Download PDF

Info

Publication number
CN112215799A
CN112215799A CN202010961293.5A CN202010961293A CN112215799A CN 112215799 A CN112215799 A CN 112215799A CN 202010961293 A CN202010961293 A CN 202010961293A CN 112215799 A CN112215799 A CN 112215799A
Authority
CN
China
Prior art keywords
lung
image
lung nodule
dimensional
nodule
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010961293.5A
Other languages
Chinese (zh)
Inventor
万涛
张宁民
秦曾昌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beihang University
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN202010961293.5A priority Critical patent/CN112215799A/en
Publication of CN112215799A publication Critical patent/CN112215799A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10081Computed x-ray tomography [CT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30061Lung
    • G06T2207/30064Lung nodule

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Quality & Reliability (AREA)
  • Apparatus For Radiation Diagnosis (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a method and a system for automatically classifying frosted pulmonary nodules, which mainly comprise the following parts: the chest CT image is processed and analyzed through an image processing method, feature engineering and a machine learning technology, the ground glass lung nodules representing the lung micro-invasive adenocarcinoma and the invasive adenocarcinoma are effectively distinguished, and the classification effect is quantitatively evaluated. The method effectively fuses a plurality of image characteristics of lung nodules, which reflect the representation of the imaging, can realize the automatic classification of the ground glass lung nodules representing the lung micro-invasive adenocarcinoma and the invasive adenocarcinoma, provides auxiliary information for doctors, helps the doctors to perform quantitative analysis, and improves the working efficiency. In addition, clinical guidance significance can be provided for the feasibility of the classification method of the abraded lung nodules through a numerical analysis method.

Description

Automatic classification method and system for grinded glass lung nodules
Technical Field
The invention relates to the technical field of image processing and algorithms, in particular to a method and a system for automatically classifying frosted lung nodules.
Background
Lung cancer is one of the diseases with an increasing incidence and mortality among lung diseases, and is the leading cause of cancer death in men and women worldwide. Lung nodules are one of the major signs of early lung cancer and can serve as a useful marker for lung cancer assessment. Lung nodules refer to circular or circle-like focal shadows of the lungs less than 3 centimeters in diameter. Lung nodules can occur in any area within the lung, but there are differences in morphology, size, density, and edges. Ground Glass Nodules (GGNs) are a special type of small pulmonary nodules that appear on high resolution Computed Tomography (CT) as a light cloudy, thin shade of lightly increased density like a frosted glass texture. Clinical studies indicate that there is a greater likelihood that persistent localized GGNs in the lungs are early lung adenocarcinomas or precancerous lesions. Therefore, early detection and timely treatment of GGN have important significance for early diagnosis and prognosis of lung cancer.
Lung adenocarcinomas are classified according to clinical classification criteria into pre-invasive lesions, micro-invasive adenocarcinomas (MIA) and Invasive Adenocarcinomas (IA). The lung adenocarcinoma subtypes have gradually enhanced invasiveness and obvious prognosis difference, and the treatment methods of early lung adenocarcinoma subtypes which are shown as GGN are clinically greatly different. Therefore, the method can accurately judge and characterize the lung micro-invasive adenocarcinoma and the vitreous lung nodules of the invasive adenocarcinoma, provide a basis for a correct treatment scheme, and has important clinical value for improving the prognosis of patients.
Because GGNs are found in various lesions in the lung, the causes of the lesions are quite complex, and the treatment and prognosis of benign and malignant GGNs are quite different, the identification of benign lesions of GGNs and various subtypes of early lung adenocarcinoma is a big problem for clinicians. At present, the benign and malignant properties of GGN are mainly identified by means of imaging and the like. In recent years, image classification algorithms based on machine learning have been well developed on different classification tasks, and particularly, have achieved good effects on the classification problem of medical images, wherein the mode based on supervised learning is still the mainstream mode in the machine learning task. Random Forest (RF) is a common machine learning classification algorithm, and has better performance in the classification problem. In order to make RF have a better classification performance, it is necessary to design the types of features according to the characteristics of data, extract the features, and provide the features to an RF classifier.
The gray scale feature of the lung nodule is one of the relatively most basic features. In addition, although the shape of a lung nodule is not fixed, a benign nodule and a malignant nodule have a large difference in morphology, and therefore, a morphological feature is also an important image feature. To distinguish lung nodules carefully, textural features are also commonly used to distinguish good and malignant lung nodules. Due to the diversity and complexity of lung nodules, a single feature often cannot fully characterize the lung nodules, and therefore, in order to fully quantify the lung nodule features, different types of features need to be extracted, mainly including gray scale features, morphological features, and texture features. Different types of nodule property classification diagnosis rely heavily on feature extraction of the nodule. The lung nodule is a three-dimensional entity, the lung nodule features extracted on the two-dimensional section cannot completely express the spatial features of the nodule, and the three-dimensional features can more comprehensively describe the characteristics of the lung nodule to a certain extent, so that the accuracy of classification of benign and malignant lung nodules is improved. The volume rendering method can reconstruct the two-dimensional lung nodule image into a three-dimensional image, and is favorable for a doctor to visually observe and judge and perform subsequent feature extraction work. The invention provides an automatic classification method based on chest CT images by combining an image processing method, feature engineering and a machine learning technology, and realizes accurate identification of frosted lung nodules representing lung micro-invasive adenocarcinoma and invasive adenocarcinoma.
Disclosure of Invention
In order to solve the classification problem in the existing image processing technology, the embodiment of the disclosure provides a method and a system for automatically classifying frosted lung nodules, which realize automatic classification of frosted lung nodules representing lung micro-invasive adenocarcinoma and improve the work efficiency of radiologists. The method mainly comprises four modules: data acquisition, extraction of lung nodule regions, three-dimensional modeling, feature extraction and feature selection, training of random forest classifiers and evaluation of classifier performance. The method comprises the steps of extracting a lung nodule region through a chest CT image labeled by a clinician, carrying out three-dimensional modeling on the extracted region, obtaining three-dimensional image characteristics of lung nodules based on the three-dimensional image characteristics, inputting the characteristics subjected to a maximum correlation and min-redundancy (mRMR) characteristic selection method into a random forest classifier, training the classifier, and realizing classification of the lung nodules representing micro-invasive adenocarcinoma and invasive adenocarcinoma. According to the method, the lung nodule region is extracted from the chest CT image, three-dimensional modeling is carried out, and the lung nodule image characteristic is extracted, so that the accurate classification of the grinded glass lung nodule is efficiently completed. Meanwhile, a numerical analysis method of the classification accuracy of the pulmonary nodules is established, and clinical guidance significance is provided for feasibility of a classification algorithm.
In a first aspect, the embodiments of the present disclosure provide a method and a system for automatically classifying frosted lung nodules, including the following steps: carrying out data acquisition and manual annotation on the high-resolution chest CT image; sequentially carrying out lung nodule region extraction and three-dimensional modeling operation on the acquired chest CT image; extracting the three-dimensional feature of the lung nodule from the image obtained after the extraction of the lung nodule region and the three-dimensional modeling operation of the lung nodule, and performing feature selection operation on the feature obtained after the feature extraction operation; inputting the features after the feature selection operation into a random forest classifier to perform classifier training operation, designing an evaluation system for evaluating the classification results of micro-invasive adenocarcinoma and invasive adenocarcinoma in the grinded vitreal lung nodule, and performing clinical feasibility analysis on the method.
In one example, the data acquisition and manual labeling for high-resolution chest CT images includes: high resolution breast CT images meeting the requirements are screened from a hospital image archiving and communication systems (PACS) system, and the lung nodule area of each breast CT image is labeled by experienced imaging physicians.
In one example, the performing a lung nodule region extraction operation on the acquired CT image of the chest includes: aiming at the characteristic that gray values of the edge of a lung nodule region and the peripheral region marked by a doctor in an imaging department are obviously different on a CT image, the lung nodule region image is obtained by adopting an edge detection and morphological image processing method.
In one example, the three-dimensional modeling operation on the acquired CT image of the breast includes: and aiming at the lung nodule region image of the two-dimensional layer, carrying out three-dimensional modeling on the lung nodule by adopting a three-dimensional data field multi-surface display method based on volume rendering to obtain a three-dimensional model of the lung nodule. Firstly, extracting boundary voxels of the lung nodule image by adopting a gray-scale weighting method, and then performing opacity endowing operation on the boundary voxels and performing photometric synthesis calculation. And then, taking the boundary voxel as a mixture of different substances, and calculating the intersection point of the sight line direction and the isosurface in the voxel by adopting a trilinear interpolation value. And finally, in order to improve the display quality of the three-dimensional pulmonary nodule, calculating the illumination effect according to the normal vector of the intersection point.
In one example, the performing lung nodule three-dimensional feature extraction operation on the image after the lung nodule region extraction and lung nodule three-dimensional modeling operation includes: and performing image feature extraction on the three-dimensional lung nodule model subjected to three-dimensional modeling to obtain the shape, gray level and texture features of the three-dimensional lung nodule. The extracted morphological features include surface area, volume, surface area, major axis length, minor axis length; the gray scale features comprise a gray scale mean value, a gray scale variance, kurtosis and tortuosity; the texture features comprise related statistics of a three-dimensional local binary pattern, a gray level co-occurrence matrix and a gray level run-length matrix.
In one example, the performing a feature selection operation on the features after the feature extraction operation includes: the method for selecting the mRMR features is adopted, and not only the correlation between the features and the labels but also the correlation between the features and the features are considered in the feature selection process. And finding a group of characteristics which are most related to the final output result in the original characteristic set but are least related to each other by using Mutual Information (MI) as a measurement standard. The method mainly comprises the following steps: to compute joint and edge distributions between features, the data is normalized to between [0,255] and each dimension of features is stored using a reasonable data structure; calculating distribution and mutual information among the characteristics and between the characteristics and the response variables; and scoring the features based on maximum correlation and minimum redundancy, sorting the features according to the scores, and selecting a group of features which are sorted at the top as important features.
In one example, the inputting the features after the feature selection operation into a random forest classifier for classifier training operation includes: training a random forest classifier by adopting a k-fold cross validation method, wherein the flow of a random forest classification algorithm is as follows:
n represents the number of training samples, and M represents the number of features;
inputting a characteristic number M for determining a decision result of a node on a decision tree, wherein M < < M;
sampling N times from N training samples in a mode of sampling with a return sample to form a training set, predicting by using samples which are not sampled, and evaluating errors of the samples;
for each node, randomly selecting m features, and calculating the optimal splitting mode of the node based on the feature set;
each decision tree grows completely, and finally a complete tree classifier is built.
In one embodiment, the method further comprises the following steps: an evaluation system for evaluating the classification accuracy of the ground glass lung nodules is designed by combining the classification results of the ground glass lung nodules representing the lung micro-invasive adenocarcinoma and the invasive adenocarcinoma, and the clinical feasibility analysis of the ground glass lung nodule classification method based on the image omics characteristics is completed by evaluating the training result through the accuracy, the sensitivity, the specificity and a Receiver Operating Characteristic (ROC) curve.
According to the automatic classification method and system for the grinded glass lung nodules, lung nodule areas are extracted through chest CT images marked by clinicians, three-dimensional modeling is conducted on the extracted areas, three-dimensional features of the lung nodules are obtained based on the lung nodule areas, the features after the feature selection method based on the mRMR are input into a random forest classifier, the classifier is trained, and classification of the lung nodules representing micro-invasive adenocarcinoma and invasive adenocarcinoma is achieved. According to the method, the lung nodule region is extracted from the chest CT image, three-dimensional modeling is carried out, and the three-dimensional characteristics of the lung nodule image are extracted, so that the accurate classification of the grinded glass lung nodule is efficiently completed. Meanwhile, a numerical analysis method of the classification accuracy of the pulmonary nodules is established, and clinical guidance significance is provided for feasibility of a classification algorithm. The method can realize accurate automatic classification of the grinded vitreopulmonary nodules, assist doctors in quantitative and qualitative analysis of the pulmonary nodules, reduce workload and improve accuracy.
In a second aspect, embodiments of the present invention provide a computer-readable storage medium having a computer program stored thereon, which when executed by a processor, implements the above classification method for a frosted lung nodule.
In a third aspect, an embodiment of the present invention provides a computer program product containing instructions, which when run on a computer, causes the computer to perform the method according to the first aspect.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings needed to be used in the description of the embodiments are briefly introduced as follows:
FIG. 1 is a schematic flow chart illustrating steps of a method and system for automatically classifying frosted lung nodules according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart illustrating steps of a method and system for automatically classifying frosted lung nodules according to another embodiment of the present invention;
FIG. 3 is a schematic diagram illustrating the principles of three-dimensional modeling of lung nodules in an embodiment of the present invention.
Detailed Description
The present application will now be described in further detail with reference to the accompanying drawings and examples.
In the following description, the terms "first" and "second" are used for descriptive purposes only and are not intended to indicate or imply relative importance. The following description provides embodiments of the disclosure, which may be combined or substituted for various embodiments, and this application is therefore intended to cover all possible combinations of the same and/or different embodiments described. Thus, if one embodiment includes feature A, B, C and another embodiment includes feature B, D, then this application should also be considered to include an embodiment that includes one or more of all other possible combinations of A, B, C, D, even though this embodiment may not be explicitly recited in text below.
In order to make the objects, technical solutions and advantages of the present invention more clearly understood, the following description will be made in detail by way of embodiments with reference to the accompanying drawings. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Fig. 1 is a schematic flow chart of a method and a system for automatically classifying frosted lung nodules in an embodiment. The method specifically comprises the following steps:
step 101, data acquisition and pulmonary nodule labeling are performed on a high-resolution chest CT image.
Specifically, the data acquisition and pulmonary nodule labeling for the high-resolution chest CT image includes: high-resolution chest CT images meeting the requirements are screened out from a hospital image archiving and communication system, and lung nodule areas of each chest CT image are labeled by experienced imaging physicians.
Further, the extracting of the lung nodule region from the acquired high resolution chest CT image further includes: aiming at the characteristic that gray values of the edge of a lung nodule region and the peripheral region marked by a doctor in an imaging department are obviously different on a CT image, the lung nodule region image is obtained by adopting an edge detection and morphological image processing method.
In step 102, lung nodule three-dimensional modeling is performed on the extracted lung nodule region image.
Specifically, performing lung nodule three-dimensional modeling on the extracted lung nodule region image includes: firstly, extracting a boundary voxel of a three-dimensional reconstruction image by adopting a gray-scale weighting method, and then giving opacity to the boundary voxel and carrying out photometric synthesis calculation. And then, taking the boundary voxel as a mixture of different substances, and calculating the intersection point of the sight line direction and the isosurface in the voxel by adopting a trilinear interpolation value. Meanwhile, in order to improve the display quality of the image, the illumination effect is calculated according to the normal vector of the intersection point, and finally the final image is displayed by adopting a projection imaging method.
In step 103, lung nodule feature extraction is performed on the lung nodule region CT image and the lung nodule three-dimensional model.
Specifically, carry out lung nodule feature extraction to lung nodule regional CT image and lung nodule three-dimensional model, include: the lung nodule three-dimensional model of each patient is used as a sample to extract different types of image features including morphological features, gray scale features, texture features and the like so as to comprehensively quantify lung nodules. The extracted morphological features include surface area, volume, surface area, major axis length, minor axis length; the gray scale features comprise a gray scale mean value, a gray scale variance, kurtosis and tortuosity; the texture features comprise related statistics of a three-dimensional local binary pattern, a gray level co-occurrence matrix and a gray level run-length matrix.
In step 104, an mRMR feature selection method is adopted to perform selection operation on the extracted features, construct a feature matrix for modeling, and assign labels to training samples.
Specifically, the selecting operation of the extracted features by using an mRMR method, and the constructing of the feature matrix for modeling includes: to compute joint and edge distributions between features, the data is normalized to between [0,255] and each dimension of features is stored using a reasonable data structure; calculating distribution and mutual information among the characteristics and between the characteristics and the response variables; the features are scored based on maximum correlation and minimum redundancy, and the features are ranked according to the scores. The method not only considers the correlation between the features and the classification result, but also considers the correlation between the features. In the process of selecting the features by using the method, a group of features which have the maximum correlation with the final output result but have the minimum correlation with each other in the original feature set are found by using mutual information as a measurement standard and are used as input feature vectors of the classification method.
Further, assigning labels to the training samples further comprises: each row of the two-dimensional feature matrix obtained by the mRMR method represents a two-dimensional feature representation of a CT image of a lung nodule. Therefore, a label is assigned to each row of the two-dimensional feature matrix, i.e., a label is assigned to each lung nodule CT image. The label assigned to the ground glass lung nodule for representing the lung micro-invasive adenocarcinoma is 0, the label assigned to the ground glass lung nodule for representing the invasive adenocarcinoma is 1, and the labels are used for guiding the training of the classifier and evaluating the classification result. Similarly, each row of the three-dimensional feature matrix obtained by the mRMR method represents a three-dimensional feature representation of the lung nodule CT three-dimensional reconstructed image of each patient. Wherein, the label assigned to the frosted lung nodule for representing the lung micro-invasive adenocarcinoma is 0, and the label assigned to the frosted lung nodule for representing the invasive adenocarcinoma is 1.
In step 105, inputting the training sample into a random forest classifier for training and evaluating the performance of the classifier, so as to realize classification of the micro-invasive adenocarcinoma and the invasive adenocarcinoma represented in the grinded lung nodule.
Specifically, inputting the training samples into a random forest classifier for training and evaluating the performance of the classifier comprises: and respectively inputting the feature matrixes representing the three-dimensional features into a random forest classifier, training the random forest classifier by adopting a ten-fold cross validation mode, and automatically selecting optimal parameters by adopting a grid search algorithm in the training process to realize accurate classification of the lung nodules. And after training is finished, evaluating a training result through accuracy, sensitivity, specificity and an ROC curve, and classifying the micro-invasive adenocarcinoma and the invasive adenocarcinoma which are represented in the frosted lung nodule by the obtained training result.
For a clearer understanding and application of the breast CT image processing method proposed by the present invention, the following example is performed. It should be noted that the scope of the present invention is not limited to the following examples.
Fig. 2 is a schematic flow chart illustrating steps of a method and a system for automatically classifying frosted lung nodules according to another embodiment of the present invention, which specifically includes:
and step 201, marking and processing data. The method comprises the steps of firstly carrying out manual marking on a lung nodule region on a high-resolution chest CT image, then utilizing the obvious difference of the edge of the lung nodule region and a non-lung nodule region in the gray value, automatically extracting the lung nodule region by adopting an edge detection and morphological image processing method, and finally carrying out three-dimensional modeling on the CT image of the lung nodule region.
Specifically, based on step 201, the specific steps of extracting the lung nodule region from the high-resolution chest CT image are as follows: pulmonary nodule regions of high resolution chest CT images are manually labeled by experienced radiologists.
In addition, based on step 201, the specific steps of extracting the lung nodule region are as follows: since the imaging physician labels the lung nodule region of the high-resolution chest CT image, and the color of the labeled boundary is green, the RGB image is first converted into a gray image, and the converted image is subtracted from the original image (i.e., the chest CT image without labeling the lung nodule region) to obtain the lung nodule region closed boundary. And on the basis of the obtained lung nodule region closed boundary, obtaining a lung nodule region mask image by using a hole filling operation in the closed boundary region. And finally, multiplying the mask image of the lung nodule region with the original image to obtain the lung nodule region. Wherein, the gray value of the background region is defined as 0, and the original image gray value of the lung nodule region is reserved.
Further, based on the step 201, the specific steps of three-dimensionally modeling the CT image of the lung nodule region are as follows: fig. 3 is a schematic diagram illustrating the principles of three-dimensional modeling of lung nodules in an embodiment of the present invention. Specifically, gray gradient weighting is adopted to extract boundary voxels of a three-dimensional reconstruction image, corresponding opacity is given to the voxels on the boundary, luminance synthesis calculation is performed, the voxels on the boundary are used as a mixture of different substances, a three-dimensional linear interpolation related to the direction is adopted to calculate intersection points of a sight line direction and an equivalent surface in the voxels, illumination effect calculation is performed according to normal vectors of the intersection points to improve the quality of a displayed image, and finally a projection imaging method is adopted to display a final lung nodule three-dimensional model.
Step 202, feature extraction and selection. And extracting three-dimensional image features of the lung nodules based on the reconstructed lung nodules, and then selecting effective features and constructing a feature matrix by using an mRMR method.
Specifically, based on step 202, the specific steps of extracting the three-dimensional image features of the lung nodules are as follows: the extraction of the three-dimensional features of the lung nodules is based on the reconstructed three-dimensional models of the lung nodules, and the gray features of the lung nodules are extracted and comprise a gray mean value, a gray variance, a tortuosity and a peak value. And extracting three-dimensional morphological characteristics of the lung nodule, including surface area, volume, surface area-volume ratio, compactness, rectangularity and the like. And extracting three-dimensional texture features of the lung nodules, including relevant statistics of the gray level co-occurrence matrix and the gray level run-length matrix as the lung nodules. Because the reconstructed lung nodules are represented as a group of three-dimensional data in a computer, in order to comprehensively measure the texture features of the lung nodules, a gray level co-occurrence matrix and a gray level run-length matrix in 13 directions in a lung nodule three-dimensional model need to be calculated, relevant statistics of the matrix are calculated on the basis, and finally, the average value of each statistic is calculated to serve as the three-dimensional texture features of the lung nodules. The calculation method of the gray level co-occurrence matrix comprises the following steps: the gray level co-occurrence matrixes in 13 directions are obtained through calculation, the characteristics of energy, entropy, contrast and the like of each matrix are calculated, and then the average value of each characteristic is obtained to serve as a part of the three-dimensional texture characteristic of the lung nodule. The calculation method of the gray level run matrix is similar to that of the gray level co-occurrence matrix, the gray level run matrices in 13 directions are obtained through calculation, statistics such as short run advantage, long run advantage, low gray level run advantage, high gray level run advantage, short run low gray level advantage and short run high gray level advantage of each matrix are calculated, and then the average value of each statistic is obtained to serve as one part of the three-dimensional texture features of the lung nodules.
Further, based on step 202, the specific steps of selecting effective features and constructing a feature matrix by using an mRMR method are as follows: and finding a group of characteristics which have the maximum correlation with the final output result but have the minimum correlation with each other in the original characteristic set by using the mutual information as a measurement standard. Wherein the relevance of the feature set S to the class c is determined by the respective features fiAnd the average value of all mutual information values (I) between category c, the calculation formula is as follows:
Figure BDA0002680633890000111
redundancy of all features in the feature set S by feature fiAnd feature fjThe average value of all mutual information values in the space is defined, and the calculation formula is as follows:
Figure BDA0002680633890000112
the mRMR standard binds D to R, as defined below:
Figure BDA0002680633890000113
in practical application, an incremental search algorithm is adopted to search the feature subset to obtain an approximately optimal solution of the mRMR standard.
Step 203, training and classifying the classifier. Inputting the characteristic matrix of the three-dimensional characteristics into a random forest classifier, training the random forest classifier by adopting a ten-fold cross validation mode, and automatically selecting the optimal parameters by the classifier by adopting a grid search algorithm in the training process so as to improve the performance of classification. After training is finished, the performance of the classifier is evaluated through accuracy, sensitivity, specificity and an ROC curve, and the obtained training result is used for classifying the micro-invasive adenocarcinoma and the invasive adenocarcinoma which are represented in the frosted lung nodule.
Specifically, the specific steps for evaluating the performance of the classifier are as follows: after training and testing of the random forest classifier are completed, obtaining a confusion matrix of a sample and indexes for evaluating the performance of the classifier, calculating an ROC curve, calculating classification recall ratio and precision ratio through the confusion matrix, and finally taking the area under the curves of the recall ratio, the precision ratio and the ROC curve as indexes for evaluating the performance of the classifier. When the lung nodules characterizing invasive adenocarcinoma are classified correctly, True Positive (TP) is indicated, otherwise False Positive (FP) is indicated; when lung nodules characterizing micro-invasive adenocarcinomas were classified correctly, True Negatives (TN) were indicated, otherwise False Negatives (FN) were indicated. The recall ratio and precision ratio calculation formula is as follows:
Figure BDA0002680633890000121
Figure BDA0002680633890000122
the embodiment of the disclosure provides a method and a system for automatically classifying frosted lung nodules, which realize automatic classification of frosted lung nodules representing micro-invasive adenocarcinoma and invasive adenocarcinoma, and the method can be divided into 3 parts. In the first part of data labeling and processing, an imaging doctor carries out manual labeling on a lung nodule region of a high-resolution chest CT image, and the edge of a lesion region and a non-lesion region have obvious difference in gray value according to the manual labeling, so that edge points can be extracted through operation among neighborhood pixels and a closed connected boundary can be obtained, and then a lung nodule mask image is obtained through morphological operation. Then, a three-dimensional data field multi-surface display method based on volume rendering is adopted to carry out three-dimensional modeling on the lung nodule; in the second part of feature extraction and selection, three-dimensional image feature extraction is carried out on the lung nodule, and feature selection processing is carried out on the extracted features by adopting an mRMR method aiming at redundant features in a focus to construct a feature matrix for modeling; in the third part, training and classifying a random forest classifier, inputting the features selected by the mRMR method as training samples into the random forest classifier, and marking the frosted lung nodules representing the micro-invasive adenocarcinoma and the invasive adenocarcinoma as 0 and 1 respectively. The invention trains a classifier in a ten-fold cross validation mode, classifies the grinded vitreopulmonary nodules, and finally evaluates the performance of the classifier by the precision ratio, the recall ratio and the area under the curve.
The embodiment of the invention also provides a computer readable storage medium. The computer-readable storage medium has stored thereon a computer program, which is executed by the processor of fig. 1 or 2.
The embodiment of the invention also provides a computer program product containing the instruction. When the computer program product is run on a computer, it causes the computer to perform the method of fig. 1 or fig. 2 described above.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and can include the processes of the embodiments of the methods described above when the computer program is executed. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.
The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent shall be subject to the appended claims.
The foregoing describes the general principles of the present disclosure in conjunction with specific embodiments, however, it is noted that the advantages, effects, etc. mentioned in the present disclosure are merely examples and are not limiting, and they should not be considered essential to the various embodiments of the present disclosure. Furthermore, the foregoing disclosure of specific details is for the purpose of illustration and description and is not intended to be limiting, since the invention is not limited to the specific details described above.
The block diagrams of devices, apparatuses, systems involved in the present invention are only given as illustrative examples and are not intended to require or imply that the connections, arrangements, configurations, etc. must be made in the manner shown in the block diagrams. These devices, apparatuses, devices, systems may be connected, arranged, configured in any manner, as will be appreciated by those skilled in the art. Words such as "including," "comprising," "having," and the like are open-ended words that mean "including, but not limited to," and are used interchangeably therewith. The words "or" and "as used herein mean, and are used interchangeably with, the word" and/or, "unless the context clearly dictates otherwise. The word "such as" is used herein to mean, and is used interchangeably with, the phrase "such as but not limited to".
Also, as used herein, the use of "or" in a list of items beginning with "at least one" indicates a separate list, e.g., "A, B or at least one of C" means A or B or C, or AB or AC or BC, or ABC (i.e., A and B and C). Furthermore, the word "exemplary" does not mean that the described example is preferred or better than other examples.
The foregoing description has been presented for purposes of illustration and description. Furthermore, this description is not intended to limit embodiments of the disclosure to the form disclosed herein. While a number of example aspects and embodiments have been discussed above, those of skill in the art will recognize certain variations, modifications, alterations, additions and sub-combinations thereof.

Claims (10)

1. An automatic classification method and system for frosted lung nodules are characterized by comprising the following steps:
carrying out data acquisition and pulmonary nodule labeling on the high-resolution chest CT image;
sequentially carrying out lung nodule region extraction and lung nodule three-dimensional modeling operation on the acquired chest CT image;
extracting lung nodule image features from the image obtained after the extraction of the lung nodule region and the three-dimensional modeling operation of the lung nodule, and performing feature selection operation on the features obtained after the feature extraction operation;
and inputting the feature vectors after the feature selection operation into a random forest classifier to perform classifier training operation, designing an evaluation system for evaluating the classification results of micro-invasive adenocarcinoma and invasive adenocarcinoma in the grinded glass lung nodule, and performing clinical feasibility analysis on the method.
2. The automated classification method and system for frosted lung nodules according to claim 1, wherein the data acquisition for high resolution chest CT images comprises: high resolution breast CT images meeting the requirements are screened from a hospital image archiving and communication systems (PACS) system, and the lung nodule area of each breast CT image is labeled by experienced imaging physicians.
3. The method and system for automatically classifying frosted lung nodules according to claim 1, wherein the operation of automatically extracting lung nodule regions from the acquired high-resolution chest CT images comprises: aiming at the characteristic that gray values of the edge of a lung nodule region and the peripheral region marked by a doctor in an imaging department are obviously different on a CT image, the lung nodule region image is obtained by adopting an edge detection and morphological image processing method.
4. The method and system for automatically classifying frosted lung nodules according to claim 1, wherein the three-dimensional modeling operation of the lung nodule image comprises: and aiming at the lung nodule region image of the two-dimensional layer, carrying out three-dimensional modeling on the lung nodule by adopting a three-dimensional data field multi-surface display method based on volume rendering to obtain a three-dimensional model of the lung nodule.
5. The method and system for automatically classifying frosted lung nodules according to claim 4, wherein the lung nodule three-dimensional modeling operation comprises: firstly, extracting boundary voxels of the lung nodule image by adopting a gray-scale weighting method, and then performing opacity endowing operation on the boundary voxels and performing photometric synthesis calculation. And then, taking the boundary voxel as a mixture of different substances, and calculating the intersection point of the sight line direction and the isosurface in the voxel by adopting a trilinear interpolation value. And finally, in order to improve the display quality of the three-dimensional pulmonary nodule, calculating the illumination effect according to the normal vector of the intersection point.
6. The method and system for automatically classifying frosted lung nodules according to claim 1, wherein the image feature extraction operation of the three-dimensional lung nodule model comprises: and performing image feature extraction on the three-dimensional lung nodule model subjected to three-dimensional modeling to obtain the shape, gray level and texture features of the three-dimensional lung nodule.
7. The method and system for automatically classifying frosted lung nodules according to claim 6, wherein the obtaining of the morphological, gray scale and texture features of the lung nodules comprises: the extracted morphological features include surface area, volume, surface area, major axis length, minor axis length; the gray scale features comprise a gray scale mean value, a gray scale variance, kurtosis and tortuosity; the texture features comprise related statistics of a three-dimensional local binary pattern, a gray level co-occurrence matrix and a gray level run-length matrix.
8. The method and system for automatically classifying frosted lung nodules according to claim 1, wherein the performing a feature selection operation on the features after the feature extraction operation comprises: a maximum correlation-minimum redundancy (mRMR) method is adopted, and the method not only considers the correlation between the characteristics and the labels in the characteristic selection process, but also considers the correlation between the characteristics and the characteristics. And finding a group of characteristics which are most related to the final output result in the original characteristic set but are least related to each other by using Mutual Information (MI) as a measurement standard. And finally selecting a group of feature vectors with the best result through a cross validation test.
9. The method and system for automatically classifying frosted lung nodules according to claim 1, wherein the step of inputting the feature vectors after the feature selection operation into a random forest classifier for classifier training operation comprises: by adopting a k-fold cross validation method, the classifier automatically selects the optimal parameters in the training process, thereby improving the performance of classification. In the method, the value of k is 10.
10. The method and system for automatically classifying frosted lung nodules according to claim 1, further comprising: and designing an evaluation system for evaluating the classification accuracy of the ground glass lung nodules by combining the classification results of the ground glass lung nodules representing the lung micro-invasive adenocarcinoma and the invasive adenocarcinoma, and completing the clinical feasibility analysis of the automatic classification method of the ground glass lung nodules.
CN202010961293.5A 2020-09-14 2020-09-14 Automatic classification method and system for grinded glass lung nodules Pending CN112215799A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010961293.5A CN112215799A (en) 2020-09-14 2020-09-14 Automatic classification method and system for grinded glass lung nodules

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010961293.5A CN112215799A (en) 2020-09-14 2020-09-14 Automatic classification method and system for grinded glass lung nodules

Publications (1)

Publication Number Publication Date
CN112215799A true CN112215799A (en) 2021-01-12

Family

ID=74049494

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010961293.5A Pending CN112215799A (en) 2020-09-14 2020-09-14 Automatic classification method and system for grinded glass lung nodules

Country Status (1)

Country Link
CN (1) CN112215799A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112767393A (en) * 2021-03-03 2021-05-07 常州市第一人民医院 Machine learning-based bimodal imaging omics ground glass nodule classification method
CN113506289A (en) * 2021-07-28 2021-10-15 中山仰视科技有限公司 Method for classifying false positive of pulmonary nodule by using double-flow network
CN113888532A (en) * 2021-11-09 2022-01-04 推想医疗科技股份有限公司 Medical image analysis method and device based on flat scanning CT data
CN113888519A (en) * 2021-10-14 2022-01-04 四川大学华西医院 Prediction system for predicting lung solid nodule malignancy
CN114266774A (en) * 2022-03-03 2022-04-01 中日友好医院(中日友好临床医学研究所) Method, equipment and system for diagnosing pulmonary embolism based on flat-scan CT image
CN116206756A (en) * 2023-05-06 2023-06-02 中国医学科学院北京协和医院 Lung adenocarcinoma data processing method, system, equipment and computer readable storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1403057A (en) * 2001-09-13 2003-03-19 田捷 3D Euclidean distance transformation process for soft tissue display in CT image
CN104751178A (en) * 2015-03-31 2015-07-01 上海理工大学 Pulmonary nodule detection device and method based on shape template matching and combining classifier
CN106650830A (en) * 2017-01-06 2017-05-10 西北工业大学 Deep model and shallow model decision fusion-based pulmonary nodule CT image automatic classification method
CN108815721A (en) * 2018-05-18 2018-11-16 山东省肿瘤防治研究院(山东省肿瘤医院) A kind of exposure dose determines method and system
WO2018223066A1 (en) * 2017-06-02 2018-12-06 Veracyte, Inc. Methods and systems for identifying or monitoring lung disease

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1403057A (en) * 2001-09-13 2003-03-19 田捷 3D Euclidean distance transformation process for soft tissue display in CT image
CN104751178A (en) * 2015-03-31 2015-07-01 上海理工大学 Pulmonary nodule detection device and method based on shape template matching and combining classifier
CN106650830A (en) * 2017-01-06 2017-05-10 西北工业大学 Deep model and shallow model decision fusion-based pulmonary nodule CT image automatic classification method
WO2018223066A1 (en) * 2017-06-02 2018-12-06 Veracyte, Inc. Methods and systems for identifying or monitoring lung disease
CN108815721A (en) * 2018-05-18 2018-11-16 山东省肿瘤防治研究院(山东省肿瘤医院) A kind of exposure dose determines method and system

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112767393A (en) * 2021-03-03 2021-05-07 常州市第一人民医院 Machine learning-based bimodal imaging omics ground glass nodule classification method
CN113506289A (en) * 2021-07-28 2021-10-15 中山仰视科技有限公司 Method for classifying false positive of pulmonary nodule by using double-flow network
CN113506289B (en) * 2021-07-28 2024-03-29 中山仰视科技有限公司 Method for classifying false positives of lung nodules by using double-flow network
CN113888519A (en) * 2021-10-14 2022-01-04 四川大学华西医院 Prediction system for predicting lung solid nodule malignancy
CN113888532A (en) * 2021-11-09 2022-01-04 推想医疗科技股份有限公司 Medical image analysis method and device based on flat scanning CT data
CN114266774A (en) * 2022-03-03 2022-04-01 中日友好医院(中日友好临床医学研究所) Method, equipment and system for diagnosing pulmonary embolism based on flat-scan CT image
CN116206756A (en) * 2023-05-06 2023-06-02 中国医学科学院北京协和医院 Lung adenocarcinoma data processing method, system, equipment and computer readable storage medium
CN116206756B (en) * 2023-05-06 2023-10-27 中国医学科学院北京协和医院 Lung adenocarcinoma data processing method, system, equipment and computer readable storage medium

Similar Documents

Publication Publication Date Title
Sechopoulos et al. Artificial intelligence for breast cancer detection in mammography and digital breast tomosynthesis: State of the art
CN109583440B (en) Medical image auxiliary diagnosis method and system combining image recognition and report editing
CN112215799A (en) Automatic classification method and system for grinded glass lung nodules
JP4310099B2 (en) Method and system for lung disease detection
CN111243042A (en) Ultrasonic thyroid nodule benign and malignant characteristic visualization method based on deep learning
ES2914387T3 (en) immediate study
Mi et al. Deep learning-based multi-class classification of breast digital pathology images
US8238637B2 (en) Computer-aided diagnosis of malignancies of suspect regions and false positives in images
CN107767962B (en) Determining result data based on medical measurement data from different measurements
El-Baz et al. Three-dimensional shape analysis using spherical harmonics for early assessment of detected lung nodules
CN111553892B (en) Lung nodule segmentation calculation method, device and system based on deep learning
CN103096786A (en) Image analysis for cervical neoplasia detection and diagnosis
CN108027364A (en) For determining the system and method for the cell adequacy in cytological analysis system
US20210166382A1 (en) Similarity determination apparatus, similarity determination method, and similarity determination program
Kaliyugarasan et al. Pulmonary nodule classification in lung cancer from 3D thoracic CT scans using fastai and MONAI
Jin et al. Automatic coronary plaque detection, classification, and stenosis grading using deep learning and radiomics on computed tomography angiography images: a multi-center multi-vendor study
Liang et al. A framework for 3D vessel analysis using whole slide images of liver tissue sections
CN112690815A (en) System and method for assisting in diagnosing lesion grade based on lung image report
JP2017189394A (en) Information processing apparatus and information processing system
Zhang et al. Development and external validation of a simple-to-Use dynamic nomogram for predicting breast malignancy based on ultrasound morphometric features: a retrospective multicenter study
Javed et al. Deep learning techniques for diagnosis of lungs cancer
Gonçalves et al. Radiomics in head and neck cancer
TW202238617A (en) Method, system and computer storage medium for determining nodules in mammals with radiomics features and semantic imaging descriptive features
Mosaliganti et al. An imaging workflow for characterizing phenotypical change in large histological mouse model datasets
Banerjee et al. Computerized multiparametric mr image analysis for prostate cancer aggressiveness-assessment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210112