WO2020199692A1 - 一种癌转移预测影像特征的筛选方法、装置和存储介质 - Google Patents

一种癌转移预测影像特征的筛选方法、装置和存储介质 Download PDF

Info

Publication number
WO2020199692A1
WO2020199692A1 PCT/CN2019/130831 CN2019130831W WO2020199692A1 WO 2020199692 A1 WO2020199692 A1 WO 2020199692A1 CN 2019130831 W CN2019130831 W CN 2019130831W WO 2020199692 A1 WO2020199692 A1 WO 2020199692A1
Authority
WO
WIPO (PCT)
Prior art keywords
image feature
image
random forest
forest classifier
sample set
Prior art date
Application number
PCT/CN2019/130831
Other languages
English (en)
French (fr)
Inventor
赵源深
李志成
梁栋
骆荣辉
刘磊
Original Assignee
中国科学院深圳先进技术研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国科学院深圳先进技术研究院 filed Critical 中国科学院深圳先进技术研究院
Publication of WO2020199692A1 publication Critical patent/WO2020199692A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H30/00ICT specially adapted for the handling or processing of medical images
    • G16H30/20ICT specially adapted for the handling or processing of medical images for handling medical images, e.g. DICOM, HL7 or PACS
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10081Computed x-ray tomography [CT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30084Kidney; Renal
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30096Tumor; Lesion

Definitions

  • This application relates to the technical field of medical image processing, and in particular to a screening method, device and storage medium for predicting image features of cancer metastasis.
  • renal clear cell carcinoma metastasis is an important reason for the extremely poor prognosis of patients.
  • the inability to effectively diagnose the risk of metastasis in patients with clear cell renal cell carcinoma before surgery affects doctors' development of targeted treatment plans.
  • the embodiments of the present application provide a screening method, device, and storage medium for predicting image features of cancer metastasis, which can provide a model for predicting cancer metastasis and provide efficient image features, which is beneficial to the diagnosis and treatment of cancer metastasis of patients.
  • the first aspect of the embodiments of the present application provides a method for screening cancer metastasis prediction image features, the method including:
  • Step 1 Obtain a first CT image feature set of a tumor patient, wherein the first CT image feature set includes CT image feature information of a number of tumor patients, and the CT image feature information includes a number of CT image features;
  • Step 2 Perform preset processing on the CT image feature information in the first CT image feature set to increase the randomness of the CT image feature information of the tumor patient, and obtain a second CT image feature set;
  • Step 3 Obtain an image feature sample set from the second CT image feature set
  • Step 4 Input the image feature sample set into a preset random forest classifier, and use the random forest classifier to score various CT image features in the image feature sample set, and the score of the score is used to indicate The contribution of various CT image features to accurate prediction of cancer metastasis;
  • Step 5 Determine whether the random forest classifier meets the iterative end condition, and if so, use CT image features that meet the preset conditions in the image feature sample set as cancer metastasis prediction image features; if not, use the classification
  • the CT image features whose values are lower than the score threshold are deleted from the image feature sample set to obtain a new image feature sample set, and return to the step 4 to input the new image feature sample set into the preset random forest classifier.
  • a second aspect of the embodiments of the present application provides a screening device for predicting image features of cancer metastasis, the device including:
  • the first acquisition module is used to acquire a first CT image feature set of a tumor patient, wherein the first CT image feature set includes CT image feature information of several tumor patients, and the CT image feature information includes several CT images feature;
  • the preprocessing module is configured to perform preset processing on the CT image feature information in the first CT image feature set, so as to increase the randomness of the CT image feature information of the tumor patient, and obtain the second CT image feature set;
  • the second acquisition module is configured to acquire an image feature sample set from the second CT image feature set
  • the classification module is used to input the image feature sample set into a preset random forest classifier, and use the random forest classifier to score various CT image features in the image feature sample set, and the score of the score is used To indicate the contribution of various CT image features to accurate prediction of cancer metastasis;
  • the loop module is used to determine whether the random forest classifier meets the iteration end condition after each scoring by the classification module, and if so, use the CT image feature that meets the preset condition in the image feature sample set as cancer Transfer prediction image features; if not, delete the CT image features whose score is lower than the score threshold from the image feature sample set to obtain a new image feature sample set, and control the classification module to convert the new image feature sample
  • the set is input to a preset random forest classifier, and the random forest classifier is used to score various CT image features in the image feature sample set.
  • the third aspect of the embodiments of the present application provides a screening device for predicting image features of cancer metastasis, comprising: a memory, a processor, and a computer program stored on the memory and running on the processor, and the processor executes The computer program implements the steps in the method provided in the first aspect of the embodiments of the present application.
  • the fourth aspect of the embodiments of the present application provides a storage medium on which a computer program is stored.
  • the computer program is executed by a processor, the steps in the method provided in the first aspect of the embodiments of the present application are implemented.
  • the embodiment of the present invention provides a screening method, device and storage medium for predicting image features of cancer metastasis, to obtain a first CT image feature set of a tumor patient; to perform preset processing on CT image feature information in the first CT image feature set , Obtain the second CT image feature set; Obtain the image feature sample set from the second CT image feature set; Use the preset random forest classifier to score various CT image features in the image feature sample set; Determine the random forest classification If the tester meets the iterative end condition, if it is, the CT image feature that meets the preset condition in the image feature sample set is used as the cancer metastasis prediction image feature; if not, the CT image feature with a score lower than the score threshold is taken from the image feature sample Delete from the collection to obtain a new image feature sample set, and input the new image feature sample set into the preset random forest classifier for re-scoring.
  • This embodiment adds randomness to a given image feature set, and then uses a random forest classifier to delete CT image features that perform poorly during each iteration, which minimizes the error of the classifier, and from the complicated imaging From the features, image features that are effective in predicting cancer metastasis are extracted.
  • FIG. 1 is a schematic flowchart of a method for screening cancer metastasis prediction image features according to the first embodiment of the application;
  • FIG. 2 is a schematic structural diagram of a screening device for predicting image features of cancer metastasis according to the second embodiment of this application;
  • FIG. 3 is a schematic structural diagram of another apparatus for screening cancer metastasis prediction image features provided by the second embodiment of the application.
  • the present invention provides a screening method for predicting cancer metastasis image features.
  • the random sequence of the CT image feature matrix of tumor patients is adjusted to a given set of CT image features. Randomness, using the random forest classifier to optimize parameters, and extracting imaging features that are effective for predicting cancer metastasis from the complex imaging features.
  • an embodiment of the present invention provides a screening method for predicting image features of cancer metastasis.
  • the screening method includes:
  • Step 101 Obtain a first CT image feature set of a tumor patient, where the first CT image feature set includes CT image feature information of several tumor patients, and the CT image feature information includes several CT image features;
  • tumor patients include, but are not limited to, renal clear cell carcinoma patients.
  • the method for screening tumor metastasis prediction imaging features of this embodiment can be used for screening renal clear cell carcinoma metastasis prediction imaging features.
  • the CT image feature information of each patient in step 101 includes several CT image features.
  • These CT image features can come from CT images of any time sequence, including but not limited to plain scan, arterial phase, venous phase, and Time series CT images of the parenchymal phase.
  • the CT image features in a CT image feature information include, but are not limited to, morphology, first-order statistics, texture, gray-scale features, wavelet and other types of image features extracted from the CT image.
  • in this embodiment in order to screen out better CT image features as much as possible, it is possible to acquire as many different CT image features of tumor patients as possible. For example, in one example, for each tumor patient, 2336 There are two CT image features, and an appropriate amount of CT image features with excellent performance are selected from the 2336 CT image features through subsequent steps.
  • the CT image before extracting CT image features from the CT image, the CT image can also be image registered. Realize the matching of CT images on a uniform scale, and ensure that the CT images of different time series of the same patient are consistent in the number of layers and resolution.
  • the CT image feature information may be a CT image feature matrix, that is, a matrix composed of multiple CT image features of the same tumor patient.
  • Step 102 Perform preset processing on the CT image feature information in the first CT image feature set to increase the randomness of the CT image feature information of the tumor patient to obtain a second CT image feature set;
  • the CT image feature information is a CT image feature matrix
  • performing preset processing on the CT image feature information in the first CT image feature set to increase the randomness of the CT image feature information of the tumor patient, and obtaining the second CT image feature set includes: The CT image feature matrix of each tumor patient in the collection is adjusted in random order to obtain a random matrix, and the random matrix of the same tumor patient and the CT image feature matrix are combined as the new CT image feature matrix of the tumor patient to obtain the second CT image Feature collection.
  • performing preset processing on the CT image feature information in the first CT image feature set to increase the randomness of the CT image feature information of the tumor patient, and obtaining the second CT image feature set includes: The CT image feature matrix of each tumor patient in the CT image feature set is adjusted in random order to obtain a random matrix, and the random matrix of each tumor patient is used as the new CT image feature matrix of the tumor patient to obtain the second CT image feature set.
  • the randomness of CT image feature information of tumor patients can be increased.
  • Step 103 Obtain an image feature sample set from the second CT image feature set
  • acquiring the image feature sample set from the second CT image feature set includes: determining a preset number of tumor patients from the second CT image feature set, and using CT image feature information of these tumor patients to form the image feature sample set.
  • Step 104 Input the image feature sample set into a preset random forest classifier, and use the random forest classifier to score various CT image features in the image feature sample set, and the score value is used to indicate various CT image feature pairs Accurately predict the contribution of cancer metastasis;
  • step 104 the higher the score in step 104, the higher the contribution to accurate prediction of cancer metastasis, and the lower the score, the lower the contribution to accurate prediction of cancer metastasis.
  • Step 105 Judge whether the random forest classifier meets the iteration end condition, if yes, go to step 106, otherwise, go to step 107;
  • Step 106 Use CT image features meeting preset conditions in the image feature sample set as cancer metastasis prediction image features
  • Step 107 Delete CT image features with scores lower than the score threshold from the image feature sample set to obtain a new image feature sample set, and return to step 104 to input the new image feature sample set into the preset random forest classifier.
  • the parameters of the random forest classifier in this embodiment include but are not limited to: Ntree is set to 615, featurenum is set to 11, mtry is set to 4, and the number of iterations is set to 10000.
  • Ntree is set to 615
  • featurenum is set to 11
  • mtry is set to 4
  • the number of iterations is set to 10000.
  • the aforementioned parameters can also be modified to meet user requirements.
  • the first CT image feature set and the second CT image feature set include not only the CT image feature information of each tumor patient, but also its medical record data; further, the medical record data of the tumor patient includes the tumor patient Age and gender data.
  • the parameters of the random forest classifier are adjusted; among them, the ratio of the number of tumor patients in the image feature verification set to the image feature sample set is within the preset ratio range.
  • the preset ratio range is 1:3 to 1:5.
  • adjusting the parameters of the random forest classifier includes:
  • the problem of hypothesis verification is that there are differences in the CT image feature information of tumor patients of the same age.
  • the parameters of the random forest classifier are adjusted.
  • the score threshold in this embodiment may be obtained by the aforementioned random forest classifier, and before step 104, it further includes: scoring various CT image features in the image feature sample set by the random forest classifier to obtain For the highest score among the scoring results of various CT image features, the highest score is used as the score threshold, and step 104 is continued.
  • determining whether the random forest classifier meets the iteration end condition includes:
  • the preset number threshold may be any integer greater than 1, for example, 10000 times.
  • the foregoing process of obtaining the score threshold may not be included in the iterative process.
  • judging whether the random forest classifier meets the iteration end condition includes:
  • the preset number requirement can be set according to the number of cancer metastasis prediction image features. For example, if the number of cancer metastasis prediction image features is 11, the preset number requirement can be set to a range of 11-20. If the number of CT image feature types with scores higher than the score threshold is in the range of 11-20 after a random forest classifier is scored, the number of types is determined to meet the preset number requirements, and the random forest classifier is determined to meet the end of the iteration Condition, end the iterative process.
  • the CT image features that meet the preset conditions in the image feature sample set are used as cancer metastasis prediction image features including:
  • a preset number of CT image features with the score ranked first are selected as the image features for cancer metastasis prediction.
  • the 11 CT image features with the highest scores are selected as the image features for cancer metastasis prediction.
  • the CT image feature that meets the preset condition in the image feature sample set as the cancer metastasis prediction image feature includes:
  • the CT image features in the image feature sample set are used as cancer metastasis prediction image features.
  • it also includes: in the iterative process of the random forest classifier based on the image sample feature set, if the number of times that all CT image features in the image feature sample set are continuously retained exceeds the preset maximum number, or the image If all the CT image features in the feature sample set are deleted, it is determined that the parameters of the random forest classifier are set incorrectly, and this screening of cancer metastasis prediction image features is stopped.
  • the preset maximum number of times may be an integer such as 100 times or 150 times, which is not limited in this embodiment.
  • the embodiment of the present invention provides a method for screening cancer metastasis prediction image features to obtain a first CT image feature set of tumor patients; preset processing of CT image feature information in the first CT image feature set to increase tumor patients
  • the randomness of the CT image feature information is obtained, and the second CT image feature set is obtained; the image feature sample set is obtained from the second CT image feature set; the preset random forest classifier is used to analyze the various CT images in the image feature sample set Score features; determine whether the random forest classifier meets the iteration end condition, if yes, use CT image features in the image feature sample set that meet the preset conditions as cancer metastasis prediction image features; if not, lower the score below the score threshold
  • the CT image features of, are deleted from the image feature sample set to obtain a new image feature sample set, and return to step 4 to input the new image feature sample set into the preset random forest classifier.
  • This embodiment adds randomness to a given image feature set, and then uses a random forest classifier to delete CT image features that perform poorly during each iteration, which minimizes the error of the classifier, and from the complicated imaging Among the features, image features that are effective in predicting cancer metastasis are selected.
  • This embodiment provides a screening device for predicting image features of cancer metastasis.
  • the device includes:
  • the first acquisition module 201 is configured to acquire a first CT image feature set of a tumor patient, where the first CT image feature set includes CT image feature information of several tumor patients, and the CT image feature information includes several CT image features;
  • the preprocessing module 202 is configured to perform preset processing on the CT image feature information in the first CT image feature set to increase the randomness of the CT image feature information of the tumor patient to obtain the second CT image feature set;
  • the second acquisition module 203 is configured to acquire an image feature sample set from the second CT image feature set
  • the classification module 204 is configured to input the image feature sample set into a preset random forest classifier, and use the random forest classifier to score various CT image features in the image feature sample set, and the score of the score Used to indicate the contribution of various CT image features to the accurate prediction of cancer metastasis;
  • the loop module 205 is configured to determine whether the random forest classifier meets the iteration end condition after the classification module 204 ends each time the scoring is completed, and if so, use the CT image feature that meets the preset condition in the image feature sample set as cancer Transfer prediction image features; if not, delete the CT image features whose score is lower than the score threshold from the image feature sample set to obtain a new image feature sample set, and control the classification module 204 to change the new image feature
  • the sample set is input into a preset random forest classifier, and the random forest classifier is used to score various CT image features in the image feature sample set.
  • the optimization parameters of the random forest classifier in this embodiment include: Ntree is set to 615, featurenum is set to 11, mtry is set to 4, and the number of iterations is set to 10000.
  • the preprocessing module 202 is configured to adjust the CT image feature matrix of each tumor patient in the first CT image feature set in a random order to obtain a random matrix, and combine the random matrix of the same tumor patient with the CT image feature matrix As a new CT image feature matrix of the tumor patient, a second CT image feature set is obtained.
  • the screening device further includes a correlation control module, which is used to input the image feature sample set into a preset random forest classifier in the classification module 204, and use the random forest classifier to analyze the image feature sample set Before scoring various CT image features, obtain an image feature verification set from the second CT image feature set; analyze the difference between the image feature verification set and the image feature sample set in CT image feature information of tumor patients of the same age The difference in CT image feature information of cancer patients of the same sex and the same gender. If any of the two differences does not meet the preset conditions, the parameters of the random forest classifier are adjusted; among them, the image feature verification set and the image feature The ratio of the number of tumor patients in the sample collection is within the preset ratio range.
  • the screening device also includes a score threshold obtaining module, which is used to input the image feature sample set into a preset random forest classifier in the classification module 204, and use the random forest classifier to perform analysis on various CT image features in the image feature sample set.
  • a score threshold obtaining module which is used to input the image feature sample set into a preset random forest classifier in the classification module 204, and use the random forest classifier to perform analysis on various CT image features in the image feature sample set.
  • a score threshold obtaining module is used to input the image feature sample set into a preset random forest classifier in the classification module 204, and use the random forest classifier to perform analysis on various CT image features in the image feature sample set.
  • a score threshold obtaining module is used to input the image feature sample set into a preset random forest classifier in the classification module 204, and use the random forest classifier to perform analysis on various CT image features in the image feature sample set.
  • the loop module 205 is used to determine whether the number of iterations of the random forest classifier based on the image sample feature set exceeds a preset number threshold, if so, determine that the random forest classifier meets the iteration end condition, otherwise, determine the random forest classification The device does not meet the iteration end condition;
  • the loop module 205 is used to determine the type of CT image feature whose score is higher than the score threshold in the scoring result of the random forest classifier, to determine whether the number of types meets the preset number requirement, and if so, to determine the random The forest classifier meets the iteration end condition, otherwise, it is judged that the random forest classifier does not meet the iteration end condition.
  • the above-mentioned screening device further includes a stop control module, which is used to, in the process of the random forest classifier iterating based on the image sample feature set, if the number of consecutive times that all CT image features in the image feature sample set are retained exceeds the preset maximum number of times , Or all CT image features in the image feature sample set are deleted, it is determined that the random forest classifier parameters are set incorrectly, and this screening of cancer metastasis prediction image features is stopped.
  • a stop control module which is used to, in the process of the random forest classifier iterating based on the image sample feature set, if the number of consecutive times that all CT image features in the image feature sample set are retained exceeds the preset maximum number of times , Or all CT image features in the image feature sample set are deleted, it is determined that the random forest classifier parameters are set incorrectly, and this screening of cancer metastasis prediction image features is stopped.
  • this embodiment also provides a screening device for predicting image features of cancer metastasis.
  • the screening device includes: a memory 301, a processor, and a computer 302 stored in the memory 301 and running on the processor 302
  • the processor 302 implements the steps in the method of the first embodiment when the processor 302 executes the computer program.
  • an embodiment of the present application also provides a storage medium.
  • the storage medium may be the apparatus for screening cancer metastasis prediction image features in each of the foregoing embodiments.
  • the storage medium may be the embodiment shown in FIG. 3.
  • a computer program is stored on the storage medium, and when the program is executed by the processor, the steps in the method described in the first embodiment are implemented.
  • the storage medium may also be a U disk, a mobile hard disk, a read-only memory (ROM, Read-Only Memory), a RAM, a magnetic disk, or an optical disk and other media that can store program codes.
  • the screening device of this embodiment can screen out several optimal image features from the CT image features extracted by each tumor patient, and can be used to construct a prediction model of renal clear cell carcinoma metastasis. Due to the optimized parameters of the random forest classifier, the image features that perform poorly during each iteration are deleted, which minimizes the error of the classifier and ensures that the selected image feature is the smallest and optimal feature. set.
  • the disclosed device and method may be implemented in other ways.
  • the device embodiments described above are merely illustrative, for example, the division of modules is only a logical function division, and there may be other divisions in actual implementation, for example, multiple modules or components can be combined or integrated. To another system, or some features can be ignored, or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or modules, and may be in electrical, mechanical or other forms.
  • the modules described as separate components may or may not be physically separate, and the components displayed as modules may or may not be physical modules, that is, they may be located in one place, or they may be distributed on multiple network modules. Some or all of the modules may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • each functional module in each embodiment of the present application may be integrated into one processing module, or each module may exist alone physically, or two or more modules may be integrated into one module.
  • the above-mentioned integrated modules can be implemented in the form of hardware or software functional modules.
  • the integrated module is implemented in the form of a software function module and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the medium includes a number of instructions to enable a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods in the various embodiments of the present application.
  • the aforementioned readable storage medium includes: U disk, mobile hard disk, ROM, RAM, magnetic disk or optical disk and other media that can store program codes.

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Medical Informatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Radiology & Medical Imaging (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Epidemiology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Primary Health Care (AREA)
  • Public Health (AREA)
  • Quality & Reliability (AREA)
  • Apparatus For Radiation Diagnosis (AREA)
  • Image Analysis (AREA)

Abstract

一种癌转移预测影像特征的筛选方法、装置和存储介质,对获取的第一CT影像特征集合进行预设处理得到第二CT影像特征集合(102),从第二CT影像特征集合中获取影像特征样本集合(103);利用随机森林分类器对该样本集合中的各类CT影像特征进行评分(104);判断随机森林分类器是否满足迭代结束条件(105),是,则将该样本集合中表现优秀的CT影像特征作为癌转移预测影像特征(106),否,则对影像特征样本集合删除其中表现不佳的CT影像特征,并将其重新输入随机森林分类器(107)。上述方案为影像特征集合增加随机性,再利用随机森林分类器删除每次迭代过程中表现不佳的CT影像特征,很大程度上减少了分类器的误差,从繁杂的影像学特征中提取了对预测癌转移有效的影像特征。

Description

一种癌转移预测影像特征的筛选方法、装置和存储介质 技术领域
本申请涉及医学图像处理技术领域,尤其涉及一种癌转移预测影像特征的筛选方法、装置和存储介质。
背景技术
目前,很多癌症患者都存在癌转移风险,这种风险对医生的治疗方案有重要影响。
以肾透明细胞癌转移为例,肾透明细胞癌转移是导致患者预后极差的重要原因。由于无法在术前对肾透明细胞癌患者的转移风险进行有效诊断,从而影响医生制定针对性的治疗方案。
临床研究发现,有超过17%的肾透明细胞癌患者会发生远端转移,而传统的手术根除无法有效应对转移性肾透明细胞癌,只能采取免疫或靶向药物的治疗方案。如果只采取手术切除来应对转移性肾透明细胞癌,其中位生存期只有12个月。因此,对肾透明细胞癌患者发生远端转移进行有效预测是制定个性化治疗方案的前置条件。
采用影像组学的方法利用患者的影像信息来构建预测模型是目前最常用的技术方案,然后在影像组学提取高通量的影像特征之后,如何从这些繁杂的影像特征中筛选出有效特征是构建预测模型的重要方面。它不仅有助于降低系统建模的难度,而且还能够排除相关特征噪声对模型预测性能的影响,提高模型精度。
技术问题
本申请实施例提供一种癌转移预测影像特征的筛选方法、装置和存储介质,可以为癌转移预测模型,提供高效的影像学特征,有利于患者的癌转移诊断和治疗。
技术解决方案
本申请实施例第一方面提供一种癌转移预测影像特征的筛选方法,该方法包括:
步骤1、获取肿瘤患者的第一CT影像特征集合,其中,所述第一CT影像特征集合中包含若干肿瘤患者的CT影像特征信息,所述CT影像特征信息中包含若干CT影像特征;
步骤2、对所述第一CT影像特征集合中的CT影像特征信息进行预设处理,以增加肿瘤患者的CT影像特征信息的随机性,得到第二CT影像特征集合;
步骤3、从所述第二CT影像特征集合中获取影像特征样本集合;
步骤4、将所述影像特征样本集合输入预设的随机森林分类器,利用所述随机森林分类器对所述影像特征样本集合中的各类CT影像特征进行评分,评分的分值用于指示各类CT影像特征对准确预测癌转移的贡献度;
步骤5、判断所述随机森林分类器是否满足迭代结束条件,若是,则将所述影像特征样本集合中满足预设条件的CT影像特征作为癌转移预测影像特征;若否,则将所述分值低于分数阈值的CT影像特征从所述影像特征样本集合中删除得到新的影像特征样本集合,返回所述步骤4将所述新的影像特征样本集合输入预设的随机森林分类器。
本申请实施例第二方面提供一种癌转移预测影像特征的筛选装置,该装置包括:
第一获取模块,用于获取肿瘤患者的第一CT影像特征集合,其中,所述第一CT影像特征集合中包含若干肿瘤患者的CT影像特征信息,所述CT影像特征信息中包含若干CT影像特征;
预处理模块,用于对所述第一CT影像特征集合中的CT影像特征信息进行预设处理,以增加肿瘤患者的CT影像特征信息的随机性,得到第二CT影像特征集合;
第二获取模块,用于从所述第二CT影像特征集合中获取影像特征样本集合;
分类模块,用于将所述影像特征样本集合输入预设的随机森林分类器,利用所述随机森林分类器对所述影像特征样本集合中的各类CT影像特征进行评分,评分的分值用于指示各类CT影像特征对准确预测癌转移的贡献度;
循环模块,用于在所述分类模块每次评分结束后,判断所述随机森林分类器是否满足迭代结束条件,若是,则将所述影像特征样本集合中满足预设条件的CT影像特征作为癌转移预测影像特征;若否,则将所述分值低于分数阈值的CT影像特征从所述影像特征样本集合中删除得到新的影像特征样本集合,控制所述分类模块将新的影像特征样本集合输入预设的随机森林分类器,利用所述随机森林分类器对所述影像特征样本集合中的各类CT影像特征进行评分。
本申请实施例第三方面提供一种癌转移预测影像特征的筛选装置,包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时,实现本申请实施例第一方面提供的方法中的步骤。
本申请实施例第四方面提供一种存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时,实现本申请实施例第一方面提供的方法中的步骤。
有益效果
本发明实施例提供了一种癌转移预测影像特征的筛选方法、装置和存储介质,获取肿瘤患者的第一CT影像特征集合;对第一CT影像特征集合中的CT影像特征信息进行预设处理,得到第二CT影像特征集合;从第二CT影像特征集合中获取影像特征样本集合;利用预设的随机森林分类器对影像特征样本集合中的各类CT影像特征进行评分;判断随机森林分类器是否满足迭代结束条件,若是,则将影像特征样本集合中满足预设条件的CT影像特征作为癌转移预测影像特征;若否,则将分值低于分数阈值的CT影像特征从影像特征样本集合中删除得到新的影像特征样本集合,将新的影像特征样本集合输入预设的随机森林分类器重新评分。本实施例为给定的影像特征集合增加随机性,再利用随机森林分类器删除每次迭代过程中表现不佳的CT影像特征,这最大限度地减少了分类器的误差,从繁杂的影像学特征中提取了对预测癌转移有效的影像特征。
附图说明
图1为本申请第一实施例提供的一种癌转移预测影像特征的筛选方法的流程示意图;
图2为本申请第二实施例提供的一种癌转移预测影像特征的筛选装置的结构示意图;
图3为本申请第二实施例提供的另一种癌转移预测影像特征的筛选装置的结构示意图。
本发明的实施方式
为使得本申请的发明目的、特征、优点能够更加的明显和易懂,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而非全部实施例。基于本申请中的实施例,本领域技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
第一实施例:
为了实现对肿瘤患者的癌转移风险进行精准判断,本发明提供了一种癌转移预测影像特征的筛选方法,通过对肿瘤患者的CT影像特征矩阵的随机顺序调整为给定的CT影像特征集合增加随机性,再利用随机森林分类器优化参数,从繁杂的影像学特征中提取了对预测癌转移有效的影像学特征。
参见图1,本发明实施例提出一种癌转移预测影像特征的筛选方法,该筛选方法包括:
步骤101、获取肿瘤患者的第一CT影像特征集合,其中,第一CT影像特征集合中包含若干肿瘤患者的CT影像特征信息, CT影像特征信息中包含若干CT影像特征;
本实施例中,肿瘤患者包括但不限于肾透明细胞癌患者,本实施例的癌转移预测影像特征的筛选方法可用于肾透明细胞癌转移预测影像特征的筛选。
可选的,步骤101中的每个患者的CT影像特征信息中包含若干个CT影像特征,这些CT影像特征可来自于任意时序的CT图像,包括但不限于平扫、动脉期、静脉期以及实质期等时序的CT图像。进一步的,一个CT影像特征信息中的CT影像特征包括但不限于从CT图像中提取的形态学、一阶统计学、纹理、灰度特征以及小波等类型的影像特征。可选的,本实施例中,为了尽可能地筛选出更优质的CT影像特征,可以将尽可能多地获取肿瘤患者的不同CT影像特征,例如在一个示例中,对各个肿瘤患者,获取2336种CT影像特征,通过后续步骤从这2336种CT影像特征中选择出适量的表现优秀的CT影像特征。
进一步的,考虑到不同时序的CT图像之间,存在一致性不足的问题,本实施例中,在从CT图像中提取CT影像特征前,还可以对CT图像进行图像配准。实现对CT图像的统一尺度上的匹配,确保同一患者的不同时序的CT图像在层数和分辨率上保持一致。
可选的,在一个示例中,CT影像特征信息可以是CT影像特征矩阵,即同一肿瘤患者的多个CT影像特征组成的矩阵。
步骤102、对第一CT影像特征集合中的CT影像特征信息进行预设处理,以增加肿瘤患者的CT影像特征信息的随机性,得到第二CT影像特征集合;
本实施例中,当CT影像特征信息为CT影像特征矩阵,增加该CT影像特征矩阵的随机性的方式很多,现有技术中任意增加矩阵随机性的方案都可以用于本实施例中。
可选的,对第一CT影像特征集合中的CT影像特征信息进行预设处理,以增加肿瘤患者的CT影像特征信息的随机性,得到第二CT影像特征集合包括:对第一CT影像特征集合中各肿瘤患者的CT影像特征矩阵分别进行随机顺序调整得到随机矩阵,将同一肿瘤患者的随机矩阵与CT影像特征矩阵组合作为所述肿瘤患者的新的CT影像特征矩阵,得到第二CT影像特征集合。
在另一个实施例中,对第一CT影像特征集合中的CT影像特征信息进行预设处理,以增加肿瘤患者的CT影像特征信息的随机性,得到第二CT影像特征集合包括:对第一CT影像特征集合中各肿瘤患者的CT影像特征矩阵分别进行随机顺序调整得到随机矩阵,将各肿瘤患者的随机矩阵作为该肿瘤患者的新的CT影像特征矩阵,得到第二CT影像特征集合。
上述处理后,肿瘤患者的CT影像特征信息的随机性得以增加。
步骤103、从第二CT影像特征集合中获取影像特征样本集合;
可选的,从第二CT影像特征集合中获取影像特征样本集合包括:从第二CT影像特征集合中确定预设数量的肿瘤患者,利用这些肿瘤患者的CT影像特征信息组成影像特征样本集合。
步骤104、将影像特征样本集合输入预设的随机森林分类器,利用随机森林分类器对影像特征样本集合中的各类CT影像特征进行评分,评分的分值用于指示各类CT影像特征对准确预测癌转移的贡献度;
可以理解的是,步骤104中分值越高,则对准确预测癌转移的贡献度越高,分值越低,则对准确预测癌转移的贡献度越低。
步骤105、判断随机森林分类器是否满足迭代结束条件,若是,则进入步骤106,否则,进入步骤107;
步骤106、将影像特征样本集合中满足预设条件的CT影像特征作为癌转移预测影像特征;
步骤107、将分值低于分数阈值的CT影像特征从影像特征样本集合中删除得到新的影像特征样本集合,返回步骤104将新的影像特征样本集合输入预设的随机森林分类器。
可选的,本实施例中的随机森林分类器的参数包括但不限于:Ntree设置为615、featurenum设置为11、mtry设置为4以及迭代次数设置为10000。在一些其他的实施例中,还可以对上述的参数进行修改以符合用户要求。
可选的,一个示例中,第一CT影像特征集合和第二CT影像特征集合中不但包括各个肿瘤患者的CT影像特征信息,还包括其病历数据;进一步的,肿瘤患者的病历数据包括肿瘤患者的年龄和性别数据。
在上述将影像特征样本集合输入预设的随机森林分类器,利用随机森林分类器对影像特征样本集合中的各类CT影像特征进行评分前,还包括:
从第二CT影像特征集合中获取影像特征验证集合;
分析影像特征验证集合与影像特征样本集合中,相同年龄段的肿瘤患者的CT影像特征信息的差异性以及相同性别的肿瘤患者的CT影像特征信息的差异性,若两种差异性中的任意一种不满足预设条件,则调整随机森林分类器的参数;其中,影像特征验证集合与影像特征样本集合的肿瘤患者的数量比在预设比例范围内。可选的,一个示例中,预设比例范围为1:3到1:5。
上述差异性比较的步骤有利于消除不同患者之间CT影像特征的相关性。
进一步的,分析影像特征验证集合与影像特征样本集合中,相同年龄段的肿瘤患者的CT影像特征信息的差异性以及相同性别的肿瘤患者的CT影像特征信息的差异性,若两种差异性中的任意一种不满足预设条件,则调整随机森林分类器的参数包括:
对影像特征验证集合与影像特征样本集合中,相同年龄段的肿瘤患者的CT影像特征信息,进行假设验证,其中,假设验证的问题为相同年龄段的肿瘤患者的CT影像特征信息存在差异,假设验证的P值设置为sex_p.value>=0.5;
以及,对影像特征验证集合与影像特征样本集合中,相同性别的肿瘤患者的CT影像特征信息进行假设验证,其中,假设验证的问题为相同性别的肿瘤患者的CT影像特征信息存在差异,假设验证的P值设置为age_p.value>=0.5;
若两个假设验证中的至少一个不成立,则调整随机森林分类器的参数。
可选的,在另一个实施例中,对上述两个假设验证的P值可以降低标准,例如sex_p.value>=0.3(或0.2),age_p.value>=0.3(或0.2)等等,以增加假设验证通过的概率。
可选的,本实施例中的分数阈值可以通过上述的随机森林分类器得到,在步骤104前,还包括:通过随机森林分类器对影像特征样本集合中的各类CT影像特征进行评分,获取各类CT影像特征的评分结果中的最高分值,将最高分值作为分数阈值,再继续执行步骤104。
在一个示例中,判断随机森林分类器是否满足迭代结束条件包括:
判断随机森林分类器基于影像样本特征集合迭代的次数是否超过预设次数阈值,若是,则判断随机森林分类器满足迭代结束条件,否则,判断随机森林分类器不满足迭代结束条件。
其中,预设次数阈值可以是大于1的任意整数,例如10000次。可选的,在一个示例中,上述得到分数阈值的过程可以不算在迭代过程之内。
在另一个示例中,判断随机森林分类器是否满足迭代结束条件包括:
确定随机森林分类器的评分结果中,分数高于分数阈值的CT影像特征的类型,判断类型的数量是否满足预设数量要求,若是,则判断随机森林分类器满足迭代结束条件,否则,判断随机森林分类器不满足迭代结束条件。
在该示例中,可以根据癌转移预测影像特征需要的数量来设置预设数量要求,例如,癌转移预测影像特征的数量为11,则可以将预设数量要求设置为11-20的范围。若某一次随机森林分类器的评分后,分数高于分数阈值的CT影像特征的类型的数量在11-20的范围,则确定类型的数量满足预设数量要求,判断随机森林分类器满足迭代结束条件,结束迭代过程。
上述两个示例中,结束迭代之后,将影像特征样本集合中满足预设条件的CT影像特征作为癌转移预测影像特征包括:
选择评分排列在前的预设数量的CT影像特征作为癌转移预测影像特征。
例如,选择评分排列在前的11个CT影像特征作为癌转移预测影像特征。
或者,在另一个示例中,结束迭代之后,将影像特征样本集合中满足预设条件的CT影像特征作为癌转移预测影像特征包括:
将迭代结束后,影像特征样本集合中的CT影像特征作为癌转移预测影像特征。
可选的,本实施例中,还包括:在随机森林分类器基于影像样本特征集合迭代过程中,若影像特征样本集合中的CT影像特征连续全部被保留的次数超过预设最大次数,或者影像特征样本集合中的CT影像特征全部被删除,则确定随机森林分类器参数设置错误,停止本次对癌转移预测影像特征的筛选。其中,预设最大次数可以是100次或者150次等整数,本实施例对此没有限制。
本实施例中,采用上述的癌转移预测影像特征的筛选方法,对肾透明细胞癌患者的2336种CT影像特征进行癌转移预测影像特征的筛选,选出了11种CT影像特征作为癌转移预测影像特征,经过预测模型构建,对由45例肾透明细胞癌患者所构成的验证集进行测试,得到转移预测结果见表1。
表1、转移预测结果
测试结果 auc acc
数值 0.8516 0.8261
从表1中可以看出本例中所提取的11种CT影像特征所构成的预测模型的精度为0.8261,auc值为0.8516,由此表明这种CT影像特征筛选方法所筛选的影像特征具有非常高的预测效果。
本发明实施例提供了一种癌转移预测影像特征的筛选方法,获取肿瘤患者的第一CT影像特征集合;对第一CT影像特征集合中的CT影像特征信息进行预设处理,以增加肿瘤患者的CT影像特征信息的随机性,得到第二CT影像特征集合;从第二CT影像特征集合中获取影像特征样本集合;利用预设的随机森林分类器对影像特征样本集合中的各类CT影像特征进行评分;判断随机森林分类器是否满足迭代结束条件,若是,则将影像特征样本集合中满足预设条件的CT影像特征作为癌转移预测影像特征;若否,则将分值低于分数阈值的CT影像特征从影像特征样本集合中删除得到新的影像特征样本集合,返回步骤4将新的影像特征样本集合输入预设的随机森林分类器。本实施例为给定的影像特征集合增加随机性,再利用随机森林分类器删除每次迭代过程中表现不佳的CT影像特征,这最大限度地减少了分类器的误差,从繁杂的影像学特征中筛选出了对预测癌转移有效的影像特征。
第二实施例:
本实施提供一种癌转移预测影像特征的筛选装置,参见图2,该装置包括:
第一获取模块201,用于获取肿瘤患者的第一CT影像特征集合,其中,第一CT影像特征集合中包含若干肿瘤患者的CT影像特征信息,CT影像特征信息中包含若干CT影像特征;
预处理模块202,用于对第一CT影像特征集合中的CT影像特征信息进行预设处理,以增加肿瘤患者的CT影像特征信息的随机性,得到第二CT影像特征集合;
第二获取模块203,用于从第二CT影像特征集合中获取影像特征样本集合;
分类模块204,用于将所述影像特征样本集合输入预设的随机森林分类器,利用所述随机森林分类器对所述影像特征样本集合中的各类CT影像特征进行评分,评分的分值用于指示各类CT影像特征对准确预测癌转移的贡献度;
循环模块205,用于在分类模块204每次评分结束后,判断所述随机森林分类器是否满足迭代结束条件,若是,则将所述影像特征样本集合中满足预设条件的CT影像特征作为癌转移预测影像特征;若否,则将所述分值低于分数阈值的CT影像特征从所述影像特征样本集合中删除得到新的影像特征样本集合,控制分类模块204将所述新的影像特征样本集合输入预设的随机森林分类器,利用所述随机森林分类器对所述影像特征样本集合中的各类CT影像特征进行评分。
可选的,本实施例中的随机森林分类器的优化参数包括:Ntree设置为615、featurenum设置为11、mtry设置为4、迭代次数设置为10000。
进一步的,预处理模块202,用于对所述第一CT影像特征集合中各肿瘤患者的CT影像特征矩阵分别进行随机顺序调整得到随机矩阵,将同一肿瘤患者的随机矩阵与CT影像特征矩阵组合作为所述肿瘤患者的新的CT影像特征矩阵,得到第二CT影像特征集合。
进一步的,筛选装置还包括相关性控制模块,用于在分类模块204将所述影像特征样本集合输入预设的随机森林分类器,利用所述随机森林分类器对所述影像特征样本集合中的各类CT影像特征进行评分前,从所述第二CT影像特征集合中获取影像特征验证集合;分析影像特征验证集合与影像特征样本集合中,相同年龄段的肿瘤患者的CT影像特征信息的差异性以及相同性别的肿瘤患者的CT影像特征信息的差异性,若两种差异性中的任意一种不满足预设条件,则调整随机森林分类器的参数;其中,影像特征验证集合与影像特征样本集合的肿瘤患者的数量比在预设比例范围内。
进一步的,该相关性控制模块,具体用于对影像特征验证集合与影像特征样本集合中,相同年龄段的肿瘤患者的CT影像特征信息,进行假设验证,其中,假设验证的问题为相同年龄段的肿瘤患者的CT影像特征信息存在差异,假设验证的P值设置为sex_p.value>=0.5;以及,对影像特征验证集合与影像特征样本集合中,相同性别的肿瘤患者的CT影像特征信息进行假设验证,其中,假设验证的问题为相同性别的肿瘤患者的CT影像特征信息存在差异,假设验证的P值设置为age_p.value>=0.5;在两个假设验证中的至少一个不成立时,调整随机森林分类器的参数。
进一步的,筛选装置还包括分数阈值获取模块,用于在分类模块204将影像特征样本集合输入预设的随机森林分类器,利用随机森林分类器对影像特征样本集合中的各类CT影像特征进行评分前,通过随机森林分类器对影像特征样本集合中的各类CT影像特征进行评分,获取各类CT影像特征的评分结果中的最高分值,将最高分值作为分数阈值,之后,控制分类模块204继续执行将影像特征样本集合输入预设的随机森林分类器,利用随机森林分类器对影像特征样本集合中的各类CT影像特征进行评分的步骤。
在一个示例中,循环模块205,用于判断随机森林分类器基于影像样本特征集合迭代的次数是否超过预设次数阈值,若是,则判断随机森林分类器满足迭代结束条件,否则,判断随机森林分类器不满足迭代结束条件;
在另一个示例中,循环模块205,用于确定随机森林分类器的评分结果中,分数高于分数阈值的CT影像特征的类型,判断类型的数量是否满足预设数量要求,若是,则判断随机森林分类器满足迭代结束条件,否则,判断随机森林分类器不满足迭代结束条件。
进一步的,上述筛选装置还包括停止控制模块,用于在随机森林分类器基于影像样本特征集合迭代的过程中,若影像特征样本集合中的CT影像特征全部被保留的连续次数超过预设最大次数,或者影像特征样本集合中的CT影像特征全部被删除,则确定随机森林分类器参数设置错误,停止本次对癌转移预测影像特征的筛选。
进一步的,本实施例还提供一种癌转移预测影像特征的筛选装置,参见图3,该筛选装置包括:存储器301、处理器及302存储在存储器301上并可在处理器302上运行的计算机程序,处理器302执行计算机程序时,实现第一实施例方法中的步骤。
进一步的,本申请实施例还提供了一种存储介质,该存储介质可以是设置于上述各实施例中的癌转移预测影像特征的筛选装置中,该存储介质可以是前述图3所示实施例中的存储器。该存储介质上存储有计算机程序,该程序被处理器执行时实现如第一实施例描述的方法中的步骤。进一步的,该储介质还可以是U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、RAM、磁碟或者光盘等各种可以存储程序代码的介质。
本实施例的筛选装置,可以从每位肿瘤患者所提取的CT影像特征中筛选出若干最优效的影像特征,可用于构建肾透明细胞癌转移的预测模型。由于采用了随机森林分类器的优化参数,删除每次迭代过程中表现不佳的影像特征,这最大限度地减少了分类器的误差,从而保证所筛选的影像特征是一个最小最优的特征子集。
在本申请所提供的几个实施例中,应该理解到,所揭露的装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,模块的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个模块或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或模块的间接耦合或通信连接,可以是电性,机械或其它的形式。
作为分离部件说明的模块可以是或者也可以不是物理上分开的,作为模块显示的部件可以是或者也可以不是物理模块,即可以位于一个地方,或者也可以分布到多个网络模块上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能模块可以集成在一个处理模块中,也可以是各个模块单独物理存在,也可以两个或两个以上模块集成在一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。
集成的模块如果以软件功能模块的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个可读存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例方法的全部或部分步骤。而前述的可读存储介质包括:U盘、移动硬盘、ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。
需要说明的是,对于前述的各方法实施例,为了简便描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本申请并不受所描述的动作顺序的限制,因为依据本申请,某些步骤可以采用其它顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作和模块并不一定都是本申请所必须的。
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其它实施例的相关描述。
以上为对本申请所提供的癌转移预测影像特征的筛选方法、装置及存储介质的描述,对于本领域的技术人员,依据本申请实施例的思想,在具体实施方式及应用范围上均会有改变之处,综上,本说明书内容不应理解为对本申请的限制。

Claims (10)

  1. 一种癌转移预测影像特征的筛选方法,其特征在于,包括:
    步骤1、获取肿瘤患者的第一CT影像特征集合,其中,所述第一CT影像特征集合中包含若干肿瘤患者的CT影像特征信息,所述CT影像特征信息中包含若干CT影像特征;
    步骤2、对所述第一CT影像特征集合中的CT影像特征信息进行预设处理,以增加肿瘤患者的CT影像特征信息的随机性,得到第二CT影像特征集合;
    步骤3、从所述第二CT影像特征集合中获取影像特征样本集合;
    步骤4、将所述影像特征样本集合输入预设的随机森林分类器,利用所述随机森林分类器对所述影像特征样本集合中的各类CT影像特征进行评分,评分的分值用于指示各类CT影像特征对准确预测癌转移的贡献度;
    步骤5、判断所述随机森林分类器是否满足迭代结束条件,若是,则将所述影像特征样本集合中满足预设条件的CT影像特征作为癌转移预测影像特征;若否,则将所述分值低于分数阈值的CT影像特征从所述影像特征样本集合中删除得到新的影像特征样本集合,返回所述步骤4将所述新的影像特征样本集合输入预设的随机森林分类器。
  2. 根据权利要求1所述的癌转移预测影像特征的筛选方法,其特征在于,所述CT影像特征信息为CT影像特征矩阵,所述对所述第一CT影像特征集合中的CT影像特征信息进行预设处理,以增加肿瘤患者的CT影像特征信息的随机性,得到第二CT影像特征集合包括:
    对所述第一CT影像特征集合中各肿瘤患者的CT影像特征矩阵分别进行随机顺序调整得到随机矩阵,将同一肿瘤患者的随机矩阵与CT影像特征矩阵组合作为所述肿瘤患者的新的CT影像特征矩阵,得到第二CT影像特征集合。
  3. 根据权利要求1所述的癌转移预测影像特征的筛选方法,其特征在于,所述将所述影像特征样本集合输入预设的随机森林分类器,利用所述随机森林分类器对所述影像特征样本集合中的各类CT影像特征进行评分前,还包括:
    从所述第二CT影像特征集合中获取影像特征验证集合;
    分析所述影像特征验证集合与所述影像特征样本集合中,相同年龄段的肿瘤患者的CT影像特征信息的差异性以及相同性别的肿瘤患者的CT影像特征信息的差异性,若两种差异性中的任意一种不满足预设条件,则调整所述随机森林分类器的参数;其中,所述影像特征验证集合与所述影像特征样本集合的肿瘤患者的数量比在预设比例范围内。
  4. 根据权利要求3所述的癌转移预测影像特征的筛选方法,其特征在于,所述分析所述影像特征验证集合与所述影像特征样本集合中,相同年龄段的肿瘤患者的CT影像特征信息的差异性以及相同性别的肿瘤患者的CT影像特征信息的差异性,若两种差异性中的任意一种不满足预设条件,则调整所述随机森林分类器的参数包括:
    对所述影像特征验证集合与所述影像特征样本集合中,相同年龄段的肿瘤患者的CT影像特征信息,进行假设验证,其中,假设验证的问题为相同年龄段的肿瘤患者的CT影像特征信息存在差异,假设验证的P值设置为sex_p.value>=0.5;
    以及,对所述影像特征验证集合与所述影像特征样本集合中,相同性别的肿瘤患者的CT影像特征信息进行假设验证,其中,假设验证的问题为相同性别的肿瘤患者的CT影像特征信息存在差异,假设验证的P值设置为age_p.value>=0.5;
    在两个假设验证中的至少一个不成立时,调整所述随机森林分类器的参数。
  5. 根据权利要求1-4任一项所述的癌转移预测影像特征的筛选方法,其特征在于,在所述步骤4前,还包括:
    通过所述随机森林分类器对所述影像特征样本集合中的各类CT影像特征进行评分,获取各类CT影像特征的评分结果中的最高分值,将所述最高分值作为所述分数阈值,继续执行所述步骤4。
  6. 根据权利要求1-4任一项所述的癌转移预测影像特征的筛选方法,其特征在于,所述判断所述随机森林分类器是否满足迭代结束条件包括:
    判断所述随机森林分类器基于影像样本特征集合迭代的次数是否超过预设次数阈值,若是,则判断所述随机森林分类器满足所述迭代结束条件,否则,判断所述随机森林分类器不满足所述迭代结束条件;
    或者,确定所述随机森林分类器的评分结果中,分数高于所述分数阈值的CT影像特征的类型,判断所述类型的数量是否满足预设数量要求,若是,则判断所述随机森林分类器满足所述迭代结束条件,否则,判断所述随机森林分类器不满足所述迭代结束条件。
  7. 根据权利要求1-4任一项所述的癌转移预测影像特征的筛选方法,其特征在于,在所述随机森林分类器基于影像样本特征集合迭代过程中,若所述影像特征样本集合中的CT影像特征连续全部被保留的次数超过预设最大次数,或者所述影像特征样本集合中的CT影像特征全部被删除,则确定所述随机森林分类器参数设置错误,停止本次对癌转移预测影像特征的筛选。
  8. 一种癌转移预测影像特征的筛选装置,其特征在于,包括:
    第一获取模块,用于获取肿瘤患者的第一CT影像特征集合,其中,所述第一CT影像特征集合中包含若干肿瘤患者的CT影像特征信息,所述CT影像特征信息中包含若干CT影像特征;
    预处理模块,用于对所述第一CT影像特征集合中的CT影像特征信息进行预设处理,以增加肿瘤患者的CT影像特征信息的随机性,得到第二CT影像特征集合;
    第二获取模块,用于从所述第二CT影像特征集合中获取影像特征样本集合;
    分类模块,用于将所述影像特征样本集合输入预设的随机森林分类器,利用所述随机森林分类器对所述影像特征样本集合中的各类CT影像特征进行评分,评分的分值用于指示各类CT影像特征对准确预测癌转移的贡献度;
    循环模块,用于在所述分类模块每次评分结束后,判断所述随机森林分类器是否满足迭代结束条件,若是,则将所述影像特征样本集合中满足预设条件的CT影像特征作为癌转移预测影像特征;若否,则将所述分值低于分数阈值的CT影像特征从所述影像特征样本集合中删除得到新的影像特征样本集合,控制所述分类模块将新的影像特征样本集合输入预设的随机森林分类器,利用所述随机森林分类器对所述影像特征样本集合中的各类CT影像特征进行评分。
  9. 一种癌转移预测影像特征的筛选装置,包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,其特征在于,所述处理器执行所述计算机程序时,实现权利要求1-7中任意一项所述方法中的步骤。
  10. 一种存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时,实现权利要求1-7中的任意一项所述方法中的步骤。
PCT/CN2019/130831 2019-04-04 2019-12-31 一种癌转移预测影像特征的筛选方法、装置和存储介质 WO2020199692A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910272005.2 2019-04-04
CN201910272005.2A CN110148115A (zh) 2019-04-04 2019-04-04 一种癌转移预测影像特征的筛选方法、装置和存储介质

Publications (1)

Publication Number Publication Date
WO2020199692A1 true WO2020199692A1 (zh) 2020-10-08

Family

ID=67588576

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/130831 WO2020199692A1 (zh) 2019-04-04 2019-12-31 一种癌转移预测影像特征的筛选方法、装置和存储介质

Country Status (2)

Country Link
CN (1) CN110148115A (zh)
WO (1) WO2020199692A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110148115A (zh) * 2019-04-04 2019-08-20 中国科学院深圳先进技术研究院 一种癌转移预测影像特征的筛选方法、装置和存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105931224A (zh) * 2016-04-14 2016-09-07 浙江大学 基于随机森林算法的肝脏平扫ct图像病变识别方法
CN108269012A (zh) * 2018-01-12 2018-07-10 中国平安人寿保险股份有限公司 风险评分模型的构建方法、装置、存储介质及终端
CN109166564A (zh) * 2018-07-19 2019-01-08 平安科技(深圳)有限公司 为歌词文本生成乐曲的方法、装置及计算机可读存储介质
CN110148115A (zh) * 2019-04-04 2019-08-20 中国科学院深圳先进技术研究院 一种癌转移预测影像特征的筛选方法、装置和存储介质

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106197424B (zh) * 2016-06-28 2019-03-22 哈尔滨工业大学 遥测数据驱动的无人机飞行状态识别方法
CN106815481B (zh) * 2017-01-19 2020-07-17 中国科学院深圳先进技术研究院 一种基于影像组学的生存期预测方法及装置
CN107220966A (zh) * 2017-05-05 2017-09-29 郑州大学 一种基于影像组学的脑胶质瘤分级预测方法
CN108509982A (zh) * 2018-03-12 2018-09-07 昆明理工大学 一种处理二分类不平衡医学数据的方法
CN109543747A (zh) * 2018-11-20 2019-03-29 厦门大学 一种基于分层随机森林的数据特征选择方法及装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105931224A (zh) * 2016-04-14 2016-09-07 浙江大学 基于随机森林算法的肝脏平扫ct图像病变识别方法
CN108269012A (zh) * 2018-01-12 2018-07-10 中国平安人寿保险股份有限公司 风险评分模型的构建方法、装置、存储介质及终端
CN109166564A (zh) * 2018-07-19 2019-01-08 平安科技(深圳)有限公司 为歌词文本生成乐曲的方法、装置及计算机可读存储介质
CN110148115A (zh) * 2019-04-04 2019-08-20 中国科学院深圳先进技术研究院 一种癌转移预测影像特征的筛选方法、装置和存储介质

Also Published As

Publication number Publication date
CN110148115A (zh) 2019-08-20

Similar Documents

Publication Publication Date Title
JP7383010B2 (ja) 医用画像認識方法及びシステム、並びに、モデルトレーニング方法、コンピュータ装置、及びプログラム
KR101811028B1 (ko) 인공지능 기반 의료기기의 임상적 유효성 평가 방법 및 시스템
CN109272048B (zh) 一种基于深度卷积神经网络的模式识别方法
CN108464840B (zh) 一种乳腺肿块自动检测方法及系统
He et al. Image segmentation algorithm of lung cancer based on neural network model
CN111291825B (zh) 病灶分类模型训练方法、装置、计算机设备和存储介质
US20210312242A1 (en) Synthetically Generating Medical Images Using Deep Convolutional Generative Adversarial Networks
CN112070119A (zh) 超声切面图像质量控制方法、装置和计算机设备
WO2024065987A1 (zh) 一种基于影像、病理和基因多组学的肺癌预后预测系统
Jiang et al. Deep learning for COVID-19 chest CT (computed tomography) image analysis: A lesson from lung cancer
Tang et al. Cmu-net: a strong convmixer-based medical ultrasound image segmentation network
Wankhade et al. A novel hybrid deep learning method for early detection of lung cancer using neural networks
CN113239755B (zh) 一种基于空谱融合深度学习的医学高光谱图像分类方法
CN114359629B (zh) 一种基于深度迁移学习的肺炎x胸片分类识别方法
WO2021120587A1 (zh) 基于oct的视网膜分类方法、装置、计算机设备及存储介质
CN116579982A (zh) 一种肺炎ct图像分割方法、装置及设备
JP2023532292A (ja) 機械学習ベースの医療データチェッカ
TWI723312B (zh) 電腦輔助直腸癌治療反應預測系統、方法及電腦程式產品
CN111275103A (zh) 多视角信息协作的肾脏良恶性肿瘤分类方法
WO2020199692A1 (zh) 一种癌转移预测影像特征的筛选方法、装置和存储介质
WO2020007026A1 (zh) 分割模型训练方法、装置及计算机可读存储介质
CN116403701A (zh) 一种非小细胞肺癌患者tmb水平的预测方法及装置
Sari et al. Best performance comparative analysis of architecture deep learning on ct images for lung nodules classification
Mathina Kani et al. Classification of skin lesion images using modified Inception V3 model with transfer learning and augmentation techniques
CN110705570B (zh) 一种图像特征识别方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19922989

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19922989

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 19922989

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 28.04.2022)

122 Ep: pct application non-entry in european phase

Ref document number: 19922989

Country of ref document: EP

Kind code of ref document: A1