CN115337000B - Machine learning method for evaluating brain aging caused by diseases based on brain structure images - Google Patents

Machine learning method for evaluating brain aging caused by diseases based on brain structure images Download PDF

Info

Publication number
CN115337000B
CN115337000B CN202211276691.9A CN202211276691A CN115337000B CN 115337000 B CN115337000 B CN 115337000B CN 202211276691 A CN202211276691 A CN 202211276691A CN 115337000 B CN115337000 B CN 115337000B
Authority
CN
China
Prior art keywords
brain
model
age
features
aging
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211276691.9A
Other languages
Chinese (zh)
Other versions
CN115337000A (en
Inventor
张瑜
王凯凯
孙超良
张欢
王志超
钱浩天
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Lab
Original Assignee
Zhejiang Lab
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Lab filed Critical Zhejiang Lab
Priority to CN202211276691.9A priority Critical patent/CN115337000B/en
Publication of CN115337000A publication Critical patent/CN115337000A/en
Application granted granted Critical
Publication of CN115337000B publication Critical patent/CN115337000B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/05Detecting, measuring or recording for diagnosis by means of electric currents or magnetic fields; Measuring using microwaves or radio waves 
    • A61B5/055Detecting, measuring or recording for diagnosis by means of electric currents or magnetic fields; Measuring using microwaves or radio waves  involving electronic [EMR] or nuclear [NMR] magnetic resonance, e.g. magnetic resonance imaging
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/24Detecting, measuring or recording bioelectric or biomagnetic signals of the body or parts thereof
    • A61B5/316Modalities, i.e. specific diagnostic methods
    • A61B5/369Electroencephalography [EEG]
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B2576/00Medical imaging apparatus involving image processing or analysis
    • A61B2576/02Medical imaging apparatus involving image processing or analysis specially adapted for a particular organ or body part
    • A61B2576/026Medical imaging apparatus involving image processing or analysis specially adapted for a particular organ or body part for the brain

Abstract

The invention discloses a machine learning method for evaluating brain aging caused by diseases based on brain structure images, which extracts structural features of different brain areas, including structural features of different brain areas, such as thickness, volume and the like of cerebral cortex from brain structure magnetic resonance images. Because not all features are helpful for model prediction, the features are screened, and the screened features which can be generalized and more concise and effective on different training subsets are used for constructing a brain age prediction model based on ridge regression. And (5) finding out the repeatedly identified features in the k models by adopting k-fold cross validation, and positioning the structural features of the brain region most relevant to brain age prediction. And finally, predicting the trained model on patient data to evaluate the degree of the disease affecting the brain aging.

Description

Machine learning method for evaluating brain aging caused by diseases based on brain structure images
Technical Field
The invention relates to the technical field of neuro-image data analysis, in particular to a machine learning method for evaluating brain aging caused by diseases based on brain structure images.
Background
Brain aging is a natural process, but in this process, there are significant individual differences in changes in brain volume, cortical thickness, and the like. Brain development follows a specific pattern during normal aging, which means that one can predict normal age based on brain development patterns. The brain age prediction not only has important scientific significance, but also has wide clinical value. Studies have shown that various types of neurological and metabolic disorders such as schizophrenia, metabolic disorders, diabetes, and cardiac decline are associated with brain aging. The age of the brain generated by the brain age prediction model is well used to evaluate the correlation of these diseases with brain aging. If a subject's brain is older than the actual physiological age, the subject may have an older brain, or a higher degree of brain aging, deviating from the normal aging trajectory, with a higher risk of having the associated disease. Vice versa, a subject may have a young brain if the subject's brain age is less than the actual physiological age. The difference between the predicted brain age and the actual age of a given individual is referred to as "brain age difference". This value is believed to reflect diffuse, multivariate morphological changes throughout the brain and may be a marker of overall brain health. The degree of deviation of the aging trajectory of the brain from the average aging trajectory of a healthy brain can reflect the risk of the subject to suffer from neurodegenerative diseases in the future. Therefore, a model is constructed based on brain aging characteristic patterns contained in neuroimaging data, and aging tracks of the individual brain are detected, so that a new method can be provided for researching changes of the brain in an aging process and how brain diseases influence normal brain aging.
Structural Magnetic Resonance Imaging (srmri) has the unique ability to non-invasively study brain structures, and presents different structural features in its image data for both health and patients. By utilizing the characteristics, the corresponding brain age prediction of the tested brain can be carried out, and the biological age of the brain can be estimated. Different brain maps have different effects on predicting brain age because different brain maps have different bases for dividing the whole brain.
In recent years, with the development of artificial intelligence, brain age prediction of different individuals by using imaging data based on machine learning is one of the clinical key research directions, so that the diagnosis rate of diseases can be effectively improved, and meanwhile, beneficial guidance is provided for the formulation of disease treatment schemes. Conventionally, in studies of brain age prediction in imaging, various regression models such as Support Vector Regression (SVR), ridge regression, and Random Forest (RF) are used to predict the brain age. Ridge regression is a special biased estimation regression method for collinear data analysis, and is essentially an improved least square estimation method, and a regression coefficient is obtained by giving up unbiased property of the least square method at the cost of losing part of information and reducing precision, so that the regression method is more practical and reliable. These studies confirm that brain age prediction algorithms based on mr brain structure information are of great help for clinical diagnosis. However, features related to brain age, such as cortical thickness, cortical area, curvature, etc., are extracted from the structural magnetic resonance image. Combining these features results in high-dimensional features for which feature selection is required. At present, in the brain age research of imaging, a principal component analysis method, a partial least square method and the like are commonly adopted for dimension reduction, but the dimension reduction methods have ambiguity and poor interpretability, so how to select a feature selection algorithm is important for brain age prediction. In addition, the current brain age prediction model based on machine learning rarely locates brain region features related to brain age, and locating key brain regions affecting brain age has important significance for model interpretability and clinical diagnosis. Meanwhile, measuring the degree of brain aging by the difference between the age of the brain and the actual age contributes to early detection and identification of diseases. Therefore, a new prediction method is needed to solve the above problems.
Disclosure of Invention
The invention aims to provide a machine learning method for evaluating brain aging caused by diseases based on brain structure images, aiming at the defects of the prior art. The method can screen out the characteristics which can be generalized and are more simple and effective on different training subsets, and can also locate the structural characteristics of the brain area which have the greatest contribution to the prediction of the brain age. The trained model is preferably predicted on patient data to enable assessment of the extent to which the disease affects brain aging.
The invention is realized by the following technical scheme: a machine learning method for evaluating brain aging caused by diseases based on brain structure images comprises the following steps:
s1: preprocessing and characteristic extraction are carried out on the brain structure magnetic resonance image;
s2: performing feature screening on the brain structure features extracted in the step S1 by adopting a Bootstrap self-development method;
s3: constructing a ridge regression brain age prediction model according to the screening result of the S2;
s4: positioning a brain region which has the maximum contribution to the predicted brain age by adopting k-fold cross validation;
s5: training and testing the constructed brain age prediction model;
s6: tests were conducted using independent patient data sets to assess the extent to which disease affects brain aging.
Preferably, the step S1 includes the following substeps:
s1.1: dividing the acquired clinical data into a type 0 patient, a type 1 patient and a healthy person according to clinical manifestations, and removing non-brain structure images in brain structure magnetic resonance images of the patients;
s1.2: extracting the magnetic resonance image of the target tissue according to the brain planning structure;
s1.3: adopting an image segmentation algorithm to segment the magnetic resonance image of the target tissue into 3 different tissues according to the structures of grey brain matter, white brain matter and cerebrospinal fluid;
s1.4: performing cortical reconstruction on the segmented tissue through a FreeSharfer software package, quantifying the function, connection and structural attributes of the human brain, performing three-dimensional reconstruction on the structural image to generate a flattened or flatwise image, and obtaining anatomical parameters of cortex thickness, curvature, area and gray matter volume of different brain areas by using different brain maps.
Preferably, the step S2 comprises the following substeps:
s2.1: setting the proportion of the sampling subset to the health group data set aiming at the health group data set;
s2.2: setting sampling times and sampling modes, namely, how many times of sampling with the return is executed;
s2.3: sampling according to the proportion set in the S2.1 and the times and modes set in the S2.2, calculating the correlation between the characteristics such as cortex thickness, cortex surface area and the like and the pilsner of the brain age for the sampled subsets, reserving the structural characteristics with significant correlation as candidate characteristics, and counting the frequency of the characteristics appearing in the sampling;
s2.4: and setting the frequency value of the feature extracted from the sampling subset, and taking the screened feature as the final feature of the model according to the frequency value.
Preferably, the step S3 comprises the following substeps:
s3.1: randomly dividing a health group data set into a training set and a testing set according to a proportion;
s3.2: standardizing the characteristic data of the divided training set and test set, unifying the data from different sources under a reference system for convenient comparison, accelerating the convergence when the security program runs, and accelerating the convergence speed after most models are normalized;
s3.3: defining a value range of an alpha parameter of the ridge regression model;
s3.4: defining the evaluation index of cross validation as R2, and searching for an optimal parameter in the value range of alpha parameters of the ridge regression model by using k-fold cross validation, namely the parameter with the highest model accuracy;
s3.5: and taking the optimal parameters as model parameters under the brain template.
Preferably, the step S4 includes the following substeps:
s4.1: performing k-fold cross validation on the training set, and setting a cross validation evaluation index as R2;
s4.2: obtaining the feature weights of k models through coef _ parameters of ridge regression models, sequencing the weights of the k models from large to small, and respectively obtaining the structural features corresponding to the first h feature weights of the k models;
s4.3: features that recur in these k models are identified, and structural features of the brain region that contribute most to the predicted brain age are located.
Preferably, the step S5 includes the following substeps:
s5.1: selecting a model with the best test score in the k-fold cross validation under n brain templates, and retraining the model on the whole training set;
s5.2: performing model test on the test set to obtain the brain age of the model test;
s5.3: and calculating MAE, R2, pearson correlation coefficient and average error between the real age and the predicted brain age, taking the MAE, R2, pearson correlation coefficient and average error as evaluation indexes of the model, and finally selecting a brain age prediction model established by a brain atlas as an optimal model.
Preferably, the step S6 includes the following substeps:
s6.1: normalizing the independent patient group data, loading the trained brain age prediction model, testing the patient test set by using the trained optimal model, and acquiring the brain age predicted by the model;
s6.2: calculating the average error between the real age and the predicted brain age, and comparing the average error with the average error tested by the health test set, wherein if the average error of the patient test set is higher than that of the health group, the disease can cause the brain aging of the patient;
s6.3: generating a fit line between the real value and the predicted value of the health group dataset and the patient group dataset, and comparing the slopes of the two fit lines to prove that the brain of the patient deviates from the aging track of the healthy brain.
Preferably, the brain atlas used in step S1.5 includes AAL, DKT, destrieux and Brainnetome.
The invention provides a machine learning method for evaluating brain aging caused by diseases based on brain structure images, which extracts structural features of different brain areas, including structural features of different brain areas, such as thickness, volume and the like of cerebral cortex from brain structure magnetic resonance images. Because not all features are helpful for model prediction, the features are screened, and the screened features which can be generalized and more concise and effective on different training subsets are used for constructing a brain age prediction model based on ridge regression. And (5) finding out the repeatedly identified features in the k models by adopting k-fold cross validation, and positioning the structural features of the brain region most relevant to brain age prediction. And finally, predicting the trained model on patient data to evaluate the degree of the disease affecting the brain aging. The method can locate the structural features of the brain region which have the greatest contribution to the prediction of the brain age, and can predict the trained model on the data of the patient, thereby being capable of evaluating the degree of the disease influencing the brain aging.
Drawings
FIG. 1 is a flow chart of a machine learning method for assessing disease-induced brain aging based on brain structure images;
FIG. 2 is a schematic diagram of Bootstrap autofrettage feature screening;
FIG. 3 is a diagram of the results of a brain age prediction model test set.
Detailed Description
The invention will be further described with reference to the accompanying drawings. In order to make the technical solutions in the present application better understood, the present invention will be further described with reference to the accompanying drawings. This is only a subset of the embodiments of the present application and not all embodiments. Other embodiments, which can be derived by others skilled in the art from the specific embodiments described herein without making any inventive step, are intended to fall within the scope of the present inventive concept.
The invention relates to a machine learning method for evaluating brain aging caused by diseases based on brain structure images, which comprises the following steps:
s1: preprocessing and characteristic extraction are carried out on the brain structure magnetic resonance image;
s2: performing feature screening on the brain structure features extracted in the step S1 by adopting a Bootstrap self-development method;
s3: constructing a ridge regression brain age prediction model according to the screening result of the S2;
s4: positioning a brain region which has the maximum contribution to predicting the brain age by adopting k-fold cross validation;
s5: training and testing the constructed brain age prediction model;
s6: tests were performed using independent patient panel data sets to assess the extent to which disease affects brain aging.
In the step S1, the preprocessing and feature extraction of the magnetic resonance image of the brain structure includes the following steps: first, the original structural image of the magnetic resonance data contains some non-brain structures, such as skull. Because the skull signal is not used in the subsequent analysis and the signal-to-noise ratio of the image edges is poor, it is necessary to remove non-brain structures such as the skull in the image in an image preprocessing operation. Then, in magnetic resonance image processing, sometimes only the states of certain specific regions are concerned, which requires that the tissue of the target region be extracted according to the anatomical structure of the brain. In the preprocessing procedure, the brain image is segmented into 3 different tissues according to the structures of the gray matter, the white matter and the cerebrospinal fluid, because the three tissues have different functions in the brain, so that an image segmentation algorithm is required in the step. And finally, carrying out cortical reconstruction through a FreeScherfer software package, quantifying the function, connection and structural attributes of the human brain, carrying out three-dimensional reconstruction on the structural image, generating a flattened or flatly expanded image, and obtaining anatomical parameters such as cortex thickness, curvature, area, gray matter volume and the like. 4 commonly used brain maps are adopted to respectively obtain the structural characteristics of the brain, including AAL, DKT, desrieux and Brainnetome. The method specifically comprises the following substeps:
s1.1: dividing the acquired clinical data into type 0 patients, type 1 patients and healthy people according to clinical manifestations, and removing non-brain structure images in the brain structure magnetic resonance images of the patients;
s1.2: extracting the magnetic resonance image of the target tissue according to the planning structure of the brain;
s1.3: adopting an image segmentation algorithm to segment the magnetic resonance image of the target tissue into 3 different tissues according to the structures of the grey brain matter, the white brain matter and the cerebrospinal fluid;
s1.4: performing cortex reconstruction on the segmented tissues through a FreeScherfer software package, quantifying the functions, connection and structural attributes of the human brain, performing three-dimensional reconstruction on the structural image to generate a flattened or flatly expanded image, and obtaining anatomical parameters of cortex thickness, curvature, area and gray matter volume of different brain areas by using different brain maps.
In the step S2, the characteristic screening by adopting a Bootstrap self-development method comprises the following steps: firstly, aiming at a health group data set, setting the proportion of a sampling subset in the data set as r%, namely r% of samples are used for constructing a sampling sample set; secondly, setting sampling times and modes, namely executing s times of non-return sampling; then, calculating the correlation between each feature and the pilsner of the brain age, reserving the structural features (p < 0.05) with significant correlation as candidate features, and counting the frequency of the occurrence of the features in s times of sampling; and finally, setting the frequency of the feature extracted by the sampling subset to be t, namely, the feature screened t times in s times of sampling as the final feature of the model. The final size of the feature set is m × n, m is the number of samples, and n is the dimension of the structural feature of each sample. The features screened out can be generalized and more succinctly effective across different training subsets. The method specifically comprises the following substeps:
s2.1: setting the proportion of the sampling subset to the health group data set aiming at the health group data set;
s2.2: setting sampling times and sampling modes, namely, how many times of sampling with the return is executed;
s2.3: sampling according to the proportion set in S2.1 and the times and modes set in S2.2, calculating the correlation between the characteristics such as cortical thickness and cortical surface area and the pilsner of the brain age for the sampled subsets, reserving the structural characteristics with significant correlation as candidate characteristics, and counting the frequency of the characteristics in sampling;
s2.4: and setting the frequency value of the feature extracted from the sampling subset, and taking the screened feature as the final feature of the model according to the frequency value.
In the step S3, the construction of the ridge regression brain age prediction model comprises the following steps: firstly, randomly dividing a health group data set into a training set and a testing set according to a ratio a to b; secondly, standardizing the characteristic data of the divided training set and test set, wherein the mean value of the standardized data is 0, and the standard deviation is 1; then, defining an alpha parameter value range (0.01, 0.1,1,3.. 60) of the ridge regression model, defining an evaluation index of cross validation as R2, and searching an optimal parameter in the value range by using k-fold cross validation; and finally, taking the optimal parameters as model parameters under 4 brain templates. The method specifically comprises the following substeps:
s3.1: randomly dividing a health group data set into a training set and a testing set according to a proportion;
s3.2: standardizing the characteristic data of the divided training set and test set, unifying the data from different sources under a reference system for convenient comparison, accelerating the convergence when the security program runs, and accelerating the convergence speed after most models are normalized;
s3.3: defining the value range of alpha parameters of the ridge regression model;
s3.4: defining the evaluation index of cross validation as R2, and searching an optimal parameter in the value range of alpha parameters of the ridge regression model by using k-fold cross validation, namely the parameter with the highest model accuracy;
s3.5: and taking the optimal parameters as model parameters under the brain template.
In step S4, locating a brain region that contributes most to predicting brain age by using k-fold cross validation includes the following steps: firstly, performing k-fold cross validation on a training set, and setting a cross validation evaluation index as R2; then, obtaining the feature weights of k models through coef _ parameters of the ridge regression model, sequencing the weights of the k models from large to small, and respectively obtaining the structural features corresponding to the first h feature weights of the k models; finally, features that recur in the k models are identified, and structural features of the brain region that are most relevant to brain age prediction are located. The method specifically comprises the following substeps:
s4.1: performing k-fold cross validation on the training set, and setting a cross validation evaluation index as R2;
s4.2: obtaining the feature weights of k models through coef _ parameters of a ridge regression model, sequencing the weights of the k models from large to small, and respectively obtaining the structural features corresponding to the first h feature weights of the k models;
s4.3: features that recur in these k models are identified, and structural features of the brain region that contribute most to the predicted brain age are located.
In the step S5, the training and testing of the built brain age prediction model comprises the following steps: firstly, selecting a model with the best test score in k-fold cross validation under n brain templates, and retraining the model on the whole training set; secondly, performing model test on the test set to obtain the age of the brain of the model test; and finally, calculating MAE, R2, pearson correlation coefficient and average error between the real age and the predicted brain age, taking the coefficients as evaluation indexes of the model, and finally selecting the brain age prediction model established by the brain atlas as an optimal model. The method specifically comprises the following substeps:
s5.1: selecting a model with the best test score in the k-fold cross validation under the n brain templates, and retraining the model on the whole training set;
s5.2: performing model test on the test set to obtain the brain age of the model test;
s5.3: and calculating MAE, R2, pearson correlation coefficient and average error between the real age and the predicted brain age, taking the MAE, R2, pearson correlation coefficient and average error as evaluation indexes of the model, and finally selecting a brain age prediction model established by a brain atlas as an optimal model.
In step S6, the test using the independent patient group dataset to assess the extent to which diseases affect brain aging comprises the following steps: firstly, the size of a patient group data set is p x n, p represents the number of patient samples, n is the structural feature dimension of each sample, the patient data is normalized, a trained brain age prediction model is loaded, a patient test set is tested by using a trained optimal model, and the brain age predicted by the model is obtained; then, calculating the average error between the real age and the predicted brain age, and comparing the average error with the average error tested by a health test set, wherein if the average error of the patient test set is higher than that of a health group, the disease can cause the brain aging of the patient; finally, in order to further verify that the brains of the patient groups have aging phenomena, fitting lines between the actual values and the predicted values of the healthy groups and the patient groups are generated, and the slopes of the two fitting lines are compared to prove that the brains of the patients deviate from the aging tracks of the healthy brains. The method specifically comprises the following substeps:
s6.1: carrying out normalization processing on independent patient group data, loading a trained brain age prediction model, and testing a patient test set by using the trained optimal model to obtain the brain age predicted by the model;
s6.2: calculating the average error between the real age and the predicted brain age, and comparing the average error with the average error tested by the health test set, wherein if the average error of the patient test set is higher than that of the health group, the disease can cause the brain aging of the patient;
s6.3: and generating a fit line between the real value and the predicted value of the health group data set and the patient group data set, and comparing the slopes of the two fit lines to prove that the brain of the patient deviates from the aging track of the healthy brain.
In general, the present invention provides a machine learning method for evaluating brain aging caused by diseases based on brain structure images. The method can locate the structural features of the brain region which have the greatest contribution to the prediction of the brain age, and can predict the trained model on the data of the patient, thereby being capable of evaluating the degree of the disease influencing the brain aging. As shown in fig. 1, the overall method flow includes, first, preprocessing and feature extraction on structural magnetic resonance data, and the flow includes: removing skull and non-brain tissues of the brain image, segmenting gray matter, white matter and cerebrospinal fluid of the image, and performing cortical reconstruction on the structural image to generate a flattened image. After the preprocessing is completed, structural characteristics such as thickness, area, curvature, volume and the like of cortex of different brain areas are obtained through statistics, and combined brain area characteristics are constructed. Secondly, screening the combination features based on different brain regions based on a Bootstrap self-development method, namely screening t times of features in s times of samples as final features of the model, and generalizing and more simply and effectively on different training subsets. And then, constructing a ridge regression brain age prediction model, and obtaining ridge regression optimal alpha parameters through k-fold cross validation. And thirdly, identifying the characteristics which repeatedly appear in the k models by adopting k-fold cross validation, and positioning the structural characteristics of the brain region which are most relevant to the brain age prediction. The model is then trained and tested based on the normal group dataset. And finally, testing the independent patient group dataset based on the trained brain age prediction model to prove that the brain of the patient has aging symptoms.
Example 1
This example uses clinical data collected by hospitals, and the data is classified into type 0 patients, type 1 patients and healthy persons according to clinical manifestations. The data are collated and quality controlled, and the patients who are finally grouped comprise 138 patients of type 0, 94 patients of type 1 and 109 healthy people.
The specific implementation process of the method comprises the following steps:
(1) Preprocessing brain structure magnetic resonance images and extracting characteristics: first, the original structural image of the magnetic resonance data contains some non-brain structures, such as skull. Since the skull signal is not used in the subsequent analysis and the signal-to-noise ratio of the image edge is poor, it is necessary to remove the non-brain structures such as the skull in the image in an image preprocessing operation. Then, in magnetic resonance image processing, sometimes only the states of certain specific regions are concerned, which requires that the tissue of the target region be extracted according to the anatomical structure of the brain. In the preprocessing procedure, the brain image is segmented into 3 different tissues according to the structures of the gray matter, the white matter and the cerebrospinal fluid, because the three tissues have different functions in the brain, so that an image segmentation algorithm is required in the step. And finally, performing cortical reconstruction through a FreeSharfer software package, quantifying the function, connection and structural attributes of the human brain, performing three-dimensional reconstruction on the structural image, generating a flattened or flatwise image, and obtaining anatomical parameters such as cortical thickness, curvature, area, gray matter volume and the like. 4 commonly used brain maps are adopted to respectively obtain the structural characteristics of the brain, including AAL, DKT, desrieux and Brainnetome.
(2) As shown in fig. 2, the signature screening was performed by using a Bootstrap self-development method: firstly, setting the proportion of a sampling subset in a data set to be 80% aiming at a health group data set, namely, r% of samples are used for constructing a sampling sample set; secondly, setting sampling times and modes, namely executing s times of non-return sampling; then, calculating the correlation between each feature and the pilsner of the brain age, reserving structural features (p < 0.05) with significant correlation as candidate features, and counting the frequency of the occurrence of the features in s times of sampling; and finally, setting the frequency of occurrence of the features extracted by the sampling subset to be t, namely, screening the features t times in s times of sampling to be used as the final features of the model. The final size of the feature set is m × n, m is the number of samples, and n is the feature dimension of each sample structure. The features screened out can be generalized and more succinctly effective across different training subsets.
(3) Constructing a ridge regression brain age prediction model: firstly, randomly dividing a health group data set into a training set and a testing set according to a proportion of a to b; secondly, standardizing the characteristic data of the divided training set and test set, wherein the mean value of the standardized data is 0, and the standard deviation is 1; then, defining an alpha parameter value range (0.01, 0.1,1,3.. 60) of the ridge regression model, defining an evaluation index of cross validation as R2, and searching an optimal parameter in the value range by using k-fold cross validation; and finally, taking the optimal parameters as model parameters under 4 brain templates. The results of the ridge regression brain age prediction are shown in FIG. 3.
(4) And (3) positioning a brain region which has the maximum contribution to the predicted brain age by adopting k-fold cross validation: firstly, performing k-fold cross validation on a training set, namely dividing the training set into k parts, taking k-1 parts as training data and 1 part as test data in turn, performing a test, and setting a cross validation evaluation index as R2; then, obtaining the feature weights of k models through coef _ parameters of the ridge regression model, sequencing the weights of the k models from large to small, and respectively obtaining the structural features of different brain areas corresponding to the first h feature weights of the k models; finally, identifying the characteristics which repeatedly appear in the k models, and positioning the structural characteristics of the brain area which greatly contributes to the brain age prediction, wherein the larger the characteristic weight is, the larger the contribution of the structural characteristics of the brain area to the brain age prediction is.
(5) Training and testing the built brain age prediction model: firstly, selecting a model with the best test score in k-fold cross validation under 4 brain templates, and retraining the model on the whole training set; secondly, performing model test on the test set to obtain the age of the brain of the model test; and finally, calculating MAE, R2, pearson correlation coefficient and average error between the real age and the predicted brain age, taking the MAE, R2, pearson correlation coefficient and average error as evaluation indexes of the model, and finally selecting a brain age prediction model established by the brain atlas as an optimal model.
(6) Tests were performed using independent patient data sets to assess the extent to which disease affects brain aging: firstly, the size of a patient group data set is p x n, p represents the number of patient samples, n is the structural feature dimension of each sample, the patient data is normalized, a trained brain age prediction model is loaded, a patient test set is tested by using a trained optimal model, and the brain age predicted by the model is obtained; then, calculating the average error between the real age and the predicted brain age, and comparing the average error with the average error tested by the health test set, wherein if the average error of the patient test set is higher than that of the health group, the disease can cause the brain aging of the patient; finally, in order to further verify that the brains of the patient groups have aging phenomena, fitting lines between the actual values and the predicted values of the healthy groups and the patient groups are generated, and the slopes of the two fitting lines are compared to prove that the brains of the patients deviate from the aging tracks of the healthy brains.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made without departing from the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims (4)

1. A machine learning method for evaluating brain aging caused by diseases based on brain structure images is characterized by comprising the following steps:
s1: preprocessing a brain structure magnetic resonance image and extracting characteristics of the brain structure magnetic resonance image;
s2: screening the structural features of the brain extracted in the step S1 by adopting a self-development method;
s3: constructing a ridge regression brain age prediction model according to the screening result of the S2;
s4: positioning a brain region which has the maximum contribution to the predicted brain age by adopting k-fold cross validation;
s5: training and testing the constructed brain age prediction model;
s6: testing using an independent patient group dataset to assess the extent to which disease affects brain aging;
said step S2 comprises the following sub-steps:
s2.1: setting the proportion of the sampling subset to the health group data set aiming at the health group data set;
s2.2: setting sampling times and a sampling mode;
s2.3: sampling according to the proportion set in the S2.1 and the times and modes set in the S2.2, calculating the correlation between the cortex thickness and the cortex surface area characteristic of the sampled subset and the pilsner of the brain age, reserving the structural characteristic with obvious correlation as a candidate characteristic, and counting the frequency of the characteristic appearing in the sampling;
s2.4: setting frequency values of the features extracted from the sampling subsets, and taking the screened features as final features of the model according to the frequency values;
the step S3 comprises the following substeps:
s3.1: randomly dividing a health group data set into a training set and a testing set according to a proportion;
s3.2: standardizing the characteristic data of the divided training set and test set;
s3.3: defining a value range of an alpha parameter of the ridge regression model;
s3.4: defining the evaluation index of cross validation as R2, and searching for an optimal parameter in the value range of alpha parameters of the ridge regression model by using k-fold cross validation, namely the parameter with the highest model accuracy;
s3.5: taking the optimal parameters as model parameters under a brain template;
the step S4 comprises the following substeps:
s4.1: performing k-fold cross validation on the training set, and setting a cross validation evaluation index as R2;
s4.2: obtaining the feature weights of k models through coef _ parameters of a ridge regression model, sequencing the weights of the k models from large to small, and respectively obtaining the structural features corresponding to the first h feature weights of the k models;
s4.3: identifying the characteristics which repeatedly appear in the k models, and positioning the structural characteristics of the brain region which have the greatest contribution to the prediction of the brain age;
said step S5 comprises the following sub-steps:
s5.1: selecting a model with the best test score in the k-fold cross validation under the n brain templates, and retraining the model on the whole training set;
s5.2: performing model test on the test set to obtain the brain age of the model test;
s5.3: and calculating MAE, R2, pearson correlation coefficient and average error between the real age and the predicted brain age, taking the MAE, R2, pearson correlation coefficient and average error as evaluation indexes of the model, and finally selecting a brain age prediction model established by a brain atlas as an optimal model.
2. The method for machine learning to evaluate brain aging caused by diseases based on brain structure images as claimed in claim 1, wherein the step S1 comprises the following sub-steps:
s1.1: dividing the acquired clinical data into type 0 patients, type 1 patients and healthy people according to clinical manifestations, and removing non-brain structure images in the brain structure magnetic resonance images of the patients;
s1.2: extracting the magnetic resonance image of the target tissue according to the brain planning structure;
s1.3: adopting an image segmentation algorithm to segment the magnetic resonance image of the target tissue into 3 different tissues according to the structures of the grey brain matter, the white brain matter and the cerebrospinal fluid;
s1.4: performing cortex reconstruction on the segmented tissues through a FreeScherfer software package, quantifying the functions, connection and structural attributes of the human brain, performing three-dimensional reconstruction on the structural image to generate a flattened or flatly expanded image, and obtaining anatomical parameters of cortex thickness, curvature, area and gray matter volume of different brain areas by using different brain maps.
3. The method for machine learning to assess brain aging caused by diseases based on brain structure images as claimed in claim 1, wherein the step S6 comprises the following sub-steps:
s6.1: carrying out normalization processing on independent patient group data, loading a trained brain age prediction model, and testing a patient test set by using the trained optimal model to obtain the brain age predicted by the model;
s6.2: calculating the average error between the real age and the predicted brain age, and comparing the average error with the average error tested by the health test set, wherein if the average error of the patient test set is higher than that of the health group, the disease can cause the brain aging of the patient;
s6.3: generating a fit line between the real value and the predicted value of the health group dataset and the patient group dataset, and comparing the slopes of the two fit lines to prove that the brain of the patient deviates from the aging track of the healthy brain.
4. The method for machine learning based on brain structure image to assess brain aging caused by diseases according to claim 2, wherein the brain atlas adopted in step S1.4 includes AAL, DKT, destrieux and Brainnetome.
CN202211276691.9A 2022-10-19 2022-10-19 Machine learning method for evaluating brain aging caused by diseases based on brain structure images Active CN115337000B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211276691.9A CN115337000B (en) 2022-10-19 2022-10-19 Machine learning method for evaluating brain aging caused by diseases based on brain structure images

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211276691.9A CN115337000B (en) 2022-10-19 2022-10-19 Machine learning method for evaluating brain aging caused by diseases based on brain structure images

Publications (2)

Publication Number Publication Date
CN115337000A CN115337000A (en) 2022-11-15
CN115337000B true CN115337000B (en) 2022-12-20

Family

ID=83957453

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211276691.9A Active CN115337000B (en) 2022-10-19 2022-10-19 Machine learning method for evaluating brain aging caused by diseases based on brain structure images

Country Status (1)

Country Link
CN (1) CN115337000B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117558452B (en) * 2024-01-11 2024-03-26 北京大学人民医院 MODS risk assessment model construction method, device, equipment and medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8628331B1 (en) * 2010-04-06 2014-01-14 Beth Ann Wright Learning model for competency based performance
CN105046709A (en) * 2015-07-14 2015-11-11 华南理工大学 Nuclear magnetic resonance imaging based brain age analysis method
CN113616184A (en) * 2021-06-30 2021-11-09 北京师范大学 Brain network modeling and individual prediction method based on multi-mode magnetic resonance image
CN113892936A (en) * 2021-09-24 2022-01-07 天津大学 Interpretable brain age prediction method based on full convolution neural network
CN115049629A (en) * 2022-06-27 2022-09-13 太原理工大学 Multi-mode brain hypergraph attention network classification method based on line graph expansion

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220122250A1 (en) * 2020-10-19 2022-04-21 Northwestern University Brain feature prediction using geometric deep learning on graph representations of medical image data

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8628331B1 (en) * 2010-04-06 2014-01-14 Beth Ann Wright Learning model for competency based performance
CN105046709A (en) * 2015-07-14 2015-11-11 华南理工大学 Nuclear magnetic resonance imaging based brain age analysis method
CN113616184A (en) * 2021-06-30 2021-11-09 北京师范大学 Brain network modeling and individual prediction method based on multi-mode magnetic resonance image
CN113892936A (en) * 2021-09-24 2022-01-07 天津大学 Interpretable brain age prediction method based on full convolution neural network
CN115049629A (en) * 2022-06-27 2022-09-13 太原理工大学 Multi-mode brain hypergraph attention network classification method based on line graph expansion

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Estimation of brain age delta from brain imaging;Stephen M.Smith等;《ScienceDirect》;20190612;全文 *
基于功能连接的大脑年龄预测的影响因素;周震等;《万方数据》;20210823;全文 *

Also Published As

Publication number Publication date
CN115337000A (en) 2022-11-15

Similar Documents

Publication Publication Date Title
CN109770903B (en) Classification prediction system for functional magnetic resonance image
Shen et al. MRI fuzzy segmentation of brain tissue using neighborhood attraction with neural-network optimization
US7428323B2 (en) Method and system for automatic diagnosis of possible brain disease
CN110840468B (en) Autism risk assessment method and device, terminal device and storage medium
US20220130540A1 (en) Method and system for individualized prediction of mental illness on basis of brain function map monkey-human cross-species migration
Hosseini et al. Comparative performance evaluation of automated segmentation methods of hippocampus from magnetic resonance images of temporal lobe epilepsy patients
CN110232332A (en) Extraction and brain state classification method and system for dynamic function connection local linear embedded features
CN112614126B (en) Magnetic resonance image brain region dividing method, system and device based on machine learning
CN111863244B (en) Functional connection mental disease classification method and system based on sparse pooling graph convolution
WO2015192021A1 (en) PATTERN ANALYSIS BASED ON fMRI DATA COLLECTED WHILE SUBJECTS PERFORM WORKING MEMORY TASKS ALLOWING HIGH-PRECISION DIAGNOSIS OF ADHD
CN115337000B (en) Machine learning method for evaluating brain aging caused by diseases based on brain structure images
Han et al. A novel convolutional variation of broad learning system for Alzheimer’s disease diagnosis by using MRI images
Alahmadi et al. Classifying cognitive profiles using machine learning with privileged information in mild cognitive impairment
CN111568412A (en) Method and device for reconstructing visual image by utilizing electroencephalogram signal
Nandakumar et al. DeepEZ: a graph convolutional network for automated epileptogenic zone localization from resting-state fMRI connectivity
CN116959714A (en) Autism spectrum disorder classification method, electronic equipment and storage medium
Jarret et al. A methodological scoping review of the integration of fMRI to guide dMRI tractography. What has been done and what can be improved: A 20-year perspective
CN114847922A (en) Brain age prediction method based on automatic fiber bundle identification
KR20200025852A (en) Method for generating predictive model based on intra-subject and inter-subject variability using functional connectivity
CN113571148B (en) One-key mental image individualized brain function report generation system, equipment and storage medium
WO1999064983A9 (en) Method and apparatus for automatic shape characterization
CN114494132A (en) Disease classification system based on deep learning and fiber bundle spatial statistical analysis
Xu et al. Unsupervised profiling of microglial arbor morphologies and distribution using a nonparametric Bayesian approach
Johnston et al. Tracking longitudinal population dynamics of single neuronal calcium signal using SCOUT
CN116030941B (en) Alzheimer&#39;s disease diagnosis method based on edge-centric effect connection network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant