CN113707324A - Analysis method for evaluating correlation between brain functional connectivity and clinical symptoms - Google Patents

Analysis method for evaluating correlation between brain functional connectivity and clinical symptoms Download PDF

Info

Publication number
CN113707324A
CN113707324A CN202111095382.7A CN202111095382A CN113707324A CN 113707324 A CN113707324 A CN 113707324A CN 202111095382 A CN202111095382 A CN 202111095382A CN 113707324 A CN113707324 A CN 113707324A
Authority
CN
China
Prior art keywords
data
correlation
analysis
brain
age
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111095382.7A
Other languages
Chinese (zh)
Inventor
李飞
龚启勇
顾实
李斌
李倩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
West China Hospital of Sichuan University
Original Assignee
University of Electronic Science and Technology of China
West China Hospital of Sichuan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China, West China Hospital of Sichuan University filed Critical University of Electronic Science and Technology of China
Priority to CN202111095382.7A priority Critical patent/CN113707324A/en
Publication of CN113707324A publication Critical patent/CN113707324A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/30ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/16Devices for psychotechnics; Testing reaction times ; Devices for evaluating the psychological state
    • A61B5/165Evaluating the state of mind, e.g. depression, anxiety
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/40Detecting, measuring or recording for evaluating the nervous system
    • A61B5/4058Detecting, measuring or recording for evaluating the nervous system for evaluating the central nervous system
    • A61B5/4064Evaluating the brain
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/40Detecting, measuring or recording for evaluating the nervous system
    • A61B5/4076Diagnosing or monitoring particular conditions of the nervous system
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/72Signal processing specially adapted for physiological signals or for diagnostic purposes
    • A61B5/7235Details of waveform analysis
    • A61B5/7264Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems
    • A61B5/7267Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems involving training the classification device
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems

Abstract

The invention discloses an analysis method for evaluating the correlation between brain functional connectivity and clinical symptoms, which comprises the steps of (1) screening high-dimensional data; (2) calculating an optimal sparse coefficient; (3) the data is processed using a typical correlation analysis based on a sparse representation. The method is used for analyzing the typical correlation between the MR brain function connection and the clinical symptom score of the schizophrenia patient, and can comprehensively understand the linear correlation between the brain function and the symptom degree, thereby being beneficial to the explanation of the pathogenesis of the schizophrenia.

Description

Analysis method for evaluating correlation between brain functional connectivity and clinical symptoms
Technical Field
The invention relates to the field of medical data analysis methods, in particular to an analysis method for evaluating correlation between brain functional connectivity and clinical symptoms.
Background
Schizophrenia is a severe psychotic disorder characterized by hallucinations, delusions, disorganized thought, lack of emotion and movement, and deterioration of social functioning, well between the ages of 16 and 25 years, with a prevalence in the population of about 1% and a lifetime prevalence of over 1%. Schizophrenia is a serious and slowly disabling mental disorder, causes problems in individual mood, mental state and mental health, affects family, interpersonal and social relations and has heavy disease burden. In China, the number of years of life for various types of diseases and disabilities (YLDs) is ranked the second place in the mental disorder rank. In the global range of mental disorders YLDs, schizophrenia ranks first. However, the neuropathophysiological mechanisms of schizophrenia are not yet fully understood.
At present, the diagnosis of schizophrenia is mainly carried out through subjective inquiry of a psychiatrist and qualitative evaluation on behavioral symptoms of a patient through a diagnostic scale, an objective biomarker which can be used for clinical diagnosis is lacked, and the diagnosis accuracy has large requirements and dependence on the capability and experience of the psychiatrist. If the patient is combined with other mental disorder symptoms, such as major depression, mania, compulsive behavior and the like, misdiagnosis and missed diagnosis are more easily caused. In recent years, Magnetic Resonance Imaging (MRI) has been widely used for exploring neuropathophysiological mechanisms of mental disorders and objective indexes of clinical diagnosis and treatment by virtue of its advantages of being noninvasive, multi-modal, and high-resolution. The previous brain image research finds that the brain structure and function abnormality of the schizophrenia patient have correlation with clinical symptoms. For example, a decrease in thickness of the orbitofrontal cortex of schizophrenic patients correlates with the severity of negative symptoms; decreased striatal volume is associated with apathy symptoms; the functional connection enhancement of the medial prefrontal lobe and other brain areas of the default network is positively correlated with positive symptoms; brain function network clustering coefficients and local efficiency reduction are associated with poor performance of working memory tasks. However, most of the conventional correlation analyses are univariate analysis modes for calculating the correlation according to the variables with differences among the groups, for example, the model of Pearson correlation analysis is simple, the relationship among the variables is not refined and solidified to form a model, the data cannot be predicted by using the correlation relationship, and the method is not beneficial to effectively mining the high-latitude brain function connection data of the MRI of the schizophrenia patient and the correlation characteristic information of a plurality of clinical symptom scores.
The CCA (canonical correlation analysis) is a multivariate statistical method that reflects the overall correlation between two groups of multivariate in the same individual by using the correlation between pairs of projection vectors, and can effectively extract the correlation information between the two groups of multivariate. However, functional connection matrices based on whole brain are typically high in dimensionality, and overfitting tends to occur using traditional canonical correlation analysis. The sparse (sparse) method not only refers to model parameter sparse (generally regularization) in a broad sense, but also includes characteristic sparse (namely characteristic selection), and the method can screen multi-characteristic and high-dimensional data, remove redundant information, increase interpretability of a result, namely that the screened information can reflect characteristics of original data better, and simultaneously avoid overfitting and increased difficulty of a learning task caused by 'dimension disaster'. In 2018, Xia et al conducted sparse features using Median Absolute Deficit (MAD) method and then sparse parameters using typical correlation analysis based on sparse representation (SCCA) method, and they discovered that 4 psychopathological dimensions were associated with different brain network connection patterns, such as mood (including depression symptoms, suicide, irritability in manic parts, and self-mutilation) dimensions associated with ventral attention network and enhancement of highlight network connection, mental symptom dimensions (including mental pedigree symptoms, mania) associated with default network and execution network (including highlight network and forehead network) connection, fear dimensions (including social fear and agoraphobia symptoms) associated with enhancement of connection in the forehead network, behavioral externalization dimensions (including attention deficit disorder, symptoms of disfiguring disorder, and irritability in depression part) associated with reduction of connection between highlight network and default network, and behavioral characteristics of brain network, The highlighted network is associated with a forehead network connection enhancement. And default networking and performing network connectivity weakening are common features of all psychopathology dimensions. Thereby clarifying the characteristics of high co-morbidities of mental diseases and large individual diagnosis heterogeneity. Therefore, the typical correlation relationship between the MR brain function connection and the clinical symptoms of the schizophrenia patient is researched based on a sparse multivariate correlation analysis method, and the correlation between the brain function and the clinical symptoms can be comprehensively understood, so that the neurobiological pathogenesis of the schizophrenia can be favorably clarified.
Xia et al extract features by using an MAD method, calculate correlations between v sets in typical vector pairs generated in a sample set and v sets in typical vector pairs of a previously obtained overall result, and sort the v sets according to the magnitude of the correlations, wherein the v sets with the large correlations are regarded as the same typical vector pairs. And deleting the group of results when the typical vector pairs in the sample set generated by the self-sampling method do not correspond to the typical vector pairs in the overall result in a one-to-one mode. The matching mode uses the v set to judge whether the same typical vector pair is stricter, but the small sample data is easy to have no correlation result, and the error probability of false negative results is increased.
However, the effect of extracting features by using the Relief method is better, interpretable results can be obtained certainly, and u corresponds to high-dimensional data, so that the results are more accurate to a certain extent, as shown in fig. 2.
Disclosure of Invention
The invention aims to provide an analysis method for evaluating the correlation between brain function connection and clinical symptoms, which is used for analyzing the typical correlation between the MR brain function connection and the clinical symptom score of a schizophrenia patient and comprehensively understanding the linear correlation between the brain function and the symptom degree, thereby being beneficial to the explanation of the pathogenesis of the schizophrenia.
In order to achieve the purpose, the invention is realized by adopting the following technical scheme:
the invention discloses an analysis method for evaluating correlation between brain functional connectivity and clinical symptoms, which comprises the following steps:
s1, extracting brain network function connection data to obtain key brain network data;
s2, performing regression on the key brain network data and the clinical symptom scale scoring data, and eliminating the influence of difference factors, wherein the difference factors comprise age, gender and the like;
s3, calculating the optimal sparse coefficient by using a training or replacement mode through brain network data and clinical symptom scale scoring data;
s4, inputting brain network data, clinical symptom scale scoring data and the optimal sparse coefficient into sparse canonical correlation analysis to obtain corresponding canonical vector pairs;
s5, generating a sample set through a self-service sampling method, bringing the sample set and the optimal sparse coefficient into sparse canonical correlation analysis to obtain a corresponding canonical vector pair, and calculating a correlation mean value and a standard deviation of brain function connection and clinical symptoms after matching the canonical vector pair with the canonical vector pair obtained in the step S4;
s6, obtaining a typical vector pair with statistical significance by using a replacement test;
s7, using corresponding confidence intervals for the brain network data and the score data of the clinical symptom scale, finding out the consistently significant functional connection and score of the clinical symptom scale in each group of typical vector pairs;
s8, performing regression analysis on the age and the gender by using a generalized additive model;
and S9, repeating the steps S1-S7, and carrying out hierarchical analysis on the ages and classified analysis on the sexes.
Preferably, in step S1, the extraction method includes dimension reduction analysis and feature selection;
the dimension reduction analysis is to generate new features to replace the original features after the features are linearly combined, and to use principal component analysis for verifying the correctness of the correlation result of the typical vector acquired after the feature selection;
feature selection is to select a subset from the original feature set to replace the original set.
Preferably, feature selection uses either an absolute median difference mode or a Relief analysis mode.
Preferably, in step S2, the regression formula is
residuals.glm(glm(data~sex+age+head_movement))
The formula is an R language code which controls the influence of age, sex and head movement on functional connection data or clinical symptom scoring data by applying a generalized linear model;
wherein residuals represents the residual; glm (general linear mode) stands for generalized linear model; data represents functional connectivity data or clinical symptom scoring data; sex represents gender classification data; age represents age data; head _ movement represents head movement data; + represents a connection; the representation is interpreted as.
Preferably, in step S3, the method for calculating the optimal sparse coefficient includes a training method and a substitution method;
the training mode obtains parameters which enable the result to be optimal through a mode of training samples, and the parameters are taken as optimal parameters to be brought into the whole to be calculated, so that a final model is obtained;
the replacement mode is to perform sparse canonical correlation analysis by performing multiple replacements on corresponding characteristics of brain network function connection data and clinical symptom scale scoring data.
Preferably, in step S4, the sparse canonical correlation analysis is to find the linear coefficient u, v such that the covariance cor (u 'X, v' Z) is the maximum given X and Z representing two sets of features on the same sample, where the size of u, v corresponds to the weight of the feature quantity corresponding to each canonical vector pair in the canonical vector pair.
Preferably, the matching method in step S5 includes:
(1) calculating the correlation between a v set in a typical vector pair generated in a sample set and a v set in a typical vector pair of a previously obtained overall result, sequencing the v sets according to the magnitude of the correlation, identifying the v set as the same typical vector pair when the magnitude of the correlation is large, and deleting the group of results when the typical vector pair in the sample set generated by a self-service sampling method does not correspond to the typical vector pair in the overall result one by one;
(2) and calculating the correlation between the u set of the typical vector pairs generated in the sample set and the u set in the overall result, and sorting the u set in a descending order according to the correlation size, wherein the high correlation is regarded as the same typical vector pair, and the matched typical vectors are not matched next time.
Preferably, in step S7, the confidence interval of the functional linkage data is 99.5%, and the confidence interval of the clinical symptom table score data is 95%, and different confidence intervals may be set according to the data characteristics.
Preferably, in step S8, steps S1, S3 to S7 are repeated in sequence, and the influence of different canonical vectors on age and sex is searched using a generalized additive model:
ConnectivityScorei=ui′X
ConnectivityScorei~Sex+s(Age)
ConnectivityScore represents scores after different dimensions of projection; u represents a projection vector; x represents functional connection data; the representation is interpreted as; s (smooth) represents smoothing; sex represents gender classification data; age represents age data.
The invention has the beneficial effects that:
the method uses typical correlation analysis based on sparse representation and adds regularization parameters, and effectively selects different characteristics through sparsity improvement, so that overfitting risks caused by high-dimensional data can be avoided, the generalization performance of a model is improved, the result is more accurate to a certain extent, and the method is suitable for analyzing high-dimensional data such as brain network functional connection.
Drawings
FIG. 1 is a schematic flow diagram of the present invention;
fig. 2 is a comparison graph of feature extraction effects.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to the accompanying drawings.
As shown in fig. 1, an assay for assessing the correlation between brain functional connectivity and clinical symptoms comprises the steps of:
s1, extracting brain network function connection data to obtain key brain network data;
s2, performing regression on the key brain network data and the clinical symptom scale scoring data, and eliminating the influence of difference factors, wherein the difference factors comprise age, gender and the like;
s3, calculating the optimal sparse coefficient by using a training or replacement mode through brain network data and clinical symptom scale scoring data;
s4, inputting brain network data, clinical symptom scale scoring data and the optimal sparse coefficient into sparse canonical correlation analysis to obtain corresponding canonical vector pairs;
and S5, generating a sample set by a self-service sampling method, and bringing the sample set and the optimal sparse coefficient into sparse canonical correlation analysis to obtain a corresponding canonical vector pair. After matching the representative vector pair with the representative vector pair obtained in step S4, calculating a correlation mean and a standard deviation of brain function connection and clinical symptoms;
s6, obtaining a typical vector pair with statistical significance by using a replacement test;
s7, using corresponding confidence intervals for the brain network data and the score data of the clinical symptom scale, finding out the consistently significant functional connection and score of the clinical symptom scale in each group of typical vector pairs;
s8, performing regression analysis on the age and the gender by using a generalized additive model;
and S9, repeating the steps S1-S7, and carrying out hierarchical analysis on the ages and classified analysis on the sexes.
Further, in step S1, the extraction method includes dimension reduction analysis and feature selection;
and (3) dimension reduction analysis: after linear combination of features, new features are generated to replace the original features, and principal component analysis is used for verifying the correctness of correlation results by typical vectors acquired after feature selection.
The principal component analysis is a statistical method, a group of variables possibly having correlation are converted into a group of linearly uncorrelated variables through orthogonal transformation, and the group of converted variables are called principal components. The most classical way is to express it by using variance Var of F1 (the first linear combination, i.e. the first comprehensive index), i.e. the larger Var (F1) is, the more information F1 contains. Therefore, the variance of the selected F1 in all linear combinations should be the largest, so the first principal component is called F1. If the first principal component is not enough to represent the original information of P indexes, then considering selecting F2, i.e. selecting the second linear combination, in order to effectively reflect the original information, the existing information of F1 does not need to appear in F2, and the expression in mathematical language requires that the covariance Cov (F1, F2) is 0 (i.e. the covariance of F1 and F2 is 0), then F2 is called the second principal component, and so on, the third, fourth, … …, the P-th principal component can be constructed:
Figure BDA0003268915400000091
wherein the content of the first and second substances,
Figure BDA0003268915400000092
is the eigenvector corresponding to the eigenvalue of X covariance matrix Sigma, Zx1,Zx2,……,ZxpThe original variables are normalized values, and in practical application, the dimensions of indexes are different, so that the influence of the dimensions is eliminated before calculation, and the original data is normalized.
Selecting characteristics: the feature selection operation is to select a subset from the original feature set to replace the original set, wherein an absolute median difference mode or a Relief analysis mode is mainly used;
the mode of the absolute median difference is based on selecting the characteristic with large median of the variance deviation absolute value for subsequent calculation, the mode of the Relief is based on selecting the characteristic with large difference among different types of samples for subsequent calculation, different characteristic selection modes reflect the data and the nature of the problem, and different schemes can be adopted in the step, so that different explanations on the correlation result in specific meaning can be obtained, and the operation with strong expansibility belongs to.
Median Absolute Difference (MAD): the way of feature selection in conventional machine learning is typically to use variance. Here, the absolute median is used, which is defined as the median of the absolute deviations of the data points from the median, and the calculation formula is as follows:
MAD=median(|Xi-median(X)|)
median stands for Median; x represents training data; i denotes the ith training data.
Relief analysis: the Relief algorithm is a feature weight algorithm, different weights are given to features according to the relevance of each feature and category, and features with weights less than a certain threshold value are removed. The relevance of features and classes in the Relief algorithm is based on the discriminative power of features on close-range samples. The algorithm randomly selects a sample R from the training set D, then finds a nearest neighbor sample H from samples in the same class as R, called NearHit, and finds a nearest neighbor sample M from samples in different classes from R, called NearMiss, and then updates the weight of each feature according to the following rules: if the distance between R and NearHit on a feature is less than the distance between R and NearMiss, the feature is beneficial to distinguishing the nearest neighbors of the same class and different classes, and the weight of the feature is increased; conversely, if the distance between R and NearHit is greater than the distance between R and NearMiss, indicating that the feature has a negative effect on distinguishing between similar and dissimilar nearest neighbors, the weight of the feature is reduced. Repeating the above processes m times to obtain the average weight of each feature. The larger the weight of a feature is, the stronger the classification ability of the feature is, and conversely, the weaker the classification ability of the feature is.
Weight value delta corresponding to characteristic j of ith samplejComprises the following steps:
Figure BDA0003268915400000101
discrete type characteristics:
Figure BDA0003268915400000102
continuous type characteristic:
Figure BDA0003268915400000103
Figure BDA0003268915400000104
Figure BDA0003268915400000105
a j-th characteristic value representing an i-th sample;
Figure BDA0003268915400000106
representing the jth characteristic value of the h sample, wherein the h sample is the nearest neighbor sample in the samples in the same class as the i sample;
Figure BDA0003268915400000107
the j characteristic value of the mth sample, wherein the sample m is the nearest neighbor sample in the samples of different classes from the sample i; diff (difference) represents a distance; a and b represent sample a and sample b.
Further, in step S2, before performing the subsequent analysis, regression is performed on the functional linkage data and the clinical symptom scale score data to eliminate the influence of age and gender on the subsequent analysis, and the regression formula is:
residuals.glm(glm(data~sex+age+head_movement))
the formula is an R language code that controls the effect of age, gender, head movements on functional connectivity data or clinical symptom scoring data using a generalized linear model.
Wherein residuals represents the residual; glm (general linear mode) stands for generalized linear model; data represents functional connectivity data or clinical symptom scoring data; sex represents gender classification data; age represents age data; head _ movement represents head movement data; + represents a connection; the representation is interpreted as.
Further, in step S3,
in conventional machine learning, parameters which enable the result to be optimal are generally obtained by means of training samples (if the sample size is sufficient, a self-service sampling method can be used for obtaining corresponding training sets and test sets, and conventional testing and training are performed), and are taken as optimal parameters to be brought into the whole for calculation, so that a final model is obtained.
Obtaining sparse coefficients in the displacement context: the data set for memorizing brain network function connection is X, the data set for clinical symptom evaluation score is Z, and in the data set, the rows correspond to the samples, and the columns correspond to the features. For each set of sparse coefficients, the following computational analysis was performed:
(1) repeatedly performing n times of replacement on the X to obtain new matrixes X _1, X _2 and X _3 …;
(2) performing sparse canonical correlation analysis on the replaced data X _1, X _2 and X _3 … and Z respectively to obtain correlations c _1, c _2 and c _3 …, and performing canonical correlation analysis on the data X _1, X _2 and X _3 … and Z to obtain a correlation c;
(3) using Fisher transform to convert the correlation data into random variables approximately in normal distribution, and expressing Fisher transform of c by Fisher (c);
(4) calculating the z-static value of Fisher (c), wherein the larger the z-static value is, the better the corresponding sparse coefficient is.
Further, in step S4,
sparse canonical correlation analysis: given X and Z, which represent two sets of features on the same sample, the linear coefficients u, v are calculated to maximize the covariance cor (u 'X, v' Z) (this is the definition of typical correlation analysis, and sparse typical correlation analysis is based on the addition of an elastic network). The linear coefficients u, v obtained here have sizes corresponding to the weights of the feature quantities corresponding to each group of representative vector pairs in the representative vector pairs).
Further, in step S5,
typical vector matching: the sample sets are generated through a self-sampling method, sparse canonical correlation analysis is carried out on the sample sets, here, a canonical vector pair obtained for each sample set needs to be matched with a canonical vector pair obtained integrally in the past, and information in the canonical vector pairs in corresponding positions is not necessarily consistent. The matching method provides two schemes which respectively correspond to different data characteristics:
1. and calculating the correlation between the v set in the typical vector pair generated in the sample set and the v set in the typical vector pair of the overall result obtained before, and sequencing according to the magnitude of the correlation, wherein the same typical vector pair is considered to be the large correlation. And deleting the group of results when the typical vector pairs in the sample set generated by the self-sampling method do not correspond to the typical vector pairs in the overall result in a one-to-one mode. This matching approach uses v-sets to determine whether the same representative vector pair is present, and the determination is more stringent, but no results are likely to occur for small sample data.
2. And calculating the correlation between the u sets of the typical vector pairs generated in the sample set and the u sets in the overall result, and sorting according to the correlation size, wherein the same typical vector pair is considered to be the correlation when the correlation is large. The sorting is carried out according to the descending order of the sizes of the correlation results by the typical vectors, the matched typical vectors are not matched next time, so that interpretable results can be obtained certainly, and u corresponds to high-dimensional data, so that the results can be more accurate to a certain extent.
Further, the operation of step S6 is generally as shown in step S3, and here again there is a typical vector pair matching problem as noted above.
Further, in step S7, the confidence interval is generally more strict for the functional connection data. As we here, 99.5% was taken for functional connectivity and 95% was taken for the score for clinical symptom assessment.
Further, in step S8, in order to search the relationship between the typical vector pair and the age and gender, the first to seventh steps are repeated without performing the operation of the second step, and then the generalized additive model is used to search the influence of different typical vector pairs on the age and gender, wherein the model is:
ConnectivityScorei=u′iX
ConnectivityScorei~Sex+s(Age)
ConnectivityScore: scoring after different dimensions of projection; u: projecting the vector; x: functional connection data; to: is interpreted as; s (smooth): smoothing; and (2) sex: classifying data into gender; age: age data.
The present invention is capable of other embodiments, and various changes and modifications may be made by one skilled in the art without departing from the spirit and scope of the invention.

Claims (10)

1. An assay for assessing the correlation between functional brain connections and clinical symptoms comprising the steps of:
s1, extracting brain network function connection data to obtain key brain network data;
s2, performing regression on the key brain network data and the clinical symptom scale scoring data, and eliminating the influence of difference factors, wherein the difference factors comprise age, gender and the like;
s3, calculating the optimal sparse coefficient by using a training or replacement mode through brain network data and clinical symptom scale scoring data;
s4, inputting brain network data, clinical symptom scale scoring data and the optimal sparse coefficient into sparse canonical correlation analysis to obtain corresponding canonical vector pairs;
s5, generating a sample set through a self-service sampling method, bringing the sample set and the optimal sparse coefficient into sparse canonical correlation analysis to obtain a corresponding canonical vector pair, and calculating a correlation mean value and a standard deviation of brain function connection and clinical symptoms after matching the canonical vector pair with the canonical vector pair obtained in the step S4;
s6, obtaining a typical vector pair with statistical significance by using a replacement test;
s7, using corresponding confidence intervals for the brain network data and the score data of the clinical symptom scale, finding out the consistently significant functional connection and score of the clinical symptom scale in each group of typical vector pairs;
s8, performing regression analysis on the age and the gender by using a generalized additive model;
and S9, repeating the steps S1-S7, and carrying out hierarchical analysis on the ages and classified analysis on the sexes.
2. The analysis method for assessing the correlation between brain function connection and clinical symptoms according to claim 1, wherein in step S1, the extraction method comprises dimension reduction analysis, feature selection:
the dimension reduction analysis is to generate a new feature to replace the original feature after the features are linearly combined;
the feature selection is to select a subset from the original feature set to replace the original set.
3. The assay for assessing the correlation between functional brain connections and clinical symptoms according to claim 2, wherein: the dimension reduction analysis is principal component analysis, and a group of variables with possible correlation are converted into a group of linearly uncorrelated variables through orthogonal transformation.
4. The assay for assessing the correlation between functional brain connections and clinical symptoms according to claim 2, wherein: the feature selection adopts two modes of an absolute median difference (MAD) mode and a Relief analysis mode.
5. The assay for assessing a correlation between brain functional connectivity and clinical symptoms according to claim 1, wherein in step S2, the regression formula is:
residuals.glm(glm(data~sex+age+head_movement))
the formula is an R language code which controls the influence of age, sex and head movement on functional connection data or clinical symptom scoring data by applying a generalized linear model;
wherein residuals represents the residual; glm (general linear mode) stands for generalized linear model; data represents functional connectivity data or clinical symptom scoring data; sex represents gender classification data; age represents age data; head _ movement represents head movement data; + represents a connection; the representation is interpreted as.
6. The analysis method for evaluating the correlation between functional brain connections and clinical symptoms according to claim 1, wherein the calculation of the optimal sparse coefficient in step S3 includes a training mode, a substitution mode:
the training mode is to obtain parameters which enable the result to be optimal through a mode of training samples, and take the parameters as the optimal parameters to be brought into the whole for calculation to obtain a final model;
the replacement mode is to perform sparse canonical correlation analysis by performing multiple replacements on corresponding characteristics of brain network function connection data and clinical symptom scale scoring data.
7. The assay for assessing the correlation between functional brain connections and clinical symptoms according to claim 1, wherein: in step S4, given X and Z representing two groups of features on the same sample, i.e., brain function connection feature and clinical symptom feature, linear coefficients u, v are obtained to maximize covariance cor (u 'X, v' Z), where u, v is a size corresponding to the weight of the feature quantity corresponding to each group of representative vector pair in the representative vector pair.
8. The analysis method for evaluating the correlation between functional brain connections and clinical symptoms according to claim 1, wherein the matching method in step S5 comprises:
(1) calculating the correlation between a v set in a typical vector pair generated in a sample set and a v set in a typical vector pair of a previously obtained overall result, sequencing the v sets according to the magnitude of the correlation, identifying the v set as the same typical vector pair when the magnitude of the correlation is large, and deleting the group of results when the typical vector pair in the sample set generated by a self-service sampling method does not correspond to the typical vector pair in the overall result one by one;
(2) and calculating the correlation between the u set of the typical vector pairs generated in the sample set and the u set in the overall result, and sorting the u set in a descending order according to the correlation size, wherein the high correlation is regarded as the same typical vector pair, and the matched typical vectors are not matched next time.
9. The assay for assessing the correlation between functional brain connections and clinical symptoms according to claim 1, wherein: in step S7, the confidence interval for the functional connectivity data was 99.5% and the confidence interval for the clinical symptoms table score data was 95%.
10. The assay of claim 1, wherein in step S8, steps S1, S3 and S7 are repeated in sequence, and a generalized additive model is used to explore the effect of different canonical vectors on age and gender, wherein the model is:
ConnectivityScorei=ui′X
ConnectivityScorei~Sex+s(Age)
ConnectivityScore represents scores after different dimensions of projection; u represents a projection vector; x represents functional connection data; the representation is interpreted as; s (smooth) represents smoothing; sex represents gender classification data; age represents age data.
CN202111095382.7A 2021-09-17 2021-09-17 Analysis method for evaluating correlation between brain functional connectivity and clinical symptoms Pending CN113707324A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111095382.7A CN113707324A (en) 2021-09-17 2021-09-17 Analysis method for evaluating correlation between brain functional connectivity and clinical symptoms

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111095382.7A CN113707324A (en) 2021-09-17 2021-09-17 Analysis method for evaluating correlation between brain functional connectivity and clinical symptoms

Publications (1)

Publication Number Publication Date
CN113707324A true CN113707324A (en) 2021-11-26

Family

ID=78661102

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111095382.7A Pending CN113707324A (en) 2021-09-17 2021-09-17 Analysis method for evaluating correlation between brain functional connectivity and clinical symptoms

Country Status (1)

Country Link
CN (1) CN113707324A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150272461A1 (en) * 2013-05-01 2015-10-01 Advanced Telecommunications Research Institute International Brain activity analyzing apparatus, brain activity analyzing method and biomarker apparatus
CN105512454A (en) * 2015-07-28 2016-04-20 东南大学 Depression patient suicide risk objective assessment model based on functional nuclear magnetic resonance
CN108257657A (en) * 2016-12-28 2018-07-06 复旦大学附属华山医院 The data analysing method of magnetic resonance detection based on the prediction of disturbance of consciousness patient consciousness recovery
CN112768072A (en) * 2021-01-12 2021-05-07 哈尔滨医科大学 Cancer clinical index evaluation system constructed based on imaging omics qualitative algorithm

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150272461A1 (en) * 2013-05-01 2015-10-01 Advanced Telecommunications Research Institute International Brain activity analyzing apparatus, brain activity analyzing method and biomarker apparatus
CN105512454A (en) * 2015-07-28 2016-04-20 东南大学 Depression patient suicide risk objective assessment model based on functional nuclear magnetic resonance
CN108257657A (en) * 2016-12-28 2018-07-06 复旦大学附属华山医院 The data analysing method of magnetic resonance detection based on the prediction of disturbance of consciousness patient consciousness recovery
CN112768072A (en) * 2021-01-12 2021-05-07 哈尔滨医科大学 Cancer clinical index evaluation system constructed based on imaging omics qualitative algorithm

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
CEDRIC HUCHUAN XIA 等: "Linked Dimensions of Psychopathology and Connectivity in Functional Brain Networks", 《NATURE COMMUNICATIONS》 *
余仁萍等: "基于静息态功能磁共振成像的精神分裂症脑网络特征分类研究", 《生物医学工程学杂志》 *
孙也婷 等: "基于精神影像和人工智能的抑郁症客观生物学标志物研究进展", 《生物化学与生物物理进展》 *
李斌: "基于功能磁共振成像的精神分裂症临床症状特征的分析", 《中国优秀博硕士学位论文全文数据库(硕士) 医药卫生科技辑》 *

Similar Documents

Publication Publication Date Title
Kumari et al. Classification of diabetes disease using support vector machine
JP7276915B2 (en) Method and System for Individualized Prediction of Psychiatric Disorders Based on Monkey-Human Species Transfer of Brain Function Maps
Hasan et al. Machine learning-based diabetic retinopathy early detection and classification systems-a survey
CN111009321A (en) Application method of machine learning classification model in juvenile autism auxiliary diagnosis
Anaissi et al. Feature selection of imbalanced gene expression microarray data
CN117315379B (en) Deep learning-oriented medical image classification model fairness evaluation method and device
Hidayat et al. Comparison of K-Nearest Neighbor and Decision Tree Methods using Principal Component Analysis Technique in Heart Disease Classification
CN117195027A (en) Cluster weighted clustering integration method based on member selection
Özdem et al. A GA-based CNN model for brain tumor classification
CN113707324A (en) Analysis method for evaluating correlation between brain functional connectivity and clinical symptoms
Gorji et al. Biomarkers Selection Toward Early Detection of Alzheimer's Disease
Priyanka et al. An effective dementia diagnosis system using machine learning techniques
Kantayeva et al. Application of machine learning in dementia diagnosis: A systematic literature review
Olofsson Using machine learning and Repeated Elastic Net Technique for identification of biomarkers of early Alzheimer's disease
Jadhao et al. Prediction of Early Stage Alzheimer’s using Machine Learning Algorithm
Bhagyashree Clinical Diagnosis of Alzheimer’s Disease Employing Support Vector Machine
CN117637154B (en) Nerve internal department severe index prediction method and system based on optimization algorithm
Raghav et al. Autism Spectrum Disorder Detection in Children Using Transfer Learning Techniques
Raibag et al. An Investigation on Epileptic Seizure Classification Using Machine Learning and Multiple Feature Selection Strategies
Li et al. Semi-supervised clustering for neuro-subtyping of autism spectrum disorder
THOMAS et al. Data Mining Algorithms and Statistical Techniques for Identification of Schizophrenia: A Survey
Rimal et al. Comparative study of machine learning and deep learning methods on ASD classification
Gharsallaoui et al. Quantifying the Reproducibility of Graph Neural Networks using Multigraph Brain Data
Amini et al. A Hybrid AI Framework to Address the Issue of Frequent Missing Values with Application in EHR Systems: the Case of Parkinson’s Disease
Ramya et al. Diagnosing Parkinson’s Disease Using Voice Features Based On Deep Learning And Information Gain

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20211126