CN115620808A - Cancer gene prognosis screening method and system based on improved Cox model - Google Patents
Cancer gene prognosis screening method and system based on improved Cox model Download PDFInfo
- Publication number
- CN115620808A CN115620808A CN202211631423.4A CN202211631423A CN115620808A CN 115620808 A CN115620808 A CN 115620808A CN 202211631423 A CN202211631423 A CN 202211631423A CN 115620808 A CN115620808 A CN 115620808A
- Authority
- CN
- China
- Prior art keywords
- matrix
- cox
- message
- patient
- regression
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 52
- 238000004393 prognosis Methods 0.000 title claims abstract description 42
- 238000012216 screening Methods 0.000 title claims abstract description 31
- 108700019961 Neoplasm Genes Proteins 0.000 title claims abstract description 28
- 102000048850 Neoplasm Genes Human genes 0.000 title claims abstract description 28
- 239000011159 matrix material Substances 0.000 claims abstract description 104
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 52
- 230000004083 survival effect Effects 0.000 claims abstract description 47
- 230000014509 gene expression Effects 0.000 claims abstract description 36
- 206010028980 Neoplasm Diseases 0.000 claims abstract description 30
- 201000011510 cancer Diseases 0.000 claims abstract description 28
- 206010027476 Metastases Diseases 0.000 claims abstract description 10
- 230000009401 metastasis Effects 0.000 claims abstract description 10
- 238000007781 pre-processing Methods 0.000 claims abstract description 7
- 239000013598 vector Substances 0.000 claims description 79
- 230000006870 function Effects 0.000 claims description 66
- 238000004422 calculation algorithm Methods 0.000 claims description 19
- 238000012546 transfer Methods 0.000 claims description 13
- 230000008569 process Effects 0.000 claims description 11
- 230000005540 biological transmission Effects 0.000 claims description 9
- 230000007704 transition Effects 0.000 claims description 4
- 238000012163 sequencing technique Methods 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims 1
- 238000005516 engineering process Methods 0.000 abstract description 3
- 238000004458 analytical method Methods 0.000 description 7
- 238000004364 calculation method Methods 0.000 description 5
- 238000011282 treatment Methods 0.000 description 5
- 238000002790 cross-validation Methods 0.000 description 4
- 238000012217 deletion Methods 0.000 description 4
- 230000037430 deletion Effects 0.000 description 4
- 230000008859 change Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 238000010207 Bayesian analysis Methods 0.000 description 2
- 238000007476 Maximum Likelihood Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 238000000018 DNA microarray Methods 0.000 description 1
- 230000001174 ascending effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 239000012634 fragment Substances 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000011337 individualized treatment Methods 0.000 description 1
- 238000012804 iterative process Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000000630 rising effect Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 230000009897 systematic effect Effects 0.000 description 1
- 238000002560 therapeutic procedure Methods 0.000 description 1
- 230000017105 transposition Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/18—Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B35/00—ICT specially adapted for in silico combinatorial libraries of nucleic acids, proteins or peptides
- G16B35/20—Screening of libraries
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
- G16B40/20—Supervised data analysis
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Theoretical Computer Science (AREA)
- Bioinformatics & Computational Biology (AREA)
- Medical Informatics (AREA)
- Evolutionary Biology (AREA)
- General Physics & Mathematics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Biotechnology (AREA)
- Biophysics (AREA)
- General Health & Medical Sciences (AREA)
- Mathematical Optimization (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Library & Information Science (AREA)
- Mathematical Physics (AREA)
- Pure & Applied Mathematics (AREA)
- Mathematical Analysis (AREA)
- Chemical & Material Sciences (AREA)
- Computational Mathematics (AREA)
- Databases & Information Systems (AREA)
- Bioethics (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Public Health (AREA)
- Epidemiology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Genetics & Genomics (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Operations Research (AREA)
- Probability & Statistics with Applications (AREA)
- Analytical Chemistry (AREA)
- Algebra (AREA)
- Biochemistry (AREA)
- General Engineering & Computer Science (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a cancer gene prognosis screening method and a cancer gene prognosis screening system based on an improved Cox model, which comprises the following steps: s1, collecting the expression quantity of different genes of cancer cells of a cancer patient, collecting survival data of the patient, collating the expression quantity of the different genes of the cancer cells and patient information into a first matrix, and preprocessing the first matrix to obtain a second matrix; s2, inputting the survival data and the second matrix into a preset Cox regression model, and solving to obtain a regression coefficient; s3, evaluating the patient risk of the corresponding gene in the regression coefficient according to the risk function of the patient, and screening a prognostic genome corresponding to high patient risk; and S4, providing guide information for predicting prognosis, relapse and metastasis by using the screened prognostic genome through a biological theory. Compared with the traditional technology, the accuracy of regression is improved in the regression part through the addition of prior and the automatic updating of parameters, and guidance information is provided for predicting prognosis, recurrence and metastasis.
Description
Technical Field
The invention relates to the technical field of survival analysis Cox model regression, in particular to a cancer gene prognosis screening method and system based on an improved Cox model.
Background
With the advent and development of DNA microarray technology, the technology can simultaneously monitor the expression levels of thousands of genes to study the effect of certain treatments, diseases and developmental stages on gene expression. Commonly used scenarios are: detecting the gene expression of the cancer cells of a plurality of cancer patients, obtaining the survival data of the patients through follow-up visit, finally carrying out statistical analysis on the collected data by using a survival analysis means, and finally screening out genes related to prognosis. The research on the relation between the prognostic gene and the tumor can provide information for predicting prognosis, recurrence, metastasis and even guiding treatment, and the final purpose is to provide help for individualized treatment of patients and further provide breakthrough for the treatment of cancer.
The collected survival data and gene expression quantity need to be subjected to systematic survival analysis, more than ten key prognostic genes are screened from tens of thousands of genes, the step is an indispensable loop in the whole prognostic analysis, and the risk of cancer patients can be evaluated through a gene set consisting of the more than ten genes, so that more treatment information is provided.
Among them, the Cox regression model is widely used in medical follow-up studies, and is the multifactorial analysis method most frequently used in survival analysis so far. The model takes survival outcome and survival time as dependent variables, can simultaneously analyze the influence of a plurality of factors on the survival time, can analyze data with truncated survival time, does not require the survival distribution type of the estimated data, has excellent properties, and has a great position in cancer prognosis gene screening.
It is shown from the open literature that the most commonly used solution in the Cox regression model is through coordinate descent, proposed by Noah Simon et al, and follows a regularization path using a hot start: (Norm sumNorm as a penalty term) is fitted. But the coefficient of the penalty term is determined by cross validation, which leads the penalty termThe coefficients cannot be solved accurately and automatically, and since the fitting is calculated by an optimization method and is a point estimation, posterior distribution cannot be obtained and prior parameters are automatically solved (i.e. penalty coefficients) by combining an Expectation-maximization algorithm (expection-maximization), so that the prognostic genes finally screened by the algorithm cannot be well associated with cancers.
Among them, cox regression is a survival analysis method, which is a loop in prognostic gene screening and plays an important role. The implication of the regression coefficients solved by the Cox regression model is to weight the risk of each corresponding gene, and only if the regression coefficients are accurate, the subsequent risk calculation for each patient will be accurate. Therefore, a more accurate method for solving the Cox regression model is needed.
To this end, in combination with the above needs and deficiencies of the prior art, the present application proposes a method and system for cancer gene prognosis screening based on an improved Cox model.
Disclosure of Invention
The invention provides a cancer gene prognosis screening method and system based on an improved Cox model, which improve regression precision in a regression part through prior addition and automatic updating of parameters, screen out corresponding genes with large absolute values in regression coefficients as prognosis genes, and provide information for subsequent prediction prognosis, relapse, transfer and even guide treatment.
The primary objective of the present invention is to solve the above technical problems, and the technical solution of the present invention is as follows:
the invention provides a cancer gene prognosis screening method based on an improved Cox model in a first aspect, which comprises the following steps:
s1, collecting the expression quantity of different genes of cancer cells of a cancer patient, collecting survival data of the patient, and sorting the expression quantity of the different genes of the cancer cells and patient information into a first matrixFor the first matrixPreprocessing is carried out to obtain a second matrix。
S2, survival data obtained in the step S1 and a second matrixXAnd inputting a preset Cox regression model, and solving to obtain a regression coefficient.
And S3, evaluating the patient risk of the corresponding gene in the regression coefficient according to the risk function of the patient, and screening a prognostic genome corresponding to high patient risk.
And S4, providing guide information for predicting prognosis, relapse and metastasis by using the screened prognostic genome through a biological theory.
Wherein, in the first matrixWherein the rows of the matrix represent patient information and the columns of the matrix represent gene segments of cancer cells; first matrixIndicates the expression level of the gene of the corresponding column in the patient of the corresponding row.
Wherein, the survival data comprises: covariate or secondary matrixXTime to live y and erasure index c.
The genes corresponding to the components with larger absolute values in the regression coefficients have larger influence on the survival time of the patient, and the prognostic gene set corresponding to high patient risk can be screened out by evaluating the regression coefficients.
The pretreatment process in the step S1 specifically comprises the following steps: removing irrelevant genes by biological information statistical means to obtain a second matrix with less columns。
Further, in step S2, first, a third matrix formed by combining the raw data and the second matrix is input into the preset Cox regression model; wherein, the firstThree matrices are denoted as [ X, y, c ]]Wherein X represents a covariate matrix, namely a second matrix, y represents survival time, and c represents a deletion index; wherein the firstiSurvival data for individual patient is。
Further, the firstiThe risk function for each of said patients is specifically:
whereinIs a shared benchmark risk function;obtaining a regression coefficient for solving the Cox regression model;is shown asiGene expression levels of individual patients.
Wherein the regression coefficient is fitted by regression using a Cox regression modelWe can then follow the gene expression level of the patientTo assess patient risk, and regression coefficientsThe components with larger absolute values in the population have larger influence on the survival time of the patient, and the genes corresponding to the components are the prognostic gene set to be screened out.
Further, solving the Cox regression model in step S2 to obtain a regression coefficient specifically includes the following steps:
s21, combining the existing survival data into a third matrix, sequencing according to the survival time of the parameters, constructing a Cox regression model by using the sequenced data, and initializing prior parameters and message transmission parameters.
And S22, projecting the high-dimensional message to independent Gaussian distribution through a moment matching rule by using an expected propagation algorithm according to a determinant vector factor graph of the Cox regression model, circularly iterating and solving the model, and outputting a regression coefficient and an approximate posterior probability.
And S23, inputting the regression coefficient and the approximate posterior probability into an expected maximum algorithm, and updating the prior parameter.
S24, judging whether the regression coefficient reaches a preset iteration ending condition or not; if the preset iteration ending condition is reached, outputting a regression coefficient obtained by the current iteration; if the preset iteration end condition is not reached, the process returns to step S22 to perform the next iteration.
Wherein the third matrix is [ X, y, c ], X represents a covariate matrix, y represents survival time, and c represents a deletion index.
The method comprises the steps of solving the problem of regression coefficient estimation by means of a complete Bayesian analysis method, converting maximum likelihood estimation with penalty terms into minimum mean square error estimation of a Bayesian angle, adopting a factor graph as a tool, calculating messages transmitted among nodes by a message transmission method based on expected propagation, and acquiring approximate posterior probability of the regression coefficient, wherein the approximate posterior probability is substantially the probability distribution obeyed by the approximation inference of the regression coefficient.
Further, the prior parameters include: mean valueVariance, varianceAnd sparsity ratio(ii) a The message passing parameters comprise: mean and variance of positive direction messages; the step S21 specifically includes: normalizing a covariate matrix X matrixThe third matrix is [ X, y, c ] according to the survival time y]Sorting in descending order, and setting the sorted third matrix as [ X, y, c ]]And substituting Cox partial likelihood function to initialize prior parameter and message transfer function.
Wherein, the prior parameter and the regression coefficient both obey Gaussian-Bernoulli distribution and have sparsity.
The projection operation of the likelihood function nodes is simplified approximately by adopting a Laplace method and a moment generating function, so that the complex calculation is simplified, and a more accurate regression coefficient is solved under the condition of less loss.
Further, the normalizing the covariate matrix X specifically includes:
wherein mean (a)X) Is composed ofXMean of the elements of the matrix, var: (X) Is composed ofXThe variance of the whole elements of the matrix.
The Cox partial likelihood function is specifically:
wherein,expressing the function asIs transferred toFor representing transition probabilities ofAboutIs normalized;the partial likelihood function is a Cox partial likelihood function, is not normalized, and represents a direct proportion relation; the function is as followsIs a variable, the firstiAn element,Is composed ofTo (1) aiAnd (4) each element.
The initialization of the prior parameter is specifically as follows: the regression coefficients are subjected to Gaussian-Bernoulli distribution, and the mathematical expression is as follows:
wherein,representing a dirac Delta function;represents a mean value ofVariance of(ii) a gaussian distribution of; the function is as followsIs a variable; initializing prior parameters,,。
The initialization of the message transfer function specifically includes: initializing a message transfer function of a positive direction message, wherein the mathematical expression of the message transfer function is as follows:
wherein,is an n-dimensional column vector with elements all being 0;the method is characterized in that the method is an n-dimensional column vector with elements all being 1, and subscripts represent the dimension of the vector;is a random variable obeying independent same variance multidimensional Gaussian distribution;is an n-column dimensional vector with element 1; initialization,,。
In the determinant vector factor graph of the Cox regression model, four multidimensional random variables are used for representing messages transmitted on the factor graph, namely, the messages are regarded as a multidimensional Gaussian probability density function, and the moment matching process requires that the messages obey the following distribution:
wherein,is a random variable obeying independent same variance multidimensional Gaussian distribution;the vector is an n-column dimensional vector with the element of 1, and the subscript represents the dimension of the vector;is a p-column dimensional vector with element 1, subscript representing vector dimension; when the elements of the multidimensional gaussian random variables are independent of each other, i.e., the off-diagonal elements of the covariance matrix are 0, the diagonal matrix can be represented by vectors.
Further, the step S22 is specifically to perform message transmission on the determinant vector factor graph of the Cox regression model based on the moment matching rule, and includes the following steps:
s221, matching according to the moment matching rule of the determinant vector factor graph of the Cox regression modelUpdating, specifically:
at a nodeIn the above, willOf (2) a messageMultiplying and projecting the result onto a multidimensional Gaussian distribution of independent covarianceIs divided by the message to obtainThe message of (2).
Wherein,is a projection operation, i.e. determiningAboutMean vector ofSum variance vectorSince it is a multidimensional Gaussian of independent covariance, the vectorIs equal and the off-diagonal element is 0, and outputs。
S222, according to the moment matching rule of the determinant vector factor graph of the Cox regression model, pairingUpdating, specifically:
at a nodeIn the above, willOf a message andmultiply and then accumulate the variablesAnd projecting the result on a multidimensional Gaussian distribution with independent covariance, and then summing the projected resultsIs divided by the message to obtainThe message of (2); whereinIs a dirac Delta function.
S223, matching according to the moment matching rule of the determinant vector factor graph of the Cox regression modelUpdating, specifically:
in thatOn a node, willOf a messageAndprojecting the result obtained by multiplication to the multidimensional Gaussian distribution of independent covariance, and summing the results obtained by projectionIs divided by the message to obtainThe message of (a); wherein the mean value obtained by the projection operationAre the Cox regression coefficients as the output result.
S224, according to the moment matching rule of the determinant vector factor graph of the Cox regression model, pairingUpdating, specifically:
in thatOn a node, willOf a message andmultiply and accumulate variablesProjecting the result to a multidimensional Gaussian distribution of independent covariance, and then summing the projected resultsIs divided by the message to obtainThe message of (2).
Wherein, due toHas an extremely complex form, and therefore uses an cumulant generation function and a Laplace method insteadAnd carrying out projection operation.
Further, in step S223, the projection operation specifically includes:
wherein,representing an approximate posterior probability of the regression coefficient; mean value obtained by projectionI.e., the Cox regression coefficients of the model output.
Further, step S23 specifically includes: regression coefficient output from step S22And approximate posterior probabilityMatching with expectation maximization algorithm to prior parametersCarrying out automatic updating; the updated expression is specifically:
The prior parameters are self-learned, and are automatically updated along with iteration of the whole algorithm without manual adjustment, so that the uncertainty of cross validation can be further avoided.
Further, the preset iteration ending condition in step S24 is specifically:
wherein the Crit value is determined by judging whether the Crit value starts to rise or notIf the iteration is finished, if the Crit value begins to rise, the iteration process is stopped and the regression coefficient of the final iteration is output(ii) a If the Crit value does not start to rise, continuing iteration; whereinRepresenting a norm.
The second aspect of the present invention provides a cancer gene prognosis screening system based on an improved Cox model, comprising a memory and a processor, wherein the memory includes a cancer gene prognosis screening program based on the improved Cox model, and the cancer gene prognosis screening program based on the improved Cox model, when executed by the processor, implements the following steps:
s1, collecting the expression quantity of different genes of cancer cells of a cancer patient, collecting survival data of the patient, and sorting the expression quantity of the different genes of the cancer cells and patient information into a first matrixFor the first matrixPreprocessing is carried out to obtain a second matrix。
S2, survival data obtained in the step S1 and a second matrixXAnd inputting a preset Cox regression model, and solving to obtain a regression coefficient.
And S3, evaluating the patient risk of the corresponding gene in the regression coefficient according to the risk function of the patient, and screening a prognostic genome corresponding to high patient risk.
And S4, providing guide information for predicting prognosis, relapse and metastasis by using the screened prognostic genome through a biological theory.
Compared with the prior art, the technical scheme of the invention has the beneficial effects that:
the invention provides a cancer gene prognosis screening method and a system based on an improved Cox model, wherein a factor graph is used as a tool, and the approximate posterior probability of a Cox regression coefficient is deduced through a moment matching message transmission method based on expected propagation; the method of minimum mean square error estimation is adopted to realize accurate estimation of the regression coefficient estimation value; in the aspect of prior parameters, an expectation maximization algorithm is adopted for automatic solution, so that cross validation is omitted, and regression coefficients are estimated more accurately; in the specific implementation aspect, the Laplace method and the cumulant generation function are simplified to simplify the complex formAnd the method is successfully projected by multiplying Gaussian so that iteration can be carried out, the problem of regression precision can be solved, a corresponding gene with a large absolute value in a regression coefficient is screened out to serve as a prognostic gene, and information is provided for subsequent prognosis prediction, recurrence, metastasis and even therapy guidance.
Drawings
FIG. 1 is a flow chart of the method for screening cancer gene prognosis based on the improved Cox model of the present invention.
FIG. 2 is a flow chart of solving a Cox model in the cancer gene prognosis screening method based on the improved Cox model of the present invention.
FIG. 3 is a flow chart of an embodiment of the invention for solving the Cox model.
FIG. 4 is a diagram of a determinant vector factor graph in an embodiment of the present invention.
FIG. 5 is a diagram illustrating a method of matching message delivery based on a desired propagation in accordance with an embodiment of the present invention.
FIG. 6 is a graph illustrating performance of regression performed on simulated data in an embodiment of the present invention.
FIG. 7 is a schematic structural diagram of a cancer gene prognosis screening system based on an improved Cox model according to the present invention.
Detailed Description
In order that the above objects, features and advantages of the present invention can be more clearly understood, a more particular description of the invention will be rendered by reference to the appended drawings. It should be noted that the embodiments and features of the embodiments of the present application may be combined with each other without conflict.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, however, the present invention may be practiced otherwise than as specifically described herein and, therefore, the scope of the present invention is not limited by the specific embodiments disclosed below.
Example 1
As shown in FIG. 1, the present invention provides a method for screening cancer gene prognosis based on an improved Cox model, which comprises the following steps:
s1, collecting the expression quantity of different genes of cancer cells of a cancer patient, collecting survival data of the patient, and sorting the expression quantity of the different genes of the cancer cells and patient information into a first matrixFor the first matrixPreprocessing is carried out to obtain a second matrix。
S2, survival data obtained in the step S1 and a second matrix are usedXAnd inputting a preset Cox regression model, and solving to obtain a regression coefficient.
And S3, evaluating the patient risk of the corresponding gene in the regression coefficient according to the risk function of the patient, and screening a prognostic genome corresponding to high patient risk.
And S4, providing guide information for predicting prognosis, relapse and metastasis by using the screened prognostic genome through a biological theory.
Wherein, in the first matrixWherein the rows of the matrix represent patient information and the columns of the matrix represent genes of cancer cellsA fragment; first matrixIndicates the expression level of the gene of the corresponding column in the patient of the corresponding row.
Wherein, the survival data comprises: covariate or secondary matrixXTime to live y and erasure index c.
The genes corresponding to the components with larger absolute values in the regression coefficients have larger influence on the survival time of the patient, and a prognostic gene set corresponding to high patient risk can be screened out by evaluating the regression coefficients.
The pretreatment process in the step S1 specifically comprises the following steps: removing irrelevant genes by means of biological information statistics to obtain a second matrix with less columns。
Further, in step S2, first, a third matrix formed by combining the raw data and the second matrix is input into the preset Cox regression model; wherein the third matrix is denoted as [ X, y, c]Wherein X represents a covariate matrix, i.e. a second matrix, y represents the time-to-live, and c represents the erasure index; wherein the first stepiSurvival data for individual patients is。
Further, the firstiThe risk function for each of said patients is specifically:
whereinIs a shared benchmark risk function;to solve the Cox regression modelModeling the obtained regression coefficient;denotes the firstiGene expression levels of individual patients.
Wherein the regression coefficient is fitted by regression using a Cox regression modelWe can then follow the gene expression level of the patientTo assess patient risk, and regression coefficientsThe larger absolute value of the components has a larger influence on the survival time of the patient, and the genes corresponding to the components are the prognostic gene set to be screened out.
Further, in step S2, solving the Cox regression model to obtain a regression coefficient, as shown in fig. 2, specifically includes the following steps:
s21, combining the existing survival data into a third matrix, sequencing according to the survival time of the parameters, constructing a Cox regression model by using the sequenced data, and initializing prior parameters and message transmission parameters.
And S22, projecting the high-dimensional message to independent Gaussian distribution through a moment matching rule by using an expected propagation algorithm according to a determinant vector factor graph of the Cox regression model, circularly iterating and solving the model, and outputting a regression coefficient and an approximate posterior probability.
And S23, inputting the regression coefficient and the approximate posterior probability into an expectation maximization algorithm, and updating the prior parameter.
S24, judging whether the regression coefficient reaches a preset iteration ending condition or not; if the preset iteration ending condition is reached, outputting a regression coefficient obtained by the current iteration; if the preset iteration end condition is not reached, the process returns to step S22 to perform the next iteration.
Wherein the third matrix is [ X, y, c ], X represents a covariate matrix, y represents survival time, and c represents a deletion index.
The method comprises the steps of solving the problem of regression coefficient estimation by means of a complete Bayesian analysis method, converting maximum likelihood estimation with penalty terms into minimum mean square error estimation of Bayesian angles, adopting a factor graph as a tool, calculating messages transmitted among nodes by a message transmission method based on expected propagation, and acquiring approximate posterior probability of the regression coefficient, wherein the approximate posterior probability is substantially the probability distribution obeyed by the approximation deduction of the regression coefficient.
Further, the prior parameters include: mean valueVariance, varianceAnd sparsity ratio(ii) a The message passing parameters comprise: mean and variance of positive direction messages; the step S21 is specifically: normalizing the X matrix of the covariate matrix, and determining the third matrix as [ X, y, c ] according to the survival time y]Sorting in a descending order, and setting the sorted third matrix as [ X, y, c ]]And substituting Cox partial likelihood function to initialize prior parameter and message transfer function.
In a specific embodiment, the covariate matrix can be a gene expression matrix, wherein each row represents a different patient, each column represents a different gene, and an element in the matrix represents the expression of a gene of a person.
Wherein, the prior parameter and the regression coefficient both obey Gaussian-Bernoulli distribution and have sparsity.
The projection operation of the likelihood function nodes is simplified approximately by adopting a Laplace method and a moment generating function, so that the complex calculation is simplified, and a more accurate regression coefficient is solved under the condition of less loss.
Further, the normalizing the covariate matrix X specifically includes:
wherein mean (a)X) Is composed ofXMean of the whole elements of the matrix, var: (X) Is composed ofXThe variance of the elements of the matrix as a whole.
The Cox partial likelihood function is specifically:
wherein,expressing the function asIs transferred toFor representing transition probabilities ofAboutIs normalized;the partial likelihood function of Cox is not normalized and represents a direct proportion relation; the function is as followsIs a variable, the firstiAn element,Is composed ofTo (1) aiAnd (4) each element.
The initialization of the prior parameter specifically comprises: the regression coefficients are subjected to Gaussian-Bernoulli distribution, and the mathematical expression is as follows:
wherein,representing a dirac Delta function;represents a mean value ofVariance of(ii) a gaussian distribution of; the function is as followsIs a variable; initializing prior parameters,,。
The initialization of the message transfer function specifically includes: initializing a message transfer function of a positive direction message, wherein the mathematical expression of the message transfer function is as follows:
wherein,is an n-dimensional column vector with elements all 0;is an n-dimensional column vector with elements all being 1;is a random variable obeying independent same variance multidimensional Gaussian distribution;is an n-column dimensional vector with element 1; initialization,,。
In a specific embodiment, the determinant vector factor graph of the Cox regression model is shown in fig. 4.
In the determinant vector factor graph of the Cox regression model, as shown in fig. 5, four multidimensional random variables are used to represent messages passing through the factor graph, i.e., the messages are regarded as a multidimensional gaussian probability density function, and the moment matching process requires that the messages obey the following distribution:
wherein,is a random variable obeying independent same variance multidimensional Gaussian distribution;the vector is an n-column dimensional vector with the element of 1, and the subscript represents the dimension of the vector;is a p-column dimensional vector with element 1, the subscript representing the vector dimension; when the elements of the multidimensional gaussian random variable are independent of each other, i.e., the off-diagonal elements of the covariance matrix are 0, the diagonal matrix can be represented by a vector.
In a specific embodiment, a priori parameters, i.e. a priori distribution, are setIn (1)-a sparse parameter,-a mean value parameter, the mean value parameter,the initial values of the variance parameters are respectively,,And then automatically updating the prior parameters by adopting an expected maximum algorithm.
Further, the step S22 is specifically to perform message transmission on the determinant vector factor graph of the Cox regression model based on the moment matching rule, and includes the following steps:
s221, matching according to the moment matching rule of the determinant vector factor graph of the Cox regression modelUpdating, specifically:
at a nodeIn the above, willMessage ofMultiplying and projecting the result onto a multidimensional Gaussian distribution of independent covarianceIs divided by the message to obtainThe message of (2).
Wherein,is a projection operation, i.e. findingAboutMean vector ofSum variance vectorSince it is a multidimensional Gaussian of independent covariance, the vectorIs equal and the off-diagonal element is 0, and outputs。
S222, according to the moment matching rule of the determinant vector factor graph of the Cox regression model, pairingUpdating, specifically:
at a nodeIn the above, willOf a message andmultiply and then accumulate the variablesAnd projecting the data to a multidimensional Gaussian distribution with independent covariance, and then summing the results obtained by projectionIs divided by the message to obtainThe message of (a); whereinIs a dirac Delta function.
S223, according to the moment matching rule of the determinant vector factor graph of the Cox regression model, pairingUpdating, specifically:
in thatOn a node, willOf a message andprojecting the result obtained by multiplication on the multidimensional Gaussian distribution of independent covariance, and summing the results obtained by projectionIs divided by the message to obtainThe message of (2); wherein the mean value obtained by the projection operationAre the Cox regression coefficients as the output result.
S224, according to the moment matching rule of the determinant vector factor graph of the Cox regression model, pairingUpdating, specifically:
in thatOn a node, willOf a message andmultiply and accumulate variablesProjecting the result to a multidimensional Gaussian distribution of independent covariance, and then summing the projected resultsIs divided by the message to obtainThe message of (2).
Wherein, due toHas an extremely complex form, and therefore uses an cumulant generation function and a Laplace method insteadAnd carrying out projection operation.
Further, in step S223, the projection operation specifically includes:
wherein,representing the approximate posterior probability of the regression coefficients; mean value obtained by projectionI.e., the Cox regression coefficients of the model output.
Further, step S23 specifically includes: regression coefficient output from step S22And approximate posterior probabilityMatching with expectation maximization algorithm to prior parametersCarrying out automatic updating; the updated expression is specifically:
The prior parameters are self-learned, and are automatically updated along with iteration of the whole algorithm without manual adjustment, so that the uncertainty of cross validation can be further avoided.
Further, the preset iteration ending condition in step S24 is specifically:
determining whether to end iteration by judging whether the Crit value starts to rise or not, if the Crit value starts to rise, stopping the iteration process and outputting a regression coefficient of the final iteration(ii) a If the Crit value does not start to rise, continuing iteration; whereinRepresenting a norm.
In a specific embodiment, the performance of regression on simulated data in a single experiment is shown in FIG. 6, where the black line is the true value and the asterisk is the estimated value.
The simulation data generation mode is as follows:
To pairIndependently sampling in a binomial distribution B (1, 0.8),wherein the erasure rate is 0.2.
example 2
Based on the above embodiment 1, with reference to fig. 3, this embodiment describes in detail a specific process of solving the Cox model in the present invention.
In one particular embodiment, as shown in FIG. 3, the known data is,,The coefficient to be regressed is。
Step 1:
S 1.1:XInitialization
Wherein mean (m)X) Is composed ofXMean of the elements of the matrix, var: (X) Is composed ofXThe variance of the whole elements of the matrix.
S1.2: merging the existing survival data (covariate matrix-X, survival time-y, deletion index-c) into a matrix [ X, y, c ] and sorting according to y descending order;
s1.3: substituting the ordered [ X, y, c ] into a Cox partial likelihood function:
expressing the function asIs transferred toThe transition probability of (2), which impliesAboutIs normalized (characteristic of probability density function), andthe partial likelihood function is a Cox partial likelihood function, and is not normalized, so that the partial likelihood function is in a direct proportion relation; the function is as followsIs a variable, the firstiAn element,Is composed ofTo (1) aiAnd (4) each element.
S1.4: it is assumed that the prior obeys a gaussian-bernoulli distribution:
S1.5: initializing a positive direction message:
wherein initialization is carried out,,;Is an n-dimensional column vector with elements all 0;an n-dimensional column vector with an element of 1, the subscripts denote the dimension of the vector.
Step 2: message passing on factor graph based on moment matching rule-expectation propagation algorithm (expectation propagation)
S2.1: updating: in thatOn a node, willMessage ofMultiplying and projecting onto a multidimensional Gaussian distribution of independent covariance, and then removingThe message of (2):
wherein,is a projection operation, i.e. findingAboutMean vector ofSum variance vector(diagonal of covariance matrix), the vector is a multidimensional Gaussian of independent covarianceIs equal and the off-diagonal element is 0, and outputs。
whereinNamely, it isThe variance of (a) is determined,,is composed ofBlack plug matrix of (To pairSecond order gradient of).
The meanings are as follows: when in useTaking out the diagonal of the matrix when it isWhen the vector is a vector, the vector is stretched into a diagonal matrix.
Is to calculate the average value of the vector,the vector points are divided by the vector points,is a vector dot product.
Wherein,adopt a pairAnd (3) solving by using a coordinate ascending algorithm after quadratic approximation:
wherein,is composed ofIn thatThe gradient of (a) is measured,is composed ofIn thatA black plug matrix of (a). After rewriting, the following are obtained:
S2.1.3: updatingIn thatBlack plug matrix ofTo aTo (1) akLine ofkColumn element(for accelerated calculations, only diagonal elements are kept to approximate the entire matrix):
If the change is still large, the iteration is continued by returning to S2.1.2.
S2.2: updating: in thatOn a node, willAndmultiply and then accumulate variablesProjected onto a multidimensional Gaussian distribution of independent covariance, and then removedThe message of (2):
wherein,the n-dimensional column vector with the element of 1 is represented by subscript, wherein the dimension of the vector is represented by the subscript;the meaning is as follows: when in useTaking out the diagonal of the matrix when it isIf the vector is a vector, the vector is expanded into a diagonal matrix,averaging the vector;means to findAboutMean vectorSum variance vectorAnd output;The inversion of the matrix is referred to as,refers to matrix transposition.
S2.3: updating: in thatOn a node, willAndthe result of the multiplication is projected on the multidimensional Gaussian distribution of independent covariance and then removedThe message of (2):
Wherein the approximation of the regression coefficients is a posteriori as follows:
and mean value obtained by projection operationIt is the Cox regression coefficients that are to be output.
S2.4: updating: in thatOn a node, willAndmultiply and then accumulate the variablesProjected onto a multidimensional Gaussian distribution of independent covariance, and then removedThe message of (2):
Step 3: output of approximate posterior probability according to S2.3Matching with expectation maximization algorithm (expectation maximization), the prior parameter is matchedAnd performing automatic updating.
Step 4: judging whether a preset iteration end condition is reached:
the end conditions are as follows:
determine whether it starts to rise, if soStopping the iterative process when the rising starts, and outputting the regression system of the final resultNumber of(in S2.3). WhereinIs a norm.
Example 3
Based on the above example 1 and example 2, and with reference to fig. 7, this example illustrates a cancer gene prognosis screening system based on an improved Cox model in the second aspect of the present invention.
In a specific embodiment, as shown in fig. 7, the present invention further provides a cancer gene prognosis screening system based on an improved Cox model, which includes a memory and a processor, wherein the memory includes a cancer gene prognosis screening program based on the improved Cox model, and the cancer gene prognosis screening program based on the improved Cox model implements the following steps when executed by the processor:
s1, collecting the expression quantity of different genes of cancer cells of a cancer patient, collecting survival data of the patient, and sorting the expression quantity of the different genes of the cancer cells and patient information into a first matrixFor the first matrixPreprocessing is carried out to obtain a second matrix。
S2, survival data obtained in the step S1 and a second matrixXAnd inputting a preset Cox regression model, and solving to obtain a regression coefficient.
And S3, evaluating the patient risk of the corresponding gene in the regression coefficient according to the risk function of the patient, and screening a prognostic genome corresponding to high patient risk.
And S4, providing guide information for predicting prognosis, relapse and metastasis by using the screened prognostic genome through a biological theory.
The drawings depicting the positional relationship of structures are for illustrative purposes only and are not to be construed as limiting the present patent.
It should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention and are not intended to limit the embodiments of the present invention. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.
Claims (10)
1. A cancer gene prognosis screening method based on an improved Cox model is characterized by comprising the following steps:
s1, collecting the expression quantity of different genes of cancer cells of a cancer patient, collecting survival data of the patient, collating the expression quantity of the different genes of the cancer cells and patient information into a first matrix, and preprocessing the first matrix to obtain a second matrix;
s2, inputting the survival data obtained in the step S1 and the second matrix into a preset Cox regression model, and solving to obtain a regression coefficient;
s3, evaluating the patient risk of the corresponding gene in the regression coefficient according to the risk function of the patient, and screening a prognostic genome corresponding to high patient risk;
and S4, providing guide information for predicting prognosis, relapse and metastasis by using the screened prognostic genome through a biological theory.
2. The method of claim 1, wherein in step S2, the survival data and the second matrix are combined to form a third matrix, and the third matrix is inputted into the predetermined Cox regression model; wherein the third matrix is denoted as [ X, y, c]X represents a covariate matrix, i.e. a second matrix, y represents the time-to-live, c represents the deletionIndexing; wherein the first stepiSurvival data for individual patients is。
3. The method of claim 2, wherein the first step is to select the improved Cox model based cancer gene prognosisiThe risk function for each of said patients is specifically:
4. The method of claim 3, wherein the step S2 of solving the Cox regression model to obtain regression coefficients comprises the following steps:
s21, combining the existing survival data into a third matrix, sequencing according to the survival time of the parameters, constructing a Cox regression model by using the sequenced data, and initializing prior parameters and message transmission parameters;
s22, projecting a high-dimensional message to independent Gaussian distribution through a moment matching rule by using an expected propagation algorithm according to a determinant vector factor graph of the Cox regression model, circularly iterating to solve the model, and outputting a regression coefficient and an approximate posterior probability;
s23, inputting the regression coefficient and the approximate posterior probability into an expected maximum algorithm, and updating prior parameters;
s24, judging whether the regression coefficient reaches a preset iteration ending condition or not; if the preset iteration ending condition is reached, outputting a regression coefficient obtained by the current iteration; and if the preset iteration end condition is not reached, returning to the step S22 for the next iteration.
5. The method of claim 4, wherein the prior parameters include: mean valueVariance, varianceAnd sparsity ratio(ii) a The message passing parameters comprise: mean and variance of positive direction messages; the step S21 is specifically: normalizing the X matrix of the covariate matrix, and determining the third matrix as [ X, y, c ] according to the survival time y]Sorting in descending order, and setting the sorted third matrix as [ X, y, c ]]And substituting Cox partial likelihood function to initialize prior parameter and message transfer function.
6. The method of claim 4, wherein the normalization process of the X matrix of the covariate matrix is as follows:
wherein mean (m)X) Is composed ofXMean of the whole elements of the matrix, var: (X) Is composed ofXThe variance of the whole elements of the matrix;
the Cox partial likelihood function is specifically:
wherein,expressing the function asIs transferred toFor representing transition probabilities ofAboutIs normalized;the partial likelihood function of Cox is not normalized and represents a direct proportion relation; the function is as followsIs a variable, the firstiAn element,Is composed ofTo (1) aiAn element;
the initialization of the prior parameters specifically comprises the following steps: the regression coefficients are subjected to Gaussian-Bernoulli distribution, and the mathematical expression is as follows:
wherein,representing a dirac Delta function;represents a mean value ofVariance ofA gaussian distribution of (d); the function is as followsIs a variable; initializing prior parameters,,;
The initialization of the message transfer function is specifically as follows: initializing a message transfer function of a positive direction message, wherein the mathematical expression of the message transfer function is as follows:
7. The method for screening cancer gene prognosis based on improved Cox model as claimed in claim 6, wherein said step S22 is specifically for message transmission on determinant vector factor graph of Cox regression model based on moment matching rule, comprising the following steps:
s221, matching according to the moment matching rule of the determinant vector factor graph of the Cox regression modelUpdating, specifically:
at a nodeIn the above, willOf (2) a messageMultiplying and projecting the result onto a multidimensional Gaussian distribution of independent covarianceIs divided by the message to obtainThe message of (a);
s222, matching according to the moment matching rule of the determinant vector factor graph of the Cox regression modelUpdating, specifically:
at a nodeIn the above, willOf a message andmultiply and then accumulate variablesAnd projected to independent covarianceOn a multi-dimensional Gaussian distribution, the results obtained by projection are then summedIs divided by the message to obtainThe message of (2); whereinIs a dirac Delta function;
s223, matching according to the moment matching rule of the determinant vector factor graph of the Cox regression modelUpdating, specifically:
in thatOn a node, willOf a message andprojecting the result obtained by multiplication on the multidimensional Gaussian distribution of independent covariance, and summing the results obtained by projectionIs divided by the message to obtainThe message of (2); wherein the mean value obtained by the projection operationIs the Cox regression coefficient as the output result;
s224, according to the moment matching rule of the determinant vector factor graph of the Cox regression model, pairingUpdating, specifically:
8. The method for screening cancer gene prognosis based on improved Cox model according to claim 4, wherein the step S23 is specifically as follows: regression coefficient output from step S22And approximate posterior probabilityMatching with expectation maximization algorithm to prior parameterCarrying out automatic updating; the updated expression is specifically:
9. The method for screening cancer gene prognosis based on improved Cox model according to any one of claims 4-8, wherein the iteration end conditions preset in step S24 are specifically:
determining whether to end iteration by judging whether the Crit value starts to rise or not, if the Crit value starts to rise, stopping the iteration process and outputting a regression coefficient of the final iteration(ii) a If the Crit value does not start to rise, continuing iteration; whereinRepresenting a norm.
10. A cancer gene prognosis screening system based on an improved Cox model comprises a memory and a processor, wherein the memory comprises a cancer gene prognosis screening program based on the improved Cox model, and the cancer gene prognosis screening program based on the improved Cox model realizes the following steps when being executed by the processor:
s1, collecting the expression quantity of different genes of cancer cells of a cancer patient, collecting survival data of the patient, collating the expression quantity of the different genes of the cancer cells and patient information into a first matrix, and preprocessing the first matrix to obtain a second matrix;
s2, inputting the survival data obtained in the step S1 and the second matrix into a preset Cox regression model, and solving to obtain a regression coefficient;
s3, evaluating the patient risk of the corresponding gene in the regression coefficient according to the risk function of the patient, and screening a prognostic genome corresponding to high patient risk;
and S4, providing guide information for predicting prognosis, relapse and metastasis by using the screened prognostic genome through a biological theory.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211631423.4A CN115620808B (en) | 2022-12-19 | 2022-12-19 | Cancer gene prognosis screening method and system based on improved Cox model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211631423.4A CN115620808B (en) | 2022-12-19 | 2022-12-19 | Cancer gene prognosis screening method and system based on improved Cox model |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115620808A true CN115620808A (en) | 2023-01-17 |
CN115620808B CN115620808B (en) | 2023-03-31 |
Family
ID=84879866
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211631423.4A Active CN115620808B (en) | 2022-12-19 | 2022-12-19 | Cancer gene prognosis screening method and system based on improved Cox model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115620808B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116321620A (en) * | 2023-05-11 | 2023-06-23 | 杭州行至云起科技有限公司 | Intelligent lighting switch control system and method thereof |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110320390A1 (en) * | 2009-03-10 | 2011-12-29 | Kuznetsov Vladimir A | Method for identification, prediction and prognosis of cancer aggressiveness |
US20170024529A1 (en) * | 2015-07-26 | 2017-01-26 | Macau University Of Science And Technology | Semi-Supervised Learning Framework based on Cox and AFT Models with L1/2 Regularization for Patient's Survival Prediction |
CN106407689A (en) * | 2016-09-27 | 2017-02-15 | 牟合(上海)生物科技有限公司 | Stomach cancer prognostic marker screening and classifying method based on gene expression profile |
CN112117003A (en) * | 2020-09-03 | 2020-12-22 | 中国科学院深圳先进技术研究院 | Tumor risk grading method, system, terminal and storage medium |
CN113409946A (en) * | 2021-07-02 | 2021-09-17 | 中山大学 | System and method for predicting cancer prognosis risk under high-dimensional deletion data |
-
2022
- 2022-12-19 CN CN202211631423.4A patent/CN115620808B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110320390A1 (en) * | 2009-03-10 | 2011-12-29 | Kuznetsov Vladimir A | Method for identification, prediction and prognosis of cancer aggressiveness |
US20170024529A1 (en) * | 2015-07-26 | 2017-01-26 | Macau University Of Science And Technology | Semi-Supervised Learning Framework based on Cox and AFT Models with L1/2 Regularization for Patient's Survival Prediction |
CN106407689A (en) * | 2016-09-27 | 2017-02-15 | 牟合(上海)生物科技有限公司 | Stomach cancer prognostic marker screening and classifying method based on gene expression profile |
CN112117003A (en) * | 2020-09-03 | 2020-12-22 | 中国科学院深圳先进技术研究院 | Tumor risk grading method, system, terminal and storage medium |
WO2022048071A1 (en) * | 2020-09-03 | 2022-03-10 | 中国科学院深圳先进技术研究院 | Tumor risk grading method and system, terminal, and storage medium |
CN113409946A (en) * | 2021-07-02 | 2021-09-17 | 中山大学 | System and method for predicting cancer prognosis risk under high-dimensional deletion data |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116321620A (en) * | 2023-05-11 | 2023-06-23 | 杭州行至云起科技有限公司 | Intelligent lighting switch control system and method thereof |
CN116321620B (en) * | 2023-05-11 | 2023-08-11 | 杭州行至云起科技有限公司 | Intelligent lighting switch control system and method thereof |
Also Published As
Publication number | Publication date |
---|---|
CN115620808B (en) | 2023-03-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109994200B (en) | Multi-group cancer data integration analysis method based on similarity fusion | |
Meeds et al. | GPS-ABC: Gaussian process surrogate approximate Bayesian computation | |
US20040167721A1 (en) | Optimal fitting parameter determining method and device, and optimal fitting parameter determining program | |
Moss et al. | Gibbon: General-purpose information-based bayesian optimisation | |
CN110993113B (en) | LncRNA-disease relation prediction method and system based on MF-SDAE | |
Rischard et al. | Unbiased estimation of log normalizing constants with applications to Bayesian cross-validation | |
Sesia et al. | Gene hunting with knockoffs for hidden markov models | |
CN115620808B (en) | Cancer gene prognosis screening method and system based on improved Cox model | |
CN111223528B (en) | Multi-group data clustering method and device | |
CN116629352A (en) | Hundred million-level parameter optimizing platform | |
Rad et al. | GP-RVM: Genetic programing-based symbolic regression using relevance vector machine | |
CN116401555A (en) | Method, system and storage medium for constructing double-cell recognition model | |
Gu et al. | RPnet: a reverse-projection-based neural network for coarse-graining metastable conformational states for protein dynamics | |
Miao et al. | Fisher-Pitman permutation tests based on nonparametric poisson mixtures with application to single cell genomics | |
Dhulipala et al. | Efficient Bayesian inference with latent Hamiltonian neural networks in No-U-Turn Sampling | |
Du et al. | Incorporating grouping information into bayesian decision tree ensembles | |
Cai et al. | Surrogate-assisted operator-repeated evolutionary algorithm for computationally expensive multi-objective problems | |
Evangelou et al. | Estimation and prediction for spatial generalized linear mixed models with parametric links via reparameterized importance sampling | |
CN115565610A (en) | Method and system for establishing recurrence transfer analysis model based on multiple sets of mathematical data | |
Roy et al. | A hidden-state Markov model for cell population deconvolution | |
CN104462817A (en) | Gene selection and cancer classification method based on Monte Carlo and non-negative matrix factorization | |
McLain et al. | Sparse high-dimensional linear regression with a partitioned empirical Bayes ECM algorithm | |
Zhai et al. | Two‐sample test with g‐modeling and its applications | |
Iba et al. | GP-RVM: Genetic programing-based symbolic regression using relevance vector machine | |
Park et al. | Stepwise feature selection using generalized logistic loss |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |