CN115620808A - Cancer gene prognosis screening method and system based on improved Cox model - Google Patents

Cancer gene prognosis screening method and system based on improved Cox model Download PDF

Info

Publication number
CN115620808A
CN115620808A CN202211631423.4A CN202211631423A CN115620808A CN 115620808 A CN115620808 A CN 115620808A CN 202211631423 A CN202211631423 A CN 202211631423A CN 115620808 A CN115620808 A CN 115620808A
Authority
CN
China
Prior art keywords
matrix
cox
message
patient
regression
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211631423.4A
Other languages
Chinese (zh)
Other versions
CN115620808B (en
Inventor
张善书
张浩川
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Original Assignee
Guangdong University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Technology filed Critical Guangdong University of Technology
Priority to CN202211631423.4A priority Critical patent/CN115620808B/en
Publication of CN115620808A publication Critical patent/CN115620808A/en
Application granted granted Critical
Publication of CN115620808B publication Critical patent/CN115620808B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B35/00ICT specially adapted for in silico combinatorial libraries of nucleic acids, proteins or peptides
    • G16B35/20Screening of libraries
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • G16B40/20Supervised data analysis
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Theoretical Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Medical Informatics (AREA)
  • Evolutionary Biology (AREA)
  • General Physics & Mathematics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Biotechnology (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Optimization (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Library & Information Science (AREA)
  • Mathematical Physics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Chemical & Material Sciences (AREA)
  • Computational Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Bioethics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Public Health (AREA)
  • Epidemiology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Genetics & Genomics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Operations Research (AREA)
  • Probability & Statistics with Applications (AREA)
  • Analytical Chemistry (AREA)
  • Algebra (AREA)
  • Biochemistry (AREA)
  • General Engineering & Computer Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a cancer gene prognosis screening method and a cancer gene prognosis screening system based on an improved Cox model, which comprises the following steps: s1, collecting the expression quantity of different genes of cancer cells of a cancer patient, collecting survival data of the patient, collating the expression quantity of the different genes of the cancer cells and patient information into a first matrix, and preprocessing the first matrix to obtain a second matrix; s2, inputting the survival data and the second matrix into a preset Cox regression model, and solving to obtain a regression coefficient; s3, evaluating the patient risk of the corresponding gene in the regression coefficient according to the risk function of the patient, and screening a prognostic genome corresponding to high patient risk; and S4, providing guide information for predicting prognosis, relapse and metastasis by using the screened prognostic genome through a biological theory. Compared with the traditional technology, the accuracy of regression is improved in the regression part through the addition of prior and the automatic updating of parameters, and guidance information is provided for predicting prognosis, recurrence and metastasis.

Description

Cancer gene prognosis screening method and system based on improved Cox model
Technical Field
The invention relates to the technical field of survival analysis Cox model regression, in particular to a cancer gene prognosis screening method and system based on an improved Cox model.
Background
With the advent and development of DNA microarray technology, the technology can simultaneously monitor the expression levels of thousands of genes to study the effect of certain treatments, diseases and developmental stages on gene expression. Commonly used scenarios are: detecting the gene expression of the cancer cells of a plurality of cancer patients, obtaining the survival data of the patients through follow-up visit, finally carrying out statistical analysis on the collected data by using a survival analysis means, and finally screening out genes related to prognosis. The research on the relation between the prognostic gene and the tumor can provide information for predicting prognosis, recurrence, metastasis and even guiding treatment, and the final purpose is to provide help for individualized treatment of patients and further provide breakthrough for the treatment of cancer.
The collected survival data and gene expression quantity need to be subjected to systematic survival analysis, more than ten key prognostic genes are screened from tens of thousands of genes, the step is an indispensable loop in the whole prognostic analysis, and the risk of cancer patients can be evaluated through a gene set consisting of the more than ten genes, so that more treatment information is provided.
Among them, the Cox regression model is widely used in medical follow-up studies, and is the multifactorial analysis method most frequently used in survival analysis so far. The model takes survival outcome and survival time as dependent variables, can simultaneously analyze the influence of a plurality of factors on the survival time, can analyze data with truncated survival time, does not require the survival distribution type of the estimated data, has excellent properties, and has a great position in cancer prognosis gene screening.
It is shown from the open literature that the most commonly used solution in the Cox regression model is through coordinate descent, proposed by Noah Simon et al, and follows a regularization path using a hot start: (
Figure DEST_PATH_IMAGE001
Norm sum
Figure 470165DEST_PATH_IMAGE002
Norm as a penalty term) is fitted. But the coefficient of the penalty term is determined by cross validation, which leads the penalty termThe coefficients cannot be solved accurately and automatically, and since the fitting is calculated by an optimization method and is a point estimation, posterior distribution cannot be obtained and prior parameters are automatically solved (i.e. penalty coefficients) by combining an Expectation-maximization algorithm (expection-maximization), so that the prognostic genes finally screened by the algorithm cannot be well associated with cancers.
Among them, cox regression is a survival analysis method, which is a loop in prognostic gene screening and plays an important role. The implication of the regression coefficients solved by the Cox regression model is to weight the risk of each corresponding gene, and only if the regression coefficients are accurate, the subsequent risk calculation for each patient will be accurate. Therefore, a more accurate method for solving the Cox regression model is needed.
To this end, in combination with the above needs and deficiencies of the prior art, the present application proposes a method and system for cancer gene prognosis screening based on an improved Cox model.
Disclosure of Invention
The invention provides a cancer gene prognosis screening method and system based on an improved Cox model, which improve regression precision in a regression part through prior addition and automatic updating of parameters, screen out corresponding genes with large absolute values in regression coefficients as prognosis genes, and provide information for subsequent prediction prognosis, relapse, transfer and even guide treatment.
The primary objective of the present invention is to solve the above technical problems, and the technical solution of the present invention is as follows:
the invention provides a cancer gene prognosis screening method based on an improved Cox model in a first aspect, which comprises the following steps:
s1, collecting the expression quantity of different genes of cancer cells of a cancer patient, collecting survival data of the patient, and sorting the expression quantity of the different genes of the cancer cells and patient information into a first matrix
Figure DEST_PATH_IMAGE003
For the first matrix
Figure 200355DEST_PATH_IMAGE004
Preprocessing is carried out to obtain a second matrix
Figure 511251DEST_PATH_IMAGE005
S2, survival data obtained in the step S1 and a second matrixXAnd inputting a preset Cox regression model, and solving to obtain a regression coefficient.
And S3, evaluating the patient risk of the corresponding gene in the regression coefficient according to the risk function of the patient, and screening a prognostic genome corresponding to high patient risk.
And S4, providing guide information for predicting prognosis, relapse and metastasis by using the screened prognostic genome through a biological theory.
Wherein, in the first matrix
Figure 437618DEST_PATH_IMAGE006
Wherein the rows of the matrix represent patient information and the columns of the matrix represent gene segments of cancer cells; first matrix
Figure 533882DEST_PATH_IMAGE006
Indicates the expression level of the gene of the corresponding column in the patient of the corresponding row.
Wherein, the survival data comprises: covariate or secondary matrixXTime to live y and erasure index c.
The genes corresponding to the components with larger absolute values in the regression coefficients have larger influence on the survival time of the patient, and the prognostic gene set corresponding to high patient risk can be screened out by evaluating the regression coefficients.
The pretreatment process in the step S1 specifically comprises the following steps: removing irrelevant genes by biological information statistical means to obtain a second matrix with less columns
Figure 887503DEST_PATH_IMAGE005
Further, in step S2, first, a third matrix formed by combining the raw data and the second matrix is input into the preset Cox regression model; wherein, the firstThree matrices are denoted as [ X, y, c ]]Wherein X represents a covariate matrix, namely a second matrix, y represents survival time, and c represents a deletion index; wherein the firstiSurvival data for individual patient is
Figure 685694DEST_PATH_IMAGE007
Further, the firstiThe risk function for each of said patients is specifically:
Figure 415753DEST_PATH_IMAGE008
wherein
Figure DEST_PATH_IMAGE009
Is a shared benchmark risk function;
Figure 894751DEST_PATH_IMAGE010
obtaining a regression coefficient for solving the Cox regression model;
Figure 419273DEST_PATH_IMAGE011
is shown asiGene expression levels of individual patients.
Wherein the regression coefficient is fitted by regression using a Cox regression model
Figure 704761DEST_PATH_IMAGE010
We can then follow the gene expression level of the patient
Figure 723664DEST_PATH_IMAGE011
To assess patient risk, and regression coefficients
Figure 574945DEST_PATH_IMAGE010
The components with larger absolute values in the population have larger influence on the survival time of the patient, and the genes corresponding to the components are the prognostic gene set to be screened out.
Further, solving the Cox regression model in step S2 to obtain a regression coefficient specifically includes the following steps:
s21, combining the existing survival data into a third matrix, sequencing according to the survival time of the parameters, constructing a Cox regression model by using the sequenced data, and initializing prior parameters and message transmission parameters.
And S22, projecting the high-dimensional message to independent Gaussian distribution through a moment matching rule by using an expected propagation algorithm according to a determinant vector factor graph of the Cox regression model, circularly iterating and solving the model, and outputting a regression coefficient and an approximate posterior probability.
And S23, inputting the regression coefficient and the approximate posterior probability into an expected maximum algorithm, and updating the prior parameter.
S24, judging whether the regression coefficient reaches a preset iteration ending condition or not; if the preset iteration ending condition is reached, outputting a regression coefficient obtained by the current iteration; if the preset iteration end condition is not reached, the process returns to step S22 to perform the next iteration.
Wherein the third matrix is [ X, y, c ], X represents a covariate matrix, y represents survival time, and c represents a deletion index.
The method comprises the steps of solving the problem of regression coefficient estimation by means of a complete Bayesian analysis method, converting maximum likelihood estimation with penalty terms into minimum mean square error estimation of a Bayesian angle, adopting a factor graph as a tool, calculating messages transmitted among nodes by a message transmission method based on expected propagation, and acquiring approximate posterior probability of the regression coefficient, wherein the approximate posterior probability is substantially the probability distribution obeyed by the approximation inference of the regression coefficient.
Further, the prior parameters include: mean value
Figure 535948DEST_PATH_IMAGE012
Variance, variance
Figure DEST_PATH_IMAGE013
And sparsity ratio
Figure 590623DEST_PATH_IMAGE014
(ii) a The message passing parameters comprise: mean and variance of positive direction messages; the step S21 specifically includes: normalizing a covariate matrix X matrixThe third matrix is [ X, y, c ] according to the survival time y]Sorting in descending order, and setting the sorted third matrix as [ X, y, c ]]And substituting Cox partial likelihood function to initialize prior parameter and message transfer function.
Wherein, the prior parameter and the regression coefficient both obey Gaussian-Bernoulli distribution and have sparsity.
The projection operation of the likelihood function nodes is simplified approximately by adopting a Laplace method and a moment generating function, so that the complex calculation is simplified, and a more accurate regression coefficient is solved under the condition of less loss.
Further, the normalizing the covariate matrix X specifically includes:
Figure 396905DEST_PATH_IMAGE015
wherein mean (a)X) Is composed ofXMean of the elements of the matrix, var: (X) Is composed ofXThe variance of the whole elements of the matrix.
The Cox partial likelihood function is specifically:
Figure 571534DEST_PATH_IMAGE016
wherein,
Figure 188592DEST_PATH_IMAGE017
expressing the function as
Figure 183092DEST_PATH_IMAGE018
Is transferred to
Figure DEST_PATH_IMAGE019
For representing transition probabilities of
Figure 324224DEST_PATH_IMAGE017
About
Figure 892041DEST_PATH_IMAGE019
Is normalized;
Figure 194846DEST_PATH_IMAGE020
the partial likelihood function is a Cox partial likelihood function, is not normalized, and represents a direct proportion relation; the function is as follows
Figure 676643DEST_PATH_IMAGE018
Is a variable, the firstiAn element
Figure 575460DEST_PATH_IMAGE021
Figure 459102DEST_PATH_IMAGE022
Is composed of
Figure 932809DEST_PATH_IMAGE023
To (1) aiAnd (4) each element.
The initialization of the prior parameter is specifically as follows: the regression coefficients are subjected to Gaussian-Bernoulli distribution, and the mathematical expression is as follows:
Figure 167481DEST_PATH_IMAGE024
wherein,
Figure 604410DEST_PATH_IMAGE025
representing a dirac Delta function;
Figure 608138DEST_PATH_IMAGE026
represents a mean value of
Figure 252746DEST_PATH_IMAGE027
Variance of
Figure 709135DEST_PATH_IMAGE028
(ii) a gaussian distribution of; the function is as follows
Figure 215334DEST_PATH_IMAGE029
Is a variable; initializing prior parameters
Figure 807989DEST_PATH_IMAGE030
Figure 623499DEST_PATH_IMAGE031
Figure 567184DEST_PATH_IMAGE032
The initialization of the message transfer function specifically includes: initializing a message transfer function of a positive direction message, wherein the mathematical expression of the message transfer function is as follows:
Figure 670882DEST_PATH_IMAGE033
wherein,
Figure 383623DEST_PATH_IMAGE034
is an n-dimensional column vector with elements all being 0;
Figure 104454DEST_PATH_IMAGE035
the method is characterized in that the method is an n-dimensional column vector with elements all being 1, and subscripts represent the dimension of the vector;
Figure 286168DEST_PATH_IMAGE036
is a random variable obeying independent same variance multidimensional Gaussian distribution;
Figure 383437DEST_PATH_IMAGE035
is an n-column dimensional vector with element 1; initialization
Figure 950684DEST_PATH_IMAGE037
Figure 842417DEST_PATH_IMAGE038
Figure 308164DEST_PATH_IMAGE039
In the determinant vector factor graph of the Cox regression model, four multidimensional random variables are used for representing messages transmitted on the factor graph, namely, the messages are regarded as a multidimensional Gaussian probability density function, and the moment matching process requires that the messages obey the following distribution:
Figure 209124DEST_PATH_IMAGE040
wherein,
Figure 630878DEST_PATH_IMAGE041
is a random variable obeying independent same variance multidimensional Gaussian distribution;
Figure 444245DEST_PATH_IMAGE042
the vector is an n-column dimensional vector with the element of 1, and the subscript represents the dimension of the vector;
Figure 584239DEST_PATH_IMAGE043
is a p-column dimensional vector with element 1, subscript representing vector dimension; when the elements of the multidimensional gaussian random variables are independent of each other, i.e., the off-diagonal elements of the covariance matrix are 0, the diagonal matrix can be represented by vectors.
Further, the step S22 is specifically to perform message transmission on the determinant vector factor graph of the Cox regression model based on the moment matching rule, and includes the following steps:
s221, matching according to the moment matching rule of the determinant vector factor graph of the Cox regression model
Figure 288890DEST_PATH_IMAGE044
Updating, specifically:
Figure 830730DEST_PATH_IMAGE045
at a node
Figure 880244DEST_PATH_IMAGE046
In the above, will
Figure 507534DEST_PATH_IMAGE047
Of (2) a message
Figure 750297DEST_PATH_IMAGE048
Multiplying and projecting the result onto a multidimensional Gaussian distribution of independent covariance
Figure 349905DEST_PATH_IMAGE049
Is divided by the message to obtain
Figure 770654DEST_PATH_IMAGE050
The message of (2).
Wherein,
Figure 681978DEST_PATH_IMAGE051
is a projection operation, i.e. determining
Figure 462852DEST_PATH_IMAGE052
About
Figure 730016DEST_PATH_IMAGE053
Mean vector of
Figure 570933DEST_PATH_IMAGE054
Sum variance vector
Figure 172816DEST_PATH_IMAGE055
Since it is a multidimensional Gaussian of independent covariance, the vector
Figure 491802DEST_PATH_IMAGE055
Is equal and the off-diagonal element is 0, and outputs
Figure 613473DEST_PATH_IMAGE056
S222, according to the moment matching rule of the determinant vector factor graph of the Cox regression model, pairing
Figure 625291DEST_PATH_IMAGE057
Updating, specifically:
Figure 245628DEST_PATH_IMAGE058
at a node
Figure 381687DEST_PATH_IMAGE059
In the above, will
Figure 607132DEST_PATH_IMAGE060
Of a message and
Figure 789852DEST_PATH_IMAGE061
multiply and then accumulate the variables
Figure 100747DEST_PATH_IMAGE062
And projecting the result on a multidimensional Gaussian distribution with independent covariance, and then summing the projected results
Figure 777848DEST_PATH_IMAGE063
Is divided by the message to obtain
Figure 857799DEST_PATH_IMAGE064
The message of (2); wherein
Figure 273737DEST_PATH_IMAGE065
Is a dirac Delta function.
S223, matching according to the moment matching rule of the determinant vector factor graph of the Cox regression model
Figure 822661DEST_PATH_IMAGE063
Updating, specifically:
Figure 552720DEST_PATH_IMAGE066
in that
Figure 487178DEST_PATH_IMAGE067
On a node, will
Figure 11700DEST_PATH_IMAGE068
Of a messageAnd
Figure 47920DEST_PATH_IMAGE069
projecting the result obtained by multiplication to the multidimensional Gaussian distribution of independent covariance, and summing the results obtained by projection
Figure 316090DEST_PATH_IMAGE070
Is divided by the message to obtain
Figure 370634DEST_PATH_IMAGE063
The message of (a); wherein the mean value obtained by the projection operation
Figure 331637DEST_PATH_IMAGE071
Are the Cox regression coefficients as the output result.
S224, according to the moment matching rule of the determinant vector factor graph of the Cox regression model, pairing
Figure 604222DEST_PATH_IMAGE072
Updating, specifically:
Figure 472821DEST_PATH_IMAGE073
in that
Figure 647451DEST_PATH_IMAGE074
On a node, will
Figure 264508DEST_PATH_IMAGE075
Of a message and
Figure 524588DEST_PATH_IMAGE074
multiply and accumulate variables
Figure 868982DEST_PATH_IMAGE076
Projecting the result to a multidimensional Gaussian distribution of independent covariance, and then summing the projected results
Figure 898117DEST_PATH_IMAGE077
Is divided by the message to obtain
Figure 686076DEST_PATH_IMAGE072
The message of (2).
Wherein, due to
Figure 433452DEST_PATH_IMAGE078
Has an extremely complex form, and therefore uses an cumulant generation function and a Laplace method instead
Figure 643854DEST_PATH_IMAGE078
And carrying out projection operation.
Further, in step S223, the projection operation specifically includes:
Figure 278228DEST_PATH_IMAGE079
wherein,
Figure 751935DEST_PATH_IMAGE080
representing an approximate posterior probability of the regression coefficient; mean value obtained by projection
Figure 721028DEST_PATH_IMAGE081
I.e., the Cox regression coefficients of the model output.
Further, step S23 specifically includes: regression coefficient output from step S22
Figure 672804DEST_PATH_IMAGE082
And approximate posterior probability
Figure 424334DEST_PATH_IMAGE083
Matching with expectation maximization algorithm to prior parameters
Figure 537784DEST_PATH_IMAGE084
Carrying out automatic updating; the updated expression is specifically:
Figure 994173DEST_PATH_IMAGE085
Figure 749640DEST_PATH_IMAGE086
Figure 358607DEST_PATH_IMAGE087
wherein,
Figure 174116DEST_PATH_IMAGE088
and
Figure 117801DEST_PATH_IMAGE089
are all about
Figure 676958DEST_PATH_IMAGE090
Is expressed as follows:
Figure 140432DEST_PATH_IMAGE091
wherein,
Figure 126842DEST_PATH_IMAGE092
the vector points are divided by the vector points,
Figure 557824DEST_PATH_IMAGE093
is a vector dot product.
The prior parameters are self-learned, and are automatically updated along with iteration of the whole algorithm without manual adjustment, so that the uncertainty of cross validation can be further avoided.
Further, the preset iteration ending condition in step S24 is specifically:
Figure 655093DEST_PATH_IMAGE094
wherein the Crit value is determined by judging whether the Crit value starts to rise or notIf the iteration is finished, if the Crit value begins to rise, the iteration process is stopped and the regression coefficient of the final iteration is output
Figure 973073DEST_PATH_IMAGE095
(ii) a If the Crit value does not start to rise, continuing iteration; wherein
Figure 130385DEST_PATH_IMAGE096
Representing a norm.
The second aspect of the present invention provides a cancer gene prognosis screening system based on an improved Cox model, comprising a memory and a processor, wherein the memory includes a cancer gene prognosis screening program based on the improved Cox model, and the cancer gene prognosis screening program based on the improved Cox model, when executed by the processor, implements the following steps:
s1, collecting the expression quantity of different genes of cancer cells of a cancer patient, collecting survival data of the patient, and sorting the expression quantity of the different genes of the cancer cells and patient information into a first matrix
Figure 251924DEST_PATH_IMAGE097
For the first matrix
Figure 887305DEST_PATH_IMAGE097
Preprocessing is carried out to obtain a second matrix
Figure 574638DEST_PATH_IMAGE005
S2, survival data obtained in the step S1 and a second matrixXAnd inputting a preset Cox regression model, and solving to obtain a regression coefficient.
And S3, evaluating the patient risk of the corresponding gene in the regression coefficient according to the risk function of the patient, and screening a prognostic genome corresponding to high patient risk.
And S4, providing guide information for predicting prognosis, relapse and metastasis by using the screened prognostic genome through a biological theory.
Compared with the prior art, the technical scheme of the invention has the beneficial effects that:
the invention provides a cancer gene prognosis screening method and a system based on an improved Cox model, wherein a factor graph is used as a tool, and the approximate posterior probability of a Cox regression coefficient is deduced through a moment matching message transmission method based on expected propagation; the method of minimum mean square error estimation is adopted to realize accurate estimation of the regression coefficient estimation value; in the aspect of prior parameters, an expectation maximization algorithm is adopted for automatic solution, so that cross validation is omitted, and regression coefficients are estimated more accurately; in the specific implementation aspect, the Laplace method and the cumulant generation function are simplified to simplify the complex form
Figure 656514DEST_PATH_IMAGE078
And the method is successfully projected by multiplying Gaussian so that iteration can be carried out, the problem of regression precision can be solved, a corresponding gene with a large absolute value in a regression coefficient is screened out to serve as a prognostic gene, and information is provided for subsequent prognosis prediction, recurrence, metastasis and even therapy guidance.
Drawings
FIG. 1 is a flow chart of the method for screening cancer gene prognosis based on the improved Cox model of the present invention.
FIG. 2 is a flow chart of solving a Cox model in the cancer gene prognosis screening method based on the improved Cox model of the present invention.
FIG. 3 is a flow chart of an embodiment of the invention for solving the Cox model.
FIG. 4 is a diagram of a determinant vector factor graph in an embodiment of the present invention.
FIG. 5 is a diagram illustrating a method of matching message delivery based on a desired propagation in accordance with an embodiment of the present invention.
FIG. 6 is a graph illustrating performance of regression performed on simulated data in an embodiment of the present invention.
FIG. 7 is a schematic structural diagram of a cancer gene prognosis screening system based on an improved Cox model according to the present invention.
Detailed Description
In order that the above objects, features and advantages of the present invention can be more clearly understood, a more particular description of the invention will be rendered by reference to the appended drawings. It should be noted that the embodiments and features of the embodiments of the present application may be combined with each other without conflict.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, however, the present invention may be practiced otherwise than as specifically described herein and, therefore, the scope of the present invention is not limited by the specific embodiments disclosed below.
Example 1
As shown in FIG. 1, the present invention provides a method for screening cancer gene prognosis based on an improved Cox model, which comprises the following steps:
s1, collecting the expression quantity of different genes of cancer cells of a cancer patient, collecting survival data of the patient, and sorting the expression quantity of the different genes of the cancer cells and patient information into a first matrix
Figure 796508DEST_PATH_IMAGE097
For the first matrix
Figure 501159DEST_PATH_IMAGE097
Preprocessing is carried out to obtain a second matrix
Figure 42998DEST_PATH_IMAGE005
S2, survival data obtained in the step S1 and a second matrix are usedXAnd inputting a preset Cox regression model, and solving to obtain a regression coefficient.
And S3, evaluating the patient risk of the corresponding gene in the regression coefficient according to the risk function of the patient, and screening a prognostic genome corresponding to high patient risk.
And S4, providing guide information for predicting prognosis, relapse and metastasis by using the screened prognostic genome through a biological theory.
Wherein, in the first matrix
Figure 558425DEST_PATH_IMAGE006
Wherein the rows of the matrix represent patient information and the columns of the matrix represent genes of cancer cellsA fragment; first matrix
Figure 185715DEST_PATH_IMAGE006
Indicates the expression level of the gene of the corresponding column in the patient of the corresponding row.
Wherein, the survival data comprises: covariate or secondary matrixXTime to live y and erasure index c.
The genes corresponding to the components with larger absolute values in the regression coefficients have larger influence on the survival time of the patient, and a prognostic gene set corresponding to high patient risk can be screened out by evaluating the regression coefficients.
The pretreatment process in the step S1 specifically comprises the following steps: removing irrelevant genes by means of biological information statistics to obtain a second matrix with less columns
Figure 162898DEST_PATH_IMAGE005
Further, in step S2, first, a third matrix formed by combining the raw data and the second matrix is input into the preset Cox regression model; wherein the third matrix is denoted as [ X, y, c]Wherein X represents a covariate matrix, i.e. a second matrix, y represents the time-to-live, and c represents the erasure index; wherein the first stepiSurvival data for individual patients is
Figure 824824DEST_PATH_IMAGE007
Further, the firstiThe risk function for each of said patients is specifically:
Figure 245572DEST_PATH_IMAGE098
wherein
Figure 360159DEST_PATH_IMAGE099
Is a shared benchmark risk function;
Figure 344295DEST_PATH_IMAGE100
to solve the Cox regression modelModeling the obtained regression coefficient;
Figure 657465DEST_PATH_IMAGE101
denotes the firstiGene expression levels of individual patients.
Wherein the regression coefficient is fitted by regression using a Cox regression model
Figure 701644DEST_PATH_IMAGE102
We can then follow the gene expression level of the patient
Figure 54259DEST_PATH_IMAGE103
To assess patient risk, and regression coefficients
Figure 638824DEST_PATH_IMAGE102
The larger absolute value of the components has a larger influence on the survival time of the patient, and the genes corresponding to the components are the prognostic gene set to be screened out.
Further, in step S2, solving the Cox regression model to obtain a regression coefficient, as shown in fig. 2, specifically includes the following steps:
s21, combining the existing survival data into a third matrix, sequencing according to the survival time of the parameters, constructing a Cox regression model by using the sequenced data, and initializing prior parameters and message transmission parameters.
And S22, projecting the high-dimensional message to independent Gaussian distribution through a moment matching rule by using an expected propagation algorithm according to a determinant vector factor graph of the Cox regression model, circularly iterating and solving the model, and outputting a regression coefficient and an approximate posterior probability.
And S23, inputting the regression coefficient and the approximate posterior probability into an expectation maximization algorithm, and updating the prior parameter.
S24, judging whether the regression coefficient reaches a preset iteration ending condition or not; if the preset iteration ending condition is reached, outputting a regression coefficient obtained by the current iteration; if the preset iteration end condition is not reached, the process returns to step S22 to perform the next iteration.
Wherein the third matrix is [ X, y, c ], X represents a covariate matrix, y represents survival time, and c represents a deletion index.
The method comprises the steps of solving the problem of regression coefficient estimation by means of a complete Bayesian analysis method, converting maximum likelihood estimation with penalty terms into minimum mean square error estimation of Bayesian angles, adopting a factor graph as a tool, calculating messages transmitted among nodes by a message transmission method based on expected propagation, and acquiring approximate posterior probability of the regression coefficient, wherein the approximate posterior probability is substantially the probability distribution obeyed by the approximation deduction of the regression coefficient.
Further, the prior parameters include: mean value
Figure 275342DEST_PATH_IMAGE104
Variance, variance
Figure 287160DEST_PATH_IMAGE105
And sparsity ratio
Figure 858563DEST_PATH_IMAGE106
(ii) a The message passing parameters comprise: mean and variance of positive direction messages; the step S21 is specifically: normalizing the X matrix of the covariate matrix, and determining the third matrix as [ X, y, c ] according to the survival time y]Sorting in a descending order, and setting the sorted third matrix as [ X, y, c ]]And substituting Cox partial likelihood function to initialize prior parameter and message transfer function.
In a specific embodiment, the covariate matrix can be a gene expression matrix, wherein each row represents a different patient, each column represents a different gene, and an element in the matrix represents the expression of a gene of a person.
Wherein, the prior parameter and the regression coefficient both obey Gaussian-Bernoulli distribution and have sparsity.
The projection operation of the likelihood function nodes is simplified approximately by adopting a Laplace method and a moment generating function, so that the complex calculation is simplified, and a more accurate regression coefficient is solved under the condition of less loss.
Further, the normalizing the covariate matrix X specifically includes:
Figure 246819DEST_PATH_IMAGE015
wherein mean (a)X) Is composed ofXMean of the whole elements of the matrix, var: (X) Is composed ofXThe variance of the elements of the matrix as a whole.
The Cox partial likelihood function is specifically:
Figure 472264DEST_PATH_IMAGE016
wherein,
Figure 654983DEST_PATH_IMAGE017
expressing the function as
Figure 982190DEST_PATH_IMAGE107
Is transferred to
Figure 908558DEST_PATH_IMAGE108
For representing transition probabilities of
Figure 722930DEST_PATH_IMAGE017
About
Figure 342131DEST_PATH_IMAGE108
Is normalized;
Figure 891055DEST_PATH_IMAGE020
the partial likelihood function of Cox is not normalized and represents a direct proportion relation; the function is as follows
Figure 621113DEST_PATH_IMAGE107
Is a variable, the firstiAn element
Figure 821150DEST_PATH_IMAGE021
Figure 611252DEST_PATH_IMAGE022
Is composed of
Figure 381893DEST_PATH_IMAGE023
To (1) aiAnd (4) each element.
The initialization of the prior parameter specifically comprises: the regression coefficients are subjected to Gaussian-Bernoulli distribution, and the mathematical expression is as follows:
Figure 915642DEST_PATH_IMAGE109
wherein,
Figure 970186DEST_PATH_IMAGE025
representing a dirac Delta function;
Figure 665610DEST_PATH_IMAGE026
represents a mean value of
Figure 641656DEST_PATH_IMAGE027
Variance of
Figure 455461DEST_PATH_IMAGE110
(ii) a gaussian distribution of; the function is as follows
Figure 364511DEST_PATH_IMAGE111
Is a variable; initializing prior parameters
Figure 496415DEST_PATH_IMAGE030
Figure 490916DEST_PATH_IMAGE031
Figure 117200DEST_PATH_IMAGE032
The initialization of the message transfer function specifically includes: initializing a message transfer function of a positive direction message, wherein the mathematical expression of the message transfer function is as follows:
Figure 146336DEST_PATH_IMAGE033
wherein,
Figure 183562DEST_PATH_IMAGE034
is an n-dimensional column vector with elements all 0;
Figure 930938DEST_PATH_IMAGE035
is an n-dimensional column vector with elements all being 1;
Figure 360914DEST_PATH_IMAGE036
is a random variable obeying independent same variance multidimensional Gaussian distribution;
Figure 244556DEST_PATH_IMAGE035
is an n-column dimensional vector with element 1; initialization
Figure 452684DEST_PATH_IMAGE112
Figure 421777DEST_PATH_IMAGE113
Figure 389864DEST_PATH_IMAGE114
In a specific embodiment, the determinant vector factor graph of the Cox regression model is shown in fig. 4.
In the determinant vector factor graph of the Cox regression model, as shown in fig. 5, four multidimensional random variables are used to represent messages passing through the factor graph, i.e., the messages are regarded as a multidimensional gaussian probability density function, and the moment matching process requires that the messages obey the following distribution:
Figure 128013DEST_PATH_IMAGE033
wherein,
Figure 772621DEST_PATH_IMAGE041
is a random variable obeying independent same variance multidimensional Gaussian distribution;
Figure 229010DEST_PATH_IMAGE115
the vector is an n-column dimensional vector with the element of 1, and the subscript represents the dimension of the vector;
Figure 732279DEST_PATH_IMAGE043
is a p-column dimensional vector with element 1, the subscript representing the vector dimension; when the elements of the multidimensional gaussian random variable are independent of each other, i.e., the off-diagonal elements of the covariance matrix are 0, the diagonal matrix can be represented by a vector.
In a specific embodiment, a priori parameters, i.e. a priori distribution, are set
Figure 590513DEST_PATH_IMAGE116
In (1)
Figure 406023DEST_PATH_IMAGE117
-a sparse parameter,
Figure 287391DEST_PATH_IMAGE118
-a mean value parameter, the mean value parameter,
Figure 846548DEST_PATH_IMAGE119
the initial values of the variance parameters are respectively
Figure 310022DEST_PATH_IMAGE030
Figure 296432DEST_PATH_IMAGE031
Figure 727414DEST_PATH_IMAGE032
And then automatically updating the prior parameters by adopting an expected maximum algorithm.
Further, the step S22 is specifically to perform message transmission on the determinant vector factor graph of the Cox regression model based on the moment matching rule, and includes the following steps:
s221, matching according to the moment matching rule of the determinant vector factor graph of the Cox regression model
Figure 824683DEST_PATH_IMAGE120
Updating, specifically:
Figure 142663DEST_PATH_IMAGE121
at a node
Figure 565554DEST_PATH_IMAGE122
In the above, will
Figure 218252DEST_PATH_IMAGE123
Message of
Figure 119212DEST_PATH_IMAGE124
Multiplying and projecting the result onto a multidimensional Gaussian distribution of independent covariance
Figure 557278DEST_PATH_IMAGE049
Is divided by the message to obtain
Figure 885491DEST_PATH_IMAGE120
The message of (2).
Wherein,
Figure 25485DEST_PATH_IMAGE125
is a projection operation, i.e. finding
Figure 464557DEST_PATH_IMAGE052
About
Figure 494479DEST_PATH_IMAGE053
Mean vector of
Figure 993594DEST_PATH_IMAGE126
Sum variance vector
Figure 620884DEST_PATH_IMAGE055
Since it is a multidimensional Gaussian of independent covariance, the vector
Figure 863647DEST_PATH_IMAGE055
Is equal and the off-diagonal element is 0, and outputs
Figure 276305DEST_PATH_IMAGE056
S222, according to the moment matching rule of the determinant vector factor graph of the Cox regression model, pairing
Figure 946320DEST_PATH_IMAGE057
Updating, specifically:
Figure 795328DEST_PATH_IMAGE127
at a node
Figure 841781DEST_PATH_IMAGE128
In the above, will
Figure 108946DEST_PATH_IMAGE129
Of a message and
Figure 949863DEST_PATH_IMAGE061
multiply and then accumulate the variables
Figure 82904DEST_PATH_IMAGE062
And projecting the data to a multidimensional Gaussian distribution with independent covariance, and then summing the results obtained by projection
Figure 418201DEST_PATH_IMAGE063
Is divided by the message to obtain
Figure 789140DEST_PATH_IMAGE130
The message of (a); wherein
Figure 800958DEST_PATH_IMAGE065
Is a dirac Delta function.
S223, according to the moment matching rule of the determinant vector factor graph of the Cox regression model, pairing
Figure 890137DEST_PATH_IMAGE063
Updating, specifically:
Figure 760616DEST_PATH_IMAGE131
in that
Figure 251640DEST_PATH_IMAGE067
On a node, will
Figure 434360DEST_PATH_IMAGE132
Of a message and
Figure 745256DEST_PATH_IMAGE069
projecting the result obtained by multiplication on the multidimensional Gaussian distribution of independent covariance, and summing the results obtained by projection
Figure 422356DEST_PATH_IMAGE133
Is divided by the message to obtain
Figure 767887DEST_PATH_IMAGE063
The message of (2); wherein the mean value obtained by the projection operation
Figure 121508DEST_PATH_IMAGE071
Are the Cox regression coefficients as the output result.
S224, according to the moment matching rule of the determinant vector factor graph of the Cox regression model, pairing
Figure 919699DEST_PATH_IMAGE072
Updating, specifically:
Figure 400490DEST_PATH_IMAGE134
in that
Figure 600527DEST_PATH_IMAGE074
On a node, will
Figure 328312DEST_PATH_IMAGE075
Of a message and
Figure 410538DEST_PATH_IMAGE074
multiply and accumulate variables
Figure 695019DEST_PATH_IMAGE076
Projecting the result to a multidimensional Gaussian distribution of independent covariance, and then summing the projected results
Figure 749563DEST_PATH_IMAGE135
Is divided by the message to obtain
Figure 648249DEST_PATH_IMAGE072
The message of (2).
Wherein, due to
Figure 421033DEST_PATH_IMAGE136
Has an extremely complex form, and therefore uses an cumulant generation function and a Laplace method instead
Figure 492894DEST_PATH_IMAGE136
And carrying out projection operation.
Further, in step S223, the projection operation specifically includes:
Figure 979825DEST_PATH_IMAGE079
wherein,
Figure 111729DEST_PATH_IMAGE080
representing the approximate posterior probability of the regression coefficients; mean value obtained by projection
Figure 902968DEST_PATH_IMAGE137
I.e., the Cox regression coefficients of the model output.
Further, step S23 specifically includes: regression coefficient output from step S22
Figure 981782DEST_PATH_IMAGE138
And approximate posterior probability
Figure 496071DEST_PATH_IMAGE139
Matching with expectation maximization algorithm to prior parameters
Figure 798877DEST_PATH_IMAGE084
Carrying out automatic updating; the updated expression is specifically:
Figure 280674DEST_PATH_IMAGE085
Figure 694337DEST_PATH_IMAGE086
Figure 328712DEST_PATH_IMAGE087
wherein,
Figure 802419DEST_PATH_IMAGE140
and
Figure 37091DEST_PATH_IMAGE141
are all about
Figure 988867DEST_PATH_IMAGE142
Is expressed as follows:
Figure 743327DEST_PATH_IMAGE143
wherein,
Figure 387935DEST_PATH_IMAGE144
in the form of a vector point divide,
Figure 844324DEST_PATH_IMAGE093
is a vector dot product.
The prior parameters are self-learned, and are automatically updated along with iteration of the whole algorithm without manual adjustment, so that the uncertainty of cross validation can be further avoided.
Further, the preset iteration ending condition in step S24 is specifically:
Figure 599791DEST_PATH_IMAGE145
determining whether to end iteration by judging whether the Crit value starts to rise or not, if the Crit value starts to rise, stopping the iteration process and outputting a regression coefficient of the final iteration
Figure 205828DEST_PATH_IMAGE146
(ii) a If the Crit value does not start to rise, continuing iteration; wherein
Figure 755758DEST_PATH_IMAGE096
Representing a norm.
In a specific embodiment, the performance of regression on simulated data in a single experiment is shown in FIG. 6, where the black line is the true value and the asterisk is the estimated value.
The simulation data generation mode is as follows:
generated from independent standard normal samples
Figure 699443DEST_PATH_IMAGE147
To pair
Figure 258600DEST_PATH_IMAGE148
Independently sampling in a binomial distribution B (1, 0.8),
Figure 722074DEST_PATH_IMAGE149
wherein the erasure rate is 0.2.
Generation from Laplace-Bernoulli samples
Figure 911747DEST_PATH_IMAGE150
Wherein the sparsity ratio is 0.2.
When in use
Figure 873887DEST_PATH_IMAGE151
And the firstiWhen no sample number is deleted:
Figure 971156DEST_PATH_IMAGE152
wherein
Figure 554715DEST_PATH_IMAGE153
Independently sample from U (0, 1) when
Figure 915289DEST_PATH_IMAGE154
And the firstiWhen the number sample is deleted:
Figure 567987DEST_PATH_IMAGE155
example 2
Based on the above embodiment 1, with reference to fig. 3, this embodiment describes in detail a specific process of solving the Cox model in the present invention.
In one particular embodiment, as shown in FIG. 3, the known data is
Figure 468947DEST_PATH_IMAGE156
Figure 156280DEST_PATH_IMAGE157
Figure 235226DEST_PATH_IMAGE158
The coefficient to be regressed is
Figure 375220DEST_PATH_IMAGE159
Step 1:
S 1.1:XInitialization
Figure 79871DEST_PATH_IMAGE015
Wherein mean (m)X) Is composed ofXMean of the elements of the matrix, var: (X) Is composed ofXThe variance of the whole elements of the matrix.
S1.2: merging the existing survival data (covariate matrix-X, survival time-y, deletion index-c) into a matrix [ X, y, c ] and sorting according to y descending order;
s1.3: substituting the ordered [ X, y, c ] into a Cox partial likelihood function:
Figure 621711DEST_PATH_IMAGE016
Figure 874488DEST_PATH_IMAGE017
expressing the function as
Figure 501778DEST_PATH_IMAGE107
Is transferred to
Figure 744540DEST_PATH_IMAGE108
The transition probability of (2), which implies
Figure 140887DEST_PATH_IMAGE017
About
Figure 561635DEST_PATH_IMAGE108
Is normalized (characteristic of probability density function), and
Figure 879484DEST_PATH_IMAGE020
the partial likelihood function is a Cox partial likelihood function, and is not normalized, so that the partial likelihood function is in a direct proportion relation; the function is as follows
Figure 925937DEST_PATH_IMAGE107
Is a variable, the firstiAn element
Figure 176790DEST_PATH_IMAGE021
Figure 17707DEST_PATH_IMAGE022
Is composed of
Figure 370322DEST_PATH_IMAGE023
To (1) aiAnd (4) each element.
S1.4: it is assumed that the prior obeys a gaussian-bernoulli distribution:
Figure 954887DEST_PATH_IMAGE160
the function is as follows
Figure 591405DEST_PATH_IMAGE161
Is a variable; initializing prior parameters
Figure 337644DEST_PATH_IMAGE030
Figure 443134DEST_PATH_IMAGE031
Figure 565811DEST_PATH_IMAGE162
S1.5: initializing a positive direction message:
Figure 791256DEST_PATH_IMAGE033
wherein initialization is carried out
Figure 239555DEST_PATH_IMAGE163
Figure 753713DEST_PATH_IMAGE164
Figure 427883DEST_PATH_IMAGE165
Figure 773414DEST_PATH_IMAGE034
Is an n-dimensional column vector with elements all 0;
Figure 127035DEST_PATH_IMAGE035
an n-dimensional column vector with an element of 1, the subscripts denote the dimension of the vector.
Step 2: message passing on factor graph based on moment matching rule-expectation propagation algorithm (expectation propagation)
S2.1: updating
Figure 925227DEST_PATH_IMAGE166
: in that
Figure 406018DEST_PATH_IMAGE167
On a node, will
Figure 871634DEST_PATH_IMAGE166
Message of
Figure 396156DEST_PATH_IMAGE122
Multiplying and projecting onto a multidimensional Gaussian distribution of independent covariance, and then removing
Figure 681644DEST_PATH_IMAGE168
The message of (2):
Figure 700547DEST_PATH_IMAGE121
wherein,
Figure 755091DEST_PATH_IMAGE169
is a projection operation, i.e. finding
Figure 716093DEST_PATH_IMAGE052
About
Figure 488877DEST_PATH_IMAGE053
Mean vector of
Figure 498422DEST_PATH_IMAGE126
Sum variance vector
Figure 423783DEST_PATH_IMAGE055
(diagonal of covariance matrix), the vector is a multidimensional Gaussian of independent covariance
Figure 290108DEST_PATH_IMAGE055
Is equal and the off-diagonal element is 0, and outputs
Figure 550188DEST_PATH_IMAGE056
By Laplace method and pairs of moment generating functions
Figure 160161DEST_PATH_IMAGE170
Simplifying to finally obtain:
Figure 931240DEST_PATH_IMAGE171
wherein
Figure 234046DEST_PATH_IMAGE172
Namely, it is
Figure 715843DEST_PATH_IMAGE173
The variance of (a) is determined,
Figure 395086DEST_PATH_IMAGE174
Figure 29461DEST_PATH_IMAGE175
is composed of
Figure 503167DEST_PATH_IMAGE176
Black plug matrix of (
Figure 472260DEST_PATH_IMAGE176
To pair
Figure 424036DEST_PATH_IMAGE177
Second order gradient of).
Figure 178496DEST_PATH_IMAGE178
The meanings are as follows: when in use
Figure 823104DEST_PATH_IMAGE179
Taking out the diagonal of the matrix when it is
Figure 279493DEST_PATH_IMAGE179
When the vector is a vector, the vector is stretched into a diagonal matrix.
Figure 34960DEST_PATH_IMAGE180
Is to calculate the average value of the vector,
Figure 830877DEST_PATH_IMAGE181
the vector points are divided by the vector points,
Figure 397119DEST_PATH_IMAGE182
is a vector dot product.
Wherein,
Figure 340804DEST_PATH_IMAGE183
adopt a pair
Figure 899962DEST_PATH_IMAGE184
And (3) solving by using a coordinate ascending algorithm after quadratic approximation:
firstly, the method is carried out
Figure 612703DEST_PATH_IMAGE185
Taylor expansion:
Figure 81337DEST_PATH_IMAGE186
wherein,
Figure 512318DEST_PATH_IMAGE187
is composed of
Figure 875166DEST_PATH_IMAGE188
In that
Figure 707993DEST_PATH_IMAGE189
The gradient of (a) is measured,
Figure 350458DEST_PATH_IMAGE190
is composed of
Figure 471998DEST_PATH_IMAGE191
In that
Figure 169696DEST_PATH_IMAGE189
A black plug matrix of (a). After rewriting, the following are obtained:
Figure 794712DEST_PATH_IMAGE192
wherein,
Figure 670395DEST_PATH_IMAGE193
will eventually be
Figure 810390DEST_PATH_IMAGE194
Simplifying into the following steps:
Figure 515040DEST_PATH_IMAGE195
wherein,
Figure 56880DEST_PATH_IMAGE196
is that
Figure 759257DEST_PATH_IMAGE197
To (1) aiElement, then apply Coordinate Ascent algorithm (Coordinate Ascent):
s2.1.1: initialization
Figure 137280DEST_PATH_IMAGE198
S2.1.2: updating
Figure 380042DEST_PATH_IMAGE199
In that
Figure 776389DEST_PATH_IMAGE200
Gradient of (2)
Figure 446404DEST_PATH_IMAGE201
To a
Figure 314653DEST_PATH_IMAGE201
To (1) akAn element
Figure 95527DEST_PATH_IMAGE202
Figure 877538DEST_PATH_IMAGE203
S2.1.3: updating
Figure 718455DEST_PATH_IMAGE204
In that
Figure 71070DEST_PATH_IMAGE189
Black plug matrix of
Figure 655636DEST_PATH_IMAGE190
To a
Figure 26574DEST_PATH_IMAGE190
To (1) akLine ofkColumn element
Figure 38392DEST_PATH_IMAGE205
(for accelerated calculations, only diagonal elements are kept to approximate the entire matrix):
Figure 612724DEST_PATH_IMAGE206
s2.1.4: updating
Figure 204243DEST_PATH_IMAGE207
Figure 226425DEST_PATH_IMAGE208
S2.1.5: updating
Figure 409145DEST_PATH_IMAGE209
Figure 736352DEST_PATH_IMAGE210
S2.1.6: updating
Figure 662720DEST_PATH_IMAGE209
If the change is small to a certain extent, the change of (2) is output
Figure 945934DEST_PATH_IMAGE209
Figure 565134DEST_PATH_IMAGE211
If the change is still large, the iteration is continued by returning to S2.1.2.
Finally, calculating the division part and outputting
Figure 363326DEST_PATH_IMAGE212
Figure 841187DEST_PATH_IMAGE213
Figure 775645DEST_PATH_IMAGE214
S2.2: updating
Figure 565746DEST_PATH_IMAGE215
: in that
Figure 116813DEST_PATH_IMAGE216
On a node, will
Figure 135716DEST_PATH_IMAGE217
And
Figure 190260DEST_PATH_IMAGE216
multiply and then accumulate variables
Figure 151263DEST_PATH_IMAGE218
Projected onto a multidimensional Gaussian distribution of independent covariance, and then removed
Figure 658467DEST_PATH_IMAGE219
The message of (2):
Figure 481061DEST_PATH_IMAGE127
wherein,
Figure 858953DEST_PATH_IMAGE220
calculating to obtain:
Figure 725278DEST_PATH_IMAGE221
Figure 985358DEST_PATH_IMAGE222
Figure 595330DEST_PATH_IMAGE223
wherein,
Figure 375199DEST_PATH_IMAGE224
the n-dimensional column vector with the element of 1 is represented by subscript, wherein the dimension of the vector is represented by the subscript;
Figure 412425DEST_PATH_IMAGE225
the meaning is as follows: when in use
Figure 425380DEST_PATH_IMAGE226
Taking out the diagonal of the matrix when it is
Figure 839044DEST_PATH_IMAGE227
If the vector is a vector, the vector is expanded into a diagonal matrix,
Figure 488067DEST_PATH_IMAGE228
averaging the vector;
Figure 961774DEST_PATH_IMAGE229
means to find
Figure 930867DEST_PATH_IMAGE230
About
Figure 882643DEST_PATH_IMAGE227
Mean vector
Figure 89633DEST_PATH_IMAGE231
Sum variance vector
Figure 219394DEST_PATH_IMAGE232
And output
Figure 675783DEST_PATH_IMAGE233
Figure 431250DEST_PATH_IMAGE234
The inversion of the matrix is referred to as,
Figure 289484DEST_PATH_IMAGE235
refers to matrix transposition.
Finally, calculating the division part and outputting
Figure 855726DEST_PATH_IMAGE236
Figure 64990DEST_PATH_IMAGE237
Figure 358568DEST_PATH_IMAGE238
S2.3: updating
Figure 71310DEST_PATH_IMAGE239
: in that
Figure 808452DEST_PATH_IMAGE240
On a node, will
Figure 239434DEST_PATH_IMAGE241
And
Figure 336703DEST_PATH_IMAGE240
the result of the multiplication is projected on the multidimensional Gaussian distribution of independent covariance and then removed
Figure 107213DEST_PATH_IMAGE242
The message of (2):
Figure 264525DEST_PATH_IMAGE243
wherein,
Figure 930605DEST_PATH_IMAGE244
the following calculation results:
Figure 565985DEST_PATH_IMAGE245
Figure 253319DEST_PATH_IMAGE246
wherein,
Figure 581532DEST_PATH_IMAGE247
and
Figure 737838DEST_PATH_IMAGE248
are all about
Figure 442489DEST_PATH_IMAGE249
Is expressed as follows:
Figure 984328DEST_PATH_IMAGE250
finally, calculating the division part and outputting
Figure 483443DEST_PATH_IMAGE251
Figure 861466DEST_PATH_IMAGE252
Figure 838649DEST_PATH_IMAGE253
Wherein the approximation of the regression coefficients is a posteriori as follows:
Figure 500575DEST_PATH_IMAGE254
and mean value obtained by projection operation
Figure 170590DEST_PATH_IMAGE255
It is the Cox regression coefficients that are to be output.
S2.4: updating
Figure 35909DEST_PATH_IMAGE256
: in that
Figure 20046DEST_PATH_IMAGE257
On a node, will
Figure 536478DEST_PATH_IMAGE258
And
Figure 174132DEST_PATH_IMAGE257
multiply and then accumulate the variables
Figure 529677DEST_PATH_IMAGE259
Projected onto a multidimensional Gaussian distribution of independent covariance, and then removed
Figure 114242DEST_PATH_IMAGE260
The message of (2):
Figure 688443DEST_PATH_IMAGE261
wherein,
Figure 496999DEST_PATH_IMAGE262
calculating to obtain:
Figure 71331DEST_PATH_IMAGE263
Figure 459587DEST_PATH_IMAGE264
Figure 685032DEST_PATH_IMAGE265
finally, calculating the division part and outputting
Figure 867752DEST_PATH_IMAGE266
Figure 194959DEST_PATH_IMAGE267
Figure 121327DEST_PATH_IMAGE268
Step 3: output of approximate posterior probability according to S2.3
Figure 201278DEST_PATH_IMAGE269
Matching with expectation maximization algorithm (expectation maximization), the prior parameter is matched
Figure 820478DEST_PATH_IMAGE270
And performing automatic updating.
S3.1: updating
Figure 369402DEST_PATH_IMAGE271
Figure 833882DEST_PATH_IMAGE272
S3.2: updating
Figure 237181DEST_PATH_IMAGE273
Figure 27283DEST_PATH_IMAGE274
S3.3: updating
Figure 47191DEST_PATH_IMAGE275
Figure 328744DEST_PATH_IMAGE276
Step 4: judging whether a preset iteration end condition is reached:
the end conditions are as follows:
Figure 383287DEST_PATH_IMAGE277
determine whether it starts to rise, if so
Figure 78711DEST_PATH_IMAGE278
Stopping the iterative process when the rising starts, and outputting the regression system of the final resultNumber of
Figure 851495DEST_PATH_IMAGE279
(in S2.3). Wherein
Figure 674088DEST_PATH_IMAGE280
Is a norm.
Example 3
Based on the above example 1 and example 2, and with reference to fig. 7, this example illustrates a cancer gene prognosis screening system based on an improved Cox model in the second aspect of the present invention.
In a specific embodiment, as shown in fig. 7, the present invention further provides a cancer gene prognosis screening system based on an improved Cox model, which includes a memory and a processor, wherein the memory includes a cancer gene prognosis screening program based on the improved Cox model, and the cancer gene prognosis screening program based on the improved Cox model implements the following steps when executed by the processor:
s1, collecting the expression quantity of different genes of cancer cells of a cancer patient, collecting survival data of the patient, and sorting the expression quantity of the different genes of the cancer cells and patient information into a first matrix
Figure 583139DEST_PATH_IMAGE281
For the first matrix
Figure 715043DEST_PATH_IMAGE281
Preprocessing is carried out to obtain a second matrix
Figure 709544DEST_PATH_IMAGE282
S2, survival data obtained in the step S1 and a second matrixXAnd inputting a preset Cox regression model, and solving to obtain a regression coefficient.
And S3, evaluating the patient risk of the corresponding gene in the regression coefficient according to the risk function of the patient, and screening a prognostic genome corresponding to high patient risk.
And S4, providing guide information for predicting prognosis, relapse and metastasis by using the screened prognostic genome through a biological theory.
The drawings depicting the positional relationship of structures are for illustrative purposes only and are not to be construed as limiting the present patent.
It should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention and are not intended to limit the embodiments of the present invention. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.

Claims (10)

1. A cancer gene prognosis screening method based on an improved Cox model is characterized by comprising the following steps:
s1, collecting the expression quantity of different genes of cancer cells of a cancer patient, collecting survival data of the patient, collating the expression quantity of the different genes of the cancer cells and patient information into a first matrix, and preprocessing the first matrix to obtain a second matrix;
s2, inputting the survival data obtained in the step S1 and the second matrix into a preset Cox regression model, and solving to obtain a regression coefficient;
s3, evaluating the patient risk of the corresponding gene in the regression coefficient according to the risk function of the patient, and screening a prognostic genome corresponding to high patient risk;
and S4, providing guide information for predicting prognosis, relapse and metastasis by using the screened prognostic genome through a biological theory.
2. The method of claim 1, wherein in step S2, the survival data and the second matrix are combined to form a third matrix, and the third matrix is inputted into the predetermined Cox regression model; wherein the third matrix is denoted as [ X, y, c]X represents a covariate matrix, i.e. a second matrix, y represents the time-to-live, c represents the deletionIndexing; wherein the first stepiSurvival data for individual patients is
Figure 39106DEST_PATH_IMAGE001
3. The method of claim 2, wherein the first step is to select the improved Cox model based cancer gene prognosisiThe risk function for each of said patients is specifically:
Figure 68242DEST_PATH_IMAGE002
wherein
Figure 652938DEST_PATH_IMAGE003
Is a shared benchmark risk function;
Figure 400314DEST_PATH_IMAGE004
obtaining a regression coefficient for solving the Cox regression model;
Figure 813978DEST_PATH_IMAGE005
is shown asiGene expression levels of individual patients.
4. The method of claim 3, wherein the step S2 of solving the Cox regression model to obtain regression coefficients comprises the following steps:
s21, combining the existing survival data into a third matrix, sequencing according to the survival time of the parameters, constructing a Cox regression model by using the sequenced data, and initializing prior parameters and message transmission parameters;
s22, projecting a high-dimensional message to independent Gaussian distribution through a moment matching rule by using an expected propagation algorithm according to a determinant vector factor graph of the Cox regression model, circularly iterating to solve the model, and outputting a regression coefficient and an approximate posterior probability;
s23, inputting the regression coefficient and the approximate posterior probability into an expected maximum algorithm, and updating prior parameters;
s24, judging whether the regression coefficient reaches a preset iteration ending condition or not; if the preset iteration ending condition is reached, outputting a regression coefficient obtained by the current iteration; and if the preset iteration end condition is not reached, returning to the step S22 for the next iteration.
5. The method of claim 4, wherein the prior parameters include: mean value
Figure 697620DEST_PATH_IMAGE006
Variance, variance
Figure 656480DEST_PATH_IMAGE007
And sparsity ratio
Figure 359994DEST_PATH_IMAGE008
(ii) a The message passing parameters comprise: mean and variance of positive direction messages; the step S21 is specifically: normalizing the X matrix of the covariate matrix, and determining the third matrix as [ X, y, c ] according to the survival time y]Sorting in descending order, and setting the sorted third matrix as [ X, y, c ]]And substituting Cox partial likelihood function to initialize prior parameter and message transfer function.
6. The method of claim 4, wherein the normalization process of the X matrix of the covariate matrix is as follows:
Figure 577348DEST_PATH_IMAGE009
wherein mean (m)X) Is composed ofXMean of the whole elements of the matrix, var: (X) Is composed ofXThe variance of the whole elements of the matrix;
the Cox partial likelihood function is specifically:
Figure 315497DEST_PATH_IMAGE010
wherein,
Figure 507575DEST_PATH_IMAGE011
expressing the function as
Figure 963965DEST_PATH_IMAGE012
Is transferred to
Figure 719431DEST_PATH_IMAGE013
For representing transition probabilities of
Figure 65748DEST_PATH_IMAGE011
About
Figure 881258DEST_PATH_IMAGE013
Is normalized;
Figure 356101DEST_PATH_IMAGE014
the partial likelihood function of Cox is not normalized and represents a direct proportion relation; the function is as follows
Figure 665991DEST_PATH_IMAGE012
Is a variable, the firstiAn element
Figure 378732DEST_PATH_IMAGE015
Figure 365143DEST_PATH_IMAGE016
Is composed of
Figure 530545DEST_PATH_IMAGE017
To (1) aiAn element;
the initialization of the prior parameters specifically comprises the following steps: the regression coefficients are subjected to Gaussian-Bernoulli distribution, and the mathematical expression is as follows:
Figure 644125DEST_PATH_IMAGE018
wherein,
Figure 945794DEST_PATH_IMAGE019
representing a dirac Delta function;
Figure 103106DEST_PATH_IMAGE020
represents a mean value of
Figure 755804DEST_PATH_IMAGE021
Variance of
Figure 407496DEST_PATH_IMAGE022
A gaussian distribution of (d); the function is as follows
Figure 94829DEST_PATH_IMAGE023
Is a variable; initializing prior parameters
Figure 219780DEST_PATH_IMAGE024
Figure 107577DEST_PATH_IMAGE025
Figure 281070DEST_PATH_IMAGE026
The initialization of the message transfer function is specifically as follows: initializing a message transfer function of a positive direction message, wherein the mathematical expression of the message transfer function is as follows:
Figure 822910DEST_PATH_IMAGE027
wherein,
Figure 322024DEST_PATH_IMAGE028
is an n-dimensional column vector with elements all 0;
Figure 700047DEST_PATH_IMAGE029
is an n-dimensional column vector with elements all being 1;
Figure 942809DEST_PATH_IMAGE030
is a random variable obeying independent same variance multidimensional Gaussian distribution;
Figure 604735DEST_PATH_IMAGE029
is an n-column dimensional vector with element 1; initialization
Figure 822221DEST_PATH_IMAGE031
Figure 671228DEST_PATH_IMAGE032
Figure 717681DEST_PATH_IMAGE033
7. The method for screening cancer gene prognosis based on improved Cox model as claimed in claim 6, wherein said step S22 is specifically for message transmission on determinant vector factor graph of Cox regression model based on moment matching rule, comprising the following steps:
s221, matching according to the moment matching rule of the determinant vector factor graph of the Cox regression model
Figure 234113DEST_PATH_IMAGE034
Updating, specifically:
Figure 560184DEST_PATH_IMAGE035
at a node
Figure 896487DEST_PATH_IMAGE036
In the above, will
Figure 481052DEST_PATH_IMAGE037
Of (2) a message
Figure 851991DEST_PATH_IMAGE038
Multiplying and projecting the result onto a multidimensional Gaussian distribution of independent covariance
Figure 605752DEST_PATH_IMAGE037
Is divided by the message to obtain
Figure 694931DEST_PATH_IMAGE034
The message of (a);
s222, matching according to the moment matching rule of the determinant vector factor graph of the Cox regression model
Figure 817608DEST_PATH_IMAGE039
Updating, specifically:
Figure 308632DEST_PATH_IMAGE040
at a node
Figure 242084DEST_PATH_IMAGE041
In the above, will
Figure 552980DEST_PATH_IMAGE042
Of a message and
Figure 276085DEST_PATH_IMAGE043
multiply and then accumulate variables
Figure 824878DEST_PATH_IMAGE044
And projected to independent covarianceOn a multi-dimensional Gaussian distribution, the results obtained by projection are then summed
Figure 929232DEST_PATH_IMAGE045
Is divided by the message to obtain
Figure 727423DEST_PATH_IMAGE039
The message of (2); wherein
Figure 254220DEST_PATH_IMAGE046
Is a dirac Delta function;
s223, matching according to the moment matching rule of the determinant vector factor graph of the Cox regression model
Figure 204989DEST_PATH_IMAGE045
Updating, specifically:
Figure 729511DEST_PATH_IMAGE047
in that
Figure 218262DEST_PATH_IMAGE048
On a node, will
Figure 752011DEST_PATH_IMAGE049
Of a message and
Figure 806555DEST_PATH_IMAGE048
projecting the result obtained by multiplication on the multidimensional Gaussian distribution of independent covariance, and summing the results obtained by projection
Figure 249781DEST_PATH_IMAGE050
Is divided by the message to obtain
Figure 22565DEST_PATH_IMAGE045
The message of (2); wherein the mean value obtained by the projection operation
Figure 94426DEST_PATH_IMAGE051
Is the Cox regression coefficient as the output result;
s224, according to the moment matching rule of the determinant vector factor graph of the Cox regression model, pairing
Figure 3476DEST_PATH_IMAGE052
Updating, specifically:
Figure 151692DEST_PATH_IMAGE053
in that
Figure 146193DEST_PATH_IMAGE054
On a node, will
Figure 21745DEST_PATH_IMAGE055
Of a message and
Figure 785302DEST_PATH_IMAGE054
multiply and accumulate variables
Figure 838839DEST_PATH_IMAGE056
Projecting the result on a multidimensional Gaussian distribution with independent covariance, and then summing the projected results
Figure 320636DEST_PATH_IMAGE057
Is divided by the message to obtain
Figure 734300DEST_PATH_IMAGE052
The message of (2).
8. The method for screening cancer gene prognosis based on improved Cox model according to claim 4, wherein the step S23 is specifically as follows: regression coefficient output from step S22
Figure 617943DEST_PATH_IMAGE058
And approximate posterior probability
Figure 842382DEST_PATH_IMAGE059
Matching with expectation maximization algorithm to prior parameter
Figure 811475DEST_PATH_IMAGE060
Carrying out automatic updating; the updated expression is specifically:
Figure 28829DEST_PATH_IMAGE061
Figure 766978DEST_PATH_IMAGE062
Figure 899669DEST_PATH_IMAGE063
wherein,
Figure 621637DEST_PATH_IMAGE064
and
Figure 377104DEST_PATH_IMAGE065
are all about
Figure 235338DEST_PATH_IMAGE066
Is expressed as follows:
Figure 536001DEST_PATH_IMAGE067
wherein,
Figure 479686DEST_PATH_IMAGE068
the vector points are divided by the vector points,
Figure 38843DEST_PATH_IMAGE069
is a vector dot product.
9. The method for screening cancer gene prognosis based on improved Cox model according to any one of claims 4-8, wherein the iteration end conditions preset in step S24 are specifically:
Figure 751584DEST_PATH_IMAGE070
determining whether to end iteration by judging whether the Crit value starts to rise or not, if the Crit value starts to rise, stopping the iteration process and outputting a regression coefficient of the final iteration
Figure 488727DEST_PATH_IMAGE071
(ii) a If the Crit value does not start to rise, continuing iteration; wherein
Figure 654130DEST_PATH_IMAGE072
Representing a norm.
10. A cancer gene prognosis screening system based on an improved Cox model comprises a memory and a processor, wherein the memory comprises a cancer gene prognosis screening program based on the improved Cox model, and the cancer gene prognosis screening program based on the improved Cox model realizes the following steps when being executed by the processor:
s1, collecting the expression quantity of different genes of cancer cells of a cancer patient, collecting survival data of the patient, collating the expression quantity of the different genes of the cancer cells and patient information into a first matrix, and preprocessing the first matrix to obtain a second matrix;
s2, inputting the survival data obtained in the step S1 and the second matrix into a preset Cox regression model, and solving to obtain a regression coefficient;
s3, evaluating the patient risk of the corresponding gene in the regression coefficient according to the risk function of the patient, and screening a prognostic genome corresponding to high patient risk;
and S4, providing guide information for predicting prognosis, relapse and metastasis by using the screened prognostic genome through a biological theory.
CN202211631423.4A 2022-12-19 2022-12-19 Cancer gene prognosis screening method and system based on improved Cox model Active CN115620808B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211631423.4A CN115620808B (en) 2022-12-19 2022-12-19 Cancer gene prognosis screening method and system based on improved Cox model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211631423.4A CN115620808B (en) 2022-12-19 2022-12-19 Cancer gene prognosis screening method and system based on improved Cox model

Publications (2)

Publication Number Publication Date
CN115620808A true CN115620808A (en) 2023-01-17
CN115620808B CN115620808B (en) 2023-03-31

Family

ID=84879866

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211631423.4A Active CN115620808B (en) 2022-12-19 2022-12-19 Cancer gene prognosis screening method and system based on improved Cox model

Country Status (1)

Country Link
CN (1) CN115620808B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116321620A (en) * 2023-05-11 2023-06-23 杭州行至云起科技有限公司 Intelligent lighting switch control system and method thereof

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110320390A1 (en) * 2009-03-10 2011-12-29 Kuznetsov Vladimir A Method for identification, prediction and prognosis of cancer aggressiveness
US20170024529A1 (en) * 2015-07-26 2017-01-26 Macau University Of Science And Technology Semi-Supervised Learning Framework based on Cox and AFT Models with L1/2 Regularization for Patient's Survival Prediction
CN106407689A (en) * 2016-09-27 2017-02-15 牟合(上海)生物科技有限公司 Stomach cancer prognostic marker screening and classifying method based on gene expression profile
CN112117003A (en) * 2020-09-03 2020-12-22 中国科学院深圳先进技术研究院 Tumor risk grading method, system, terminal and storage medium
CN113409946A (en) * 2021-07-02 2021-09-17 中山大学 System and method for predicting cancer prognosis risk under high-dimensional deletion data

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110320390A1 (en) * 2009-03-10 2011-12-29 Kuznetsov Vladimir A Method for identification, prediction and prognosis of cancer aggressiveness
US20170024529A1 (en) * 2015-07-26 2017-01-26 Macau University Of Science And Technology Semi-Supervised Learning Framework based on Cox and AFT Models with L1/2 Regularization for Patient's Survival Prediction
CN106407689A (en) * 2016-09-27 2017-02-15 牟合(上海)生物科技有限公司 Stomach cancer prognostic marker screening and classifying method based on gene expression profile
CN112117003A (en) * 2020-09-03 2020-12-22 中国科学院深圳先进技术研究院 Tumor risk grading method, system, terminal and storage medium
WO2022048071A1 (en) * 2020-09-03 2022-03-10 中国科学院深圳先进技术研究院 Tumor risk grading method and system, terminal, and storage medium
CN113409946A (en) * 2021-07-02 2021-09-17 中山大学 System and method for predicting cancer prognosis risk under high-dimensional deletion data

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116321620A (en) * 2023-05-11 2023-06-23 杭州行至云起科技有限公司 Intelligent lighting switch control system and method thereof
CN116321620B (en) * 2023-05-11 2023-08-11 杭州行至云起科技有限公司 Intelligent lighting switch control system and method thereof

Also Published As

Publication number Publication date
CN115620808B (en) 2023-03-31

Similar Documents

Publication Publication Date Title
CN109994200B (en) Multi-group cancer data integration analysis method based on similarity fusion
Meeds et al. GPS-ABC: Gaussian process surrogate approximate Bayesian computation
US20040167721A1 (en) Optimal fitting parameter determining method and device, and optimal fitting parameter determining program
Moss et al. Gibbon: General-purpose information-based bayesian optimisation
CN110993113B (en) LncRNA-disease relation prediction method and system based on MF-SDAE
Rischard et al. Unbiased estimation of log normalizing constants with applications to Bayesian cross-validation
Sesia et al. Gene hunting with knockoffs for hidden markov models
CN115620808B (en) Cancer gene prognosis screening method and system based on improved Cox model
CN111223528B (en) Multi-group data clustering method and device
CN116629352A (en) Hundred million-level parameter optimizing platform
Rad et al. GP-RVM: Genetic programing-based symbolic regression using relevance vector machine
CN116401555A (en) Method, system and storage medium for constructing double-cell recognition model
Gu et al. RPnet: a reverse-projection-based neural network for coarse-graining metastable conformational states for protein dynamics
Miao et al. Fisher-Pitman permutation tests based on nonparametric poisson mixtures with application to single cell genomics
Dhulipala et al. Efficient Bayesian inference with latent Hamiltonian neural networks in No-U-Turn Sampling
Du et al. Incorporating grouping information into bayesian decision tree ensembles
Cai et al. Surrogate-assisted operator-repeated evolutionary algorithm for computationally expensive multi-objective problems
Evangelou et al. Estimation and prediction for spatial generalized linear mixed models with parametric links via reparameterized importance sampling
CN115565610A (en) Method and system for establishing recurrence transfer analysis model based on multiple sets of mathematical data
Roy et al. A hidden-state Markov model for cell population deconvolution
CN104462817A (en) Gene selection and cancer classification method based on Monte Carlo and non-negative matrix factorization
McLain et al. Sparse high-dimensional linear regression with a partitioned empirical Bayes ECM algorithm
Zhai et al. Two‐sample test with g‐modeling and its applications
Iba et al. GP-RVM: Genetic programing-based symbolic regression using relevance vector machine
Park et al. Stepwise feature selection using generalized logistic loss

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant