CN111261243B - Method for detecting phase change critical point of complex biological system based on relative entropy index - Google Patents

Method for detecting phase change critical point of complex biological system based on relative entropy index Download PDF

Info

Publication number
CN111261243B
CN111261243B CN202010025627.8A CN202010025627A CN111261243B CN 111261243 B CN111261243 B CN 111261243B CN 202010025627 A CN202010025627 A CN 202010025627A CN 111261243 B CN111261243 B CN 111261243B
Authority
CN
China
Prior art keywords
sample
subject
relative entropy
representing
normal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010025627.8A
Other languages
Chinese (zh)
Other versions
CN111261243A (en
Inventor
刘锐
王俊霞
陈培
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN202010025627.8A priority Critical patent/CN111261243B/en
Publication of CN111261243A publication Critical patent/CN111261243A/en
Application granted granted Critical
Publication of CN111261243B publication Critical patent/CN111261243B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/20ICT specially adapted for the handling or processing of patient-related medical or healthcare data for electronic clinical trials or questionnaires
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Medical Informatics (AREA)
  • Public Health (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Biomedical Technology (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Pathology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention discloses a method for detecting phase transition critical points of a complex biological system based on relative entropy indexes, which is used for determining early warning signals of a pre-disease state or phase transition by researching rich dynamic information provided by high-flux data and learning different characteristics of networks under two different states by utilizing different characteristics between a normal state and a pre-disease state. To verify validity, the present invention applies this detection method to two real data sets. The two real data sets are respectively: lung squamous cell carcinoma (luc) and lung adenocarcinoma (LUAD).

Description

Method for detecting phase change critical point of complex biological system based on relative entropy index
Technical Field
The invention relates to the technical field of biological system phase transition critical point detection, in particular to a method for detecting complex biological system phase transition critical points based on a relative entropy index (Relative Entropy Score, recorded as RES).
Background
The progression of complex diseases such as diabetes and cancer is generally a nonlinear process with three phases, a normal state, a pre-disease state, and a disease state, where the pre-disease state is a critical state or point prior to the disease state. Traditional biomarkers aim to identify disease states by using observed differential expression information of molecules, but pre-disease states may not be detected since there is typically no significant difference between normal and pre-disease states. Thus, signaling pre-disease states is a challenge, which in effect means disease prediction.
The theoretical derivation of the calculation method is presented below:
different dynamics before and around critical phase transitions:
the dynamics of complex disease progression can be represented by the following nonlinear discrete-time dynamic system:
Z(t)=f(Z(t-1);P), (1)
here Z (t) = (Z) 1 (t),z 2 (t),…,z n (t)) is an n-dimensional state vector or variable at time t=1, 2, …, p= (P) 1 ,…,P s ) Is a parameter vector or driving factor representing a slowly varying factor, such as a genetic factor (SNP, CNV, etc.), an epigenetic factor (methylation, acetylation, etc.), or an environmental factor. f R n ×R s ×R n Is a nonlinear function. For such a nonlinear systemThe system is in
Figure BDA0002362344510000021
Will undergo a phase change or be a kind of a phase change when the parameter P reaches the threshold P c From a stable equilibrium bifurcation (Gilmore, 1993). The supplementary information A1 gives a detailed description.
For a system (1) near z, P reaches P c Previously, the system should maintain a stable equilibrium
Figure BDA0002362344510000022
All eigenvalues are therefore modulo (0, 1). Parameter value P for shifting system state c Referred to as a bifurcation parameter value or a threshold value, and the state prior to such bifurcation is referred to as a pre-disease state. In general, a real system is often disturbed by noise and thus has random dynamics. When the system approaches from a normal state to a pre-disease state, the dynamic and statistical properties have been demonstrated that as the system approaches the pre-disease state, a significant set or Dynamic Network Biomarkers (DNBs) appear in the observed variables, meeting the following three conditions (Chen et al 2012, liu et al 2012,2013a,2014 b)
The variable z in this group i (t) an increase in correlation between;
the set of variables z i (t) and other groups of variables z j (t) a decrease in correlation between;
the set of variables z i The standard deviation of (t) increases.
Thus, there is a significant difference in kinetics between normal and pre-disease states. The normal state is a steady state with high rebound, insensitive to parameter disturbances, and therefore can be modeled as a smooth markov process. When the system is in a normal state, there is no significant change between the distributions of Z (t) and Z (t-1), i.e., the probability distribution remains almost unchanged over time. In contrast, a pre-disease state with low rebound is sensitive to parameter changes, whose dynamics or probability distribution changes over time. In this way, the pre-disease state is modeled as a time-varying Markov process. When the system is in a pre-disease state, there is a significant difference between the distribution of Z (t) and the distribution of Z (t-1). Based on these dynamics, the switching time from the normal state to the pre-disease state can be identified.
Most biomolecules perform their function through interactions with functional modules or other biomolecules between modules. This inter-and intra-module interconnectivity suggests that the effects of a particular genetic abnormality not only affect the activity of the gene product carrying it, but can extend along links of a network consisting of biomolecules and alter the activity of other gene products. Thus, understanding the interaction network environment of a biomolecule is critical to determining the phenotypic effects of defects affecting a biomolecule.
Disclosure of Invention
The invention aims to provide a method for detecting a phase transition critical point of a complex biological system based on a relative entropy index (Relative Entropy Score) by utilizing different characteristics between a normal state and a disease state, wherein in the biological process of the complex disease, a pre-disease state is identified before the critical point is reached. In particular, identifying a pre-disease state corresponds to detecting a switching point where two networks differ.
To study the evolution of the network system, the invention uses a difference network which integrates the difference edges, namely, quantifies the statistical importance (namely, relative entropy index, RES) of each difference edge in the difference network.
The aim of the invention can be achieved by adopting the following technical scheme:
a method for detecting phase transition critical points of a complex biological system based on relative entropy indexes, the method comprising the following steps:
s1, a continuous time observation data sequence O t ={o 1 ,o 2 ,o 3 ,…,o t Conversion to a sequence of time-variant networks { DN } 2 ,DN 3 ,…,DN t-1 ,DN t };
The correlation network is built first, and the correlation is mapped to the existing functional network, namely the STRING network, at each sampling timePoint pair observation sequence { o } 1 ,o 2 ,o 3 ,…,o t Construction of a related network sequence { N } 1 ,…,N t}, wherein ,Nt Representing a correlation network at time t, each edge connecting two nodes represents a correlation between two biomolecules, while each edge connecting only one node represents a self-adjustment or variation of the biomolecules, and subsequently, a parameter α is selected such that the Pearson correlation coefficient PCC satisfies the following formula: the I PCC I is not less than alpha, wherein the parameter alpha is a parameter to be determined based on specific real data, the edges of the related network of the PCC meeting the above conditions are reserved, and the edges not meeting the above conditions are removed, so that the related network is obtained;
s2, preparing a reference sample, and taking a sample extracted in a normal period as the reference sample. For a real dataset we usually choose a sample from normal tissue as a reference sample;
s3, fitting the distribution of the biomolecules according to a reference sample, wherein the distribution is specifically as follows:
for biomolecules g i Fitting a gaussian distribution based on the expression level in the reference samples { s1, s2, …, sk }; then, a k-dimensional vector (area (D gi (S 1 )),area(D gi (S 2 )),…,area(D gi (S k ) Of), wherein area (D) gi (S k ) Representing the biomolecule g in the kth sample i A cumulative area determined by a gaussian distribution;
s4, constructing a reference distribution P according to the following formula
Figure BDA0002362344510000041
wherein ,
Figure BDA0002362344510000045
representing the biomolecule g in the kth sample i The cumulative area determined by the corresponding Gaussian distribution is +.>
Figure BDA0002362344510000042
S5, calculating a relative entropy index (namely Relative Entropy Score) which is recorded as RES
Figure BDA0002362344510000043
Wherein RES_N represents the relative entropy index obtained by the normal sample,
wherein ,
Figure BDA0002362344510000044
H <u,v> (x v ,x ul ) Represents the edge feature between subject v and subject u's first normal sample, H <v,u> (x u ,x vp ) Representing the edge features between subject u and the p-th normal sample of subject v, x v All normal samples, x, representing subject v u All normal samples, x, representing subject u ul The first normal sample, x, representing subject u vs An s-th normal sample representing subject v
Figure BDA0002362344510000051
wherein ,p(xv1 ) Distribution of the 1 st normal sample representing subject v, p (x v2 ) Distribution of the 2 nd normal sample representing subject v, …, p (x vm ) Distribution of the mth normal sample representing subject v, p (x ul ) A distribution of the first normal sample representing subject u;
is available in the same way
Figure BDA0002362344510000052
Wherein RES_D represents the relative entropy index obtained from the disease sample, H <u,v> (y v ,y ul ) Representing the subjectv and the first disease sample of subject u, H <v,u> (y u ,y vs ) Representing the edge features between subject u and subject v's disease sample, y v All disease samples representing subject v, y u All disease samples representing subject u, y ul A first disease sample representing subject u, y vs The s-th disease sample representing the subject v, P representing a discrete probability distribution, distribution P satisfying
Figure BDA0002362344510000053
Where P (x) is the probability value expressed with reference to the xth sample, u, v represent the subject, l represents the ith sample, s represents the s-th sample.
Further, the method for detecting the phase transition critical point of the complex biological system requires at least 3 samples.
Further, the relative entropy index (RES) has different characteristics in different states, and the value of the relative entropy index (RES) in a disease state is smaller than that in a normal state.
Further, the parameter α is selected according to a principle that the difference network in the normal state has as few difference edges as possible, so as to highlight the pre-disease state with a certain number of difference edges.
Compared with the prior art, the invention has the following advantages and effects:
the invention provides a calculation method based on a relative entropy index (RES) for identifying an upcoming critical transition, which is proved to be valid by a real dataset. It is noted that the object of the present invention is to detect early warning signals generated from normal conditions (or pre-disease conditions) rather than to find signs of a disease condition (or pre-disease condition) where a qualitative change occurs. The innovation of the invention is as follows:
1. the traditional method can only judge whether an individual is in a healthy state or a disease state, but the critical transition critical period cannot be effectively perceived in the limit state of the healthy state, and the invention adopts a calculation method of a time difference network, so that the pre-disease period in the complex disease development process can be accurately reflected or the occurrence of complex disease deterioration can be predicted;
2. in the prior art, single variable or few variables are greatly influenced by noise, and critical point signals are not obvious, so that the method can overcome;
3. the method adopts unsupervised learning and a forward algorithm to realize actual operation of high-flux data;
4. the very ingenious model design in the method of the invention converts continuous gene expression data into observation data, which is a difficult point and a key point.
Drawings
FIG. 1 is a schematic flow chart of a method for detecting phase transition critical points of a complex biological system based on relative entropy indexes;
fig. 2 (a) is a graph of lung squamous cell carcinoma (luc) dataset in a first case, namely: the IA phase sample is used as a control sample, the rest samples are used as experimental groups, the comparison schematic diagram of the gene expression and the relative entropy index (RES) survival analysis results is shown, the left side represents the survival analysis results of the gene expression on the data set, and the right side represents the survival analysis results of the relative entropy index (RES) on the data set;
fig. 2 (b) is a graph of lung squamous cell carcinoma (luc) dataset in a second case, namely: the comparison schematic of the results of the survival analysis of the gene expression and the relative entropy index (RES) is shown by taking the IA and the IBETA phase samples as control samples, taking the rest samples as experimental groups, wherein the left side represents the results of the survival analysis of the gene expression on the data set, and the right side represents the results of the survival analysis of the relative entropy index (RES) on the data set;
fig. 2 (c) is a graph of lung squamous cell carcinoma (luc) dataset in a third case, namely: the comparison schematic of the survival analysis results of gene expression and relative entropy index (RES) is shown in the left side, wherein the left side represents the survival analysis results of gene expression on a data set, and the right side represents the survival analysis results of relative entropy index (RES) on the data set;
fig. 2 (d) is a graph of lung squamous cell carcinoma (luc) dataset in a fourth case, namely: IA, IBETA, IIA and IIBETA phase samples are used as control samples, the rest samples are used as experimental groups, the comparison schematic diagram of the gene expression and the relative entropy index (RES) survival analysis results is shown, the left side represents the survival analysis results of the gene expression on the data set, and the right side represents the survival analysis results of the relative entropy index (RES) on the data set;
fig. 2 (e) is a graph of lung squamous cell carcinoma (luc) dataset in a fifth case, namely: IALPHA, IBETA, IIALPHA, IIBETA and IIIALPHA phase samples are used as control samples, the rest samples are used as experimental groups, the comparison diagram of gene expression and relative entropy index (RES) survival analysis results is shown, the left side represents the survival analysis results of gene expression on the data set, and the right side represents the survival analysis results of relative entropy index (RES) on the data set;
FIG. 2 (f) is a graphical representation of a comparison of gene expression and relative entropy index (RES) results on a lung squamous cell carcinoma (LUSC) dataset;
FIG. 2 (g) is a schematic representation of the dynamic evolution of a network of relative entropy indices (RES) on a lung squamous cell carcinoma (LUSC) dataset;
fig. 3 (a) is a graph of lung adenocarcinoma (LUAD) dataset in a first case, namely: the phase I sample is used as a control sample, the rest sample is used as an experimental group, the comparison diagram of the gene expression and the relative entropy index (RES) survival analysis results is shown, the left side represents the survival analysis results of the gene expression on the data set, and the right side represents the survival analysis results of the relative entropy index (RES) on the data set;
fig. 3 (b) is a graph of lung adenocarcinoma (LUAD) dataset in a second case, namely: i and an IA period sample are taken as control samples, the rest samples are taken as experimental groups, the comparison diagram of the gene expression and the relative entropy index (RES) survival analysis results is shown, the left side represents the survival analysis results of the gene expression on the data set, and the right side represents the survival analysis results of the relative entropy index (RES) on the data set;
fig. 3 (c) is a graph of lung adenocarcinoma (LUAD) dataset in a third case, namely: i, using IA and IBETA samples as control samples, using the rest samples as experimental groups, comparing gene expression and relative entropy index (RES) survival analysis results, wherein the left side represents the survival analysis result of gene expression on the data set, and the right side represents the survival analysis result of relative entropy index (RES) on the data set;
fig. 3 (d) is a graph of lung adenocarcinoma (LUAD) dataset in a fourth case, namely: i, IALPHA, IBETA and IIALPHA phase samples are used as control samples, the rest samples are used as experimental groups, the comparison diagram of gene expression and relative entropy index (RES) survival analysis results is shown, the left side represents the survival analysis results of gene expression on the data set, and the right side represents the survival analysis results of relative entropy index (RES) on the data set;
fig. 3 (e) is a graph of lung adenocarcinoma (LUAD) dataset in a fifth case, namely: i, IALPHA, IBETA, IIALPHA and IIBETA phase samples are used as control samples, the rest samples are used as experimental groups, the comparison diagram of gene expression and relative entropy index (RES) survival analysis results is shown, the left side represents the survival analysis results of gene expression on the data set, and the right side represents the survival analysis results of relative entropy index (RES) on the data set;
fig. 3 (f) is a graph of lung adenocarcinoma (LUAD) dataset in a sixth scenario, namely: i, IALPHA, IBETA, IIALPHA, IIBETA and IIIALPHA phase samples are used as control samples, the rest samples are used as experimental groups, the comparison diagram of gene expression and relative entropy index (RES) survival analysis results is shown, the left side represents the survival analysis results of gene expression on the data set, and the right side represents the survival analysis results of relative entropy index (RES) on the data set;
fig. 3 (g) is a graph of lung adenocarcinoma (LUAD) dataset in a seventh case, namely: i, IALPHA, IBETA, IIBETA, IIIALPHA and IIIBETA phase samples are used as control samples, the rest samples are used as experimental groups, the comparison of gene expression and relative entropy index (RES) survival analysis results is shown, the left side represents the survival analysis result of gene expression on the data set, and the right side represents the survival analysis result of relative entropy index (RES) on the data set;
FIG. 3 (h) is a graphical representation of a comparison of gene expression and relative entropy index (RES) results on a lung adenocarcinoma (LUAD) dataset;
fig. 3 (i) is a diagram of the dynamic evolution of a network of relative entropy indices (RES) on a lung adenocarcinoma (LUAD) dataset.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Examples
As shown in fig. 1, the present invention discloses a method for detecting critical state before phase transition of complex biological system based on relative entropy index.
A data matrix illustration of node features and edge features for calculating the relative entropy index (RES) is given below.
First, node characteristics are used to distinguish normal samples from disease samples by x, y. The mth normal sample of subject u is denoted as x um The nth disease sample of study subject u is designated y un . In addition, the mth normal sample of subject v is denoted as x vm The nth disease sample of study subject v was noted as y vn
The edge feature follows.
H <u,v> (x v ,x ul ) An edge feature between a normal sample representing subject v and the first normal sample of subject u;
H <v,u> (x u ,x vp ) An edge feature between a normal sample representing subject u and a p-th normal sample of subject v;
H <u,v> (y v ,y ul ) An edge feature between a disease sample representing subject v and a first disease sample of subject u;
H <v,u> (y u ,y vp ) Edge features between the disease sample representing subject u and the p-th disease sample of subject v;
the relative entropy index res_n obtained by the normal sample:
Figure BDA0002362344510000101
wherein ,
Figure BDA0002362344510000102
Figure BDA0002362344510000103
x v all normal samples, x, representing subject v u All normal samples, x, representing subject u ul The first normal sample, x, representing subject u vs An s-th normal sample representing subject v, p (x v1 ) Distribution of the 1 st normal sample representing subject v, p (x v2 ) Distribution of the 2 nd normal sample representing subject v, …, p (x vm ) Represents the distribution of the mth normal sample of subject v. p (x) ul ) Representing the distribution of the first normal sample of subject u.
The relative entropy index RES_D obtained by the disease sample can be obtained by the same method
Figure BDA0002362344510000104
wherein ,H<u,v> (y v ,y ul ) Represents the edge features between subject v and subject u's first disease sample, H <v,u> (y u ,y vp ) Representing the edge features between subject u and subject v's p-th disease sample, y v All disease samples representing subject v, y u All disease samples representing subject u, y ul A first disease sample representing subject u, y vs An s disease sample representing subject v.
According to the flow diagram disclosed in fig. 1.
The results obtained in this example are as follows:
1. predicting critical points of a real dataset
The present example applies a method based on the relative entropy index to two real experimental data sets, namely lung squamous cell carcinoma (luc) and lung adenocarcinoma (LUAD).
2. Application of relative entropy index in 2 tumor data sets
To further demonstrate the effectiveness of this method, it was applied to 2 tumor datasets: lung squamous cell carcinoma, lung adenocarcinoma, all from TCGA oncogene patterns, consisting of tumor and tumor proximity samples. According to the corresponding clinical data of TCGA cancer gene map, the tumor is divided into different stages. Lung squamous cell carcinoma, lung adenocarcinoma can be divided into 7 stages. In all 2 data sets, the relative entropy index (RES) of each pair of genes was calculated according to the relative entropy index (RES) algorithm. Finally, the critical phase of the tumor is determined by observing the changes in the relative entropy index (RES) values of each pair of genes.
The relative entropy index (RES) successfully identified the key stages before both cancers were worsened. To verify the identified critical period, kaplan-mean (log-rank) survival analysis was performed on samples before and after critical transformation (FIGS. 2 (a) -2 (e), 3 (a) -3 (g)). The prognostic life of the sample before critical transformation is generally higher than that of the sample after critical transformation. In particular, for lung squamous cell carcinoma, it can be seen from fig. 2 (c) that the survival time of the sample before the critical period (sample of stage IA-IIA) is much longer than that of the sample after the critical period (sample of stage IIB-IV), and there is a significant difference between the survival curves of the two groups of samples (significant value p=0.042). The survival curves of the samples before and after stage ii B of lung adenocarcinoma were significantly different (p=0.015, fig. 3 (g)), and the survival time of the pre-critical samples (samples of stages IA-IIB) was much longer than that of the post-critical samples (samples of stages IIIA-IV). These results indicate that the determined critical phase is accurate and closely related to prognosis.
In summary, the invention provides a calculation method based on a time difference network by utilizing the observed difference correlation information between molecules in normal and pre-disease states, which can accurately reflect the pre-disease state or predict the occurrence of serious diseases. This differential network differs from existing methods in that the skilled artisan studies the differential association (or correlation) of genes or proteins, rather than differential expression of genes or proteins. The theoretical basis for this work is the quantification of critical states using dynamic network biomarkers.
The above examples are preferred embodiments of the present invention, but the embodiments of the present invention are not limited to the above examples, and any other changes, modifications, substitutions, combinations, and simplifications that do not depart from the spirit and principle of the present invention should be made in the equivalent manner, and the embodiments are included in the protection scope of the present invention.

Claims (4)

1. The method for detecting the phase change critical point of the complex biological system based on the relative entropy index is characterized by comprising the following steps:
s1, a continuous time observation data sequence O t ={o 1 ,o 2 ,o 3 ,...,o t Conversion to a sequence of time-variant networks { DN } 2 ,DN 3 ,...,DN t-1 ,DN t };
A correlation network is established first, and the observation sequence { o } is mapped to the existing functional network, namely the STRING network at each sampling time point 1 ,o 2 ,o 3 ,...,o t Construction of a related network sequence { N } 1 ,...,N t}, wherein ,Nt Representing a correlation network at time t, each edge connecting two nodes represents a correlation between two biomolecules, while each edge connecting only one node represents a self-adjustment or variation of the biomolecules, and subsequently, a parameter α is selected such that the Pearson correlation coefficient PCC satisfies the following formula: the I PCC I is not less than alpha, wherein the parameter alpha is a parameter to be determined based on specific real data, the edges of the related network of the PCC meeting the above conditions are reserved, and the edges not meeting the above conditions are removed, so that the related network is obtained;
s2, preparing a reference sample, and taking a sample extracted in a normal period as the reference sample. For a real dataset we usually choose a sample from normal tissue as a reference sample;
s3, fitting the distribution of the biomolecules according to a reference sample, wherein the distribution is specifically as follows:
for biomolecules g i Fitting a gaussian distribution based on the expression levels in the reference samples { s1, s2,.,. Then, a k-dimensional vector is obtained
Figure FDA0002362344500000011
Figure FDA0002362344500000012
wherein ,
Figure FDA0002362344500000013
representing the biomolecule g in the kth sample i A cumulative area determined by a gaussian distribution;
s4, constructing a reference distribution P according to the following formula
Figure FDA0002362344500000021
wherein ,
Figure FDA0002362344500000022
representing the cumulative area of the biomolecules gi in the kth sample as determined by the corresponding Gaussian distribution, for distribution P there is +.>
Figure FDA0002362344500000023
S5, calculating a relative entropy index, and marking the relative entropy index as RES
Figure FDA0002362344500000024
Wherein RES_N represents the relative entropy index obtained by the normal sample,
wherein ,
Figure FDA0002362344500000025
H <u,v> (x v ,x ul ) Represents the edge feature between subject v and subject u's first normal sample, H <v,u> (x u ,x vp ) Representing the edge features between subject u and the p-th normal sample of subject v, x v All normal samples, x, representing subject v u All normal samples, x, representing subject u ul The first normal sample, x, representing subject u vs An s-th normal sample representing subject v
Figure FDA0002362344500000026
wherein ,p(xv1 ) Distribution of the 1 st normal sample representing subject v, p (x v2 ) Distribution of the 2 nd normal sample representing subject v vm ) Distribution of the mth normal sample representing subject v, p (x ul ) A distribution of the first normal sample representing subject u;
is available in the same way
Figure FDA0002362344500000027
Wherein RES_D represents the relative entropy index obtained from the disease sample, H <u,v> (y v ,y ul ) Represents the edge features between subject v and subject u's first disease sample, H <v,u> (y u ,y vs ) Representing the edge features between subject u and subject v's disease sample, y v All disease samples representing subject v, y u All disease samples representing subject u, y ul A first disease sample representing subject u, y vs Representative ofThe s-th disease sample of the study object v, P represents a discrete probability distribution, the distribution P satisfies
Figure FDA0002362344500000031
Where P (x) is the probability value expressed with reference to the xth sample, u, v represent the subject, l represents the ith sample, s represents the s-th sample.
2. The method for detecting phase transition critical points of a complex biological system based on relative entropy indexes of claim 1, wherein the method for detecting phase transition critical points of a complex biological system requires at least 3 samples.
3. The method for detecting phase transition critical points of complex biological systems based on relative entropy indexes according to claim 2, wherein the relative entropy indexes have different characteristics in different states, and the relative entropy indexes in a disease state have smaller values than those in a normal state.
4. The method for detecting phase transition critical points of complex biological systems based on relative entropy indexes according to claim 1, wherein the parameter α is selected in such a way that the difference network in the normal state has as few difference edges as possible, so as to highlight the pre-disease state with a certain number of difference edges.
CN202010025627.8A 2020-01-10 2020-01-10 Method for detecting phase change critical point of complex biological system based on relative entropy index Active CN111261243B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010025627.8A CN111261243B (en) 2020-01-10 2020-01-10 Method for detecting phase change critical point of complex biological system based on relative entropy index

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010025627.8A CN111261243B (en) 2020-01-10 2020-01-10 Method for detecting phase change critical point of complex biological system based on relative entropy index

Publications (2)

Publication Number Publication Date
CN111261243A CN111261243A (en) 2020-06-09
CN111261243B true CN111261243B (en) 2023-04-21

Family

ID=70945044

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010025627.8A Active CN111261243B (en) 2020-01-10 2020-01-10 Method for detecting phase change critical point of complex biological system based on relative entropy index

Country Status (1)

Country Link
CN (1) CN111261243B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113889180B (en) * 2021-09-30 2024-05-24 山东大学 Biomarker identification method and system based on dynamic network entropy
CN115083524A (en) * 2022-06-06 2022-09-20 华南理工大学 Method for detecting phase change critical point of complex biological system based on single cell diagram entropy

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106126893A (en) * 2016-06-17 2016-11-16 浙江大学 A kind of based on gene function related network discovery chronic disease mechanism and the method for preventive intervention procedure strategy thereof
CN106709278A (en) * 2017-01-10 2017-05-24 河南省医药科学研究院 Method for carrying out screening and functional analysis on driver genes of NSCLC (Non-Small Cell Lung Cancer)
CN107748901A (en) * 2017-11-24 2018-03-02 东北大学 The industrial process method for diagnosing faults returned based on similitude local spline
CN110379459A (en) * 2019-08-13 2019-10-25 杭州新范式生物医药科技有限公司 A kind of method and system being associated with discovery molecular marker with gene function based on transcript profile dynamic change of temporal series
CN110519637A (en) * 2019-08-27 2019-11-29 西北工业大学 The method for monitoring abnormality combined based on audio frequency and video monitoring

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106126893A (en) * 2016-06-17 2016-11-16 浙江大学 A kind of based on gene function related network discovery chronic disease mechanism and the method for preventive intervention procedure strategy thereof
CN106709278A (en) * 2017-01-10 2017-05-24 河南省医药科学研究院 Method for carrying out screening and functional analysis on driver genes of NSCLC (Non-Small Cell Lung Cancer)
CN107748901A (en) * 2017-11-24 2018-03-02 东北大学 The industrial process method for diagnosing faults returned based on similitude local spline
CN110379459A (en) * 2019-08-13 2019-10-25 杭州新范式生物医药科技有限公司 A kind of method and system being associated with discovery molecular marker with gene function based on transcript profile dynamic change of temporal series
CN110519637A (en) * 2019-08-27 2019-11-29 西北工业大学 The method for monitoring abnormality combined based on audio frequency and video monitoring

Also Published As

Publication number Publication date
CN111261243A (en) 2020-06-09

Similar Documents

Publication Publication Date Title
Tripathi et al. DeepLNC, a long non-coding RNA prediction tool using deep neural network
Hanczar et al. Small-sample precision of ROC-related estimates
Simon et al. Using cross-validation to evaluate predictive accuracy of survival risk classifiers based on high-dimensional data
Lebre et al. Statistical inference of the time-varying structure of gene-regulation networks
CN111261243B (en) Method for detecting phase change critical point of complex biological system based on relative entropy index
CN109063416B (en) Gene expression prediction technique based on LSTM Recognition with Recurrent Neural Network
RU2517286C2 (en) Classification of samples data
JP2018532214A (en) Integrated method and system for identifying functional patient-specific somatic abnormalities using multi-omic cancer profiles
Noviello et al. Deep learning predicts short non-coding RNA functions from only raw sequence data
CN116741397B (en) Cancer typing method, system and storage medium based on multi-group data fusion
WO2024027032A1 (en) Method and system for evaluating tumor formation risk and tumor tissue source
Zerzucha et al. Dissimilarity partial least squares applied to non-linear modeling problems
Liu et al. Identifying critical state of complex diseases by single-sample-based hidden markov model
Melnyk et al. GraphKKE: graph Kernel Koopman embedding for human microbiome analysis
CN111009292B (en) Method for detecting phase transition critical point of complex biological system based on single sample sKLD index
WO2023196928A2 (en) True variant identification via multianalyte and multisample correlation
Lovino et al. Multi-omics classification on kidney samples exploiting uncertainty-aware models
Jardillier et al. Benchmark of lasso-like penalties in the Cox model for TCGA datasets reveal improved performance with pre-filtering and wide differences between cancers
Qiu et al. Unsupervised learning framework with multidimensional scaling in predicting epithelial-mesenchymal transitions
Prathik et al. Prediction of carcinoma cancer type using deep reinforcement learning technique from gene expression data
KR102376212B1 (en) Gene expression marker screening method using neural network based on gene selection algorithm
Ginanjar et al. The best architecture selection with deep neural network (DNN) method for breast cancer classification using MicroRNA data
Zhao et al. SEBGLMA: Semantic Embedded Bipartite Graph Network for Predicting lncRNA‐miRNA Associations
KR102659915B1 (en) Method of gene selection for predicting medical information of patients and uses thereof
Spirko Variable selection and supervised dimension reduction for large-scale genomic data with censored survival outcomes

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant