CN117854723A - Method, device, equipment and medium for risk stratification of diffuse large B cell lymphoma - Google Patents
Method, device, equipment and medium for risk stratification of diffuse large B cell lymphoma Download PDFInfo
- Publication number
- CN117854723A CN117854723A CN202311747476.7A CN202311747476A CN117854723A CN 117854723 A CN117854723 A CN 117854723A CN 202311747476 A CN202311747476 A CN 202311747476A CN 117854723 A CN117854723 A CN 117854723A
- Authority
- CN
- China
- Prior art keywords
- risk stratification
- gene
- cell lymphoma
- diffuse large
- risk
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000013517 stratification Methods 0.000 title claims abstract description 165
- 206010012818 diffuse large B-cell lymphoma Diseases 0.000 title claims abstract description 130
- 208000031671 Large B-Cell Diffuse Lymphoma Diseases 0.000 title claims abstract description 114
- 238000000034 method Methods 0.000 title claims abstract description 47
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 147
- 238000004393 prognosis Methods 0.000 claims abstract description 54
- 238000012216 screening Methods 0.000 claims abstract description 15
- 230000007614 genetic variation Effects 0.000 claims description 24
- 230000004547 gene signature Effects 0.000 claims description 3
- 230000002068 genetic effect Effects 0.000 abstract description 8
- 101150044508 key gene Proteins 0.000 description 6
- 238000004891 communication Methods 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 208000015914 Non-Hodgkin lymphomas Diseases 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 3
- 230000035772 mutation Effects 0.000 description 3
- 102000003855 L-lactate dehydrogenase Human genes 0.000 description 2
- 108700023483 L-lactate dehydrogenases Proteins 0.000 description 2
- 238000003759 clinical diagnosis Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000005065 mining Methods 0.000 description 2
- 210000002569 neuron Anatomy 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000007670 refining Methods 0.000 description 2
- 229940088872 Apoptosis inhibitor Drugs 0.000 description 1
- 101000971171 Homo sapiens Apoptosis regulator Bcl-2 Proteins 0.000 description 1
- 102100038895 Myc proto-oncogene protein Human genes 0.000 description 1
- 101710135898 Myc proto-oncogene protein Proteins 0.000 description 1
- 206010028980 Neoplasm Diseases 0.000 description 1
- 108010051791 Nuclear Antigens Proteins 0.000 description 1
- 102000019040 Nuclear Antigens Human genes 0.000 description 1
- 108700020796 Oncogene Proteins 0.000 description 1
- 102100023935 Transmembrane glycoprotein NMB Human genes 0.000 description 1
- 239000000158 apoptosis inhibitor Substances 0.000 description 1
- 230000008236 biological pathway Effects 0.000 description 1
- 201000011510 cancer Diseases 0.000 description 1
- 230000002759 chromosomal effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 210000002364 input neuron Anatomy 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 239000002773 nucleotide Substances 0.000 description 1
- 125000003729 nucleotide group Chemical group 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 210000000056 organ Anatomy 0.000 description 1
- 230000007170 pathology Effects 0.000 description 1
- 230000037361 pathway Effects 0.000 description 1
- 230000007105 physical stamina Effects 0.000 description 1
- 238000013139 quantization Methods 0.000 description 1
- 230000008707 rearrangement Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000012502 risk assessment Methods 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 230000004083 survival effect Effects 0.000 description 1
- 230000004797 therapeutic response Effects 0.000 description 1
- 108091007466 transmembrane glycoproteins Proteins 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
Abstract
The invention discloses a diffuse large B cell lymphoma risk stratification method, a device, equipment and a medium, wherein the method comprises the following steps: determining the influence value of a gene variation characteristic set through a preset diffuse large B cell lymphoma prognosis model; screening a high-influence gene feature set according to the influence value; quantifying a high-influence gene feature set of a patient through an influence value, and determining a risk stratification index; a prognostic risk level for the patient is determined based on the risk stratification index and a preset threshold. Because the high-influence gene feature set is screened out by using the preset diffuse large B cell lymphoma prognosis model, and the risk stratification index is determined according to the variation condition of the high-influence gene feature set of the patient, compared with the traditional international prognosis risk index for carrying out risk stratification based on the phenotype information of the patient, the invention can realize risk stratification based on the genetic information of the diffuse large B cell lymphoma patient, refines the means of risk stratification and improves the accuracy of risk stratification.
Description
Technical Field
The invention relates to the technical field of cancer risk stratification, in particular to a diffuse large B cell lymphoma risk stratification method, a diffuse large B cell lymphoma risk stratification device, diffuse large B cell lymphoma risk stratification equipment and a diffuse large B cell lymphoma risk stratification medium.
Background
Diffuse large B-cell lymphomas (Diffuse Large B Cell Lymphoma, DLBCL) are one of the most common subtypes of Non-Hodgkin's lymphomas, NHL, accounting for about 30% -50% of all NHL cases. However, DLBCL is highly heterogeneous, and different patients exhibit great differences in clinical characteristics, treatment response, survival risk, and the like, which presents great challenges for clinical diagnosis and treatment.
In order to improve the treatment effect, doctors can formulate a personalized treatment scheme according to the risk classification and molecular prognosis indexes of patients in the clinical diagnosis and treatment process. The most widely used clinical prognostic risk assessment system at present is the international prognostic index (International Prognostic Index, IPI), which classifies patients into 4 risk classes based on 5 factors of age, clinical stage, lactate dehydrogenase (Lactate dehydrogenase, LDH), number of extra-corporal organs involved, physical stamina. Over 20 years of development, the prognosis evaluation system for DLBCL has also been improved and refined, with representative new methods of age-adjusted IPI (aa-IPI), improved IPI for therapeutic response (R-IPI), and NCCN-IPI refining age and LDH indicators. However, these assessment systems generally divide patients into 3-4 risk levels, and do not adequately account for clinical heterogeneity of patients. More importantly, they rely on only a few phenotypic information of the patient for evaluation. With the development of molecular pathology, molecular prognostic indicators of DLBCL are gaining increasing attention. At present, the research shows that the molecules with independent prognostic significance comprise a nuclear antigen KI-67, a transmembrane glycoprotein CD5, a protooncogene C-MYC, an oncogene P53 and an apoptosis inhibitor gene BCL-2. However, the explanation for the heterogeneity of DLBCL by these single molecular prognostic indicators remains quite limited.
The foregoing is provided merely for the purpose of facilitating understanding of the technical solutions of the present invention and is not intended to represent an admission that the foregoing is prior art.
Disclosure of Invention
The invention mainly aims to provide a diffuse large B cell lymphoma risk stratification method, device, equipment and medium, and aims to solve the technical problem that the traditional international prognosis risk index for risk stratification based on patient phenotype information has limited heterogeneity interpretation of DLBCL.
To achieve the above object, the present invention provides a diffuse large B-cell lymphoma risk stratification method comprising the steps of:
determining the influence value of a gene variation characteristic set through a preset diffuse large B cell lymphoma prognosis model;
screening the high-influence gene feature set of the gene variation feature set according to the influence value;
quantifying the high-influence gene feature set of the patient through the influence value, and determining a risk stratification index;
and determining a prognosis risk level of the patient based on the risk stratification index and a preset threshold.
Optionally, the determining the influence value of the genetic variation feature set by presetting a diffuse large B-cell lymphoma prognosis model includes:
determining SHAP values of a gene variation characteristic set through a preset diffuse large B cell lymphoma prognosis model;
determining a variant case set and a variant case number based on a clinical case set of a diffuse large B-cell lymphoma patient;
and calculating the SHAP value, the variation sample set and the variation sample number through a preset characteristic influence formula, and determining the influence value of the gene variation characteristic set.
Optionally, the preset feature influence formula is:
wherein, impact j An influence value indicating the variation characteristics of the jth gene, N j Representing the number of variation cases with the j-th characteristic value of 1, C j A variation case set s representing the jth characteristic value of 1 ij SHAP value representing the j-th feature of the i-th clinical case, the feature value representing whether the gene is mutated or not, 1 representing that the gene is mutated, and 0 representing that the gene is not mutated.
Optionally, said quantifying said high-impact gene feature set of the patient by said impact value, determining a risk stratification index, comprises:
obtaining a genetic variation characteristic value of a clinical case set of the diffuse large B cell lymphoma patient;
and calculating the influence value, the high influence gene characteristic set and the gene variation characteristic value through a preset risk stratification index formula to determine the risk stratification index of the patient.
Optionally, the preset risk stratification index formula is:
wherein index-rs i Risk stratification index indicating the ith case, F top Representing the high-impact gene feature set, f ij The j-th eigenvalue (0 or 1) of the i-th clinical case, wherein the eigenvalue indicates whether the gene is mutated or not, 1 indicates that the gene is mutated, and 0 indicates that the gene is not mutated.
Optionally, the determining a prognostic risk level of the patient based on the risk stratification index and a preset threshold value comprises:
determining two preset thresholds of risk stratification;
determining a prognostic risk level for the patient based on the risk stratification index and the two preset thresholds, the prognostic risk level being:
wherein, level i For the prognostic risk level for the ith case, t1 and t2 are two preset thresholds for risk stratification.
Optionally, the high-impact gene feature set further comprises a high-risk gene variation set and a low-risk gene variation set.
In addition, in order to achieve the above object, the present invention also proposes a diffuse large B-cell lymphoma risk stratification device comprising:
the influence module is used for determining the influence value of the gene variation characteristic set through a preset diffuse large B cell lymphoma prognosis model;
the gene screening module is used for screening the high-influence gene feature set of the gene variation feature set according to the influence value;
the risk index module is used for quantifying the high-influence gene feature set of the patient through the influence value and determining a risk stratification index;
and the risk stratification module is used for determining a prognosis risk level of the patient based on the risk stratification index and a preset threshold value.
In addition, to achieve the above object, the present invention also proposes a diffuse large B-cell lymphoma risk stratification device comprising: a memory, a processor, and a diffuse large B-cell lymphoma risk stratification program stored on the memory and executable on the processor, the diffuse large B-cell lymphoma risk stratification program configured to implement the steps of the diffuse large B-cell lymphoma risk stratification method as described above.
Furthermore, to achieve the above object, the present invention also proposes a medium having stored thereon a diffuse large B-cell lymphoma risk stratification procedure which when executed by a processor implements the steps of the diffuse large B-cell lymphoma risk stratification method as described above.
Firstly, determining an influence value of a gene variation characteristic set by presetting a diffuse large B cell lymphoma prognosis model; then screening the high-influence gene feature set of the gene variation feature set according to the influence value; quantifying the high-influence gene feature set of the patient through the influence value, and determining a risk stratification index; and finally, determining a prognosis risk level of the patient based on the risk stratification index and a preset threshold value. Because the invention utilizes the preset diffuse large B cell lymphoma prognosis model to screen out the high-influence gene feature set, and then determines the risk stratification index according to the variation condition of the high-influence gene feature set of the patient, compared with the traditional international prognosis risk index (IPI) for carrying out risk stratification based on the phenotype information of the patient, the invention can realize risk stratification based on the genetic information of the diffuse large B cell lymphoma patient, refines the means of risk stratification and improves the accuracy of risk stratification.
Drawings
FIG. 1 is a schematic diagram of the architecture of a diffuse large B-cell lymphoma risk stratification device of a hardware operating environment according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart of a first embodiment of a diffuse large B-cell lymphoma risk stratification method according to the present invention;
FIG. 3 is a schematic flow chart of a second embodiment of a diffuse large B-cell lymphoma risk stratification method according to the present invention;
fig. 4 is a block diagram of a first embodiment of a diffuse large B-cell lymphoma risk stratification device according to the invention.
The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
Referring to fig. 1, fig. 1 is a schematic structural diagram of a diffuse large B-cell lymphoma risk stratification device of a hardware operating environment according to an embodiment of the present invention.
As shown in fig. 1, the diffuse large B-cell lymphoma risk stratification device may comprise: a processor 1001, such as a central processing unit (Central Processing Unit, CPU), a communication bus 1002, a user interface 1003, a network interface 1004, a memory 1005. Wherein the communication bus 1002 is used to enable connected communication between these components. The user interface 1003 may include a Display, an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may further include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a Wireless interface (e.g., a Wireless-Fidelity (Wi-Fi) interface). The Memory 1005 may be a high-speed random access Memory (Random Access Memory, RAM) or a stable nonvolatile Memory (NVM), such as a disk Memory. The memory 1005 may also optionally be a storage device separate from the processor 1001 described above.
Those skilled in the art will appreciate that the structure shown in fig. 1 does not constitute a limitation of the diffuse large B-cell lymphoma risk stratification device, and may comprise more or fewer components than shown, or may combine certain components, or may be a different arrangement of components.
As shown in fig. 1, an operating system, a network communication module, a user interface module, and a diffuse large B-cell lymphoma risk stratification procedure may be included in the memory 1005 as one storage medium.
In the diffuse large B-cell lymphoma risk stratification device shown in fig. 1, the network interface 1004 is mainly used for data communication with a network server; the user interface 1003 is mainly used for data interaction with a user; the processor 1001 and the memory 1005 in the diffuse large B-cell lymphoma risk stratification device of the present invention may be disposed in the diffuse large B-cell lymphoma risk stratification device, and the diffuse large B-cell lymphoma risk stratification device invokes the diffuse large B-cell lymphoma risk stratification program stored in the memory 1005 through the processor 1001, and executes the diffuse large B-cell lymphoma risk stratification method provided by the embodiment of the present invention.
The embodiment of the invention provides a diffuse large B cell lymphoma risk stratification method, and referring to fig. 2, fig. 2 is a flow chart of a first embodiment of the diffuse large B cell lymphoma risk stratification method.
In this embodiment, the diffuse large B-cell lymphoma risk stratification method comprises the following steps:
step S10: and determining the influence value of the gene variation characteristic set through a preset diffuse large B cell lymphoma prognosis model.
It should be noted that, the execution subject of the method of the present embodiment may be a computing service device with functions of gene screening, numerical quantization and risk stratification, such as a personal computer, a server, etc., or may be other electronic devices capable of implementing the same or similar functions, such as the diffuse large B-cell lymphoma risk stratification device described above, which is not limited in this embodiment. Here, the present embodiment and each of the following embodiments will be specifically described with the above diffuse large B-cell lymphoma risk stratification device (abbreviated as risk stratification device).
It can be understood that the preset diffuse large B-cell lymphoma prognosis model is a model constructed and trained based on a visual neural network (Visible neural network, VNN) and a fully connected neural network using genetic variation information and basic clinical information of DLBCL patients. Under the visualization structure, the input neurons are mapped to genetic variation features, the hidden layer neurons are mapped to biological pathways, and the model-based VNN can improve the interpretability to assist in genetically driven mining.
It is understood that the set of genetic variation characteristics is a pre-selected or defined set of genetic variation characteristics associated with the prognostic impact of diffuse large B-cell lymphoma patients, which characteristics may include Single Nucleotide Variation (SNV), insertion/deletion (Indel), chromosomal structural variation, gene rearrangement, copy Number Variation (CNV), and the like of different types of genetic variation.
It can be understood that, according to the SHAP of the post-model interpretation method, SHAP values of the gene variation feature set are calculated, and then, according to the SHAP values, influence values of each gene variation in the gene variation feature set on the prognosis risk of the patient, wherein the influence values are values reflecting the importance of the gene variation feature, and based on the absolute value of the influence values, genes with high influence in the gene variation feature set of diffuse large B cell lymphoma can be screened out.
Step S20: and screening the high-influence gene feature set of the gene variation feature set according to the influence value.
Further, the high-impact gene feature set in this embodiment further includes a high-risk gene variation set and a low-risk gene variation set. For ease of understanding, as shown in table 1, it is assumed that 30 high-impact genes were selected, of which 17 high-risk variant genes that increase patient risk (impact greater than 0) and 13 low-risk variant genes that decrease patient risk (impact less than 0), according to the impact values shown in the following table.
Table 1: high influence gene characteristic base and corresponding influence value thereof.
Step S30: and quantifying the high-influence gene feature set of the patient through the influence value, and determining a risk stratification index.
It should be noted that, the risk stratification index is a calculated index for risk stratification of patients with diffuse large B-cell lymphoma, and since the risk differences of some patients are difficult to distinguish by phenotypic characteristics, the risk levels of different patients within the same IPI rating can be further refined by the risk stratification index.
Step S40: and determining a prognosis risk level of the patient based on the risk stratification index and a preset threshold.
It should be noted that the preset threshold is a preset threshold for patient stratification, for example, -0.015, 0.025, etc., and may be set according to the actual stratification condition, which is not limited in this embodiment.
In a specific implementation, the risk stratification device can determine the influence value of the genetic variation feature set by presetting a diffuse large B-cell lymphoma prognosis model. And screening out a high-influence gene characteristic set in the gene variation characteristic set of the diffuse large B cell lymphoma based on the absolute value of the influence value. The high-impact gene feature set of the patient can then be quantified by the impact value to determine a risk stratification index. And finally, determining a prognosis risk level of the patient based on the risk stratification index and a preset threshold value, and further refining the risk levels of different patients in the same IPI grade.
The risk stratification device of the embodiment can determine the influence value of the genetic variation characteristic set by presetting a diffuse large B-cell lymphoma prognosis model. And screening out a high-influence gene characteristic set in the gene variation characteristic set of the diffuse large B cell lymphoma based on the absolute value of the influence value. The high-impact gene feature set of the patient can then be quantified by the impact value to determine a risk stratification index. And finally, determining a prognosis risk level of the patient based on the risk stratification index and a preset threshold value. Because the high-influence gene feature set is screened out by using the preset diffuse large B-cell lymphoma prognosis model, and the risk stratification index is determined according to the variation condition of the high-influence gene feature set of the patient, compared with the traditional international prognosis risk index (IPI) for carrying out risk stratification based on the phenotype information of the patient, the invention can realize risk stratification based on the genetic information of the diffuse large B-cell lymphoma patient, refines the means of risk stratification and improves the accuracy of risk stratification.
Referring to fig. 3, fig. 3 is a flow chart illustrating a second embodiment of the diffuse large B-cell lymphoma risk stratification method of the present invention.
Based on the first embodiment, in this embodiment, the step S10 includes:
step S11: and determining the SHAP value of the gene variation characteristic set by presetting a diffuse large B cell lymphoma prognosis model.
The SHAP value is a value for explaining the prediction result of the predictive model of the preset diffuse large B-cell lymphoma. Wherein, based on the interpretability of SHAP, the degree and direction of the influence of a certain gene on the prognosis of a patient in the gene variation characteristic set can be clearly determined, thereby improving the mining efficiency of genetic driving. Significance scores can be calculated for each input gene signature and gene pathways therein as hidden layer neurons using the Python package shape (V0.41.0), which can accurately reflect the impact of the signature on patient prognosis. Thus, the key genes that contribute most to model prediction can be determined based on the SHAP values, and the risk trend of the genes can be distinguished according to the positive and negative of the SHAP values.
Step S12: and determining a variation case set and the number of variation cases based on the clinical case set of the diffuse large B cell lymphoma patient.
Step S13: and calculating the SHAP value, the variation sample set and the variation sample number through a preset characteristic influence formula, and determining the influence value of the gene variation characteristic set.
It should be noted that the clinical case set is clinical treatment information of patients with diffuse large B-cell lymphoma, 928 cases of case data set disclosed in DLBCL related study by Stuart et al may be used, in which mutation spectra (mutation or non-mutation information) of 117 genes may be included, and other similar related data may be used, which is not limited in this embodiment. The preset characteristic influence formula is a formula for calculating an influence value of the genetic variation characteristic set.
In practical implementation, a preset diffuse large B-cell lymphoma prognosis model can be constructed by using the genetic variation characteristics of DLBCL patients and the clinical case set. According to the SHAP of the post-model interpretation method, SHAP values of each input feature (gene variation feature) are calculated by using Python package shape (V0.41.0), then the influence of each gene variation in the gene variation feature on the prognosis risk of the patient can be calculated, and 30 genes with high influence on the prognosis risk of the DLBCL patient can be screened accordingly.
The most common method of selecting important features based on the SHAP values is to calculate the sum of the absolute values of the SHAP values for all samples, but some low frequency genetic variations may be ignored due to the small number. In order to better identify genetic features that significantly affect the risk of prognosis of a patient, including genes that have a lower frequency of variation, a subset of cases where a particular genetic variation exists may be used to improve the method of quantifying gene impact.
Further, the preset characteristic influence formula in this embodiment is:
wherein, impact j An influence value indicating the variation characteristics of the jth gene, N j Representing the number of variation cases with the j-th characteristic value of 1, C j A variation case set s representing the jth characteristic value of 1 ij SHAP value representing the j-th feature of the i-th clinical case, the feature value representing whether the gene is mutated or not, 1 representing that the gene is mutated, and 0 representing that the gene is not mutated.
In a specific implementation, the risk stratification device can determine the SHAP value of the genetic variation feature set by presetting a diffuse large B-cell lymphoma prognosis model; then, based on clinical case sets of diffuse large B cell lymphoma patients, a variation case set and a variation case number are determined. And finally, calculating the SHAP value, the variation sample set and the variation sample number through the preset characteristic influence formula, and determining the influence value of the gene variation characteristic set. Therefore, the influence value of the genetic variation on the prognosis risk of the patient can be calculated by using a preset diffuse large B cell lymphoma prognosis model and SHAP values, and 30 high influence genes of the prognosis risk of the DLBCL patient can be screened accordingly.
Further, in the present embodiment, step S30 includes: obtaining a genetic variation characteristic value of a clinical case set of the diffuse large B cell lymphoma patient; and calculating the influence value, the high influence gene characteristic set and the gene variation characteristic value through a preset risk stratification index formula to determine the risk stratification index of the patient.
It should be noted that the preset risk stratification index formula is a formula for calculating the prognostic risk stratification index of patients suffering from diffuse large B-cell lymphoma.
In practical implementation, the existence or non-existence of key gene variation in the gene variation characteristic set and the influence value of the key gene variation on the prognosis risk of the patient can be used as the basis for calculating the risk stratification index of the diffuse large B cell lymphoma patient.
Further, in this embodiment, the preset risk stratification index formula is:
wherein index-rs i Risk stratification index indicating the ith case, F top Representing the high-impact gene feature set, f ij The j-th eigenvalue (0 or 1) of the i-th clinical case, wherein the eigenvalue indicates whether the gene is mutated or not, 1 indicates that the gene is mutated, and 0 indicates that the gene is not mutated.
After determining the risk stratification index of patients with diffuse large B-cell lymphomas according to the preset risk stratification index formula described above, the risk stratification of different patients within the same IPI rating may be further refined to distinguish risk differences between patients that are difficult to distinguish by phenotypic characteristics.
Further, in the present embodiment, step S40 includes: determining two preset thresholds of risk stratification; determining a prognostic risk level for the patient based on the risk stratification index and the two preset thresholds, the prognostic risk level being:
wherein, level i For the prognostic risk level for the ith case, t1 and t2 are two preset thresholds for risk stratification.
It should be noted that t1 and t2 are two thresholds for risk stratification, and the threshold value may be determined based on the risk stratification effect of the patient, e.g. t1 and t2 may be-0.015 and 0.025, respectively.
In a specific implementation, the risk stratification device may obtain a characteristic value of genetic variation of a clinical case set of the diffuse large B-cell lymphoma patient. And then using the existence or non-existence of key gene variation in the gene variation characteristic set and the influence value of the key gene variation on the prognosis risk of the patient as the basis for calculating the risk stratification index of the patient suffering from diffuse large B cell lymphoma. And calculating the influence value, the high influence gene characteristic set and the gene variation characteristic value through the preset risk stratification index formula to determine the risk stratification index of the patient. Therefore, the risk stratification index of the DLBCL patient can be calculated by utilizing the high-influence gene feature set, the risk stratification based on genetic information is realized, and the phenotype-based IPI risk stratification is supplemented.
The risk stratification device of the embodiment can determine the SHAP value of the genetic variation feature set by presetting a diffuse large B-cell lymphoma prognosis model. Then, based on clinical case sets of diffuse large B cell lymphoma patients, a variation case set and a variation case number are determined. And finally, calculating the SHAP value, the variation sample set and the variation sample number through the preset characteristic influence formula, determining the influence value of the gene variation characteristic set, and screening 30 high influence genes of the prognosis risk of the DLBCL patient according to the influence value. Still further, the risk stratification device may obtain a characteristic value of genetic variation of the clinical case set of the diffuse large B-cell lymphoma patient. And then using the existence or non-existence of key gene variation in the gene variation characteristic set and the influence value of the key gene variation on the prognosis risk of the patient as the basis for calculating the risk stratification index of the patient suffering from diffuse large B cell lymphoma. And calculating the influence value, the high influence gene characteristic set and the gene variation characteristic value through the preset risk stratification index formula to determine the risk stratification index of the patient. Therefore, the risk stratification index of the DLBCL patient can be calculated by utilizing the high-influence gene feature set, the risk stratification based on genetic information is realized, and the phenotype-based IPI risk stratification is supplemented.
In addition, the embodiment of the invention also provides a medium, wherein the medium is a storage medium, and the storage medium is stored with a diffuse large B cell lymphoma risk stratification program, and the diffuse large B cell lymphoma risk stratification program realizes the steps of the diffuse large B cell lymphoma risk stratification method when being executed by a processor.
Referring to fig. 4, fig. 4 is a block diagram showing the construction of a first embodiment of the diffuse large B-cell lymphoma risk stratification device of the present invention.
As shown in fig. 4, the diffuse large B-cell lymphoma risk stratification device according to the embodiment of the present invention comprises:
the influence module 401 is used for determining an influence value of the gene variation characteristic set through a preset diffuse large B-cell lymphoma prognosis model;
a gene screening module 402, configured to screen the high-impact gene feature set of the gene variation feature set according to the impact value;
a risk index module 403, configured to quantify the high-impact gene feature set of the patient according to the impact value, and determine a risk stratification index;
a risk stratification module 404 for determining a prognostic risk level for the patient based on the risk stratification index and a preset threshold.
The risk stratification device of the embodiment can determine the influence value of the genetic variation characteristic set by presetting a diffuse large B-cell lymphoma prognosis model. And screening out a high-influence gene characteristic set in the gene variation characteristic set of the diffuse large B cell lymphoma based on the absolute value of the influence value. The high-impact gene feature set of the patient can then be quantified by the impact value to determine a risk stratification index. And finally, determining a prognosis risk level of the patient based on the risk stratification index and a preset threshold value. Because the high-influence gene feature set is screened out by using the preset diffuse large B-cell lymphoma prognosis model, and the risk stratification index is determined according to the variation condition of the high-influence gene feature set of the patient, compared with the traditional international prognosis risk index (IPI) for carrying out risk stratification based on the phenotype information of the patient, the invention can realize risk stratification based on the genetic information of the diffuse large B-cell lymphoma patient, refines the means of risk stratification and improves the accuracy of risk stratification.
Based on the first embodiment of the diffuse large B-cell lymphoma risk stratification device of the present invention described above, a second embodiment of the diffuse large B-cell lymphoma risk stratification device of the present invention is presented.
In this embodiment, the influence module 401 is further configured to determine SHAP values of the genetic variation feature set by presetting a diffuse large B-cell lymphoma prognosis model; determining a variant case set and a variant case number based on a clinical case set of a diffuse large B-cell lymphoma patient; and calculating the SHAP value, the variation sample set and the variation sample number through a preset characteristic influence formula, and determining the influence value of the gene variation characteristic set.
Further, the preset characteristic influence formula is:
wherein, impact j An influence value indicating the variation characteristics of the jth gene, N j Representing the number of variation cases with the j-th characteristic value of 1, C j A variation case set s representing the jth characteristic value of 1 ij SHAP value representing the j-th feature of the i-th clinical case, the feature value representing whether the gene is mutated or not, 1 representing that the gene is mutated, and 0 representing that the gene is not mutated.
Further, the risk index module 403 is further configured to obtain a genetic variation characteristic value of the clinical case set of the diffuse large B-cell lymphoma patient; and calculating the influence value, the high influence gene characteristic set and the gene variation characteristic value through a preset risk stratification index formula to determine the risk stratification index of the patient.
Further, the preset risk stratification index formula is:
wherein index-rs i Risk stratification index indicating the ith case, F top Representing the high-impact gene feature set, f ij The j-th eigenvalue (0 or 1) of the i-th clinical case, wherein the eigenvalue indicates whether the gene is mutated or not, 1 indicates that the gene is mutated, and 0 indicates that the gene is not mutated.
Further, the risk stratification module 404 is further configured to determine two preset thresholds of risk stratification; determining a prognostic risk level for the patient based on the risk stratification index and the two preset thresholds, the prognostic risk level being:
wherein, level i For the prognostic risk level for the ith case, t1 and t2 are two preset thresholds for risk stratification.
Further, the high-impact gene feature set further comprises a high-risk gene variation set and a low-risk gene variation set.
Other embodiments or specific implementation manners of the diffuse large B-cell lymphoma risk stratification device of the present invention may refer to the above-mentioned method embodiments, and will not be described herein.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.
The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. read-only memory/random-access memory, magnetic disk, optical disk), comprising instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method according to the embodiments of the present invention.
The foregoing description is only of the preferred embodiments of the present invention, and is not intended to limit the scope of the invention, but rather is intended to cover any equivalents of the structures or equivalent processes disclosed herein or in the alternative, which may be employed directly or indirectly in other related arts.
Claims (10)
1. A diffuse large B-cell lymphoma risk stratification method, characterized in that the diffuse large B-cell lymphoma risk stratification method comprises:
determining the influence value of a gene variation characteristic set through a preset diffuse large B cell lymphoma prognosis model;
screening the high-influence gene feature set of the gene variation feature set according to the influence value;
quantifying the high-influence gene feature set of the patient through the influence value, and determining a risk stratification index;
and determining a prognosis risk level of the patient based on the risk stratification index and a preset threshold.
2. The method for risk stratification of diffuse large B-cell lymphoma of claim 1 wherein said determining an impact value of a set of genetic variation features by pre-setting a diffuse large B-cell lymphoma prognosis model comprises:
determining SHAP values of a gene variation characteristic set through a preset diffuse large B cell lymphoma prognosis model;
determining a variant case set and a variant case number based on a clinical case set of a diffuse large B-cell lymphoma patient;
and calculating the SHAP value, the variation sample set and the variation sample number through a preset characteristic influence formula, and determining the influence value of the gene variation characteristic set.
3. The diffuse large B-cell lymphoma risk stratification method of claim 2 wherein said predetermined characteristic influence formula is:
wherein, impact j An influence value indicating the variation characteristics of the jth gene, N j Representing the number of variation cases with the j-th characteristic value of 1, C j A variation case set s representing the jth characteristic value of 1 ij SHAP value representing the j-th feature of the i-th clinical case, the feature value representing whether the gene is mutated or not, 1 representing that the gene is mutated, and 0 representing that the gene is not mutated.
4. The diffuse large B-cell lymphoma risk stratification method of claim 3 wherein said quantifying said high-impact gene signature set of a patient by said impact value comprises:
obtaining a genetic variation characteristic value of a clinical case set of the diffuse large B cell lymphoma patient;
and calculating the influence value, the high influence gene characteristic set and the gene variation characteristic value through a preset risk stratification index formula to determine the risk stratification index of the patient.
5. The diffuse large B-cell lymphoma risk stratification method of claim 4 wherein said predetermined risk stratification index formula is:
wherein index-rs i Risk stratification index indicating the ith case, F top Representing the high-impact gene feature set, f ij The j-th eigenvalue (0 or 1) of the i-th clinical case, wherein the eigenvalue indicates whether the gene is mutated or not, 1 indicates that the gene is mutated, and 0 indicates that the gene is not mutated.
6. The diffuse large B-cell lymphoma risk stratification method of claim 5, wherein said determining a prognostic risk level for a patient based on said risk stratification index and a preset threshold comprises:
determining two preset thresholds of risk stratification;
determining a prognostic risk level for the patient based on the risk stratification index and the two preset thresholds, the prognostic risk level being:
wherein, level i For the prognostic risk level for the ith case, t1 and t2 are two preset thresholds for risk stratification.
7. The diffuse large B-cell lymphoma risk stratification method of any one of claims 1-6 wherein said high-impact gene signature set further comprises a high-risk gene variation set and a low-risk gene variation set.
8. A diffuse large B-cell lymphoma risk stratification device, comprising:
the influence module is used for determining the influence value of the gene variation characteristic set through a preset diffuse large B cell lymphoma prognosis model;
the gene screening module is used for screening the high-influence gene feature set of the gene variation feature set according to the influence value;
the risk index module is used for quantifying the high-influence gene feature set of the patient through the influence value and determining a risk stratification index;
and the risk stratification module is used for determining a prognosis risk level of the patient based on the risk stratification index and a preset threshold value.
9. A diffuse large B-cell lymphoma risk stratification device, characterized in that said device comprises: a memory, a processor, and a diffuse large B-cell lymphoma risk stratification program stored on the memory and executable on the processor, the diffuse large B-cell lymphoma risk stratification program configured to implement the steps of the diffuse large B-cell lymphoma risk stratification method according to any one of claims 1-7.
10. A medium having stored thereon a diffuse large B-cell lymphoma risk stratification procedure which when executed by a processor implements the steps of the diffuse large B-cell lymphoma risk stratification method according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311747476.7A CN117854723A (en) | 2023-12-18 | 2023-12-18 | Method, device, equipment and medium for risk stratification of diffuse large B cell lymphoma |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311747476.7A CN117854723A (en) | 2023-12-18 | 2023-12-18 | Method, device, equipment and medium for risk stratification of diffuse large B cell lymphoma |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117854723A true CN117854723A (en) | 2024-04-09 |
Family
ID=90544210
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311747476.7A Pending CN117854723A (en) | 2023-12-18 | 2023-12-18 | Method, device, equipment and medium for risk stratification of diffuse large B cell lymphoma |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117854723A (en) |
-
2023
- 2023-12-18 CN CN202311747476.7A patent/CN117854723A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Alachiotis et al. | RAiSD detects positive selection based on multiple signatures of a selective sweep and SNP vectors | |
Chun et al. | Limited statistical evidence for shared genetic effects of eQTLs and autoimmune-disease-associated loci in three major immune-cell types | |
Zhu et al. | Statistical methods for SNP heritability estimation and partition: A review | |
Carmi et al. | Sequencing an Ashkenazi reference panel supports population-targeted personal genomics and illuminates Jewish and European origins | |
US10127353B2 (en) | Method and systems for querying sequence-centric scientific information | |
Hamid et al. | Data integration in genetics and genomics: methods and challenges | |
US20190114219A1 (en) | Error correction in ancestry classification | |
Ronen et al. | Learning natural selection from the site frequency spectrum | |
Favorov et al. | A Markov chain Monte Carlo technique for identification of combinations of allelic variants underlying complex diseases in humans | |
Marchini et al. | Genotype imputation for genome-wide association studies | |
Adams et al. | Maximum-likelihood estimation of demographic parameters using the frequency spectrum of unlinked single-nucleotide polymorphisms | |
US9213947B1 (en) | Scalable pipeline for local ancestry inference | |
US10777302B2 (en) | Identifying variants of interest by imputation | |
Schrider | Background selection does not mimic the patterns of genetic diversity produced by selective sweeps | |
JP2005512175A (en) | A method for identifying genetic features of complex genetic classifiers | |
Hao et al. | Extending tests of Hardy–Weinberg equilibrium to structured populations | |
Li et al. | Estimation of quantitative trait locus effects with epistasis by variational Bayes algorithms | |
Tang et al. | A review of SNP heritability estimation methods | |
Li et al. | Performance‐weighted‐voting model: An ensemble machine learning method for cancer type classification using whole‐exome sequencing mutation | |
Kang et al. | Practical issues in building risk-predicting models for complex diseases | |
DeGroat et al. | Discovering biomarkers associated and predicting cardiovascular disease with high accuracy using a novel nexus of machine learning techniques for precision medicine | |
DeGiorgio et al. | A spatially aware likelihood test to detect sweeps from haplotype distributions | |
Hettiarachchi et al. | GWAS to identify SNPs associated with common diseases and individual risk: Genome Wide Association Studies (GWAS) to identify SNPs associated with common diseases and individual risk | |
Tang et al. | Identification of genes and haplotypes that predict rheumatoid arthritis using random forests | |
Ackermann et al. | Teamwork: improved eQTL mapping using combinations of machine learning methods |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination |