CN117854723A - Method, device, equipment and medium for risk stratification of diffuse large B cell lymphoma - Google Patents

Method, device, equipment and medium for risk stratification of diffuse large B cell lymphoma Download PDF

Info

Publication number
CN117854723A
CN117854723A CN202311747476.7A CN202311747476A CN117854723A CN 117854723 A CN117854723 A CN 117854723A CN 202311747476 A CN202311747476 A CN 202311747476A CN 117854723 A CN117854723 A CN 117854723A
Authority
CN
China
Prior art keywords
risk stratification
gene
cell lymphoma
diffuse large
risk
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311747476.7A
Other languages
Chinese (zh)
Inventor
谭洁
朱敏
林垂旭
李映华
梁小丹
梁耀铭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Kingmed Diagnostics Group Co ltd
Guangzhou Kingmed Diagnostics Central Co Ltd
Original Assignee
Guangzhou Kingmed Diagnostics Group Co ltd
Guangzhou Kingmed Diagnostics Central Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Kingmed Diagnostics Group Co ltd, Guangzhou Kingmed Diagnostics Central Co Ltd filed Critical Guangzhou Kingmed Diagnostics Group Co ltd
Priority to CN202311747476.7A priority Critical patent/CN117854723A/en
Publication of CN117854723A publication Critical patent/CN117854723A/en
Pending legal-status Critical Current

Links

Abstract

The invention discloses a diffuse large B cell lymphoma risk stratification method, a device, equipment and a medium, wherein the method comprises the following steps: determining the influence value of a gene variation characteristic set through a preset diffuse large B cell lymphoma prognosis model; screening a high-influence gene feature set according to the influence value; quantifying a high-influence gene feature set of a patient through an influence value, and determining a risk stratification index; a prognostic risk level for the patient is determined based on the risk stratification index and a preset threshold. Because the high-influence gene feature set is screened out by using the preset diffuse large B cell lymphoma prognosis model, and the risk stratification index is determined according to the variation condition of the high-influence gene feature set of the patient, compared with the traditional international prognosis risk index for carrying out risk stratification based on the phenotype information of the patient, the invention can realize risk stratification based on the genetic information of the diffuse large B cell lymphoma patient, refines the means of risk stratification and improves the accuracy of risk stratification.

Description

Method, device, equipment and medium for risk stratification of diffuse large B cell lymphoma
Technical Field
The invention relates to the technical field of cancer risk stratification, in particular to a diffuse large B cell lymphoma risk stratification method, a diffuse large B cell lymphoma risk stratification device, diffuse large B cell lymphoma risk stratification equipment and a diffuse large B cell lymphoma risk stratification medium.
Background
Diffuse large B-cell lymphomas (Diffuse Large B Cell Lymphoma, DLBCL) are one of the most common subtypes of Non-Hodgkin's lymphomas, NHL, accounting for about 30% -50% of all NHL cases. However, DLBCL is highly heterogeneous, and different patients exhibit great differences in clinical characteristics, treatment response, survival risk, and the like, which presents great challenges for clinical diagnosis and treatment.
In order to improve the treatment effect, doctors can formulate a personalized treatment scheme according to the risk classification and molecular prognosis indexes of patients in the clinical diagnosis and treatment process. The most widely used clinical prognostic risk assessment system at present is the international prognostic index (International Prognostic Index, IPI), which classifies patients into 4 risk classes based on 5 factors of age, clinical stage, lactate dehydrogenase (Lactate dehydrogenase, LDH), number of extra-corporal organs involved, physical stamina. Over 20 years of development, the prognosis evaluation system for DLBCL has also been improved and refined, with representative new methods of age-adjusted IPI (aa-IPI), improved IPI for therapeutic response (R-IPI), and NCCN-IPI refining age and LDH indicators. However, these assessment systems generally divide patients into 3-4 risk levels, and do not adequately account for clinical heterogeneity of patients. More importantly, they rely on only a few phenotypic information of the patient for evaluation. With the development of molecular pathology, molecular prognostic indicators of DLBCL are gaining increasing attention. At present, the research shows that the molecules with independent prognostic significance comprise a nuclear antigen KI-67, a transmembrane glycoprotein CD5, a protooncogene C-MYC, an oncogene P53 and an apoptosis inhibitor gene BCL-2. However, the explanation for the heterogeneity of DLBCL by these single molecular prognostic indicators remains quite limited.
The foregoing is provided merely for the purpose of facilitating understanding of the technical solutions of the present invention and is not intended to represent an admission that the foregoing is prior art.
Disclosure of Invention
The invention mainly aims to provide a diffuse large B cell lymphoma risk stratification method, device, equipment and medium, and aims to solve the technical problem that the traditional international prognosis risk index for risk stratification based on patient phenotype information has limited heterogeneity interpretation of DLBCL.
To achieve the above object, the present invention provides a diffuse large B-cell lymphoma risk stratification method comprising the steps of:
determining the influence value of a gene variation characteristic set through a preset diffuse large B cell lymphoma prognosis model;
screening the high-influence gene feature set of the gene variation feature set according to the influence value;
quantifying the high-influence gene feature set of the patient through the influence value, and determining a risk stratification index;
and determining a prognosis risk level of the patient based on the risk stratification index and a preset threshold.
Optionally, the determining the influence value of the genetic variation feature set by presetting a diffuse large B-cell lymphoma prognosis model includes:
determining SHAP values of a gene variation characteristic set through a preset diffuse large B cell lymphoma prognosis model;
determining a variant case set and a variant case number based on a clinical case set of a diffuse large B-cell lymphoma patient;
and calculating the SHAP value, the variation sample set and the variation sample number through a preset characteristic influence formula, and determining the influence value of the gene variation characteristic set.
Optionally, the preset feature influence formula is:
wherein, impact j An influence value indicating the variation characteristics of the jth gene, N j Representing the number of variation cases with the j-th characteristic value of 1, C j A variation case set s representing the jth characteristic value of 1 ij SHAP value representing the j-th feature of the i-th clinical case, the feature value representing whether the gene is mutated or not, 1 representing that the gene is mutated, and 0 representing that the gene is not mutated.
Optionally, said quantifying said high-impact gene feature set of the patient by said impact value, determining a risk stratification index, comprises:
obtaining a genetic variation characteristic value of a clinical case set of the diffuse large B cell lymphoma patient;
and calculating the influence value, the high influence gene characteristic set and the gene variation characteristic value through a preset risk stratification index formula to determine the risk stratification index of the patient.
Optionally, the preset risk stratification index formula is:
wherein index-rs i Risk stratification index indicating the ith case, F top Representing the high-impact gene feature set, f ij The j-th eigenvalue (0 or 1) of the i-th clinical case, wherein the eigenvalue indicates whether the gene is mutated or not, 1 indicates that the gene is mutated, and 0 indicates that the gene is not mutated.
Optionally, the determining a prognostic risk level of the patient based on the risk stratification index and a preset threshold value comprises:
determining two preset thresholds of risk stratification;
determining a prognostic risk level for the patient based on the risk stratification index and the two preset thresholds, the prognostic risk level being:
wherein, level i For the prognostic risk level for the ith case, t1 and t2 are two preset thresholds for risk stratification.
Optionally, the high-impact gene feature set further comprises a high-risk gene variation set and a low-risk gene variation set.
In addition, in order to achieve the above object, the present invention also proposes a diffuse large B-cell lymphoma risk stratification device comprising:
the influence module is used for determining the influence value of the gene variation characteristic set through a preset diffuse large B cell lymphoma prognosis model;
the gene screening module is used for screening the high-influence gene feature set of the gene variation feature set according to the influence value;
the risk index module is used for quantifying the high-influence gene feature set of the patient through the influence value and determining a risk stratification index;
and the risk stratification module is used for determining a prognosis risk level of the patient based on the risk stratification index and a preset threshold value.
In addition, to achieve the above object, the present invention also proposes a diffuse large B-cell lymphoma risk stratification device comprising: a memory, a processor, and a diffuse large B-cell lymphoma risk stratification program stored on the memory and executable on the processor, the diffuse large B-cell lymphoma risk stratification program configured to implement the steps of the diffuse large B-cell lymphoma risk stratification method as described above.
Furthermore, to achieve the above object, the present invention also proposes a medium having stored thereon a diffuse large B-cell lymphoma risk stratification procedure which when executed by a processor implements the steps of the diffuse large B-cell lymphoma risk stratification method as described above.
Firstly, determining an influence value of a gene variation characteristic set by presetting a diffuse large B cell lymphoma prognosis model; then screening the high-influence gene feature set of the gene variation feature set according to the influence value; quantifying the high-influence gene feature set of the patient through the influence value, and determining a risk stratification index; and finally, determining a prognosis risk level of the patient based on the risk stratification index and a preset threshold value. Because the invention utilizes the preset diffuse large B cell lymphoma prognosis model to screen out the high-influence gene feature set, and then determines the risk stratification index according to the variation condition of the high-influence gene feature set of the patient, compared with the traditional international prognosis risk index (IPI) for carrying out risk stratification based on the phenotype information of the patient, the invention can realize risk stratification based on the genetic information of the diffuse large B cell lymphoma patient, refines the means of risk stratification and improves the accuracy of risk stratification.
Drawings
FIG. 1 is a schematic diagram of the architecture of a diffuse large B-cell lymphoma risk stratification device of a hardware operating environment according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart of a first embodiment of a diffuse large B-cell lymphoma risk stratification method according to the present invention;
FIG. 3 is a schematic flow chart of a second embodiment of a diffuse large B-cell lymphoma risk stratification method according to the present invention;
fig. 4 is a block diagram of a first embodiment of a diffuse large B-cell lymphoma risk stratification device according to the invention.
The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
Referring to fig. 1, fig. 1 is a schematic structural diagram of a diffuse large B-cell lymphoma risk stratification device of a hardware operating environment according to an embodiment of the present invention.
As shown in fig. 1, the diffuse large B-cell lymphoma risk stratification device may comprise: a processor 1001, such as a central processing unit (Central Processing Unit, CPU), a communication bus 1002, a user interface 1003, a network interface 1004, a memory 1005. Wherein the communication bus 1002 is used to enable connected communication between these components. The user interface 1003 may include a Display, an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may further include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a Wireless interface (e.g., a Wireless-Fidelity (Wi-Fi) interface). The Memory 1005 may be a high-speed random access Memory (Random Access Memory, RAM) or a stable nonvolatile Memory (NVM), such as a disk Memory. The memory 1005 may also optionally be a storage device separate from the processor 1001 described above.
Those skilled in the art will appreciate that the structure shown in fig. 1 does not constitute a limitation of the diffuse large B-cell lymphoma risk stratification device, and may comprise more or fewer components than shown, or may combine certain components, or may be a different arrangement of components.
As shown in fig. 1, an operating system, a network communication module, a user interface module, and a diffuse large B-cell lymphoma risk stratification procedure may be included in the memory 1005 as one storage medium.
In the diffuse large B-cell lymphoma risk stratification device shown in fig. 1, the network interface 1004 is mainly used for data communication with a network server; the user interface 1003 is mainly used for data interaction with a user; the processor 1001 and the memory 1005 in the diffuse large B-cell lymphoma risk stratification device of the present invention may be disposed in the diffuse large B-cell lymphoma risk stratification device, and the diffuse large B-cell lymphoma risk stratification device invokes the diffuse large B-cell lymphoma risk stratification program stored in the memory 1005 through the processor 1001, and executes the diffuse large B-cell lymphoma risk stratification method provided by the embodiment of the present invention.
The embodiment of the invention provides a diffuse large B cell lymphoma risk stratification method, and referring to fig. 2, fig. 2 is a flow chart of a first embodiment of the diffuse large B cell lymphoma risk stratification method.
In this embodiment, the diffuse large B-cell lymphoma risk stratification method comprises the following steps:
step S10: and determining the influence value of the gene variation characteristic set through a preset diffuse large B cell lymphoma prognosis model.
It should be noted that, the execution subject of the method of the present embodiment may be a computing service device with functions of gene screening, numerical quantization and risk stratification, such as a personal computer, a server, etc., or may be other electronic devices capable of implementing the same or similar functions, such as the diffuse large B-cell lymphoma risk stratification device described above, which is not limited in this embodiment. Here, the present embodiment and each of the following embodiments will be specifically described with the above diffuse large B-cell lymphoma risk stratification device (abbreviated as risk stratification device).
It can be understood that the preset diffuse large B-cell lymphoma prognosis model is a model constructed and trained based on a visual neural network (Visible neural network, VNN) and a fully connected neural network using genetic variation information and basic clinical information of DLBCL patients. Under the visualization structure, the input neurons are mapped to genetic variation features, the hidden layer neurons are mapped to biological pathways, and the model-based VNN can improve the interpretability to assist in genetically driven mining.
It is understood that the set of genetic variation characteristics is a pre-selected or defined set of genetic variation characteristics associated with the prognostic impact of diffuse large B-cell lymphoma patients, which characteristics may include Single Nucleotide Variation (SNV), insertion/deletion (Indel), chromosomal structural variation, gene rearrangement, copy Number Variation (CNV), and the like of different types of genetic variation.
It can be understood that, according to the SHAP of the post-model interpretation method, SHAP values of the gene variation feature set are calculated, and then, according to the SHAP values, influence values of each gene variation in the gene variation feature set on the prognosis risk of the patient, wherein the influence values are values reflecting the importance of the gene variation feature, and based on the absolute value of the influence values, genes with high influence in the gene variation feature set of diffuse large B cell lymphoma can be screened out.
Step S20: and screening the high-influence gene feature set of the gene variation feature set according to the influence value.
Further, the high-impact gene feature set in this embodiment further includes a high-risk gene variation set and a low-risk gene variation set. For ease of understanding, as shown in table 1, it is assumed that 30 high-impact genes were selected, of which 17 high-risk variant genes that increase patient risk (impact greater than 0) and 13 low-risk variant genes that decrease patient risk (impact less than 0), according to the impact values shown in the following table.
Table 1: high influence gene characteristic base and corresponding influence value thereof.
Step S30: and quantifying the high-influence gene feature set of the patient through the influence value, and determining a risk stratification index.
It should be noted that, the risk stratification index is a calculated index for risk stratification of patients with diffuse large B-cell lymphoma, and since the risk differences of some patients are difficult to distinguish by phenotypic characteristics, the risk levels of different patients within the same IPI rating can be further refined by the risk stratification index.
Step S40: and determining a prognosis risk level of the patient based on the risk stratification index and a preset threshold.
It should be noted that the preset threshold is a preset threshold for patient stratification, for example, -0.015, 0.025, etc., and may be set according to the actual stratification condition, which is not limited in this embodiment.
In a specific implementation, the risk stratification device can determine the influence value of the genetic variation feature set by presetting a diffuse large B-cell lymphoma prognosis model. And screening out a high-influence gene characteristic set in the gene variation characteristic set of the diffuse large B cell lymphoma based on the absolute value of the influence value. The high-impact gene feature set of the patient can then be quantified by the impact value to determine a risk stratification index. And finally, determining a prognosis risk level of the patient based on the risk stratification index and a preset threshold value, and further refining the risk levels of different patients in the same IPI grade.
The risk stratification device of the embodiment can determine the influence value of the genetic variation characteristic set by presetting a diffuse large B-cell lymphoma prognosis model. And screening out a high-influence gene characteristic set in the gene variation characteristic set of the diffuse large B cell lymphoma based on the absolute value of the influence value. The high-impact gene feature set of the patient can then be quantified by the impact value to determine a risk stratification index. And finally, determining a prognosis risk level of the patient based on the risk stratification index and a preset threshold value. Because the high-influence gene feature set is screened out by using the preset diffuse large B-cell lymphoma prognosis model, and the risk stratification index is determined according to the variation condition of the high-influence gene feature set of the patient, compared with the traditional international prognosis risk index (IPI) for carrying out risk stratification based on the phenotype information of the patient, the invention can realize risk stratification based on the genetic information of the diffuse large B-cell lymphoma patient, refines the means of risk stratification and improves the accuracy of risk stratification.
Referring to fig. 3, fig. 3 is a flow chart illustrating a second embodiment of the diffuse large B-cell lymphoma risk stratification method of the present invention.
Based on the first embodiment, in this embodiment, the step S10 includes:
step S11: and determining the SHAP value of the gene variation characteristic set by presetting a diffuse large B cell lymphoma prognosis model.
The SHAP value is a value for explaining the prediction result of the predictive model of the preset diffuse large B-cell lymphoma. Wherein, based on the interpretability of SHAP, the degree and direction of the influence of a certain gene on the prognosis of a patient in the gene variation characteristic set can be clearly determined, thereby improving the mining efficiency of genetic driving. Significance scores can be calculated for each input gene signature and gene pathways therein as hidden layer neurons using the Python package shape (V0.41.0), which can accurately reflect the impact of the signature on patient prognosis. Thus, the key genes that contribute most to model prediction can be determined based on the SHAP values, and the risk trend of the genes can be distinguished according to the positive and negative of the SHAP values.
Step S12: and determining a variation case set and the number of variation cases based on the clinical case set of the diffuse large B cell lymphoma patient.
Step S13: and calculating the SHAP value, the variation sample set and the variation sample number through a preset characteristic influence formula, and determining the influence value of the gene variation characteristic set.
It should be noted that the clinical case set is clinical treatment information of patients with diffuse large B-cell lymphoma, 928 cases of case data set disclosed in DLBCL related study by Stuart et al may be used, in which mutation spectra (mutation or non-mutation information) of 117 genes may be included, and other similar related data may be used, which is not limited in this embodiment. The preset characteristic influence formula is a formula for calculating an influence value of the genetic variation characteristic set.
In practical implementation, a preset diffuse large B-cell lymphoma prognosis model can be constructed by using the genetic variation characteristics of DLBCL patients and the clinical case set. According to the SHAP of the post-model interpretation method, SHAP values of each input feature (gene variation feature) are calculated by using Python package shape (V0.41.0), then the influence of each gene variation in the gene variation feature on the prognosis risk of the patient can be calculated, and 30 genes with high influence on the prognosis risk of the DLBCL patient can be screened accordingly.
The most common method of selecting important features based on the SHAP values is to calculate the sum of the absolute values of the SHAP values for all samples, but some low frequency genetic variations may be ignored due to the small number. In order to better identify genetic features that significantly affect the risk of prognosis of a patient, including genes that have a lower frequency of variation, a subset of cases where a particular genetic variation exists may be used to improve the method of quantifying gene impact.
Further, the preset characteristic influence formula in this embodiment is:
wherein, impact j An influence value indicating the variation characteristics of the jth gene, N j Representing the number of variation cases with the j-th characteristic value of 1, C j A variation case set s representing the jth characteristic value of 1 ij SHAP value representing the j-th feature of the i-th clinical case, the feature value representing whether the gene is mutated or not, 1 representing that the gene is mutated, and 0 representing that the gene is not mutated.
In a specific implementation, the risk stratification device can determine the SHAP value of the genetic variation feature set by presetting a diffuse large B-cell lymphoma prognosis model; then, based on clinical case sets of diffuse large B cell lymphoma patients, a variation case set and a variation case number are determined. And finally, calculating the SHAP value, the variation sample set and the variation sample number through the preset characteristic influence formula, and determining the influence value of the gene variation characteristic set. Therefore, the influence value of the genetic variation on the prognosis risk of the patient can be calculated by using a preset diffuse large B cell lymphoma prognosis model and SHAP values, and 30 high influence genes of the prognosis risk of the DLBCL patient can be screened accordingly.
Further, in the present embodiment, step S30 includes: obtaining a genetic variation characteristic value of a clinical case set of the diffuse large B cell lymphoma patient; and calculating the influence value, the high influence gene characteristic set and the gene variation characteristic value through a preset risk stratification index formula to determine the risk stratification index of the patient.
It should be noted that the preset risk stratification index formula is a formula for calculating the prognostic risk stratification index of patients suffering from diffuse large B-cell lymphoma.
In practical implementation, the existence or non-existence of key gene variation in the gene variation characteristic set and the influence value of the key gene variation on the prognosis risk of the patient can be used as the basis for calculating the risk stratification index of the diffuse large B cell lymphoma patient.
Further, in this embodiment, the preset risk stratification index formula is:
wherein index-rs i Risk stratification index indicating the ith case, F top Representing the high-impact gene feature set, f ij The j-th eigenvalue (0 or 1) of the i-th clinical case, wherein the eigenvalue indicates whether the gene is mutated or not, 1 indicates that the gene is mutated, and 0 indicates that the gene is not mutated.
After determining the risk stratification index of patients with diffuse large B-cell lymphomas according to the preset risk stratification index formula described above, the risk stratification of different patients within the same IPI rating may be further refined to distinguish risk differences between patients that are difficult to distinguish by phenotypic characteristics.
Further, in the present embodiment, step S40 includes: determining two preset thresholds of risk stratification; determining a prognostic risk level for the patient based on the risk stratification index and the two preset thresholds, the prognostic risk level being:
wherein, level i For the prognostic risk level for the ith case, t1 and t2 are two preset thresholds for risk stratification.
It should be noted that t1 and t2 are two thresholds for risk stratification, and the threshold value may be determined based on the risk stratification effect of the patient, e.g. t1 and t2 may be-0.015 and 0.025, respectively.
In a specific implementation, the risk stratification device may obtain a characteristic value of genetic variation of a clinical case set of the diffuse large B-cell lymphoma patient. And then using the existence or non-existence of key gene variation in the gene variation characteristic set and the influence value of the key gene variation on the prognosis risk of the patient as the basis for calculating the risk stratification index of the patient suffering from diffuse large B cell lymphoma. And calculating the influence value, the high influence gene characteristic set and the gene variation characteristic value through the preset risk stratification index formula to determine the risk stratification index of the patient. Therefore, the risk stratification index of the DLBCL patient can be calculated by utilizing the high-influence gene feature set, the risk stratification based on genetic information is realized, and the phenotype-based IPI risk stratification is supplemented.
The risk stratification device of the embodiment can determine the SHAP value of the genetic variation feature set by presetting a diffuse large B-cell lymphoma prognosis model. Then, based on clinical case sets of diffuse large B cell lymphoma patients, a variation case set and a variation case number are determined. And finally, calculating the SHAP value, the variation sample set and the variation sample number through the preset characteristic influence formula, determining the influence value of the gene variation characteristic set, and screening 30 high influence genes of the prognosis risk of the DLBCL patient according to the influence value. Still further, the risk stratification device may obtain a characteristic value of genetic variation of the clinical case set of the diffuse large B-cell lymphoma patient. And then using the existence or non-existence of key gene variation in the gene variation characteristic set and the influence value of the key gene variation on the prognosis risk of the patient as the basis for calculating the risk stratification index of the patient suffering from diffuse large B cell lymphoma. And calculating the influence value, the high influence gene characteristic set and the gene variation characteristic value through the preset risk stratification index formula to determine the risk stratification index of the patient. Therefore, the risk stratification index of the DLBCL patient can be calculated by utilizing the high-influence gene feature set, the risk stratification based on genetic information is realized, and the phenotype-based IPI risk stratification is supplemented.
In addition, the embodiment of the invention also provides a medium, wherein the medium is a storage medium, and the storage medium is stored with a diffuse large B cell lymphoma risk stratification program, and the diffuse large B cell lymphoma risk stratification program realizes the steps of the diffuse large B cell lymphoma risk stratification method when being executed by a processor.
Referring to fig. 4, fig. 4 is a block diagram showing the construction of a first embodiment of the diffuse large B-cell lymphoma risk stratification device of the present invention.
As shown in fig. 4, the diffuse large B-cell lymphoma risk stratification device according to the embodiment of the present invention comprises:
the influence module 401 is used for determining an influence value of the gene variation characteristic set through a preset diffuse large B-cell lymphoma prognosis model;
a gene screening module 402, configured to screen the high-impact gene feature set of the gene variation feature set according to the impact value;
a risk index module 403, configured to quantify the high-impact gene feature set of the patient according to the impact value, and determine a risk stratification index;
a risk stratification module 404 for determining a prognostic risk level for the patient based on the risk stratification index and a preset threshold.
The risk stratification device of the embodiment can determine the influence value of the genetic variation characteristic set by presetting a diffuse large B-cell lymphoma prognosis model. And screening out a high-influence gene characteristic set in the gene variation characteristic set of the diffuse large B cell lymphoma based on the absolute value of the influence value. The high-impact gene feature set of the patient can then be quantified by the impact value to determine a risk stratification index. And finally, determining a prognosis risk level of the patient based on the risk stratification index and a preset threshold value. Because the high-influence gene feature set is screened out by using the preset diffuse large B-cell lymphoma prognosis model, and the risk stratification index is determined according to the variation condition of the high-influence gene feature set of the patient, compared with the traditional international prognosis risk index (IPI) for carrying out risk stratification based on the phenotype information of the patient, the invention can realize risk stratification based on the genetic information of the diffuse large B-cell lymphoma patient, refines the means of risk stratification and improves the accuracy of risk stratification.
Based on the first embodiment of the diffuse large B-cell lymphoma risk stratification device of the present invention described above, a second embodiment of the diffuse large B-cell lymphoma risk stratification device of the present invention is presented.
In this embodiment, the influence module 401 is further configured to determine SHAP values of the genetic variation feature set by presetting a diffuse large B-cell lymphoma prognosis model; determining a variant case set and a variant case number based on a clinical case set of a diffuse large B-cell lymphoma patient; and calculating the SHAP value, the variation sample set and the variation sample number through a preset characteristic influence formula, and determining the influence value of the gene variation characteristic set.
Further, the preset characteristic influence formula is:
wherein, impact j An influence value indicating the variation characteristics of the jth gene, N j Representing the number of variation cases with the j-th characteristic value of 1, C j A variation case set s representing the jth characteristic value of 1 ij SHAP value representing the j-th feature of the i-th clinical case, the feature value representing whether the gene is mutated or not, 1 representing that the gene is mutated, and 0 representing that the gene is not mutated.
Further, the risk index module 403 is further configured to obtain a genetic variation characteristic value of the clinical case set of the diffuse large B-cell lymphoma patient; and calculating the influence value, the high influence gene characteristic set and the gene variation characteristic value through a preset risk stratification index formula to determine the risk stratification index of the patient.
Further, the preset risk stratification index formula is:
wherein index-rs i Risk stratification index indicating the ith case, F top Representing the high-impact gene feature set, f ij The j-th eigenvalue (0 or 1) of the i-th clinical case, wherein the eigenvalue indicates whether the gene is mutated or not, 1 indicates that the gene is mutated, and 0 indicates that the gene is not mutated.
Further, the risk stratification module 404 is further configured to determine two preset thresholds of risk stratification; determining a prognostic risk level for the patient based on the risk stratification index and the two preset thresholds, the prognostic risk level being:
wherein, level i For the prognostic risk level for the ith case, t1 and t2 are two preset thresholds for risk stratification.
Further, the high-impact gene feature set further comprises a high-risk gene variation set and a low-risk gene variation set.
Other embodiments or specific implementation manners of the diffuse large B-cell lymphoma risk stratification device of the present invention may refer to the above-mentioned method embodiments, and will not be described herein.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.
The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. read-only memory/random-access memory, magnetic disk, optical disk), comprising instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method according to the embodiments of the present invention.
The foregoing description is only of the preferred embodiments of the present invention, and is not intended to limit the scope of the invention, but rather is intended to cover any equivalents of the structures or equivalent processes disclosed herein or in the alternative, which may be employed directly or indirectly in other related arts.

Claims (10)

1. A diffuse large B-cell lymphoma risk stratification method, characterized in that the diffuse large B-cell lymphoma risk stratification method comprises:
determining the influence value of a gene variation characteristic set through a preset diffuse large B cell lymphoma prognosis model;
screening the high-influence gene feature set of the gene variation feature set according to the influence value;
quantifying the high-influence gene feature set of the patient through the influence value, and determining a risk stratification index;
and determining a prognosis risk level of the patient based on the risk stratification index and a preset threshold.
2. The method for risk stratification of diffuse large B-cell lymphoma of claim 1 wherein said determining an impact value of a set of genetic variation features by pre-setting a diffuse large B-cell lymphoma prognosis model comprises:
determining SHAP values of a gene variation characteristic set through a preset diffuse large B cell lymphoma prognosis model;
determining a variant case set and a variant case number based on a clinical case set of a diffuse large B-cell lymphoma patient;
and calculating the SHAP value, the variation sample set and the variation sample number through a preset characteristic influence formula, and determining the influence value of the gene variation characteristic set.
3. The diffuse large B-cell lymphoma risk stratification method of claim 2 wherein said predetermined characteristic influence formula is:
wherein, impact j An influence value indicating the variation characteristics of the jth gene, N j Representing the number of variation cases with the j-th characteristic value of 1, C j A variation case set s representing the jth characteristic value of 1 ij SHAP value representing the j-th feature of the i-th clinical case, the feature value representing whether the gene is mutated or not, 1 representing that the gene is mutated, and 0 representing that the gene is not mutated.
4. The diffuse large B-cell lymphoma risk stratification method of claim 3 wherein said quantifying said high-impact gene signature set of a patient by said impact value comprises:
obtaining a genetic variation characteristic value of a clinical case set of the diffuse large B cell lymphoma patient;
and calculating the influence value, the high influence gene characteristic set and the gene variation characteristic value through a preset risk stratification index formula to determine the risk stratification index of the patient.
5. The diffuse large B-cell lymphoma risk stratification method of claim 4 wherein said predetermined risk stratification index formula is:
wherein index-rs i Risk stratification index indicating the ith case, F top Representing the high-impact gene feature set, f ij The j-th eigenvalue (0 or 1) of the i-th clinical case, wherein the eigenvalue indicates whether the gene is mutated or not, 1 indicates that the gene is mutated, and 0 indicates that the gene is not mutated.
6. The diffuse large B-cell lymphoma risk stratification method of claim 5, wherein said determining a prognostic risk level for a patient based on said risk stratification index and a preset threshold comprises:
determining two preset thresholds of risk stratification;
determining a prognostic risk level for the patient based on the risk stratification index and the two preset thresholds, the prognostic risk level being:
wherein, level i For the prognostic risk level for the ith case, t1 and t2 are two preset thresholds for risk stratification.
7. The diffuse large B-cell lymphoma risk stratification method of any one of claims 1-6 wherein said high-impact gene signature set further comprises a high-risk gene variation set and a low-risk gene variation set.
8. A diffuse large B-cell lymphoma risk stratification device, comprising:
the influence module is used for determining the influence value of the gene variation characteristic set through a preset diffuse large B cell lymphoma prognosis model;
the gene screening module is used for screening the high-influence gene feature set of the gene variation feature set according to the influence value;
the risk index module is used for quantifying the high-influence gene feature set of the patient through the influence value and determining a risk stratification index;
and the risk stratification module is used for determining a prognosis risk level of the patient based on the risk stratification index and a preset threshold value.
9. A diffuse large B-cell lymphoma risk stratification device, characterized in that said device comprises: a memory, a processor, and a diffuse large B-cell lymphoma risk stratification program stored on the memory and executable on the processor, the diffuse large B-cell lymphoma risk stratification program configured to implement the steps of the diffuse large B-cell lymphoma risk stratification method according to any one of claims 1-7.
10. A medium having stored thereon a diffuse large B-cell lymphoma risk stratification procedure which when executed by a processor implements the steps of the diffuse large B-cell lymphoma risk stratification method according to any one of claims 1 to 7.
CN202311747476.7A 2023-12-18 2023-12-18 Method, device, equipment and medium for risk stratification of diffuse large B cell lymphoma Pending CN117854723A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311747476.7A CN117854723A (en) 2023-12-18 2023-12-18 Method, device, equipment and medium for risk stratification of diffuse large B cell lymphoma

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311747476.7A CN117854723A (en) 2023-12-18 2023-12-18 Method, device, equipment and medium for risk stratification of diffuse large B cell lymphoma

Publications (1)

Publication Number Publication Date
CN117854723A true CN117854723A (en) 2024-04-09

Family

ID=90544210

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311747476.7A Pending CN117854723A (en) 2023-12-18 2023-12-18 Method, device, equipment and medium for risk stratification of diffuse large B cell lymphoma

Country Status (1)

Country Link
CN (1) CN117854723A (en)

Similar Documents

Publication Publication Date Title
Alachiotis et al. RAiSD detects positive selection based on multiple signatures of a selective sweep and SNP vectors
Chun et al. Limited statistical evidence for shared genetic effects of eQTLs and autoimmune-disease-associated loci in three major immune-cell types
Zhu et al. Statistical methods for SNP heritability estimation and partition: A review
Carmi et al. Sequencing an Ashkenazi reference panel supports population-targeted personal genomics and illuminates Jewish and European origins
US10127353B2 (en) Method and systems for querying sequence-centric scientific information
Hamid et al. Data integration in genetics and genomics: methods and challenges
US20190114219A1 (en) Error correction in ancestry classification
Ronen et al. Learning natural selection from the site frequency spectrum
Favorov et al. A Markov chain Monte Carlo technique for identification of combinations of allelic variants underlying complex diseases in humans
Marchini et al. Genotype imputation for genome-wide association studies
Adams et al. Maximum-likelihood estimation of demographic parameters using the frequency spectrum of unlinked single-nucleotide polymorphisms
US9213947B1 (en) Scalable pipeline for local ancestry inference
US10777302B2 (en) Identifying variants of interest by imputation
Schrider Background selection does not mimic the patterns of genetic diversity produced by selective sweeps
JP2005512175A (en) A method for identifying genetic features of complex genetic classifiers
Hao et al. Extending tests of Hardy–Weinberg equilibrium to structured populations
Li et al. Estimation of quantitative trait locus effects with epistasis by variational Bayes algorithms
Tang et al. A review of SNP heritability estimation methods
Li et al. Performance‐weighted‐voting model: An ensemble machine learning method for cancer type classification using whole‐exome sequencing mutation
Kang et al. Practical issues in building risk-predicting models for complex diseases
DeGroat et al. Discovering biomarkers associated and predicting cardiovascular disease with high accuracy using a novel nexus of machine learning techniques for precision medicine
DeGiorgio et al. A spatially aware likelihood test to detect sweeps from haplotype distributions
Hettiarachchi et al. GWAS to identify SNPs associated with common diseases and individual risk: Genome Wide Association Studies (GWAS) to identify SNPs associated with common diseases and individual risk
Tang et al. Identification of genes and haplotypes that predict rheumatoid arthritis using random forests
Ackermann et al. Teamwork: improved eQTL mapping using combinations of machine learning methods

Legal Events

Date Code Title Description
PB01 Publication
SE01 Entry into force of request for substantive examination