CN118016160A - HLA haploid matched hematopoietic stem cell transplantation survival rate prediction method, system, equipment and storage medium - Google Patents
HLA haploid matched hematopoietic stem cell transplantation survival rate prediction method, system, equipment and storage medium Download PDFInfo
- Publication number
- CN118016160A CN118016160A CN202410411917.4A CN202410411917A CN118016160A CN 118016160 A CN118016160 A CN 118016160A CN 202410411917 A CN202410411917 A CN 202410411917A CN 118016160 A CN118016160 A CN 118016160A
- Authority
- CN
- China
- Prior art keywords
- hla
- donor
- genes
- gene
- haploid
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000004083 survival effect Effects 0.000 title claims abstract description 74
- 238000000034 method Methods 0.000 title claims abstract description 37
- 238000011134 hematopoietic stem cell transplantation Methods 0.000 title claims abstract description 33
- 239000013598 vector Substances 0.000 claims abstract description 69
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 62
- 238000003205 genotyping method Methods 0.000 claims abstract description 36
- 125000003275 alpha amino acid group Chemical group 0.000 claims abstract description 24
- 210000003958 hematopoietic stem cell Anatomy 0.000 claims description 34
- 238000002054 transplantation Methods 0.000 claims description 19
- 238000006243 chemical reaction Methods 0.000 claims description 12
- 239000002773 nucleotide Substances 0.000 claims description 12
- 125000003729 nucleotide group Chemical group 0.000 claims description 12
- 108090000848 Ubiquitin Proteins 0.000 claims description 7
- 102000044159 Ubiquitin Human genes 0.000 claims description 7
- 238000004364 calculation method Methods 0.000 claims description 7
- 101710093543 Probable non-specific lipid-transfer protein Proteins 0.000 claims description 4
- 238000004590 computer program Methods 0.000 claims description 4
- 108091028043 Nucleic acid sequence Proteins 0.000 claims description 3
- 230000009467 reduction Effects 0.000 claims description 2
- 239000000427 antigen Substances 0.000 description 7
- 108091007433 antigens Proteins 0.000 description 7
- 102000036639 antigens Human genes 0.000 description 7
- 230000034994 death Effects 0.000 description 7
- 231100000517 death Toxicity 0.000 description 7
- 210000000265 leukocyte Anatomy 0.000 description 5
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 102000004169 proteins and genes Human genes 0.000 description 3
- 241000282412 Homo Species 0.000 description 2
- 108700018351 Major Histocompatibility Complex Proteins 0.000 description 2
- 210000004027 cell Anatomy 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 210000000056 organ Anatomy 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 230000020382 suppression by virus of host antigen processing and presentation of peptide antigen via MHC class I Effects 0.000 description 2
- 210000001744 T-lymphocyte Anatomy 0.000 description 1
- 230000000735 allogeneic effect Effects 0.000 description 1
- 210000000612 antigen-presenting cell Anatomy 0.000 description 1
- 238000003556 assay Methods 0.000 description 1
- 210000004369 blood Anatomy 0.000 description 1
- 239000008280 blood Substances 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 210000000170 cell membrane Anatomy 0.000 description 1
- 230000002759 chromosomal effect Effects 0.000 description 1
- 210000000349 chromosome Anatomy 0.000 description 1
- 230000006378 damage Effects 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 239000012636 effector Substances 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000028993 immune response Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 102000004196 processed proteins & peptides Human genes 0.000 description 1
- 108090000765 processed proteins & peptides Proteins 0.000 description 1
- 238000004393 prognosis Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B30/00—ICT specially adapted for sequence analysis involving nucleotides or amino acids
- G16B30/10—Sequence alignment; Homology search
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
Landscapes
- Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Medical Informatics (AREA)
- Biophysics (AREA)
- Theoretical Computer Science (AREA)
- Spectroscopy & Molecular Physics (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Biology (AREA)
- Biotechnology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Data Mining & Analysis (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Epidemiology (AREA)
- Databases & Information Systems (AREA)
- Public Health (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioethics (AREA)
- Chemical & Material Sciences (AREA)
- Analytical Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The present disclosure relates to a method, system, apparatus, and storage medium for predicting survival of HLA haploid matched hematopoietic stem cell transplantation. The prediction method comprises S100, obtaining HLA genotyping of an acceptor and a donor; s200, according to HLA genotypes of the receptor and the donor, obtaining two amino acid sequences corresponding to mismatch genes between the receptor and the donor HLA genes; s300, converting two amino acid sequences corresponding to mismatch genes between acceptor and donor HLA genes into two-dimensional vectors; s400, calculating Euclidean distance of mismatched genes between acceptor and donor HLA genes; s500, predicting survival rate of the receptor after receiving HLA haploid matched hematopoietic stem cell transplantation based on Euclidean distance. The survival rate of the receptor receiving HLA haploid phase hematopoietic stem cell transplantation is predicted through Euclidean distance, HLA-rich functional polymorphism can be effectively reflected, and the research conclusion has application value.
Description
Technical Field
The disclosure relates to the technical field of medical care informatics, in particular to a method, a system, equipment and a storage medium for predicting survival rate of HLA haploid matched hematopoietic stem cell transplantation.
Background
Human leukocyte antigen (human leukocyte antigen, HLA) genotyping is one of the important assays in organ and hematopoietic stem cell transplantation. It is generally believed that the higher the HLA compatibility of the donor and recipient, the lower the rejection rate, and the higher the graft success rate and organ survival rate. In the past research, HLA matching mainly depends on the traditional HLA matching algorithm, such as the number of HLA mismatching, high-risk Amino Acid Substitution (AAS) and the like, so that the HLA-rich functional polymorphism is difficult to effectively reflect, and the research conclusion lacks application value.
Disclosure of Invention
To overcome the problems in the related art, the present disclosure provides a method, system, apparatus, and storage medium for predicting survival rate of HLA haploid matched hematopoietic stem cell transplantation. The Euclidean distance of the mismatched genes between the receptor HLA genes and the donor HLA genes is obviously statistically different from the transplantation survival rate of the HLA haploid matched hematopoietic stem cells, so that the survival rate of the receptor receiving HLA haploid matched hematopoietic stem cells is predicted by the Euclidean distance of the mismatched genes between the receptor HLA genes and the donor HLA genes, the functional polymorphism of the HLA is effectively reflected, and the research conclusion has application value.
In a first aspect of the present disclosure, the present disclosure provides a method of predicting survival of transplantation of HLA haploid matched hematopoietic stem cells transplanted as a recipient HLA gene having one nucleotide sequence identical to one nucleotide sequence of a donor HLA gene and having another nucleotide sequence different from the other nucleotide sequence of the donor HLA gene, wherein the two nucleotide sequences different between the recipient HLA gene and the donor HLA gene are referred to as mismatch genes, the method comprising the steps of:
S100, obtaining HLA genotyping of an acceptor and HLA genotyping of a donor;
S200, acquiring two amino acid sequences corresponding to mismatch genes between an acceptor HLA gene and a donor HLA gene according to the HLA genotyping of the acceptor and the HLA genotyping of the donor;
S300, respectively converting two amino acid sequence characteristics corresponding to mismatch genes between an acceptor HLA gene and a donor HLA gene into two-dimensional vectors;
S400, calculating Euclidean distance of mismatched genes between an acceptor HLA gene and a donor HLA gene; the Euclidean distance is the coordinate distance between two-dimensional vectors corresponding to mismatch genes between the acceptor HLA gene and the donor HLA gene;
s500, predicting the survival rate of the receptor after receiving HLA haploid matched hematopoietic stem cell transplantation based on the Euclidean distance; wherein, the greater the Euclidean distance, the lower the survival rate of the receptor after receiving HLA haploid matched hematopoietic stem cell transplantation is predicted; conversely, the smaller the Euclidean distance, the greater the survival rate of the recipient after receiving an HLA haploid matched hematopoietic stem cell transplant.
In some alternative embodiments, in step S200, the mismatched genes between the acceptor HLA gene and the donor HLA gene are converted into the corresponding two amino acid sequences using the IMGT/HLA database according to the acceptor HLA genotyping and the donor HLA genotyping.
In some alternative embodiments, in step S300, two amino acid sequence features corresponding to mismatch genes between the acceptor HLA gene and the donor HLA gene are first converted into high-dimensional vectors, respectively, and then two high-dimensional vectors corresponding to mismatch genes between the acceptor HLA gene and the donor HLA gene are reduced in dimension into two-dimensional vectors, respectively.
In some alternative embodiments, in step S300, the high-dimensional vector is selected from 100-4096-dimensional vectors, such as 100-dimensional vector, 200-dimensional vector, 300-dimensional vector, 400-dimensional vector, 512-dimensional vector, 1024-dimensional vector, 2048-dimensional vector, 4096-dimensional vector, or the like. In some embodiments, in step S300, the high-dimensional vector is selected from 1024-dimensional vectors.
In some alternative embodiments, two amino acid sequence features corresponding to mismatched genes between an acceptor HLA gene and a donor HLA gene are each converted to a high dimensional vector using a ubiquitin base protein language model. In some alternative embodiments, the ubiquitin base protein language model is selected from ProtT models. In some embodiments, the ubiquitin base protein language model is selected from the ProtT-XL-U50 model.
In some alternative embodiments, UMAP (Uniform Manifold Approximation and Projection) or t-SNE (t-Distributed Stochastic Neighbor Embedding) is used to reduce the two high-dimensional vectors corresponding to mismatched genes between the acceptor HLA gene and the donor HLA gene, respectively, to two-dimensional vectors.
In some alternative embodiments, the two high-dimensional vectors corresponding to mismatched genes between the acceptor HLA gene and the donor HLA gene are reduced in size to two-dimensional vectors, respectively, using t-SNE.
In some embodiments, UMAP is used to reduce the two high-dimensional vectors corresponding to mismatched genes between the acceptor HLA gene and the donor HLA gene into two-dimensional vectors, respectively.
In some alternative embodiments, the parameter n_neighbors that reduces the high-dimensional vector to a two-dimensional vector using UMAP is selected from any integer from 5 to 15; for example 5, 6, 7, 8, 9, 10, 11, 12, 13, 14 or 15. In some embodiments, n_neighbors is selected from 10.
In some alternative embodiments, parameter min_dist for reducing the high vector to a two-dimensional vector using UMAP is selected from any number from 0.01 to 0.20; for example, 0.01, 0.02, 0.03, 0.04, 0.05, 0.06, 0.07, 0.08, 0.09, 0.10, 0.11, 0.12, 0.13, 0.14, 0.15, 0.16, 0.17, 0.18, 0.19, or 0.20, etc. In some embodiments, min_dist is selected from 0.1.
In some alternative embodiments, the parameter random_state that reduces the high-dimensional vector to a two-dimensional vector using UMAP is selected from any integer from 4 to 12; for example 4, 5, 6, 7, 8, 9, 10, 11 or 12. In some embodiments, the random_state is selected from 8.
In some alternative embodiments, in step S500, the euclidean distance is compared to a preset threshold to predict survival of the recipient after receiving HLA-haploid matched hematopoietic stem cell transplantation:
if the Euclidean distance is greater than or equal to a preset threshold, predicting the survival rate of the receptor after the receptor is transplanted with HLA haploid matched hematopoietic stem cells to be low survival rate;
if the Euclidean distance is less than the preset threshold, the survival rate of the receptor after receiving the transplantation of the HLA haploid matched hematopoietic stem cells is predicted to be high survival rate.
In some alternative embodiments, the preset threshold is selected from any number from 12 to 23; such as 12、12.5、13、13.5、14、14.5、15、15.5、16、16.5、17、17.5、17.55、17.60、17.65、17.70、17.71、17.72、17.73、17.74、17.75、17.76、17.77、17.78、17.79、17.80、17.81、17.82、17.83、17.84、17.85、17.86、17.87、17.88、17.89、17.90、17.95、18、18.5、19、19.5、20、20.5、21、21.5、22、22.5 or 23, etc.
In some embodiments, the preset threshold is selected from 17.78.
In a second aspect of the disclosure, the disclosure provides a prediction system of survival of HLA haploid matched hematopoietic stem cell transplants, the prediction system being based on the prediction method of the first aspect of the disclosure; the prediction system comprises a data acquisition module, a first conversion module, a second conversion module, a dimension reduction module, a calculation module and a prediction module;
The data acquisition module is used for acquiring HLA genotyping of the receptor and HLA genotyping of the donor;
The first conversion module is used for acquiring two amino acid sequences corresponding to mismatch genes between the HLA genes of the receptor and the HLA genes of the donor according to the HLA genotyping of the receptor and the HLA genotyping of the donor;
The second conversion module is used for respectively converting two amino acid sequence characteristics corresponding to mismatch genes between the receptor HLA genes and the donor HLA genes into two-dimensional vectors;
A calculation module for calculating euclidean distance of mismatch genes between the acceptor HLA gene and the donor HLA gene;
and a prediction module for predicting survival rate of the receptor after receiving HLA haploid matched hematopoietic stem cell transplantation based on the Euclidean distance.
In a third aspect of the present disclosure, the present disclosure provides a prediction apparatus of survival of HLA haploid matched hematopoietic stem cell transplantation, the prediction apparatus comprising a memory and a processor; wherein the memory is used for storing program instructions; the processor is configured to invoke program instructions, which when executed implement the steps of the prediction method of the first aspect of the present disclosure.
In a fourth aspect of the disclosure, the disclosure provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the prediction method of the first aspect of the disclosure.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the principles of the disclosure.
Fig. 1 is a flow chart illustrating a method of predicting survival of HLA haploid matched hematopoietic stem cell transplants according to an exemplary embodiment.
Fig. 2 is a graph illustrating two-dimensional vectors corresponding to a recipient HLA genotyping and a donor HLA genotyping, according to an example embodiment.
Fig. 3 is a graph showing survival curves for predicting survival of HLA-incompatible allogeneic hematopoietic stem cells after transplantation into both high and low survival at a predetermined threshold selected from 17.78.
FIG. 4 is a diagram illustrating a system for predicting survival of HLA haploid matched hematopoietic stem cell transplants according to an exemplary embodiment
Detailed Description
The present disclosure discloses a method, system, apparatus, and storage medium for predicting survival rate of transplantation of HLA haploid matched hematopoietic stem cells, and those skilled in the art can refer to the disclosure herein to appropriately improve the implementation of process parameters. It is expressly noted that all such similar substitutions and modifications will be apparent to those skilled in the art, and are deemed to be included in the present disclosure. While the methods and applications of the present disclosure have been described in terms of preferred embodiments, it will be apparent to those skilled in the relevant art that variations and appropriate changes and combinations of the methods and applications described herein can be made to practice and use the disclosed technology without departing from the spirit and scope of the disclosure.
Interpretation of the terms
As used in this disclosure, the term "transplantation" can be divided into, by donor source: autograft, xenograft, and syngeneic transplants. Among these, "xenografts" are in turn divided into sibling donor grafts and non-blood donor grafts. "isogenic transplantation" is a transplantation in which the recipient and donor genes are identical, and in humans only refers to transplantation between syngeneic twins.
As used in this disclosure, the term "xenograft" is divided into full-phase and half-phase according to the degree of match of ten sites of human leukocyte antigen (Human Leukocyfe Antigen, HLA) genes, wherein full-phase refers to the same ten sites in the HLA genes of the recipient as the donor, and half-phase refers to the same five to nine sites in the HLA genes of the recipient as the donor, e.g., five, six, seven, eight, nine, etc.
As used in this disclosure, the term "human leukocyte antigen (human leukocyte antigen, HLA)" is a gene encoding the major histocompatibility complex (major histocompatibility complex, MHC) of humans. HLA molecules present on the surface of cell membranes can bind peptides from either the cell interior or the cell exterior, forming HLA-peptide complexes that antigen presenting cells present to T cells to elicit a series of immune responses. Among them, HLA genes are located in the short arm 6P21.31 region of chromosome 6 and are 3600kb in length. HLA-I, HLA-II and HLA-III genes can be classified according to their structure, distribution and function.
As used in this disclosure, the term "haploid matched hematopoietic stem cell transplantation (haploidentical hematopoietic stem cell transplantation, haplo-HSCT)" refers to a donor and recipient having half of the chromosomal HLA antigen identical, either from the father or from the mother. In general, the probability of HLA haploid match between parents and children is 100%, the probability of intercellular haploid match is 50%, and the probability of other parents match is 25% or less.
As used in this disclosure, the term "total death (OS)" refers to the total death caused by various causes over a period of time.
As used in this disclosure, the term "total mortality" refers to the ratio of the total number of deaths from various causes over a period of time to the average population of the population at the same time; the calculation formula of the total cause death rate is as follows: total mortality = total number of deaths in a group per year/average population of the group per year x 100%. The total cause mortality is an index for measuring the death risk of people due to diseases and injuries in a certain period.
As used in this disclosure, the term "threshold" is used to assess the sensitivity of a factor to an effector response produced by an organism. The prediction of survival probability after HLA xenograft (such as hematopoietic stem cell transplantation, etc.), the reasonable setting of the threshold value has important guiding significance for the prognosis evaluation of the target object (receptor) receiving HLA xenograft and the selection of the donor by the target object.
HLA haploid matched hematopoietic stem cell transplantation survival rate prediction method
In one embodiment, a method of predicting survival of HLA haploid matched hematopoietic stem cell transplants is provided. Wherein, the HLA haploid matched hematopoietic stem cell transplantation refers to that one nucleotide sequence in the acceptor HLA gene and one nucleotide sequence in the donor HLA gene are identical, and the other nucleotide sequence in the acceptor HLA gene and the other nucleotide sequence in the donor HLA gene are different. Also, two nucleotide sequences different between the acceptor HLA gene and the donor HLA gene are called mismatch genes.
Fig. 1 is a flowchart showing a method for predicting survival of HLA haploid matched hematopoietic stem cell transplants, according to an exemplary embodiment, the method comprising the steps of:
in step S100, HLA genotyping of the recipient and HLA genotyping of the donor are acquired.
In step S200, two amino acid sequences corresponding to mismatch genes between the acceptor HLA gene and the donor HLA gene are obtained according to the acceptor HLA genotyping and the donor HLA genotyping.
In one embodiment, the IMGT database is used to convert mismatched genes between the recipient HLA gene and the donor HLA gene into corresponding two amino acid sequences according to the recipient HLA genotyping and the donor HLA genotyping.
In step S300, two amino acid sequence features corresponding to mismatch genes between the acceptor HLA gene and the donor HLA gene are respectively converted into two-dimensional vectors using the ubiquitin basic protein language model.
In some embodiments, two amino acid sequence features corresponding to mismatched genes between an acceptor HLA gene and a donor HLA gene are converted into high-dimensional vectors by using a ubiquitin basic protein language model; and then, utilizing UMAP or t-SNE to respectively reduce the dimensions of two high-dimensional vectors corresponding to mismatch genes between the receptor HLA gene and the donor HLA gene into two-dimensional vectors.
In some embodiments, two amino acid sequence features corresponding to mismatched genes between an acceptor HLA gene and a donor HLA gene are each converted to 1024-dimensional vectors using the ProtT model. Illustratively, the ProtT model is selected from the ProtT5-XL-U50 model.
In some embodiments, two 1024-dimensional vectors corresponding to mismatched genes between the acceptor HLA gene and the donor HLA gene are reduced in size to high-dimensional vectors, respectively, using t-SNE techniques.
In some embodiments, the two 1024-dimensional vectors corresponding to mismatched genes between the acceptor HLA gene and the donor HLA gene are reduced in size to high-dimensional vectors, respectively, using UMAP techniques.
In one embodiment, two amino acid sequence features corresponding to mismatch genes between an acceptor HLA gene and a donor HLA gene are respectively converted into 1024-dimensional vectors by using ProtT-XL-U50 model; and then, utilizing UMAP to respectively reduce the dimensions of two 1024-dimensional vectors corresponding to mismatch genes between the receptor HLA gene and the donor HLA gene into two-dimensional vectors. Wherein, utilize UMAP to reduce the 1024 dimension vector to the parameter of two-dimensional vector: n_neighbors is selected from any integer from 5 to 15; min_dist is selected from any number from 0.01 to 0.20; the random_state is selected from any integer from 4 to 12. Illustratively, the parameters of the UMAP to reduce the 1024-dimensional vector to a two-dimensional vector: n_neighbors is selected from 10; min_dist is selected from 0.1; the random_state is selected from 8.
It should be noted that the values of the parameters n_ neighbors, min _dist and random_state are a preferred implementation, and in actual implementation, there may be a range of fluctuations in the values of the parameters n_ neighbors, min _dist and random_state.
In step S400, the euclidean distance of the mismatch gene between the acceptor HLA gene and the donor HLA gene is calculated; the Euclidean distance is the coordinate distance between two-dimensional vectors corresponding to mismatch genes between the acceptor HLA gene and the donor HLA gene.
In one embodiment, FIG. 2 shows a graph of two-dimensional vectors corresponding to an acceptor HLA genotyping and a donor HLA genotyping. Wherein, the coordinates of two-dimensional vectors corresponding to the receptor HLA genotyping are A respectively03:01 And A/>24:02; The coordinates of two-dimensional vectors corresponding to the HLA genotyping of the donor are A/>, respectively02:01 And A/>24:02; Euclidean distance of mismatched gene between the acceptor HLA gene and the donor HLA gene is the coordinate A/>03:01 And coordinates A/>02:01.
In step S500, survival of the recipient after receiving HLA haploid matched hematopoietic stem cell transplantation is predicted based on the euclidean distance.
In some embodiments, the greater the euclidean distance, the less survival the recipient is predicted to be after receiving HLA haploid matched hematopoietic stem cell transplantation; conversely, the smaller the Euclidean distance, the greater the survival rate of the recipient after receiving an HLA haploid matched hematopoietic stem cell transplant.
To intuitively reflect the survival of the recipient after receiving an HLA-haploid matched hematopoietic stem cell transplant, in some embodiments, the euclidean distance is compared to a preset threshold to predict the survival of the recipient after receiving an HLA-haploid matched hematopoietic stem cell transplant: if the Euclidean distance is greater than or equal to a preset threshold, predicting the survival rate of the receptor after the receptor is transplanted with HLA haploid matched hematopoietic stem cells to be low survival rate; if the Euclidean distance is less than the preset threshold, the survival rate of the receptor after receiving the transplantation of the HLA haploid matched hematopoietic stem cells is predicted to be high survival rate.
In some embodiments, the preset threshold is selected from any number from 12 to 23. Illustratively, the predetermined threshold is selected from 17.78.
The euclidean distance of mismatched genes between the acceptor HLA gene and the donor HLA gene was analyzed to determine whether the survival rate was statistically significant at a time point after HLA haploid hematopoietic stem cell transplantation with total death (OS) after HLA haploid hematopoietic stem cell transplantation. Wherein, fig. 3 is a survival curve for predicting survival of HLA haploid matched hematopoietic stem cells after transplantation to be in both high and low survival at a predetermined threshold selected from 17.78. When the survival rate of the HLA haploid matched hematopoietic stem cells is predicted to be 66.86% in the group with high survival rate after transplantation of the HLA haploid matched hematopoietic stem cells, the survival rate of the group with low survival rate after transplantation of the HLA haploid matched hematopoietic stem cells is predicted to be 54.39%, the p value between the two groups is 0.0030 (the p value <0.05 is generally used as a statistical difference, the p value <0.01 is used as a significant statistical difference, the p value <0.001 is used as an extremely significant statistical difference), thereby showing that the Euclidean distance of mismatch genes between the receptor HLA genes and the donor HLA genes is significantly statistically different from the survival rate of the HLA haploid matched hematopoietic stem cells transplanted by the receptor, and the HLA-rich functional polymorphism can be effectively reflected by the Euclidean distance of mismatch genes between the receptor HLA genes and the donor HLA genes.
From this, the present disclosure predicts survival rate of recipient receiving HLA haploid matched hematopoietic stem cell transplantation by euclidean distance between mismatched genes between recipient HLA gene and donor HLA gene to guide selection of donor, thereby effectively improving survival rate of recipient.
HLA haploid matched hematopoietic stem cell transplantation survival probability prediction system
In one embodiment, a system for predicting survival of HLA haploid matched hematopoietic stem cell transplants is provided. Fig. 4 is a diagram illustrating a system for predicting survival of HLA haploid matched hematopoietic stem cell transplants according to an exemplary embodiment.
The prediction system is based on a method for predicting survival rate of HLA haploid matched hematopoietic stem cell transplantation according to an exemplary embodiment of the present disclosure. Specifically, the prediction system comprises a data acquisition module, a first conversion module, a second conversion module, a calculation module and a prediction module.
And the data acquisition module is used for acquiring HLA genotyping of the receptor and HLA genotyping of the donor.
The first conversion module is used for acquiring two amino acid sequences corresponding to mismatch genes between the acceptor HLA genes and the donor HLA genes according to the HLA genotyping of the acceptor and the HLA genotyping of the donor.
The second conversion module is used for respectively converting two amino acid sequence characteristics corresponding to mismatch genes between the receptor HLA gene and the donor HLA gene into two-dimensional vectors by utilizing a ubiquitin basic protein language model;
And the calculation module is used for calculating Euclidean distance of mismatched genes between the receptor HLA genes and the donor HLA genes.
And a prediction module for predicting survival rate of the receptor after receiving HLA haploid matched hematopoietic stem cell transplantation based on the Euclidean distance.
HLA haploid matched hematopoietic stem cell transplantation survival rate prediction device
In one embodiment, a prediction device for survival of HLA-haploid matched hematopoietic stem cell transplants is provided, the prediction device comprising a memory and a processor; wherein the memory is used for storing program instructions; the processor is configured to invoke the program instructions, which when executed, implement the steps of a method for predicting survival of transplantation of HLA-haploid matched hematopoietic stem cells according to an exemplary embodiment of the present disclosure.
Computer readable storage medium
In one embodiment, a computer readable storage medium is provided having stored thereon a computer program which, when executed by a processor, performs the steps of a method for predicting survival of transplantation of HLA haploid matched hematopoietic stem cells according to an exemplary embodiment of the present disclosure.
The above describes in detail the method, system, device and storage medium for predicting survival rate of transplantation of HLA haploid matched hematopoietic stem cells provided by the present disclosure. Specific examples have been set forth herein to illustrate the principles and embodiments of the present disclosure, and the description of the examples above is only intended to assist in understanding the methods of the present disclosure and the core ideas thereof. It should be noted that it would be obvious to those skilled in the art that various improvements and modifications can be made to the present disclosure without departing from the principles of the present disclosure, and such improvements and modifications fall within the scope of the claims of the present disclosure.
Claims (10)
1. A method for predicting survival rate of transplantation of HLA-haploid-matched hematopoietic stem cells, which are transplanted as a mismatch gene, wherein one nucleotide sequence in a recipient HLA gene and one nucleotide sequence in a donor HLA gene are identical and the other nucleotide sequence in the recipient HLA gene and the other nucleotide sequence in the donor HLA gene are different, wherein the two different nucleotide sequences between the recipient HLA gene and the donor HLA gene are called mismatch genes, characterized by comprising the steps of:
S100, obtaining HLA genotyping of an acceptor and HLA genotyping of a donor;
S200, acquiring two amino acid sequences corresponding to mismatch genes between an acceptor HLA gene and a donor HLA gene according to the HLA genotyping of the acceptor and the HLA genotyping of the donor;
S300, respectively converting two amino acid sequence characteristics corresponding to mismatch genes between an acceptor HLA gene and a donor HLA gene into two-dimensional vectors;
S400, calculating Euclidean distance of mismatched genes between an acceptor HLA gene and a donor HLA gene; the Euclidean distance is the coordinate distance between two-dimensional vectors corresponding to mismatch genes between the acceptor HLA gene and the donor HLA gene;
s500, predicting the survival rate of the receptor after receiving HLA haploid matched hematopoietic stem cell transplantation based on the Euclidean distance; wherein, the greater the Euclidean distance, the lower the survival rate of the receptor after receiving HLA haploid matched hematopoietic stem cell transplantation is predicted; conversely, the smaller the Euclidean distance, the greater the survival rate of the recipient after receiving an HLA haploid matched hematopoietic stem cell transplant.
2. The prediction method according to claim 1, wherein in step S300, two amino acid sequence features corresponding to mismatch genes between the acceptor HLA gene and the donor HLA gene are converted into high-dimensional vectors, respectively, and then the two high-dimensional vectors corresponding to mismatch genes between the acceptor HLA gene and the donor HLA gene are reduced in dimension into two-dimensional vectors, respectively; wherein the high-dimensional vector is selected from 100-4096-dimensional vectors.
3. The prediction method according to claim 2, wherein the two amino acid sequence features corresponding to the mismatch gene between the acceptor HLA gene and the donor HLA gene are respectively converted into high-dimensional vectors using a ubiquitin basic protein language model.
4. The prediction method according to claim 2, wherein the two high-dimensional vectors corresponding to the mismatch genes between the acceptor HLA gene and the donor HLA gene are reduced in size to two-dimensional vectors, respectively, by UMAP or t-SNE.
5. The prediction method according to claim 4, wherein the two high-dimensional vectors corresponding to mismatch genes between the acceptor HLA gene and the donor HLA gene are reduced in size to two-dimensional vectors, respectively, by UMAP; wherein, utilize UMAP to reduce the high-dimensional vector to the parameter of two-dimensional vector: n_neighbors is selected from any integer from 5 to 15, and min_dist is selected from any number from 0.01 to 0.20.
6. The method according to claim 1, wherein in step S500, the euclidean distance is compared with a preset threshold value to predict survival rate of the recipient after receiving HLA-haploid matched hematopoietic stem cell transplantation:
if the Euclidean distance is greater than or equal to a preset threshold, predicting the survival rate of the receptor after the receptor is transplanted with HLA haploid matched hematopoietic stem cells to be low survival rate;
if the Euclidean distance is less than the preset threshold, the survival rate of the receptor after receiving the transplantation of the HLA haploid matched hematopoietic stem cells is predicted to be high survival rate.
7. The prediction method according to claim 6, wherein the preset threshold is selected from any number from 12 to 23.
8. A prediction system of survival rate of HLA haploid matched hematopoietic stem cell transplantation, characterized in that the prediction system is based on the prediction method of any one of claims 1 to 7; the prediction system comprises a data acquisition module, a first conversion module, a second conversion module, a dimension reduction module, a calculation module and a prediction module;
The data acquisition module is used for acquiring HLA genotyping of the receptor and HLA genotyping of the donor;
the conversion module is used for acquiring two amino acid sequences corresponding to mismatch genes between the acceptor HLA genes and the donor HLA genes according to the HLA genotyping of the acceptor and the HLA genotyping of the donor;
The second conversion module is used for respectively converting two amino acid sequence characteristics corresponding to mismatch genes between the receptor HLA genes and the donor HLA genes into two-dimensional vectors;
A calculation module for calculating euclidean distance of mismatch genes between the acceptor HLA gene and the donor HLA gene;
and a prediction module for predicting survival rate of the receptor after receiving HLA haploid matched hematopoietic stem cell transplantation based on the Euclidean distance.
9. A prediction device for survival rate of transplantation of HLA haploid matched hematopoietic stem cells, wherein the prediction device comprises a memory and a processor; wherein the memory is used for storing program instructions; the processor is configured to invoke program instructions, which when executed, implement the steps of the prediction method of any of claims 1 to 7.
10. A computer readable storage medium having stored thereon a computer program, characterized in that the computer program when executed by a processor realizes the steps of the prediction method according to any of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410411917.4A CN118016160B (en) | 2024-04-08 | 2024-04-08 | HLA haploid matched hematopoietic stem cell transplantation survival rate prediction method, system, equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410411917.4A CN118016160B (en) | 2024-04-08 | 2024-04-08 | HLA haploid matched hematopoietic stem cell transplantation survival rate prediction method, system, equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN118016160A true CN118016160A (en) | 2024-05-10 |
CN118016160B CN118016160B (en) | 2024-05-31 |
Family
ID=90954911
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410411917.4A Active CN118016160B (en) | 2024-04-08 | 2024-04-08 | HLA haploid matched hematopoietic stem cell transplantation survival rate prediction method, system, equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118016160B (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101575643A (en) * | 2009-05-31 | 2009-11-11 | 中国人民解放军第二军医大学 | Method for detecting the chimerism rate of recipients after allogeneic hematopoietic stem cell transplantation |
US20150125866A1 (en) * | 2012-05-04 | 2015-05-07 | Nhs Blood & Transplant | Method for selecting donors and recipients for transplantation |
US20150278434A1 (en) * | 2012-11-08 | 2015-10-01 | Umc Utrecht Holding B.V. | Method for prediction of an immune response against mismatched human leukocyte antigens |
WO2018079499A1 (en) * | 2016-10-25 | 2018-05-03 | 国立大学法人山口大学 | Method for assisting in prediction of prognosis of allogeneic hematopoietic stem cell transplantation |
US20180189443A1 (en) * | 2014-05-07 | 2018-07-05 | Pirche Ag | Methods and systems for predicting alloreactivity in transplantation |
-
2024
- 2024-04-08 CN CN202410411917.4A patent/CN118016160B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101575643A (en) * | 2009-05-31 | 2009-11-11 | 中国人民解放军第二军医大学 | Method for detecting the chimerism rate of recipients after allogeneic hematopoietic stem cell transplantation |
US20150125866A1 (en) * | 2012-05-04 | 2015-05-07 | Nhs Blood & Transplant | Method for selecting donors and recipients for transplantation |
US20150278434A1 (en) * | 2012-11-08 | 2015-10-01 | Umc Utrecht Holding B.V. | Method for prediction of an immune response against mismatched human leukocyte antigens |
US20180189443A1 (en) * | 2014-05-07 | 2018-07-05 | Pirche Ag | Methods and systems for predicting alloreactivity in transplantation |
WO2018079499A1 (en) * | 2016-10-25 | 2018-05-03 | 国立大学法人山口大学 | Method for assisting in prediction of prognosis of allogeneic hematopoietic stem cell transplantation |
Non-Patent Citations (1)
Title |
---|
陈惠仁;: "单倍型造血干细胞移植的新进展及其临床结果", 中国组织工程研究与临床康复, no. 34, 19 August 2008 (2008-08-19) * |
Also Published As
Publication number | Publication date |
---|---|
CN118016160B (en) | 2024-05-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Meinshausen | Node harvest | |
Riester et al. | FRANz: reconstruction of wild multi-generation pedigrees | |
WO2023217290A1 (en) | Genophenotypic prediction based on graph neural network | |
Anderson | Large-scale parentage inference with SNPs: an efficient algorithm for statistical confidence of parent pair allocations | |
CN116580848A (en) | Multi-head attention mechanism-based method for analyzing multiple groups of chemical data of cancers | |
CN114925767A (en) | Scene generation method and device based on variational self-encoder | |
CN118016160B (en) | HLA haploid matched hematopoietic stem cell transplantation survival rate prediction method, system, equipment and storage medium | |
Kwasigroch et al. | Deep neural network architecture search using network morphism | |
Liu et al. | A pre-trained large generative model for translating single-cell transcriptome to proteome | |
CN115239967A (en) | Image generation method and device for generating countermeasure network based on Trans-CSN | |
Conroy et al. | Chromosome identification using hidden Markov models: comparison with neural networks, singular value decomposition, principal components analysis, and Fisher discriminant analysis | |
CN111340330B (en) | Synchronous back-substitution reduction method based on correlation-quantity-shape three-dimensional distance | |
CN113362920A (en) | Feature selection method and device based on clinical data | |
CN113393582A (en) | Three-dimensional object reconstruction algorithm based on deep learning | |
Tienda-Luna et al. | Inferring the skeleton cell cycle regulatory network of malaria parasite using comparative genomic and variational Bayesian approaches | |
Patel et al. | Hyperbolic geometry-based deep learning methods to produce population trees from genotype data | |
Conde et al. | A null-space-based genetic algorithm for constrained l 1 minimization | |
Szatkownik et al. | Towards creating longer genetic sequences with GANs: Generation in principal component space | |
Zhai et al. | Two‐sample test with g‐modeling and its applications | |
Zhang et al. | scIAMC: Single-Cell Imputation via adaptive matrix completion | |
Szatkownik et al. | Latent generative modeling of long genetic sequences with GANs | |
Zhu et al. | Protein structure prediction with improved quantum immune algorithm | |
Seçilmiş | Deterministic modeling and inference of biological systems | |
Weldenegodguad | Genomic characterization of northern Eurasian cattle (Bos taurus) and reindeer (Rangifer tarandus) | |
Adeniyi | Bayesian survival analysis with flexible penalization using beta process prior for baseline hazard |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |