CN114613423A - Biomarker for predicting curative effect of diffuse large B cell lymphoma chemotherapy - Google Patents

Biomarker for predicting curative effect of diffuse large B cell lymphoma chemotherapy Download PDF

Info

Publication number
CN114613423A
CN114613423A CN202011448752.6A CN202011448752A CN114613423A CN 114613423 A CN114613423 A CN 114613423A CN 202011448752 A CN202011448752 A CN 202011448752A CN 114613423 A CN114613423 A CN 114613423A
Authority
CN
China
Prior art keywords
chemotherapy
cell lymphoma
diffuse large
biomarker
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011448752.6A
Other languages
Chinese (zh)
Inventor
林坚
陈航宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Xuanniao Feixun Technology Co.,Ltd.
Original Assignee
Shanghai Epican Biotechnology Co ltd
Shanghai Epican Genetech Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Epican Biotechnology Co ltd, Shanghai Epican Genetech Co ltd filed Critical Shanghai Epican Biotechnology Co ltd
Priority to CN202011448752.6A priority Critical patent/CN114613423A/en
Publication of CN114613423A publication Critical patent/CN114613423A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B5/00ICT specially adapted for modelling or simulations in systems biology, e.g. gene-regulatory networks, protein interaction networks or metabolic networks
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • G16B30/10Sequence alignment; Homology search
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B35/00ICT specially adapted for in silico combinatorial libraries of nucleic acids, proteins or peptides
    • G16B35/20Screening of libraries
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/106Pharmacogenomics, i.e. genetic variability in individual responses to drugs and drug metabolism
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/158Expression markers

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Chemical & Material Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biotechnology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Evolutionary Biology (AREA)
  • Theoretical Computer Science (AREA)
  • Analytical Chemistry (AREA)
  • Molecular Biology (AREA)
  • Organic Chemistry (AREA)
  • Medical Informatics (AREA)
  • Wood Science & Technology (AREA)
  • Biochemistry (AREA)
  • Immunology (AREA)
  • Zoology (AREA)
  • Genetics & Genomics (AREA)
  • Library & Information Science (AREA)
  • Pathology (AREA)
  • Microbiology (AREA)
  • Physiology (AREA)
  • General Engineering & Computer Science (AREA)
  • Oncology (AREA)
  • Hospice & Palliative Care (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention relates to a biomarker for predicting the curative effect of diffuse large B cell lymphoma chemotherapy, which comprises the combination of ARHGEF12, THAP3, SMYD3, OR2G2, ALKBH3, RNASEH2C, ZNF280D, SLC5A11, CTDP1, GPR15, GOLGB1, FBXL4 and LMBR 1. The biomarker provided by the invention can effectively predict the treatment effect of diffuse large B cell lymphoma chemotherapy, the sensitivity and specificity respectively reach 0.82 and 0.75, and the AUC value is 0.78. The biomarker is used for predicting the curative effect, has the advantages of safety, no wound, easy acquisition of samples, high accuracy, convenient operation and the like, and provides accurate judgment for the evaluation of the curative effect of the diffuse large B cell lymphoma chemotherapy.

Description

Biomarker for predicting curative effect of diffuse large B cell lymphoma chemotherapy
Technical Field
The invention relates to the technical field of biology, in particular to a biomarker for predicting the curative effect of diffuse large B cell lymphoma chemotherapy.
Background
Diffuse large B-cell lymphoma (DLBCL) is the major type of aggressive lymphoid tissue tumor, accounting for approximately 30% of non-hodgkin's lymphoma. Although most of the patients with DLBCL are old, the disease is found in all ages. The overall survival of DLBCL patients was significantly improved since rituximab (R) combined with cyclophosphamide, doxorubicin, vincristine, and prednisone (CHOP) chemotherapy regimen (R-CHOP) 10 years ago.
However, 30% to 50% of patients are not sensitive to this standard of care and the current methods do not accurately or efficiently predict the therapeutic effect of R-CHOP prior to treatment. Currently, Positron Emission Tomography (PET) -CT is the gold standard for evaluating the efficacy of different treatment regimens on DLBCL. However, it is usually used after treatment, and therefore the treatment effect cannot be predicted in advance. The International Prognostic Index (IPI) is the currently adopted method for assessing the major prognostic risk of DLBCL, especially in high risk patients, and is useful in R-CHOP chemotherapy. However, the accuracy of predicting the therapeutic effect of R-CHOP on DLBCL patients by IPI is low. Therefore, it is necessary to find an accurate method to predict in advance the therapeutic effect of the R-CHOP regimen of DLBCL.
Disclosure of Invention
The biomarker provided by the invention can effectively predict the curative effect of the diffuse large B cell lymphoma chemotherapy, the sensitivity and the specificity respectively reach 0.82 and 0.75, and the AUC value is 0.78. The biomarker is used for predicting the curative effect, has the advantages of safety, no wound, easy acquisition of samples, high accuracy, convenient operation and the like, and provides accurate judgment for the evaluation of the curative effect of the diffuse large B cell lymphoma chemotherapy.
Therefore, the invention adopts the following technical scheme.
A first aspect of the invention provides a biomarker comprising a combination of ARHGEF12, THAP3, SMYD3, OR2G2, ALKBH3, RNASEH2C, ZNF280D, SLC5a11, CTDP1, GPR15, GOLGB1, FBXL4 and LMBR 1.
In a second aspect, the invention provides the use of the biomarker in the manufacture of a product for predicting the efficacy of chemotherapy for diffuse large B-cell lymphoma, said use comprising the use for constructing a model for predicting the efficacy of chemotherapy for diffuse large B-cell lymphoma.
Further, the chemotherapy regimen is: rituximab (R) is combined with a chemotherapeutic regimen of cyclophosphamide, doxorubicin, vincristine, and prednisone (CHOP). This chemotherapeutic regimen is commonly referred to in the art as R-CHOP.
In a third aspect, the present invention provides a model for predicting the efficacy of chemotherapy for diffuse large B-cell lymphoma, wherein the input variable of the model is the content of the biomarker according to the present invention.
Further, the method for measuring the content of the biomarker is 5hmC high-throughput detection; in a specific embodiment, the assay is 5 hmC-Seal.
In a fourth aspect, the present invention provides a method for constructing a model for predicting the therapeutic effect of diffuse large B-cell lymphoma chemotherapy, comprising the following steps:
(1) respectively detecting samples from a plurality of patients with the diffuse large B cell lymphoma subjected to chemotherapy to obtain 5hmC sequencing data of DNA;
(2) sequentially carrying out first filtration, screening and second filtration on the sequencing data obtained in the step (1) to obtain a biomarker for predicting the curative effect of the diffuse large B cell lymphoma chemotherapy;
the first filtering includes: removing spike information only occurring in 10 samples or less;
the screening comprises the following steps: comparing sequencing data from different samples by using DESeq2 software, reserving a 5hmC peak region with the reads number larger than 50, and finding out a difference biomarker of 5hmC up-down regulation according to FoldChage > being 0.5 and pvalue being less than 0.01;
the second filtering includes: for the differential biomarkers obtained by said screening, filtration was performed using the recursive feature elimination algorithm (RFECV) in Scikit-leran to obtain biomarkers for predicting the efficacy of chemotherapy for diffuse large B-cell lymphoma: ARHGEF12, THAP3, SMYD3, OR2G2, ALKBH3, RNASEH2C, ZNF280D, SLC5a11, CTDP1, GPR15, GOLGB1, FBXL4, and LMBR 1;
(3) inputting the data of the biomarkers obtained in the step (2) into a machine learning model, training the model, storing the trained model, and obtaining a model for predicting the curative effect of the diffuse large B cell lymphoma chemotherapy.
Further, the sample is plasma; in a specific embodiment, the sample is peripheral blood.
Further, the patients with chemotherapy-induced diffuse large B-cell lymphoma include patients evaluated for Complete Remission (CR), Partial Remission (PR), Stable Disease (SD), and Progressive Disease (PD) by treatment effect.
Further, in the second filtering, the parameters used are: logistic regressoncv (class _ weight: ' balanced ', cv ═ 2, max _ iter ═ 1000), rating ═ accuracy ')
Further, in step (3), the machine learning model includes: CV model (LR) was regressed by training logistic.
Further, the chemotherapy regimen is: rituximab (R) is combined with a chemotherapeutic regimen of cyclophosphamide, doxorubicin, vincristine, and prednisone (CHOP).
The invention discloses a method for predicting the chemotherapy curative effect of diffuse large B cell lymphoma, which is characterized in that the method utilizes the logistic regression analysis to carry out data mining, discusses factors related to the chemotherapy curative effect of the diffuse large B cell lymphoma, and predicts the chemotherapy curative effect of the diffuse large B cell lymphoma according to the factors. Through logistic regression analysis, the weight of independent variable can be obtained, so that the factors which are closely related to the curative effect of diffuse large B cell lymphoma chemotherapy can be known. Meanwhile, the chemotherapy curative effect of diffuse large B cell lymphoma of one person can be predicted according to the weight value and the factor. For the model provided by the invention, the performance of the model is evaluated by adopting Receiver Operating Characteristic (ROC) analysis, the area under the curve (AUC) is calculated by using skleran, the sensitivity and the specificity of the model respectively reach 0.82 and 0.75, and the ROC value is 0.78.
In a fifth aspect, the present invention provides a product for predicting the efficacy of chemotherapy for diffuse large B-cell lymphoma, said product predicting the efficacy of chemotherapy for diffuse large B-cell lymphoma based on said model for predicting the efficacy of chemotherapy for diffuse large B-cell lymphoma.
DNA 5-methylcytosine (5mCs) is an important epigenetic feature that plays an important role in gene expression and tumorigenesis development. According to the invention, through a high-throughput sequencing technology, the 5-hydroxymethylcytosine biomarker for predicting the chemotherapy effect of diffuse large B-cell lymphoma is found, and the R-CHOP treatment effect of a patient with diffuse large B-cell lymphoma is effectively predicted in advance.
Compared with the prior art, the invention has the following beneficial effects:
(1) the invention provides a biomarker for predicting the chemotherapy curative effect of diffuse large B cell lymphoma, which can effectively predict the chemotherapy curative effect of a patient with diffuse large B cell lymphoma in advance.
(2) The invention also provides a model for predicting the curative effect of the diffuse large B cell lymphoma chemotherapy, the sensitivity and the specificity of the model respectively reach 0.82 and 0.75, the AUC value is 0.78, and the model has the advantages of strong specificity and high sensitivity. By applying the biomarker and/or the model, the safe, noninvasive and high-accuracy prediction of the curative effect of the diffuse large B cell lymphoma of a patient can be realized.
(3) According to the biomarker and the model provided by the invention, the sample for predicting the chemotherapeutic effect is peripheral blood, and is easy to obtain; the prediction method is based on high-throughput sequencing, and the detection efficiency is high; the prediction model is reliable, and the accuracy of the prediction result is high.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. In the drawings:
FIG. 1: the prediction accuracy of a single marker in the 13 feature 5hmC markers in the training set;
FIG. 2: and verifying the prediction accuracy of a single marker in the 13 characteristic 5hmC markers in the group. Of these, ARHGEF12(AUC 0.76) and ZNF280D (AUC 0.76) showed the highest prediction accuracy.
FIG. 3: feature selection for 5hmC differential markers. The recursive feature selection algorithm selects 13 features as the minimum feature number to obtain the best cross-validation score. The x-axis is the number of features selected and the y-axis is the cross-validation score for model performance.
FIG. 4: receiver Operating Characteristics (ROC) curves for the prediction models constructed using 135 hmC feature Markers in the training and validation cohort. The true positive rate (sensitivity) and false positive rate (1-specificity) were plotted as a function. In the training set, the prediction model constructed by the 135 hmC feature Markers can effectively predict responders and non-responders (AUC 1.00) to the R-CHOP treatment, and in the verification set, the prediction model based on the 135 hmC feature Markers can also effectively predict responders and non-responders (AUC 0.78) to the R-CHOP treatment.
FIG. 5: confusion matrices for predictive model performance in the validation set (22 patients with treatment effect and 8 patients with treatment non-effect). The results show that: in 22 cases of treatment effective patients, the treatment effectiveness of 18 patients is predicted based on 13 prediction models of 5hmC characteristic Markers, and the sensitivity reaches 0.82; of the 8 treatment-ineffectual patients, 6 were predicted to be ineffectual based on a predictive model of 135 hmC feature Markers, with a specificity of 0.75.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
The invention provides a model for predicting the curative effect of diffuse large B cell lymphoma chemotherapy, and relevant parameters of 135 hmC characteristic biomarkers in the model are shown in a table 1:
TABLE 1
Markers GeneID Coefficients SE z.value p.value
Intercept -5.5704 0.867 2.652 <0.01
chr1_6721489_6721898 THAP3 0.7712 0.145 1.865 <0.05
chr1_246290825_246291238 SMYD3 0.39 0.149 1.955 <0.05
chr1_247755954_247756505 OR2G2 3.1779 0.108 1.344 <0.05
chr11_43905400_43905804 ALKBH3 3.3423 0.128 1.306 <0.05
chr11_65511519_65512429 RNASEH2C 1.5211 0.072 2.061 <0.05
chr11_120211662_120212234 ARHGEF12 -3.8797 0.115 -3.225 <0.001
chr15_56982146_56982638 ZNF280D -1.2266 0.177 -3.250 <0.001
chr16_24916341_24916920 SLC5A11 0.6683 0.076 0.149 <0.05
chr18_77500908_77501376 CTDP1 -2.573 0.103 -3.182 <0.001
chr3_98270705_98271079 GPR15 0.1052 0.167 2.348 <0.05
chr3_121430838_121431239 GOLGB1 0.8526 0.178 2.982 <0.01
chr6_99461404_99461922 FBXL4 1.7188 0.101 2.165 <0.05
chr7_156700537_156701031 LMBR1 1.0942 0.078 0.579 <0.05
In the examples of the present invention, samples were sequenced using 5hmC-Seal high throughput sequencing, and the 5hmC-Seal high throughput sequencing used in the following examples was explained as follows:
5hmC-Seal is a high throughput sequencing method based on 5 hmC. The method is based on the improved chemical glycosylation mark and combines the second generation high-throughput sequencing technology to obtain the distribution information of 5hmC on the genome DNA.
Due to the high sensitivity of the chemical labeling method, the input DNA can be as low as 1-10ng, the DNA can be fragmented genome DNA or small fragments such as cfDNA, filling up is carried out AT two ends of the DNA fragment according to the requirement of second-generation sequencing, then an A tail is connected AT the 3' end, and a sequencing Y-shaped joint is connected AT two ends of each DNA fragment by AT specificity, wherein the sequencing Y-shaped joint comprises index information for distinguishing samples, amplification primers and other sequences. Then, entering a labeling step of 5hmC, firstly adding UDP-6-N3-Glc, reacting 5hmC on DNA to obtain N3-5ghmC under a specific condition, then adding DBCO-PEG4-Biotin, connecting N3-5ghmC with Biotin, finally screening to obtain all DNA fragments containing 5hmC sites through efficient specific binding of Biotin-magnetic beads, and completing construction of a DNA library based on 5hmC after PCR amplification and purification; DNA band size and distribution of each sample were analyzed by Fragment Analyzer quality control, and after library was accurately quantified by qPCR, high throughput sequencing was performed using Nextseq500 sequencer from Illumina to obtain base sequences of all DNA fragments in the library.
Example 1
86 patients with R-CHOP chemotherapy diffuse large B cell lymphoma, including patients with Complete Remission (CR), Partial Remission (PR), Stable Disease (SD) and Progressive Disease (PD) assessed by treatment efficacy. Peripheral blood samples (3-4mL) of the above 86 patients were taken, cfDNA was extracted from the plasma and subjected to 5hmC-Seal high-throughput sequencing, the sequencing throughput of each sample was 1.5Gb, and the sequencing band size was 38 bp. The method comprises the following specific steps:
(1) collecting samples:
at the time of confirmed diagnosis of patient admission (before R-CHOP standard chemotherapy), a total of 8mL of blood was collected from 86 patients with diffuse large B cell lymphoma, and plasma samples were collected within 24 hours after plasma separation. First centrifugation was performed at 1350g and 4 ℃ for 12min, and the pale yellow supernatant was carefully removed and transferred to a 2ml DNA free sterile EP tube to avoid contaminating the buffy coat. Secondly, carrying out secondary centrifugation at 1350g and 4 ℃ for 5 min; the supernatant was carefully removed, transferred to a 2mL DNase free sterile EP tube, and the remaining erythrocytes were removed again. Thirdly, centrifuging at 13500g and 4 ℃ for 5 min; the supernatant was carefully removed and transferred to a 2mL DNase free sterile centrifuge tube to obtain about 4mL of plasma, which was labeled and then frozen in a-80 ℃ freezer for use.
(2) Extracting plasma DNA:
8-10ng of Plasma cfDNA was extracted from each of the Plasma samples from 86 patients with diffuse large B-cell lymphoma using the Quick-cfDNA Serum & Plasma Kit (ZYMO) Kit.
(3) End-filling and ligation of cfDNA to sequencing adapters:
a reaction mixture (total volume 24uL) containing 20uL cfDNA, 2.8uL End Repair & A-labeling Buffer, and 1.2uL End Repair & A-labeling Enzyme mix was prepared according to the KAPA Hyperplus Library Preparation Kit (96) instructions, and was incubated at 20 for 30 minutes and then at 65 for 30 minutes. Next, a Ligation reaction mixture of 2uL of nucleic acid free water,12uL of Ligation Buffer and 4uL of DNA Ligase was placed in a 1.5mL low adsorption EP tube. To 18uL ligation reaction mixture was added 2uL of sequencing adapter, mixed, added to 24uL reaction samples, and heated at 20 ℃ for 4 hours. The reaction product was purified using a DNA Clean & Concentrator 5(ZYMO) purification kit, and eluted with 20uL of an elution buffer to obtain a final DNA ligation sample.
(4)5hmC marker:
a total volume of 4uL of labeling reaction mix, 1uL of 50uM UDP-N3-Glu, 1uL of β GT enzyme (NEB), and 2uL of HEPES buffer (pH 8.0, 50mM final concentration), was prepared and 4uL of the labeling reaction mix was added to 20uL of the DNA ligation sample. The mixture was incubated at 37 ℃ for 2 hours in a water bath. The mixture was taken out, and the reaction product was purified with a DNA Clean & Concentrator 5(ZYMO) purification kit to obtain purified 30uL DNA. Then, 1uL of DBCO-PEG4-biotin (click Chemistry tools) was added to the above purified 30uL of DNA, and the reaction product was purified by DNA Clean & Concentrator 5(ZYMO) purification kit after 1 hour of water bath at 37 ℃ to obtain 30uL of purified labeled product.
(5) Enrichment of 5 hmC:
first, bound beads were prepared as follows: 2.5uL of Dynabeads (Invitrogen) were removed and 100uL of wash buffer (5mM Tris (pH 7.5), lM NaCl and 0.02% Tween20) was added, vortexed, placed on a 1.5mL magnetic rack, washed repeatedly with 100uL of wash buffer for 3 times, and finally 32uL of binding buffer (10mM Tris (pH 7.5), 2M NaCl and 0.04% Tween20) was added and vortexed. Then, the purified labeled product obtained in the above step was added to the mixture of magnetic beads and mixed in a rotary mixer for 30min to allow sufficient binding. Finally, the beads were washed 5 times with 100uL of washing solution (5mM Tris (pH 7.5), lM NaCl and 0.02% Tween20), the supernatant was centrifuged off, and 23.8uL RNase-Free water was added.
(6) And (3) PCR amplification:
to the final system of the above steps, 25uL of 2X PCR master mix and 1.25uL of PCR primers (total volume 50u L) were added and amplification was performed according to the following PCR reaction cycle temperatures and conditions:
Figure BDA0002825887630000071
Figure BDA0002825887630000081
and (3) purifying the amplification product by using AmpureXP beads to obtain a final 5hmC library, and determining the library concentration by using the Qubit 3.0.
(6) High-throughput sequencing after quality control of the 5hmC library:
performing quality control on the obtained 5hmC library by a Fragment Analyzer (TM) full-automatic capillary electrophoresis system, and determining the size of a DNA Fragment in the library and whether the library contains impurities; concentration determination was performed by qPCR. The sequencing libraries that passed the quality control were mixed at the same concentration and sequenced with 11 lumine NextSeq500 using a sequencing Kit of High Output Kit v2(75cycles), with a sequencing throughput of 1.5Gb for each sample and a sequencing band size of 75 bp.
(7) Raw sequencing data alignment:
each original sequencing FASTQ data is firstly trimmed by low-quality data through Trimmomatic software, then is aligned to human genome hg19 through Bowtie2 software, and is identified by reads number peaks containing 5hmC through MACS software, wherein the parameters are as follows: effective genome size 2.72e + 09; tag size 38; band width is 100; model fold is 10; p value cutoff is 1.00 e-05.
The 86 samples were randomly split into a training group (training cohort) and a validation group (validation cohort) at a ratio of 2: 1. Comparing the peak information of plasma cfDNA of different patients in the training set (56 cases) and performing filtering, provided that each peak appears in at least 10 samples; comparing DNA sequencing data from different samples with DEseq2 software, finding 5hmC peak area with reads number greater than 50, finding difference biomarker of 5hmC up-down regulation according to | log2FoldChange | > 0.5 and pvalue < 0.01. The differential biomarkers were then further filtered using the recursive feature elimination algorithm (RFECV) in Scikit-Learn, and the feature biomarkers were screened (using the parameters: estimator ═ logistic regression cv (class _ weight ═ balanced ', cv ═ 2, max _ iter ═ 1000), and sequencing ═ accuracycacy'). Further, model construction is carried out by using the screened feature biomarkers, a logistic regression CV model (LR) is trained, the performance of the model is analyzed and evaluated by adopting the operating characteristics (ROC) of a subject, the area under the curve (AUC), the sensitivity and the specificity are calculated by using skleran, wherein the prediction accuracy of a single marker in the 5hmC markers with 13 features in a training group can be shown in figure 1, and the prediction accuracy of a single marker in the 5hmC markers with 13 features in a verification group can be shown in figure 2. Finally, after 100 times of training, it is found that the prediction model constructed by the 5hmC markers with 13 characteristics can reach the optimal cross validation score, and the result is shown in FIG. 3. The constructed LR prediction model was used to predict the treatment outcome for patients in the validation group, see fig. 4. The constructed LR prediction model was used in example 2 to predict the treatment outcome of patients in the validation group.
Example 2
Peripheral blood of patients in the validation group (30 cases, wherein CR 14 case, PR 8 case, SD 3 case, PD 5 case) in example 1 was taken for sample collection/cfDNA extraction/library construction/library sequencing according to the experimental method in example 1, the sequencing data was processed and analyzed, and 5hmC enrichment sites were screened. Then, efficacy prediction is performed by the prediction model obtained in example 1 according to the selected 5 hmC-enriched site. The prediction results are shown in fig. 5, and the results show that 18 of 22 PRCR patients are predicted to be PRCR, with a sensitivity of 0.82; of the 8 PDSD patients, 6 predicted PDSD with a specificity of 0.75.
Therefore, the prediction model constructed based on the 13 characteristic biomarkers of 5hmC provided by the invention can effectively predict the treatment effect of R-CHOP chemotherapy of patients with diffuse large B-cell lymphoma in advance.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (10)

1. A biomarker for predicting the efficacy of chemotherapy for diffuse large B-cell lymphoma, wherein the biomarker comprises a combination of ARHGEF12, THAP3, SMYD3, OR2G2, ALKBH3, RNASEH2C, ZNF280D, SLC5a11, CTDP1, GPR15, GOLGB1, FBXL4 and LMBR 1.
2. Use of a biomarker according to claim 1 in the manufacture of a product for predicting the efficacy of chemotherapy for diffuse large B-cell lymphoma, said use comprising constructing a model for predicting the efficacy of chemotherapy for diffuse large B-cell lymphoma.
3. Use of a biomarker according to claim 2, wherein the chemotherapy regimen is: rituximab in combination with a chemotherapeutic regimen of cyclophosphamide, doxorubicin, vincristine and prednisone.
4. A product based on a model for predicting the efficacy of chemotherapy for diffuse large B-cell lymphoma, wherein the input variables for said model are the levels of the biomarkers of claim 1.
5. The product of claim 4, wherein the biomarker level is determined by a 5hmC high throughput assay; preferably 5 hmC-Seal.
6. The product of claim 4, wherein the model is constructed by a method comprising the steps of:
(1) respectively detecting samples from a plurality of patients with the diffuse large B cell lymphoma subjected to chemotherapy to obtain 5hmC sequencing data of DNA;
(2) sequentially carrying out first filtration, screening and second filtration on the sequencing data obtained in the step (1) to obtain a biomarker for predicting the curative effect of the diffuse large B cell lymphoma chemotherapy;
the first filtering includes: removing peak information that appears only in 10 samples or less;
the screening comprises the following steps: comparing sequencing data from different samples by using DESeq2 software, reserving a 5hmC peak region with the reads number larger than 50, and obtaining a differential biomarker of 5hmC up-and-down regulation according to the conditions that log2FoldChange | > (0.5) and pvalue (0.01);
the second filtering includes: for the differential biomarkers obtained by said screening, filtering was performed using the recursive feature elimination algorithm in Scikit-leran to obtain biomarkers for predicting the efficacy of chemotherapy on diffuse large B-cell lymphoma: ARHGEF12, THAP3, SMYD3, OR2G2, ALKBH3, RNASEH2C, ZNF280D, SLC5a11, CTDP1, GPR15, GOLGB1, FBXL4, and LMBR 1;
(3) inputting the data of the biomarkers obtained in the step (2) into a machine learning model, training the model, storing the trained model, and obtaining a model for predicting the curative effect of the diffuse large B cell lymphoma chemotherapy.
7. The product of claim 6, wherein the sample is plasma; peripheral blood is preferred.
8. The product of claim 6, wherein said chemotherapy-treated diffuse large B-cell lymphoma patients comprise patients evaluated for complete remission, partial remission, stable disease and disease progression.
9. The product according to claim 6, characterized in that in the second filtration the parameters used are: the identifier is logisticregressoncv (class _ weight is 'balanced', cv is 2, max _ iter is 1000), and the calibrating is 'accuracy').
10. The product of claim 6, wherein in step (3), the machine learning model comprises: the CV model was regressed by training logistic.
CN202011448752.6A 2020-12-09 2020-12-09 Biomarker for predicting curative effect of diffuse large B cell lymphoma chemotherapy Pending CN114613423A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011448752.6A CN114613423A (en) 2020-12-09 2020-12-09 Biomarker for predicting curative effect of diffuse large B cell lymphoma chemotherapy

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011448752.6A CN114613423A (en) 2020-12-09 2020-12-09 Biomarker for predicting curative effect of diffuse large B cell lymphoma chemotherapy

Publications (1)

Publication Number Publication Date
CN114613423A true CN114613423A (en) 2022-06-10

Family

ID=81856780

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011448752.6A Pending CN114613423A (en) 2020-12-09 2020-12-09 Biomarker for predicting curative effect of diffuse large B cell lymphoma chemotherapy

Country Status (1)

Country Link
CN (1) CN114613423A (en)

Similar Documents

Publication Publication Date Title
CN108753967B (en) Gene set for liver cancer detection and panel detection design method thereof
CN111041089B (en) Application of host marker for COVID-19 infection
CN110760580B (en) Early diagnosis equipment for liver cancer
AU2010311535A1 (en) Means and methods for non-invasive diagnosis of chromosomal aneuploidy
WO2020048518A1 (en) Group of genes for molecular typing of medulloblastoma and use thereof
CN109637587B (en) Method, device, storage medium, processor and method for standardizing transcriptome data expression quantity for detecting gene fusion mutation
CN107142320B (en) Gene marker for detecting liver cancer and application thereof
CN106399304B (en) A kind of SNP marker relevant to breast cancer
CN110229897A (en) MED12 gene mutation detection kit and its application
CN115786459B (en) Method for detecting tiny residual disease of solid tumor by high-throughput sequencing
CN108060227A (en) A kind of amplimer, kit and its detection method for detecting PAH gene mutations
CN113637744A (en) Application of microbial marker in judging progress of acute pancreatitis course
JPWO2019117257A1 (en) How to help detect breast cancer
CN109295219B (en) The biomarker and its detection kit of the congenital fibrinogen defect of one group of detection
CN114613423A (en) Biomarker for predicting curative effect of diffuse large B cell lymphoma chemotherapy
CN116121360A (en) Kit for detecting DBA pathogenic gene set and detection method
CN115011695A (en) Multiple cancer species identification marker based on free circular DNA gene, kit and application
KR102229647B1 (en) MiRNA bio-marker for non-invasive differential diagnosis of acute rejection in kidney transplanted patients and uses thereof
CN108342488B (en) Kit for detecting gastric cancer
CN103509801B (en) Skeletal muscle chloride ion channel gene mutant and its application
CN106868128B (en) Biomarker for auxiliary diagnosis of breast cancer and application thereof
CN106636351B (en) One kind SNP marker relevant to breast cancer and its application
CN110218795B (en) Application of miR-766-3p and miR-766-5p in preparation of high-grade glioma and intracranial lymphoma diagnosis and differential diagnosis preparation
CN111979325B (en) Application of molecular marker combination in characterization of lung adenocarcinoma qi-yin deficiency syndrome, screening and model building method
CN115747217B (en) Long-chain non-coding RNA PDXDC1-AS1 and application thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20230629

Address after: 2104, 21st Floor, No. 9 North Fourth Ring West Road, Haidian District, Beijing, 100000

Applicant after: Beijing Xuanniao Feixun Technology Co.,Ltd.

Address before: Room 201-2, 2nd Floor, 328 Edison Road, Pudong New Area, Shanghai, March 2012

Applicant before: SHANGHAI EPICAN GENETECH Co.,Ltd.

Applicant before: SHANGHAI EPICAN BIOTECHNOLOGY Co.,Ltd.

TA01 Transfer of patent application right