EP4025713A1 - Moyens et méthodes de diagnostic du cancer du poumon - Google Patents
Moyens et méthodes de diagnostic du cancer du poumonInfo
- Publication number
- EP4025713A1 EP4025713A1 EP20764417.0A EP20764417A EP4025713A1 EP 4025713 A1 EP4025713 A1 EP 4025713A1 EP 20764417 A EP20764417 A EP 20764417A EP 4025713 A1 EP4025713 A1 EP 4025713A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- regions
- tumor
- methylation
- methylation markers
- dna
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 95
- 206010058467 Lung neoplasm malignant Diseases 0.000 title claims abstract description 66
- 201000005202 lung cancer Diseases 0.000 title claims abstract description 59
- 208000020816 lung neoplasm Diseases 0.000 title claims abstract description 59
- 230000011987 methylation Effects 0.000 claims abstract description 164
- 238000007069 methylation reaction Methods 0.000 claims abstract description 164
- 238000004458 analytical method Methods 0.000 claims abstract description 40
- 238000011528 liquid biopsy Methods 0.000 claims abstract description 38
- 108091034117 Oligonucleotide Proteins 0.000 claims abstract description 29
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 claims abstract description 29
- 238000001574 biopsy Methods 0.000 claims abstract description 20
- 210000004072 lung Anatomy 0.000 claims abstract description 19
- 206010028980 Neoplasm Diseases 0.000 claims description 141
- 210000002381 plasma Anatomy 0.000 claims description 74
- 238000004422 calculation algorithm Methods 0.000 claims description 50
- 238000012163 sequencing technique Methods 0.000 claims description 50
- 238000006243 chemical reaction Methods 0.000 claims description 38
- LSNNMFCWUKXFEE-UHFFFAOYSA-M Bisulfite Chemical compound OS([O-])=O LSNNMFCWUKXFEE-UHFFFAOYSA-M 0.000 claims description 27
- 206010041823 squamous cell carcinoma Diseases 0.000 claims description 26
- 208000009956 adenocarcinoma Diseases 0.000 claims description 24
- 238000003745 diagnosis Methods 0.000 claims description 22
- 208000002154 non-small cell lung carcinoma Diseases 0.000 claims description 20
- 238000004393 prognosis Methods 0.000 claims description 15
- 239000007787 solid Substances 0.000 claims description 15
- 210000004369 blood Anatomy 0.000 claims description 8
- 239000008280 blood Substances 0.000 claims description 8
- 210000002966 serum Anatomy 0.000 claims description 8
- 238000001369 bisulfite sequencing Methods 0.000 claims description 6
- 208000002151 Pleural effusion Diseases 0.000 claims description 4
- 206010041067 Small cell lung cancer Diseases 0.000 claims description 4
- 206010036790 Productive cough Diseases 0.000 claims description 2
- 239000012530 fluid Substances 0.000 claims description 2
- 210000003802 sputum Anatomy 0.000 claims description 2
- 208000024794 sputum Diseases 0.000 claims description 2
- 241000094396 Bolitoglossa carri Species 0.000 claims 1
- 208000037841 lung tumor Diseases 0.000 abstract description 18
- 238000012360 testing method Methods 0.000 abstract description 10
- 238000000018 DNA microarray Methods 0.000 abstract description 3
- 238000012512 characterization method Methods 0.000 abstract description 3
- 238000001356 surgical procedure Methods 0.000 abstract 1
- 239000000523 sample Substances 0.000 description 76
- 108020004414 DNA Proteins 0.000 description 72
- 230000007067 DNA methylation Effects 0.000 description 57
- 210000001519 tissue Anatomy 0.000 description 45
- 239000011324 bead Substances 0.000 description 36
- 108090000623 proteins and genes Proteins 0.000 description 19
- 102000053602 DNA Human genes 0.000 description 18
- 238000003908 quality control method Methods 0.000 description 18
- 201000011510 cancer Diseases 0.000 description 16
- 238000001514 detection method Methods 0.000 description 16
- 238000007481 next generation sequencing Methods 0.000 description 16
- 208000029729 tumor suppressor gene on chromosome 11 Diseases 0.000 description 16
- 239000012634 fragment Substances 0.000 description 14
- 201000005296 lung carcinoma Diseases 0.000 description 14
- 239000006228 supernatant Substances 0.000 description 13
- 238000009396 hybridization Methods 0.000 description 12
- 239000003795 chemical substances by application Substances 0.000 description 11
- 238000013507 mapping Methods 0.000 description 10
- 230000008569 process Effects 0.000 description 10
- 101100310856 Drosophila melanogaster spri gene Proteins 0.000 description 9
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N Ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 description 9
- 239000000872 buffer Substances 0.000 description 9
- 230000000875 corresponding effect Effects 0.000 description 9
- 238000011156 evaluation Methods 0.000 description 9
- 238000001847 surface plasmon resonance imaging Methods 0.000 description 9
- 239000011534 wash buffer Substances 0.000 description 9
- 210000004027 cell Anatomy 0.000 description 8
- 238000000605 extraction Methods 0.000 description 8
- 230000035772 mutation Effects 0.000 description 8
- 239000002773 nucleotide Substances 0.000 description 8
- 125000003729 nucleotide group Chemical group 0.000 description 8
- 230000035945 sensitivity Effects 0.000 description 8
- 238000010200 validation analysis Methods 0.000 description 8
- 210000000349 chromosome Anatomy 0.000 description 7
- 239000000203 mixture Substances 0.000 description 7
- 102000004190 Enzymes Human genes 0.000 description 6
- 108090000790 Enzymes Proteins 0.000 description 6
- -1 MET Proteins 0.000 description 6
- 108020004682 Single-Stranded DNA Proteins 0.000 description 6
- 238000013459 approach Methods 0.000 description 6
- OPTASPLRGRRNAP-UHFFFAOYSA-N cytosine Chemical compound NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 description 6
- 238000011161 development Methods 0.000 description 6
- 230000018109 developmental process Effects 0.000 description 6
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 6
- 238000003384 imaging method Methods 0.000 description 6
- 238000004519 manufacturing process Methods 0.000 description 6
- 101001108436 Homo sapiens Neurexin-1 Proteins 0.000 description 5
- 101001108433 Homo sapiens Neurexin-1-beta Proteins 0.000 description 5
- 102100021582 Neurexin-1-beta Human genes 0.000 description 5
- 239000003153 chemical reaction reagent Substances 0.000 description 5
- 238000002591 computed tomography Methods 0.000 description 5
- 230000004083 survival effect Effects 0.000 description 5
- 238000002560 therapeutic procedure Methods 0.000 description 5
- RWQNBRDOKXIBIV-UHFFFAOYSA-N thymine Chemical class CC1=CNC(=O)NC1=O RWQNBRDOKXIBIV-UHFFFAOYSA-N 0.000 description 5
- 238000005406 washing Methods 0.000 description 5
- 102100033764 Acyl-coenzyme A oxidase-like protein Human genes 0.000 description 4
- 102100035553 Autism susceptibility gene 2 protein Human genes 0.000 description 4
- 102100029171 Calcipressin-2 Human genes 0.000 description 4
- 102100031602 Dedicator of cytokinesis protein 10 Human genes 0.000 description 4
- 102100030428 E3 ubiquitin-protein ligase E3D Human genes 0.000 description 4
- 102100023639 FYVE, RhoGEF and PH domain-containing protein 5 Human genes 0.000 description 4
- 101000801174 Homo sapiens Acyl-coenzyme A oxidase-like protein Proteins 0.000 description 4
- 101000874361 Homo sapiens Autism susceptibility gene 2 protein Proteins 0.000 description 4
- 101001062197 Homo sapiens Calcipressin-2 Proteins 0.000 description 4
- 101000866268 Homo sapiens Dedicator of cytokinesis protein 10 Proteins 0.000 description 4
- 101000772959 Homo sapiens E3 ubiquitin-protein ligase E3D Proteins 0.000 description 4
- 101000827825 Homo sapiens FYVE, RhoGEF and PH domain-containing protein 5 Proteins 0.000 description 4
- 101001082570 Homo sapiens Hypoxia-inducible factor 3-alpha Proteins 0.000 description 4
- 101001027854 Homo sapiens Protein FAM53A Proteins 0.000 description 4
- 102100030482 Hypoxia-inducible factor 3-alpha Human genes 0.000 description 4
- 102100037525 Protein FAM53A Human genes 0.000 description 4
- 230000003321 amplification Effects 0.000 description 4
- 239000000090 biomarker Substances 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 4
- 238000010219 correlation analysis Methods 0.000 description 4
- 238000007405 data analysis Methods 0.000 description 4
- 238000002405 diagnostic procedure Methods 0.000 description 4
- 201000010099 disease Diseases 0.000 description 4
- 230000002068 genetic effect Effects 0.000 description 4
- UYTPUPDQBNUYGX-UHFFFAOYSA-N guanine Chemical compound O=C1NC(N)=NC2=C1N=CN2 UYTPUPDQBNUYGX-UHFFFAOYSA-N 0.000 description 4
- 230000006607 hypermethylation Effects 0.000 description 4
- 239000000463 material Substances 0.000 description 4
- 238000003199 nucleic acid amplification method Methods 0.000 description 4
- XEBWQGVWTUSTLN-UHFFFAOYSA-M phenylmercury acetate Chemical compound CC(=O)O[Hg]C1=CC=CC=C1 XEBWQGVWTUSTLN-UHFFFAOYSA-M 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 239000000047 product Substances 0.000 description 4
- 238000000746 purification Methods 0.000 description 4
- 238000011002 quantification Methods 0.000 description 4
- 239000002096 quantum dot Substances 0.000 description 4
- 239000011541 reaction mixture Substances 0.000 description 4
- 238000012706 support-vector machine Methods 0.000 description 4
- 208000024891 symptom Diseases 0.000 description 4
- 210000004881 tumor cell Anatomy 0.000 description 4
- LRSASMSXMSNRBT-UHFFFAOYSA-N 5-methylcytosine Chemical class CC1=CNC(=O)N=C1N LRSASMSXMSNRBT-UHFFFAOYSA-N 0.000 description 3
- CURLTUGMZLYLDI-UHFFFAOYSA-N Carbon dioxide Chemical compound O=C=O CURLTUGMZLYLDI-UHFFFAOYSA-N 0.000 description 3
- 108010077544 Chromatin Proteins 0.000 description 3
- 102100040205 Homeobox protein Hox-D12 Human genes 0.000 description 3
- 101001037169 Homo sapiens Homeobox protein Hox-D12 Proteins 0.000 description 3
- 101001131829 Homo sapiens P protein Proteins 0.000 description 3
- 101001117509 Homo sapiens Prostaglandin E2 receptor EP4 subtype Proteins 0.000 description 3
- 101001134801 Homo sapiens Protocadherin beta-2 Proteins 0.000 description 3
- 101000703741 Homo sapiens Short stature homeobox protein 2 Proteins 0.000 description 3
- 102100034574 P protein Human genes 0.000 description 3
- 102100033437 Protocadherin beta-2 Human genes 0.000 description 3
- 108010005173 SERPIN-B5 Proteins 0.000 description 3
- 108091006753 SLC22A20 Proteins 0.000 description 3
- 102100031976 Short stature homeobox protein 2 Human genes 0.000 description 3
- HEMHJVSKTPXQMS-UHFFFAOYSA-M Sodium hydroxide Chemical compound [OH-].[Na+] HEMHJVSKTPXQMS-UHFFFAOYSA-M 0.000 description 3
- 102100023099 Solute carrier family 22 member 20 Human genes 0.000 description 3
- 230000001594 aberrant effect Effects 0.000 description 3
- 235000011089 carbon dioxide Nutrition 0.000 description 3
- 238000005119 centrifugation Methods 0.000 description 3
- 210000000038 chest Anatomy 0.000 description 3
- 210000003483 chromatin Anatomy 0.000 description 3
- 108091092240 circulating cell-free DNA Proteins 0.000 description 3
- 230000000295 complement effect Effects 0.000 description 3
- 229940104302 cytosine Drugs 0.000 description 3
- 230000004069 differentiation Effects 0.000 description 3
- 238000009826 distribution Methods 0.000 description 3
- 229940079593 drug Drugs 0.000 description 3
- 239000003814 drug Substances 0.000 description 3
- 239000000975 dye Substances 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000001914 filtration Methods 0.000 description 3
- 230000014509 gene expression Effects 0.000 description 3
- 238000011534 incubation Methods 0.000 description 3
- 230000036210 malignancy Effects 0.000 description 3
- 230000003211 malignant effect Effects 0.000 description 3
- 239000008188 pellet Substances 0.000 description 3
- 238000002360 preparation method Methods 0.000 description 3
- 230000037452 priming Effects 0.000 description 3
- 238000013515 script Methods 0.000 description 3
- 208000000587 small cell lung carcinoma Diseases 0.000 description 3
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 3
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N (+)-Biotin Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 description 2
- 229930024421 Adenine Natural products 0.000 description 2
- GFFGJBXGBJISGV-UHFFFAOYSA-N Adenine Chemical compound NC1=NC=NC2=C1N=CN2 GFFGJBXGBJISGV-UHFFFAOYSA-N 0.000 description 2
- 206010006187 Breast cancer Diseases 0.000 description 2
- 208000026310 Breast neoplasm Diseases 0.000 description 2
- 102100026891 Cystatin-B Human genes 0.000 description 2
- 238000007400 DNA extraction Methods 0.000 description 2
- KCXVZYZYPLLWCC-UHFFFAOYSA-N EDTA Chemical compound OC(=O)CN(CC(O)=O)CCN(CC(O)=O)CC(O)=O KCXVZYZYPLLWCC-UHFFFAOYSA-N 0.000 description 2
- 102100030708 GTPase KRas Human genes 0.000 description 2
- 101000912191 Homo sapiens Cystatin-B Proteins 0.000 description 2
- 101000584612 Homo sapiens GTPase KRas Proteins 0.000 description 2
- 101001012157 Homo sapiens Receptor tyrosine-protein kinase erbB-2 Proteins 0.000 description 2
- 101000984753 Homo sapiens Serine/threonine-protein kinase B-raf Proteins 0.000 description 2
- 206010027476 Metastases Diseases 0.000 description 2
- 102100030086 Receptor tyrosine-protein kinase erbB-2 Human genes 0.000 description 2
- 102100027103 Serine/threonine-protein kinase B-raf Human genes 0.000 description 2
- VYPSYNLAJGMNEJ-UHFFFAOYSA-N Silicium dioxide Chemical compound O=[Si]=O VYPSYNLAJGMNEJ-UHFFFAOYSA-N 0.000 description 2
- 108010090804 Streptavidin Proteins 0.000 description 2
- 108010083111 Ubiquitin-Protein Ligases Proteins 0.000 description 2
- ISAKRJDGNUQOIC-UHFFFAOYSA-N Uracil Chemical compound O=C1C=CNC(=O)N1 ISAKRJDGNUQOIC-UHFFFAOYSA-N 0.000 description 2
- 229960000643 adenine Drugs 0.000 description 2
- 229960002685 biotin Drugs 0.000 description 2
- 239000011616 biotin Substances 0.000 description 2
- 238000013276 bronchoscopy Methods 0.000 description 2
- 238000007635 classification algorithm Methods 0.000 description 2
- 230000002596 correlated effect Effects 0.000 description 2
- 230000008878 coupling Effects 0.000 description 2
- 238000010168 coupling process Methods 0.000 description 2
- 238000005859 coupling reaction Methods 0.000 description 2
- 230000003247 decreasing effect Effects 0.000 description 2
- 238000003748 differential diagnosis Methods 0.000 description 2
- 230000008030 elimination Effects 0.000 description 2
- 238000003379 elimination reaction Methods 0.000 description 2
- 102000052116 epidermal growth factor receptor activity proteins Human genes 0.000 description 2
- 108700015053 epidermal growth factor receptor activity proteins Proteins 0.000 description 2
- 230000001973 epigenetic effect Effects 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 238000007417 hierarchical cluster analysis Methods 0.000 description 2
- 239000003550 marker Substances 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 201000001441 melanoma Diseases 0.000 description 2
- 230000004060 metabolic process Effects 0.000 description 2
- 230000009401 metastasis Effects 0.000 description 2
- YOHYSYJDKVYCJI-UHFFFAOYSA-N n-[3-[[6-[3-(trifluoromethyl)anilino]pyrimidin-4-yl]amino]phenyl]cyclopropanecarboxamide Chemical compound FC(F)(F)C1=CC=CC(NC=2N=CN=C(NC=3C=C(NC(=O)C4CC4)C=CC=3)C=2)=C1 YOHYSYJDKVYCJI-UHFFFAOYSA-N 0.000 description 2
- 230000002018 overexpression Effects 0.000 description 2
- 239000013610 patient sample Substances 0.000 description 2
- 229920000642 polymer Polymers 0.000 description 2
- 238000003752 polymerase chain reaction Methods 0.000 description 2
- 238000010837 poor prognosis Methods 0.000 description 2
- 102000004169 proteins and genes Human genes 0.000 description 2
- 238000012216 screening Methods 0.000 description 2
- 230000001225 therapeutic effect Effects 0.000 description 2
- 229940113082 thymine Drugs 0.000 description 2
- 238000009966 trimming Methods 0.000 description 2
- 238000011144 upstream manufacturing Methods 0.000 description 2
- 210000002700 urine Anatomy 0.000 description 2
- GUAHPAJOXVYFON-ZETCQYMHSA-N (8S)-8-amino-7-oxononanoic acid zwitterion Chemical compound C[C@H](N)C(=O)CCCCCC(O)=O GUAHPAJOXVYFON-ZETCQYMHSA-N 0.000 description 1
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 1
- 102100040685 14-3-3 protein zeta/delta Human genes 0.000 description 1
- 102100033793 ALK tyrosine kinase receptor Human genes 0.000 description 1
- 102100022900 Actin, cytoplasmic 1 Human genes 0.000 description 1
- 108010085238 Actins Proteins 0.000 description 1
- 102100034540 Adenomatous polyposis coli protein Human genes 0.000 description 1
- 108090001008 Avidin Proteins 0.000 description 1
- 235000017491 Bambusa tulda Nutrition 0.000 description 1
- 244000218454 Bambusa tulda Species 0.000 description 1
- 102100037674 Bis(5'-adenosyl)-triphosphatase Human genes 0.000 description 1
- 101150105979 Brms1 gene Proteins 0.000 description 1
- 101150023402 CST6 gene Proteins 0.000 description 1
- 102100024154 Cadherin-13 Human genes 0.000 description 1
- 208000005623 Carcinogenesis Diseases 0.000 description 1
- 108010078791 Carrier Proteins Proteins 0.000 description 1
- ZEOWTGPWHLSLOG-UHFFFAOYSA-N Cc1ccc(cc1-c1ccc2c(n[nH]c2c1)-c1cnn(c1)C1CC1)C(=O)Nc1cccc(c1)C(F)(F)F Chemical compound Cc1ccc(cc1-c1ccc2c(n[nH]c2c1)-c1cnn(c1)C1CC1)C(=O)Nc1cccc(c1)C(F)(F)F ZEOWTGPWHLSLOG-UHFFFAOYSA-N 0.000 description 1
- 108091007854 Cdh1/Fizzy-related Proteins 0.000 description 1
- 102000038594 Cdh1/Fizzy-related Human genes 0.000 description 1
- VYZAMTAEIAYCRO-UHFFFAOYSA-N Chromium Chemical compound [Cr] VYZAMTAEIAYCRO-UHFFFAOYSA-N 0.000 description 1
- 208000000668 Chronic Pancreatitis Diseases 0.000 description 1
- 244000260524 Chrysanthemum balsamita Species 0.000 description 1
- 235000005633 Chrysanthemum balsamita Nutrition 0.000 description 1
- 102100024458 Cyclin-dependent kinase inhibitor 2A Human genes 0.000 description 1
- 102100039498 Cytotoxic T-lymphocyte protein 4 Human genes 0.000 description 1
- 230000006429 DNA hypomethylation Effects 0.000 description 1
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 1
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 1
- 102100038587 Death-associated protein kinase 1 Human genes 0.000 description 1
- 102100031480 Dual specificity mitogen-activated protein kinase kinase 1 Human genes 0.000 description 1
- 101710146526 Dual specificity mitogen-activated protein kinase kinase 1 Proteins 0.000 description 1
- 206010071975 EGFR gene mutation Diseases 0.000 description 1
- 102100036725 Epithelial discoidin domain-containing receptor 1 Human genes 0.000 description 1
- 101710131668 Epithelial discoidin domain-containing receptor 1 Proteins 0.000 description 1
- 102100023593 Fibroblast growth factor receptor 1 Human genes 0.000 description 1
- 101710182386 Fibroblast growth factor receptor 1 Proteins 0.000 description 1
- 208000031448 Genomic Instability Diseases 0.000 description 1
- 108091007417 HOX transcript antisense RNA Proteins 0.000 description 1
- 108010033040 Histones Proteins 0.000 description 1
- 108700005087 Homeobox Genes Proteins 0.000 description 1
- 101000964898 Homo sapiens 14-3-3 protein zeta/delta Proteins 0.000 description 1
- 101000779641 Homo sapiens ALK tyrosine kinase receptor Proteins 0.000 description 1
- 101000924577 Homo sapiens Adenomatous polyposis coli protein Proteins 0.000 description 1
- 101000762243 Homo sapiens Cadherin-13 Proteins 0.000 description 1
- 101000889276 Homo sapiens Cytotoxic T-lymphocyte protein 4 Proteins 0.000 description 1
- 101000956145 Homo sapiens Death-associated protein kinase 1 Proteins 0.000 description 1
- 101001005719 Homo sapiens Melanoma-associated antigen 3 Proteins 0.000 description 1
- 101000605639 Homo sapiens Phosphatidylinositol 4,5-bisphosphate 3-kinase catalytic subunit alpha isoform Proteins 0.000 description 1
- 101000686031 Homo sapiens Proto-oncogene tyrosine-protein kinase ROS Proteins 0.000 description 1
- 101000579425 Homo sapiens Proto-oncogene tyrosine-protein kinase receptor Ret Proteins 0.000 description 1
- 101000742859 Homo sapiens Retinoblastoma-associated protein Proteins 0.000 description 1
- 101001132698 Homo sapiens Retinoic acid receptor beta Proteins 0.000 description 1
- 101000658138 Homo sapiens Thymosin beta-10 Proteins 0.000 description 1
- 101000652324 Homo sapiens Transcription factor SOX-17 Proteins 0.000 description 1
- 101000894428 Homo sapiens Transcriptional repressor CTCFL Proteins 0.000 description 1
- 101000800498 Homo sapiens Transketolase-like protein 1 Proteins 0.000 description 1
- 108091007767 MALAT1 Proteins 0.000 description 1
- 229940124647 MEK inhibitor Drugs 0.000 description 1
- 102100025082 Melanoma-associated antigen 3 Human genes 0.000 description 1
- 206010027457 Metastases to liver Diseases 0.000 description 1
- 241001465754 Metazoa Species 0.000 description 1
- 102000016397 Methyltransferase Human genes 0.000 description 1
- 108060004795 Methyltransferase Proteins 0.000 description 1
- 206010029260 Neuroblastoma Diseases 0.000 description 1
- 206010057852 Nicotine dependence Diseases 0.000 description 1
- 108700020796 Oncogene Proteins 0.000 description 1
- 108010011536 PTEN Phosphohydrolase Proteins 0.000 description 1
- 102000014160 PTEN Phosphohydrolase Human genes 0.000 description 1
- 206010033649 Pancreatitis chronic Diseases 0.000 description 1
- 102100024450 Prostaglandin E2 receptor EP4 subtype Human genes 0.000 description 1
- 206010060862 Prostate cancer Diseases 0.000 description 1
- 102100023347 Proto-oncogene tyrosine-protein kinase ROS Human genes 0.000 description 1
- 102100028286 Proto-oncogene tyrosine-protein kinase receptor Ret Human genes 0.000 description 1
- 102100038042 Retinoblastoma-associated protein Human genes 0.000 description 1
- 102100033909 Retinoic acid receptor beta Human genes 0.000 description 1
- 108091006207 SLC-Transporter Proteins 0.000 description 1
- 102000037054 SLC-Transporter Human genes 0.000 description 1
- 102100026715 Serine/threonine-protein kinase STK11 Human genes 0.000 description 1
- 101710181599 Serine/threonine-protein kinase STK11 Proteins 0.000 description 1
- DWAQJAXMDSEUJJ-UHFFFAOYSA-M Sodium bisulfite Chemical compound [Na+].OS([O-])=O DWAQJAXMDSEUJJ-UHFFFAOYSA-M 0.000 description 1
- 208000005718 Stomach Neoplasms Diseases 0.000 description 1
- 241000906446 Theraps Species 0.000 description 1
- 208000007536 Thrombosis Diseases 0.000 description 1
- 102100034998 Thymosin beta-10 Human genes 0.000 description 1
- 208000025569 Tobacco Use disease Diseases 0.000 description 1
- 102100030243 Transcription factor SOX-17 Human genes 0.000 description 1
- 102100021393 Transcriptional repressor CTCFL Human genes 0.000 description 1
- 102100033108 Transketolase-like protein 1 Human genes 0.000 description 1
- 108700025716 Tumor Suppressor Genes Proteins 0.000 description 1
- 102000044209 Tumor Suppressor Genes Human genes 0.000 description 1
- 102000006275 Ubiquitin-Protein Ligases Human genes 0.000 description 1
- 102100020696 Ubiquitin-conjugating enzyme E2 K Human genes 0.000 description 1
- 108020004417 Untranslated RNA Proteins 0.000 description 1
- 102000039634 Untranslated RNA Human genes 0.000 description 1
- 230000005856 abnormality Effects 0.000 description 1
- 239000002253 acid Substances 0.000 description 1
- 150000007513 acids Chemical class 0.000 description 1
- 230000003213 activating effect Effects 0.000 description 1
- 201000008395 adenosquamous carcinoma Diseases 0.000 description 1
- 230000033115 angiogenesis Effects 0.000 description 1
- 230000001640 apoptogenic effect Effects 0.000 description 1
- 238000003556 assay Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 239000012148 binding buffer Substances 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 235000020958 biotin Nutrition 0.000 description 1
- 108010005713 bis(5'-adenosyl)triphosphatase Proteins 0.000 description 1
- 230000000903 blocking effect Effects 0.000 description 1
- 210000001124 body fluid Anatomy 0.000 description 1
- 239000010839 body fluid Substances 0.000 description 1
- 230000036952 cancer formation Effects 0.000 description 1
- 238000001818 capillary gel electrophoresis Methods 0.000 description 1
- 231100000504 carcinogenesis Toxicity 0.000 description 1
- 239000003183 carcinogenic agent Substances 0.000 description 1
- 230000024245 cell differentiation Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 239000012141 concentrate Substances 0.000 description 1
- 238000011109 contamination Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000003066 decision tree Methods 0.000 description 1
- 230000002950 deficient Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 230000036425 denaturation Effects 0.000 description 1
- 238000004925 denaturation Methods 0.000 description 1
- 235000014113 dietary fatty acids Nutrition 0.000 description 1
- 230000008482 dysregulation Effects 0.000 description 1
- 238000010828 elution Methods 0.000 description 1
- 239000012149 elution buffer Substances 0.000 description 1
- 239000003623 enhancer Substances 0.000 description 1
- 238000011067 equilibration Methods 0.000 description 1
- 208000021045 exocrine pancreatic carcinoma Diseases 0.000 description 1
- 229930195729 fatty acid Natural products 0.000 description 1
- 239000000194 fatty acid Substances 0.000 description 1
- 150000004665 fatty acids Chemical class 0.000 description 1
- 230000002349 favourable effect Effects 0.000 description 1
- 239000007850 fluorescent dye Substances 0.000 description 1
- 238000013467 fragmentation Methods 0.000 description 1
- 238000006062 fragmentation reaction Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 206010017758 gastric cancer Diseases 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 238000010562 histological examination Methods 0.000 description 1
- 230000006195 histone acetylation Effects 0.000 description 1
- 230000002055 immunohistochemical effect Effects 0.000 description 1
- 238000000126 in silico method Methods 0.000 description 1
- 230000000977 initiatory effect Effects 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 238000009434 installation Methods 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 239000007788 liquid Substances 0.000 description 1
- 201000005243 lung squamous cell carcinoma Diseases 0.000 description 1
- 238000007403 mPCR Methods 0.000 description 1
- 238000007885 magnetic separation Methods 0.000 description 1
- 108040008770 methylated-DNA-[protein]-cysteine S-methyltransferase activity proteins Proteins 0.000 description 1
- 108091054189 miR-196a stem-loop Proteins 0.000 description 1
- 108091089775 miR-200b stem-loop Proteins 0.000 description 1
- 238000013508 migration Methods 0.000 description 1
- 230000005012 migration Effects 0.000 description 1
- 230000002438 mitochondrial effect Effects 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 238000007479 molecular analysis Methods 0.000 description 1
- JTSLALYXYSRPGW-UHFFFAOYSA-N n-[5-(4-cyanophenyl)-1h-pyrrolo[2,3-b]pyridin-3-yl]pyridine-3-carboxamide Chemical compound C=1C=CN=CC=1C(=O)NC(C1=C2)=CNC1=NC=C2C1=CC=C(C#N)C=C1 JTSLALYXYSRPGW-UHFFFAOYSA-N 0.000 description 1
- 230000001338 necrotic effect Effects 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 201000002120 neuroendocrine carcinoma Diseases 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 210000000056 organ Anatomy 0.000 description 1
- 108700042657 p16 Genes Proteins 0.000 description 1
- 108700025694 p53 Genes Proteins 0.000 description 1
- 208000008443 pancreatic carcinoma Diseases 0.000 description 1
- 230000037438 passenger mutation Effects 0.000 description 1
- 238000005453 pelletization Methods 0.000 description 1
- 210000004910 pleural fluid Anatomy 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 239000002243 precursor Substances 0.000 description 1
- 230000003449 preventive effect Effects 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 230000035755 proliferation Effects 0.000 description 1
- 201000001514 prostate carcinoma Diseases 0.000 description 1
- 239000012264 purified product Substances 0.000 description 1
- 230000005855 radiation Effects 0.000 description 1
- 238000003753 real-time PCR Methods 0.000 description 1
- 230000001105 regulatory effect Effects 0.000 description 1
- 230000000754 repressing effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 239000000377 silicon dioxide Substances 0.000 description 1
- 230000000391 smoking effect Effects 0.000 description 1
- 235000010267 sodium hydrogen sulphite Nutrition 0.000 description 1
- 125000006850 spacer group Chemical group 0.000 description 1
- 238000010972 statistical evaluation Methods 0.000 description 1
- 201000011549 stomach cancer Diseases 0.000 description 1
- 238000003239 susceptibility assay Methods 0.000 description 1
- 230000037426 transcriptional repression Effects 0.000 description 1
- 239000000439 tumor marker Substances 0.000 description 1
- 229940035893 uracil Drugs 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6883—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
- C12Q1/6886—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6869—Methods for sequencing
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/112—Disease subtyping, staging or classification
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/118—Prognosis of disease development
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/154—Methylation markers
Definitions
- the present invention relates to the diagnosis of lung tumors. It provides methods that are suitable for diagnosing lung tumors on the basis of surgical samples as well as lung biopsies (here e.g. with the help of DNA microarrays) and liquid biopsies.
- Cell-free DNA zfDNA is used for liquid biopsies. Both particularly suitable analytical methods and particularly suitable sets of methylation markers are described.
- the invention also relates to agents suitable for diagnosing lung cancer by examining the methylation of a set of methylation markers, for example in cell-free DNA (zfDNA) from liquid biopsy samples from patients, the agent comprising oligonucleotides which can hybridize with DNA which includes methylation markers, as well as the use of these methods and means for diagnosis, for example determination, subtyping and prognostic characterization of lung tumors.
- a set of methylation markers for example in cell-free DNA (zfDNA) from liquid biopsy samples from patients
- the agent comprising oligonucleotides which can hybridize with DNA which includes methylation markers
- Lung cancer is the second most common cancer in men and women worldwide. Around 52,500 new cases are registered in Germany every year. The mean age of onset is 70 for men and 69 for women. A distinction is made between small cell lung cancer (SCLC) and non-small cell lung cancer (NSCLC) lung cancer. NSCLC are much more common and occur in 85% of affected patients. In addition, several sub-entities are distinguished in NSCLC, the most common of which are adenocarcinomas and squamous cell carcinomas. The fact that the symptoms of the disease usually appear very late is reflected in a poor prognosis. The 5-year survival rate is 15%.
- SCLC small cell lung cancer
- NSCLC non-small cell lung cancer
- Lung carcinomas like most other tumors, have a high degree of genomic heterogeneity.
- mutations within KRAS, EGFR, BRAF, MEK1, MET, HER2, ALK, ROS1, RET, FGFR1, DDR2, PTEN, LKB1, RB1, CDKN2A or TP53 genes can induce the development of primary lung cancer.
- passenger mutations which can lead to various subclones, accumulate in the course of tumor evolution. This fact makes the development of a reliable early detection test based only on molecular genetic mutation analyzes very difficult, which is evident from the many examples in the literature. For example, Uchida et al. performed a lung cancer screen / ng based on typical EGFR gene mutations.
- promoters within certain tumor suppressor genes are hypermethylated, which in turn results in their transcriptional repression. This phenomenon is accompanied by the overexpression of DNA methyltransferases. Promoter hypermethylation has been described particularly frequently in the literature within the P16INK4A, RASSF1A, APC, RARB, CDH1, CDH13, DAPK, FHIT and MGMT genes (Langevin et al. [2015] Transl. Res. 165: 74-90).
- H4K20me3 is also lower in NSCLC than in healthy lung tissue (Newman et al. [2014] Nat. Methods 20: 548- 554).
- aberrant ncRNA expression can occur, such as MIR196A, MIR200B, MALAT1 and HOTAIR.
- the affected patients are first subjected to a comprehensive physical examination.
- the chest is then examined using imaging methods such as X-rays or computed tomography (CT).
- CT computed tomography
- tumors are detected, bronchoscopies are recommended, in which the lungs are thoroughly endoscopically analyzed and biopsies of the tumors are taken.
- biopsies are then subjected to histological, immunohistochemical and molecular genetic analyzes. During the histological examinations, it is determined whether the tumors are malignant. If this is the case, its entity is determined.
- molecular genetic and imaging methods are also used. In particular, the imaging and endoscopic procedures can be stressful for the affected patients due to the radiation exposure and invasiveness.
- the detection limit of the radiological procedure is a tumor size of 7 to 10 mm, which corresponds to cell clusters consisting of around one billion tumor cells.
- An alternative, less invasive method is based on liquid biopsies, by means of which tumors can be detected much earlier, from a size of approx. 50 million cells.
- Circulating cell-free DNA can then be isolated from the blood plasma or blood serum.
- the zfDNA arises in the course of apoptotic and necrotic processes.
- Cellular, genomic DNA gDNA is split into fragments approx. 167 bp in length by DNAsen and released into the bloodstream.
- the total amount of zfDNA also contains tumor DNA.
- the amount of zfDNA can vary greatly depending on the entity or stage of the disease. However, it does contain diagnostically, therapeutically and prognostically relevant information.
- DNA methylation is of particular interest in this context.
- the DNA methylation pattern is tissue-specific and changes in the early phases of tumor evolution.
- zfDNA methylation in the blood remains stable. It is neither modified nor falsified and is therefore suitable as a biomarker in clinical diagnostics (Puszyk et al. [2009] Clin. Chim. Acta 400: 107-110).
- the diagnostic potential of DNA methylation is also evident from the example of the “Epi proLung” assay (“Epigenomics AG”, Germany).
- the zfDNA methylation pattern of the SHOX2 and PTGER4 genes is analyzed. With a specificity of 90%, the sensitivity is 67% (Weiss et al. [2017] J. Thorac. Oncol. 12: 77-84).
- the sensitivity of the "Epi proLung” test is therefore not sufficient for reliable lung cancer screening. So far there are no other methods based on liquid biopsies that enable reliable, preventive early detection of lung cancer.
- One object of the invention is a method of diagnosing lung cancer which comprises determining the methylation of a set of methylation markers in a sample from a patient, for example, by examining zfDNA from a liquid biopsy.
- the sample can also be a tissue sample, e.g. a solid tissue sample from a tumor or from a tissue in which a tumor may be present.
- the tissue sample can come from a biopsy or surgical material from lung tissue.
- Pleural fluid can also be examined.
- the method according to the invention is characterized by the fact that, due to the selection of the markers, it is particularly well suited for both an examination of tissue samples taken during an operation and an examination of lung biopsy tissue as well as an examination of zfDNA from a liquid biopsy to be used.
- the invention provides a method for diagnosing lung cancer in which the methylation of a set of methylation markers, for example in zfDNA from a liquid biopsy Sample of a patient, is determined, optionally an alignment is carried out against a reference genome with the Segemehl algorithm.
- the invention also provides a method for diagnosing lung cancer in which the methylation of a set of methylation markers, for example in zfDNA from a liquid biopsy sample of a patient, is determined, with the methylation of methylation markers in the genes SERPINB5, DOCK10, PCDHB2, HIF3A, FGD5, RCAN2, HOXD12, O-CA2, SLC22A20, FADL-1, NRXN1, ACOXL, FAM53A, UBE3D and AUTS2.
- the circulating cell-free DNA (zfDNA) from liquid biopsies e.g. from plasma, blood, or serum, preferably from plasma
- the total amount of circulating DNA also contains the tumor DNA, which contains all therapeutically and prognostically relevant information about the genetic and epigenetic characteristics of the tumor.
- the invention provides both preferred methods for diagnosing lung cancer on this basis and also preferred sets of methylation markers.
- the present invention clearly shows (see Section 2.1.3) that the DNA methylation pattern between the zfDNA from the plasma and the gDNA from a primary tumor correlate only to a limited extent. In fact, the total amount of zfDNA does not only contain DNA from the lungs or a tumor, but also DNA from other tissues and organs.
- the aim was therefore - in contrast to the approaches known in the prior art, the de- Termination of universal methylation signatures, by means of which the most diverse (also complex) patient samples (also with strongly varying tumor cell content) can be robustly and reliably examined.
- This has been achieved with the present invention. It is advantageous according to the invention that the identified markers deliver good results both with tissue samples, for example solid tissue samples from tumor tissue, and with liquid biopsies, and are thus suitable for diagnosing lung cancer from different types of samples.
- DNA methylation signatures in 40 malignant lung tumors and their corresponding controls were examined.
- An analysis of DNA methylation signatures in the blood plasma of nine patients was then carried out. Five of them suffered from adenocarcinoma and four from squamous cell carcinoma of the lung. The other patients, on the other hand, were free from malignancies and formed the control cohort.
- additional data sets from several studies made available were evaluated, which made it possible to identify further tumor-specific and prognostic CpG loci.
- the set of methylation markers synthesized on this basis also known as the plasma panel (see Table 1), was then validated in a pilot study.
- This set of methylation markers comprises a large number of regions that are differentially methylated in zfDNA, for example, and surprisingly allow a specific statement about the presence of a tumor, the tumor entity, the tumor stage and / or the prognosis.
- the invention therefore relates to a method for diagnosing lung cancer in which the methylation of a set of methylation markers is determined in a sample from the patient, the set of methylation markers from the group consisting of those listed in Tables 1a, 1b and 1c Regions is selected and comprises at least 60 regions, preferably at least 64 regions, more preferably at least 340 or at least 350 regions, most preferably at least 630 regions.
- methylation markers can be determined to determine the presence of a tumor.
- the invention also relates to a method for diagnosing lung cancer in which the methylation of a set of methylation markers is determined in a sample from the patient where the set of methylation markers is selected from the group consisting of the regions listed in Tables 1a, 1b and 1c and comprises at least 134 regions, preferably 138 regions, more preferably at least 240 regions, most preferably at least at least 247 regions.
- methylation markers can be determined to determine the entity of a tumor.
- the set of methylation markers can comprise at least 194 regions, preferably at least 600 regions, optionally all 630 regions.
- at least 60, preferably at least 64, methylation markers can be determined in order to determine the presence of a tumor, e.g. methylation markers from Table 1a, and at least 134, preferably 138 regions, methylation markers can be determined in order to determine the entity of the tumor, e.g. Methylation markers from Table 1b.
- the more methylation markers that are determined the more accurate the analysis. Therefore, at least 150, preferably at least 340 or even 350 methylation markers can be determined to determine the presence of a tumor, e.g.
- methylation markers from Table 1a and at least 240 or even 247 methylation markers can be determined to identify the tumor entity determine, e.g. methylation markers from Table 1b.
- at least 15, preferably at least 30 or even 33 methylation markers from Table 1c can also be determined in order to determine the prognosis.
- the invention therefore relates to a method for diagnosing lung cancer in which the methylation of a set of methylation markers in a sample of a patient, for example in zfDNA from a liquid biopsy sample of a patient, is determined, the set of methylation markers being at least 60 Includes regions selected from the group consisting of:
- the aforementioned methylation markers are the markers listed in Table 1a, which were only identified in the case of zfDNA. In this analysis, the presence of a tumor is preferably checked, the set of methylation markers optionally including all regions of the group.
- the set of methylation markers can include at least 340 regions, selected from the group consisting of the regions listed in table 1a, the set of methylation markers preferably including all regions listed in table 1a.
- the set of methylation markers comprises at least 134 regions selected from the group consisting of
- the aforementioned methylation markers are the markers mentioned in Table 1b, which were only identified in the case of zfDNA.
- the entity of a tumor is preferably checked, it being possible in particular to differentiate between adenocarcinoma and squamous cell carcinoma.
- the set of methylation markers can include all regions of the group.
- the set of methylation markers can also include at least 240 regions, the group consisting of the regions listed in Table 1b.
- the set of methylation markers preferably comprises all regions of the group listed in Table 1b.
- the significance of the analysis is greatest when the set of methylation markers comprises at least 620 regions from a group which consists of all regions listed in Table 1, in particular when determining the prognosis, preferably when the set of methylation markers includes all regions of the Group includes.
- differentially methylated regions for example the regions defined in Table 1a, 1b and / or 1c, can serve as methylation markers, or differentially methylated positions.
- the analysis of entire regions leads to more reliable results, since specific positions do not necessarily have to have the same informative value for individual patients. For this, an analysis of specific positions is possible with less effort, for example using an array, and is therefore beneficial if an inexpensive diagnosis is to be made. The selection is therefore based on a balance between the reliability required in each case and the possible effort.
- both types of methylation markers can be used for diagnosis at the same time.
- the amount of sample present also plays a role, since primarily tissue samples from operations contain sufficient amounts of DNA to determine an analysis of individual methylated positions via an array.
- methylation markers identified in this context are partly in the genes SERPINB5, DOCK10, PCDHB2, HIF3A, FGD5, RCAN2, HOXD12, OCA2, SLC22A20, FADL-1, NRXN1, ACOXL, FAM53A, UBE3D and AUTS2BE3D. These genes have never been specifically described in relation to lung cancer or certain NSCLC entities.
- SERPIN5 is, for example, a known oncogene (Lei et al. [2011] Oncol. Rep. 26: 1115-1120).
- HOX genes are expressed aberrantly in many types of cancer (Bhatlekar et al. [2014] J. Mol. Med. 92: 811-823).
- Dysregulation of RCAN2 leads to proliferation of tumor cells (Niitsu et al. [2016] Oncogenesis 5: e253).
- Altered expression of DOCK10 resulted in the migration of melanoma cells in some studies (Gadea et al. [2008] Curr. Biol. 18: 1456-1465).
- HIF3A and FGD5 are important angiogenesis regulators and thus play a decisive role during tumor evolution (Jackson et al. [2010] Expert Opin. Therap. Targets 14: 1047-1057); and Kurogane et al. [2012] Arterioscler. Thrombus Vase. Biol. 32: 988-996).
- the DNA methylation of some PCDHB2-CpG loci is associated with a poor prognosis in neuroblastoma patients (Abe et al. [2005] Cancer Res. 65: 828-834).
- Altered metabolism for example, is a characteristic of malignant tumors, in which the FADL-1 fatty acid transporter as well as some SLC transporters can play an important role (Lin et al. [2015] Nat. Rev. Drug Discov. 14: 543-560; and Black [1991] J. Bacteriol. 173: 435-442).
- UBE3D codes for a ubiquitin protein ligase.
- Several studies have shown that some ubiquitin protein ligases can play an important role during tumor evolution (including Lisztwan et al. [1999] Genes Dev. 13: 1822-1833).
- AUTS2 and NRXN1 are neuronal genes.
- AUTS2 overexpression was detected in liver metastases (Oksenberg & Ahituv [2013] Trends Genet. 29: 600-608).
- NRXN1 could be responsible for nicotine addiction (Ching et al. [2010] Am. J. Med. Genet. B. Neuropsychiatr. Genet. 153B: 937-947).
- Increased expression of ACOXL has already been described in prostate carcinomas (O'Hurley et al. [2015] PLoS One 10: e0133449).
- the invention thus provides for the first time a method for diagnosing lung cancer in which the methylation of a set of methylation markers, for example in zfDNA from a liquid biopsy sample of a patient, is determined, the methylation of methylation markers in the SERPINB5 genes , DOCK10, PCDHB2, HIF3A, FGD5, RCAN2, HOXD12, OCA2, SLC22A20, FADL-1, NRXN1, ACOXL, FAM53A, UBE3D and AUTS2.
- methylation markers preferably include the methylation markers mentioned in Table 2, in particular if the presence of a lung carcinoma is to be determined.
- the methylation markers include the methylation markers listed in Table 3. Both the methylation markers mentioned in Table 2 and in Table 3 are preferably determined in order to answer both questions.
- the methylation markers mentioned in Table 4 can also be analyzed, which also allows conclusions to be drawn about the stage of the tumor.
- the invention thus also provides a method for diagnosing lung cancer in which the methylation of a set of methylation markers, for example in zfDNA from a liquid biopsy sample of a patient, is determined, the set of methylation markers having the 10 following positions (see also table 2) includes:
- the set of methylation markers can include the following 10 positions (see also Table 3):
- markers are particularly meaningful when the RT algorithm is used for analysis.
- the entity of a tumor can be identified with these markers.
- the set of methylation markers can also include all Po sitions include.
- the SVM algorithm can be used for analysis. For example, regions that could not be validated using samples from early stages of lung cancer could be metastasis-specific signatures. These regions were therefore used for the calculation of the Sfag / ng parameter, i.e. for the calculation of the stage. So far, the staging parameter described in this work can differentiate the late stages of lung cancer from early stages with 80% accuracy. In general, the staging parameter should only be used as a reference.
- the lung cancer can be NSCLC or SCLC, preferably NSCLC.
- the NSCLC is preferably an adenocarcinoma or squamous cell carcinoma. It has been shown that markers according to the invention can differentiate between these entities and are thus suitable for differential diagnosis.
- the diagnosis according to the invention allows a statement to be made about the presence of a tumor, about the entity of a tumor (in particular the differentiation between adenocarcinoma and squamous cell carcinoma), about the tumor stage and / or about the prognosis. Most important is the statement about the presence and entity of the tumor. Further statements can optionally also be made by means of supplementary methods if the presence of a tumor has been determined according to the invention.
- the method according to the invention also allows a statement to be made about the presence of a tumor, about the entity of a tumor (in particular the distinction between adenocarcinoma and squamous cell carcinoma) and about the tumor stage, and preferably about the prognosis.
- diagnosis therefore includes a differential diagnosis.
- the method according to the invention is also suitable for early detection of lung cancer, that is to say also for diagnosis in stage I or II.
- this diagnosis is also possible on the basis of a liquid biopsy sample, e.g. a blood sample, so that different tissue does not necessarily have to be removed from the patient.
- a liquid biopsy sample from a patient is analyzed.
- the method according to the invention can advantageously also be carried out reliably on the basis of lung biopsy tissue.
- paired biopsy tissue from lung biopsies of the presumably diseased and the presumably healthy lung of a patient in parallel.
- usually only the tumor or the suspicious tissue is biopsied, whereby previously collected data records of healthy tissue can serve as a reference.
- the patient is a human.
- the word patient is generally used synonymously with subject. It may be a patient with symptoms that suggest the patient may have a lung tumor. However, it can also be a subject without symptoms.
- the subject or patient can be a risk patient for a lung tumor. This includes subjects who, due to certain risk factors and / or their lifestyle (e.g. smoking, use of e-cigarettes or other increased exposure to carcinogenic agents, symptoms) have an increased risk of lung cancer and / or have radiological abnormalities.
- the patient can also be a patient with a lung tumor that has already been treated, for example an operated one, it being possible to investigate the recurrence of a tumor and / or a metastasis.
- the zfDNA can be extracted from a variety of body fluids.
- the liquid biopsy sample can be blood, plasma, serum, sputum, bronchial fluid and pleural effusion. It is preferably derived from blood, for example serum or plasma, preferably plasma. Since pleural effusion only occurs in the course of the disease, this material is particularly suitable for the detection of later stages.
- the zfDNA extraction from plasma or serum is significantly faster and cheaper than from urine, which makes these materials more interesting for screening.
- the zfDNA stability is relevant, because zfDNA is more stable in plasma than in serum.
- the invention provides means which are suitable for diagnosing lung cancer with a method according to the invention by examining the methylation of a set of methylation markers, for example in zfDNA from a liquid biopsy sample of a patient.
- the agents are preferably also used for diagnosing lung cancer with a method according to the invention by examining the methylation of a set of methylation markers in another sample of a patient, in particular a solid tissue sample from a tumor or a tissue in which a tumor is suspected or from a lung biopsy.
- the agent comprises oligonucleotides which can hybridize with DNA (e.g. zfDNA or DNA derived therefrom, e.g. by bisulfite conversion), which methylation markers according to the invention comprise or consist of them. Methylation markers from the subgroups mentioned in the claims are preferred here. “Can hybridize” is to be understood as a specific hybridization, in particular under stringent conditions, such as those described in the experimental section.
- Suitable oligonucleotides are, for example, oligonucleotides which can hybridize with the regions mentioned in Table 1a, 1b and / or 1c, preferably in Table 1a, because they are complementary to these regions or a fragment thereof which contains at least 20 nucleotides, for example when coupled to a solid support preferably comprises 60-352, optionally 100-190 or 135-157 nucleotides.
- the length depends, among other things, on the base composition or sequence and the hybridization temperature as well as the technology selected. Since it is double-stranded DNA, the oligonucleotides can be complementary to the strand in the 5'-3 'direction or to the strand in the 3'-5' direction, or both.
- oligonucleotides cannot hybridize with regions other than those mentioned in the tables, which is also a prerequisite for specific hybridization.
- suitable oligonucleotides which can hybridize with the regions on chromosome 1 mentioned in Table 1a, 1b and 1c are listed in Table 5.
- the person skilled in the art is able to select suitable oligonucleotides for other markers on the basis of the information disclosed herein about the markers.
- Such oligonucleotides can optionally comprise further components, for example spacer or linker regions.
- the oligonucleotides according to the invention can, for example, be coupled to a solid support, or are oligonucleotides which are coupled to a solid support. Such a coupling is possible e.g. via adapters or tags. One option for this is coupling to biotin, which can bind (or is already bound) to streptavidin or avidin, which is coupled to the solid support.
- the solid support can be, for example, a gene chip, a bead or a bead, for example a magnetic bead or a matrix of columns.
- the carrier thus allows simple separation of the hybridized DNA.
- magnetic beads are described which are coupled to oligonucleotides via a streptavidin-biotin bond, which specifically hybridize with the regions mentioned in Table 1 and can be used as capture probes.
- the agents according to the invention optionally comprise 638 oligonucleotides, e.g. capture probes, which can hybridize with all of the methylation markers mentioned in Table 1.
- the oligonucleotides according to the invention can also be a kit comprising PCR primers for amplifying regions which comprise the methylation markers or (in particular in the case of regions from Table 1) consist of them.
- PCR primers are preferably about 12-40, optionally 15-25, nucleotides in length, which can hybridize with the regions mentioned.
- Such a kit can also comprise blocking oligonucleotides or detection probes which, after bisulfite conversion, can specifically bind to previously methylated or unmethylated DNA.
- Such oligonucleotides can be used, for example, in PCR-based methods according to the invention.
- An analysis by PCR is particularly useful if only a limited number of markers is to be analyzed, e.g. the markers in the genes mentioned above.
- the markers defined in Table 2 are preferably analyzed with this method, alternatively or additionally also the markers defined in Table 3, so that correspondingly suitable oligonucleotides can be selected.
- One or more primers suitable for multiplex PCR can optionally be selected.
- Probes for detection are preferably marked with suitable dyes.
- the invention also provides a method in which the agents according to the invention are used for a diagnosis of lung cancer in a sample from a patient, optionally zfDNA from a liquid biopsy sample from a patient (also: subject) being examined. Due to the selection of markers, however, other samples, e.g. from biopsies and Bronchoscopies or from tissue samples taken during an operation are examined with the agents according to the invention, in particular with those that include markers from Table 1 a, b and / or c, preferably all markers from Table 1a and 1b, optionally also from Table 1c . Biopsies can also be taken from the outside, possibly with imaging.
- the bioinformatic evaluation pipeline poses a further problem.
- the conventional gDNA-WGBS libraries are usually aligned with the “Bismarck” algorithm after processing. The results of the alignment can then be analyzed by numerous evaluation pipelines, with genome-wide DNA methylation signatures being extracted.
- the WGBS experiment of circulating DNA carried out in the exemplary embodiments was the first of its kind. It turned out that the zfDNA-L / brar / es have a different complexity and fragment distribution than conventional gDNA-L / brar / es (see Section 1.1 .2.5). This could be the reason why the “Bismarck” algorithm most frequently used in the prior art delivered an unsatisfactory mapping efficiency of only 70%. For this reason, further algorithms were tested. The best results, with a mapping efficiency of at least 98%, were provided by the “Segemehl” algorithm (see Section 1.1.2.5).
- the Segemehl algorithm is used in particular for aligning (that is, for arranging) the sequencing information of the zfDNA with respect to a reference genome.
- the Segemehl algorithm can be found at https://www.bioinf.uni-leipziq.de/Software/seqemehl/. and is e.g. in Otto et al. described in more detail (Otto et al. [2012] Bioinformatics 28: 1698-1704). As in the example below, version 0.2.0 can be used, but also another version, such as 0.3.4.
- the invention also relates to a method according to the invention for diagnosing a lung tumor, which comprises the following steps: a. Extraction of zfDNA from a liquid biopsy sample or genomic DNA from a lung biopsy tissue sample or a solid tissue sample, which is taken, for example, during an operation, optionally from zfDNA from a liquid biopsy sample, b. Carrying out a bisulfite conversion, c. Creation of a Whole Genome Bisulfite Sequencing Library, d. Enrichment of the DNA regions comprising the defined metabolism markers, these being preferably brought into contact with an agent according to the invention for diagnosis, e. Sequencing the enriched DNA regions, f. alignment of the sequencing data against a reference genome using the Segemehl algorithm, g. Calculation of the methylation rates.
- the converted DNA e.g. zfDNA
- Library preparation takes place in two steps.
- a WGBS library is created from each sample, which contains information about the entire methyloma or the zfDNA methyloma of the corresponding patient.
- these can be enriched from the entire methylome. This can be done as a second step based on the Whole Genome Bisulfite Sequencing Library.
- methylation markers can be used for the enrichment, e.g. the markers from Table 1a identified for the first time in the context of the present work in zfDNA, all markers from Table 1a, alternatively or additionally the markers from Table 1b and / or 1c .
- Capture probes for example, can be used for enrichment. These capture probes can cover the entire plasma panel or parts of it (see section 1.2.1).
- the enriched library can be QC and quantified (see Section 1.1.2.2). It is preferably sequenced, e.g. on the "MiSeq" ("Illumina", USA) (see section 1.2.2).
- the sequencing data can, for example, be saved in the "FastQ” format and then analyzed (see e.g. Section 1.2.3). It is preferable not to analyze the entire methylome, but rather only defined methylation markers.
- Preferred methylation markers are, for example, the 638 regions determined in Table 1 (plasma panel).
- the Segemehl algorithm in particular is used for the analysis against a reference genome.
- the methylation patterns are then calculated.
- the format of the "Segeiolo-OufpL / f-files" is different from the typical "Bismarck” format. Therefore, if necessary, a suitable analysis pipeline compatible with “Segemehl” can be used.
- the “Bisulfite Analysis Toolkit” can be mentioned as an example in this context. These modular software can be used on numerous computing clusters and expanded with additional software and your own scripts. To identify the differentially methylated markers suitable for lung cancer diagnosis, the analysis pipeline can be supplemented with your own bio-informatics scripts, for example those disclosed herein.
- telomere sequence As an alternative to the diagnostic method using sequencing, it is also possible to carry out an analysis using PCR on the basis of the results according to the invention. This is particularly relevant for smaller subgroups of the specific markers, e.g. if initially a sample from a patient is only to be examined for the presence of a tumor and / or the determination of the tumor entity.
- suitable primers can be used to amplify regions of the zfDNA, for example, and to detect the positions mentioned in Table 2 and / or 3. This can be done from purified, bisulfite-converted DNA, e.g. using real-time PCR.
- multiplex PCRs or parallel approaches can also be used.
- beta-actin can be analyzed to check whether the amount of total DNA in the sample is sufficient.
- e.g. zfDNA from a liquid biopsy, preferably from plasma can be purified, bisulfite-converted and then purified again.
- Blockers and detection probes can also be used for the PCR, which specifically recognize the bisulfite-converted, unmethylated sequences within the regions and block their amplification, so that the methylated sequences are amplified preferentially. Methylation-specific probes then only defective methylated sequences that were amplified during the PCR.
- the methylation patterns determined in the sample of a patient can be correlated with the patterns known herein for tumors, optionally for a specific entity and / or a specific stage as specified in the tables, for example. According to the invention, this allows statements to be made about the presence, the entity, the stage and / or the prognosis of a lung tumor and thus allows a reliable, extended diagnosis. According to the invention, this diagnosis can be used to select a therapy in the presence of a tumor or to decide on the initiation of a therapy.
- the invention thus also relates to a method for treating a lung tumor, which comprises a diagnostic method according to the invention, this tumor being treated if a tumor is present.
- the entity of the tumor can also be determined, whereby a therapy suitable, for example, for an adenocarcinoma or a squamous cell carcinoma can be selected.
- Suitable therapy may include, for example, administration of suitable drugs or combinations of drugs and / or radiation.
- the diagnostic method can be used to carry out further diagnostic steps if a tumor is detected, such as taking a solid biopsy and / or imaging methods.
- the invention also relates to a use of a method according to the invention or an agent according to the invention for diagnosing lung cancer, the diagnosis allowing a statement to be made about the presence of a tumor, about the entity of a tumor, about the tumor stage and / or about the prognosis, preferably about the presence and entity of the tumor, optionally about everything at the same time.
- an NGS panel that is based, among other things, on genome-wide zfDNA methylation signatures from plasma.
- the method according to the invention is explicitly characterized by the fact that, due to the selection of the markers, it is particularly suitable for examining zfDNA from a liquid biopsy, for example, for examining tissue samples taken during an operation or lung biopsy tissue.
- the plasma panel differentiated malignant lung tumors from stage I with 100% accuracy, identified the most common NSCLC subtypes and provided further information regarding the determination of the stage of the lung tumors (staging).
- Fig. 1 The WGBS sequencing data were evaluated in several steps. A. First of all, the data was subjected to a QC (e.g. with FastQC) and then processed.
- a QC e.g. with FastQC
- Fig. 2 Processed sequencing data were aligned against the “HG19” reference genome using the “Bisulfite Analysis Toolkit” using the Segemehl algorithm. In addition, DNA methylation rates and differentially methylated regions were detected and overview graphics were created.
- Fig. 4 The functional principle of a classifier.
- an annotation file is generated from the data of the validation cohort (12 patients), which is also loaded into the "Qlucore Omics Explorer” software with the determined DNA methylation rates of the regions contained in the plasma panel (see Table 1).
- the DNA methylation data (variables) and the annotation file are used by implemented algorithms ("k-Nearest Neighbors Algorithm” (kNN), "Support Vector Machines” (SVM) and “Random Trees” (RT)) to create an optimal model to create. This process is known as predictive modeling.
- kNN k-Nearest Neighbors Algorithm
- SVM Simple Vector Machines
- RT Random Trees
- Fig. 6 The DNA methylation rates determined with the “BAT_calling” and “BAT_filter_vcf” modules were loaded into the “BAT_summarize” module of the “Bisulfite Analysis Toolkit”.
- A. The scatter plot clearly shows that the lung cancer group can be differentiated from the control group (tumor-free patient cohort) based on the DNA methylation pattern.
- B. The middle and C. the staggered displays of the DNA methylation rates per group illustrate the genome-wide hypermethylation of the lung carcinoma group compared to the control group.
- the zfDNA methylation patterns determined were normalized and subjected to a hierarchical cluster analysis. Thereby A. 18,000 for the lung carcinoma and B. 44,000 for the respective entity specific differentially methylated CpG loci were identified (adenocarcinoma (A.K.), squamous cell carcinoma (P.K.)).
- Fig. 8 “Pearson” correlation analysis of the DNA methylation values detected with both methods (HM 450K and WGBS) (adenocarcinoma (A.K.), squamous cell carcinoma (P.K.)).
- Fig. 9 The zfDNA methylation rates determined were loaded into the “Qlucore Omics Explorer” software and analyzed using the following classification algorithms: “k-Nearest Neighbors Algorithm” (kNN), “Support Vector Machines” (SVM) and “Random Trees” “(RT). A high z value indicates strong methylation. A. By analyzing 10 differentially methylated positions (markers), the kNN algorithm was able to distinguish healthy (control) patients from malignant lung cancer patients. Both the early (I, II) and the late (III, IV) stages of lung cancer were classified with 100% accuracy (light bar on top of the figure: malignant lung tumor, dark bar (3 columns left): control).
- the late tumor stages (III, IV) could be identified with 80% accuracy with the SVM algorithm, 523 positions were analyzed ((light bars on top of the figure (4 columns left): early stage (I, II ), dark bars at the top of the figure (5 columns on the right): late stage (III, IV))
- the evaluated positions are partly more methylated in the early and partly in the late stages.
- a suitable panel that is to say a set of methylation markers, for DNA methylation analysis in blood plasma was developed within the scope of the invention.
- the set of methylation markers is therefore also referred to as a plasma panel.
- the plasma panel was developed in three independent approaches. The first approach examined whether DNA methylation is generally suitable as a biomarker for lung cancer diagnosis (see Section 1.1.1). For this purpose, 40 lung carcinomas and their corresponding controls were analyzed using the "Illumina Infinium Human Methylation450K BeadChip" (HM 450K). The method identified clear, tumor-specific DNA methylation signatures. Next up were as in the section
- the method detected several thousand aberrantly methylated CpG loci that were not only tumor-specific but also entity-specific. From these, the most suitable regions for the differentiation for the plasma panel were selected (see Section 1.1.2.5.5). Since the diagnosis according to the invention should preferably be made on the basis of liquid biopsies, the methylation markers identified here are of particular importance. In the third approach, the plasma panel was supplemented with 59 tumor-specific and prognostically relevant CpG loci from further studies (see Section 1.1.3).
- the HM 450K data set contained information on the methylation status of 40 lung carcinomas (adenocarcinomas and squamous cell carcinomas) and their corresponding controls.
- the data set was evaluated with the "Qlucore Omics Explorer” software (version 3.2, “Qlucore”, Sweden) and resulted in:
- the circulating cell-free DNA is used according to the invention for the non-invasive diagnosis of solid tumors. If a patient suffers from a malignant tumor disease, the total amount of circulating DNA also contains the tumor DNA, which contains all therapeutically and prognostically relevant information about the genetic and epigenetic characteristics of the tumor. Therefore, the zfDNA has to be isolated from the blood or blood plasma. Since zfDNA can only be extracted from the blood plasma in a very small amount, a method was chosen that enriches the zfDNA very specifically and efficiently without isolating further components of the plasma.
- the “PME free-circulating DNA Extraction Kit” (“Analytik Jena”, Germany, see Section 1.1.2.1) can be used. It contains a polymer which complexes only very specific short-stranded dsDNA fragments. The polymer-zfDNA complex is then precipitated and purified. After the purification, the complex compound can be dissolved. The DNA released in the process is purified from the polymer and concentrated in further steps, for example by binding to a silica column. Other methods based, for example, on the same or similar active principles can also be used. The resulting Product is very clean and can also be used for sensitive NGS-based analysis methods such as WGBS.
- Blood plasma was prepared and shipped on dry ice. For this purpose, the whole blood was centrifuged for 10 minutes at 1,500 g within 30 minutes after it was taken. After centrifugation, the plasma supernatant was carefully pipetted off, distributed to “CryoPure” vessels (“Sarstedt AG & Co”, Germany) and immediately frozen at -80 ° C.
- the frozen plasma samples were slowly thawed under lukewarm water and then centrifuged at 4,500 g for 10 minutes. The pellet was discarded, the clear supernatant transferred to a 10 ml tube and processed with the “PME free-circulating DNA extraction kit” according to the manufacturer's instructions.
- the zfDNA was quantified fluorometrically using the “Qubit dsDNA High Sensitivity Assay Kit” (“Thermo Fisher Scientific”, USA). For this purpose, 1 pl_ of the sample was mixed with the 198 ml_ “Qubit dsDNA HS Buffer” and 1 ml_ “Qubit dsDNA HS Reagent”, incubated for 2 minutes and then ver measure up.
- the “Qubit dsDNA HS Reagent” is a dye that generates a very weak fluorescence signal under normal conditions. In the presence of double-stranded DNA (dsDNA), however, it intercalates into the dsDNA, changes its structure and generates a strong fluorescence signal. Neither single-stranded DNA (ssDNA) nor RNA is bound. The signal intensity thus correlates exclusively with the amount of dsDNA present in the sample.
- the quality of the extracted zfDNA was analyzed using the “Agilent 2100 High Sensitivity DNA Kit” (“Agilent”, USA).
- the method was capillary gel electrophoresis.
- the “Gel-Dye Mix” had to be prepared. 300 ml of the gel matrix were mixed with 15 pL of the dye concentrate, mixed and placed on a "spin filter”. Centrifugation was carried out for 10 minutes at 2240 g.
- the DNA chip was placed in the “priming station” and equilibrated. For this purpose, 9 ml of the “Gel-Dye Mix” were pipetted into the well provided for the equilibration process.
- the stamp of the "Priming Station” was adjusted to one milliliter. After the priming station was firmly closed, the plunger was pressed down for one minute. Finally, the remaining wells of the chip were loaded according to the manufacturer's instructions. The chip was incubated for 1 min and measured immediately afterwards. During the incubation period, a fluorescent dye contained in the “Gel-Dye Mix” intercalated between the bases of the dsDNA. The dsDNA fragments were then plugged drawn through the microscopic capillaries of the "Agilent 2100 Bionalyzer"("Agilent", USA) and separated according to fragment size and detected.
- DNA is subjected to genome-wide PCR-based amplification.
- the DNA polymerases cannot differentiate between cytosines and 5-methylcytosines, so that all 5-methylcytosines are replaced by cytosines during the reaction. The newly synthesized strands are not re-methylated.
- the sample is subjected to a treatment with sodium bisulfite before the PCR.
- This process is known as bisulfite conversion, during which all unmethylated cytosines are converted into uracils.
- the methylated cytosines remain unchanged under the selected reaction conditions.
- the bisulfite conversion reaction is described in NEB, N.E.B. Bisulfite conversion (available at: http://www.neb-online.de/wp-content/uploads/2015/04/NEB epigenetik bisulfit3.jpg) and in Clark et al. (Clark et al. [1994] Nucl. Acids Res 22: 2990-2997).
- the bisulfite conversion of the zfDNA can be done e.g. with the "EZ DNA Methylation-Gold TM Kit” ("Zymo Research", USA). For this, 10 ng of the previously extracted zfDNA was dissolved in 20 ⁇ l of water, mixed with 130 ml of "CT” conversion reagent and processed in the thermal cycler with the following program: 10 min 98 ° C, 2.5 h 64 ° C, up to 20 h at 4 ° C. In the next step, the bisulfite-converted samples were desulfonated and purified.
- WGBS is an NGS-based method (next generation sequencing).
- NGS next generation sequencing
- the underlying sequencing reaction is based on fluorescence and takes place on a glass slide, also called a flow cell.
- Illumina special “Illumina” adapters (short oligonucleotides) are first ligated. The sample is then subjected to a denaturation reaction.
- the ssDNA fragment to be sequenced is "twisted".
- the DNA strands are replicated. This process is known as bridge amplification.
- the so-called sequencing clusters which subsequently dissociate, arise from the progressive amplification at limited positions. After the cluster formation, the actual sequencing reaction takes place, in which DNA bases are incorporated which, depending on the incorporated base, generate fluorescence signals of different wavelengths. After each completed installation cycle, these fluorescence signals are detected and thus provide information about the base sequence within a read.
- the “Accel-NGS ® Methyl-Seq DNA Library Kit” (“Swift Biosciences”, USA) was established for the following experiments.
- the kit was specially developed for WGBS of the zfDNA.
- Complex WGBS libraries can be generated with zfDNA quantities of less than 10 ng.
- the central role is played by the enzyme “adaptase”, which adds a 10 nt long overhang to the 3 'end of the bisulfite-converted ssDNA. This overhang enables better ligating of the sequencing adapters and thus more efficient library production. Therefore, according to the invention, a method for producing the WBGS libraries is preferably used which, by means of the enzyme adaptase, inserts a 10 nt overhand at the 3 'end of the bisulfite-converted ssDNA.
- Extension Reaction Mix 44 ml “Extension Reaction Mix” was added to the sample, carefully mixed and incubated in the animal cycler (program 2: 98 ° C 1 min; 62 ° C 2 min; 65 ° C 5 min; 4 ° C).
- the product has been purified.
- "SPRI Beads” (Beckman Coulter”, USA) can be used for this.
- the finished product was purified using “SPRI Beads” (“Beckman Coulter”, USA).
- the PCR was carried out. 5 pL of the respective index and 25 pL of the “Indexing PCR Reaction Mix” were added to each sample.
- the finished PCR reaction was incubated in the thermal cycler (program 4: 98 ° C 30 s; PCR cycles: 98 ° C 10 s; 60 ° C 30 s; 68 ° C 1 min (7-9 cycles); 4 ° C ) and purified using the “SPRI Beads” (“Beckman Coulter”, USA) according to the manufacturer's instructions.
- the finished WGBS libraries were quantified as described in Section 1.1.2.2 and checked for quality.
- the samples were transferred to 1.5 mL Eppendorf reaction vessels and “SPRI Beads” (“Beckman Coulter”, USA) were added in the prescribed ratio (Tab. A). The samples were then mixed and incubated for 5 minutes at room temperature. Since the beads were magnetic, the principle of magnetic separation could be used for pelletizing. For this purpose, the reaction vessels were placed on a magnetic stand and then incubated for 2 min at room temperature. After the incubation, the supernatant was removed, the beads were washed twice with 500 ⁇ l of 80% ethanol each time (“Merck Millipore”, USA) and then air-dried. Once the ethanol had evaporated, the samples were removed from the magnetic stand.
- the “SPRI Beads” were resuspended in the prescribed amount of “Low EDTA TE” buffer (Tab. A) and incubated for 2 min at room temperature. Finally, the samples were placed back on the magnetic stand. After approx. 2 min, the supernatant and the “SPRI beads” were completely separated. The supernatant contained the purified product, was pipetted off and used for the next step.
- the WGBS libraries were sequenced on the “NextSeq 500” platform (“Illumina”, USA) in the “TATAA Biocenter” (Gothenburg, Sweden). Four 76 pair end (PE) runs were carried out in high throughput mode.
- the WGBS libraries could not be created with conventional protocols due to the high level of fragmentation and small amounts of zfDNA.
- the zfDNA libraries produced with the “Accel-NGS ® Methyl-Seq DNA Library Kit” (“Swift Biosciences”, USA) thus had a different complexity and fragment distribution than the conventional WGBS libraries. Therefore, a suitable bioinformatic evaluation pipeline had to be established in order to be able to analyze the data optimally.
- WGBS data In general, several steps have to be established in order to be able to evaluate WGBS data (Fig. 1). First, the quality of the raw data is checked. The “FastQC” software (version 0.11.15, “Babraham Bioinformatics”, England) is most frequently used for this purpose (see Section 1.1.2.5.1). The software visualizes the quality of the sequencing, length distribution and composition of the reads. In addition, information is provided about possible adapter contamination and the number of kmeres and PCR duplicates. Sequences with a minimum length of two nucleotides, which are repeated over and over again in the raw data, are called Kmers.
- the reads can be arranged against a reference genome of your choice; this process is also known as alignment (see Section 1.1.2.5.3). Many algorithms are available for the alignment. Depending on the nature of the WGBS library, the appropriate one must be selected and optimized. The mapping efficiency can be analyzed for this. The percentage of analyzed reads that can be assigned to the reference genome is calculated.
- the "bis- marck “algorithm Karl & Andrews [2011] Bioinformatics 27: 1571-1572.
- “Bismarck” version 0.15.0, “Babraham Institu te”, England
- did not deliver satisfactory results (mapping efficiency of approx. 70%). Therefore, further algorithms were tested.
- the data are filtered according to the CpG context and the desired coverage (at least fourfold), e.g. with the "Bisulfite Analysis Toolkit” (Version 0.1, “Interdisciplinary Center for Bioinformatics, Leipzig University”, Germany) and only then used for peak calling (see section 1.1.2.5.3).
- the coverage also known as the sequencing depth, indicates how often a position was read during sequencing. E.g. an average coverage of 100 times says that each sequenced base was read an average of 100 times. Peak calling is the actual step in which the methylation status of the respective CpG is calculated.
- the conventional libraries have an average coverage of 30 to 40 times, which is what the conventional methods for peak calling are designed for.
- the zfDNA libraries had an average coverage of 8 to 10 times due to their lower complexity. Accordingly, the filtering and peak calling, e.g. with the "Bisulfite Analysis Toolkit", had to be optimized.
- the raw data was delivered in the "FastO" format. This is a text-based one
- the libraries generated with the "Accel-NGS ® Methyl-Seq DNA Library Kit” contained DNA fragments of different lengths. This means that if a DNA fragment was shorter than 152 bp, the “Illumina adapter” or the flow cell were also sequenced. This resulted in the presence of "NNNNNNNNN” sequences. As the alignment of the associated and otherwise good quality reads would be prevented in the further course of the data analysis, the over-presented sequences had to be removed.
- the command used for this was: cutadapt -q 20 -o 5 -minimum-length 30 -a GATCGGAAGAG -A AGATCGGAAGAG -o ⁇ Name_Read_1> .clipped.fastq.gz -p ⁇ Name_Read_2> .ciipped.fastq.gz ⁇ Na- me_Read_1 > .fastq.gz ⁇ Name_Read_2> .fastq.gz &> ⁇ Name> .clipping.stats
- the enzyme “adaptase” was used, which produced an overhang of low complexity at the 3‘ end of the second read. This area, like the over-presented sequences, would interfere with the later alignment and therefore had to be removed.
- the command was: cutadapt -minimum-length 25 -u 11 -o ⁇ Name_Read_2> .clipped.trimmed.fastq.gz -p ⁇ Name_Read_ 1>. ciipped. trimmed. fastq. gz ⁇ Name_Read_2>. clipped. fastq. gz ⁇ Name_Read_ 1>. clipped. fastq. gz ⁇ Name_Read_ 1>. clipped. fastq. gz
- the alignment was carried out against the “HG19” reference genome.
- Several algorithms were tested, and surprisingly the “Segemehl” algorithm delivered the best results (see Section 1.1.2.5).
- the algorithm is based on the search for an op- maximum hit in the reference genome (Hoffmann et al. [2009] PLoS Comput. Biol. 5: e1000502).
- the maximum number of inaccuracies allowed per read was 10%. All hits that fell below this threshold were admitted to the semi-global alignment.
- only the reads were listed in a final file with an accuracy of at least 90% and used for further analyzes.
- the preferred “BAM” format is a compressed version of the “SAM” file, a text-based format that is generated by the algorithm to save the results of the alignment.
- the statistical evaluation of the mapping efficiency was done e.g. with the "BAT_mapping_stat" module (Kretzmer et al. [2017] F1000Res. 6: 1490).
- the DNA methylation was detected with the help of "BAT_calling".
- the module creates a "VCF” file. This is a text file that only contains information about the detected DNA methylation rates, coverage, number of covered nucleotides and the sequence context. In the further course of the analyzes, this file was filtered according to the CpG context and a coverage of at least eight times. Images were generated and additional “VCF” and “BedGraph” files were created.
- the “BAT_summarize” module was used, which determined the mean values of the detected DNA methylation rates in two groups.
- the calculated DNA methylation rates as well as the genomic coordinates of the cytosines were written into a text-based “BedGraph” file, which was then used to identify differentially methylated regions.
- the visualization of the DNA methylation per group was carried out using the "BAT_overview” module [201] The commands were:
- the “Bedtools” software was used for the correlation analysis.
- the “Bedtools Inter- sect” module reads in both the WGBS and HM 450K results, checks them for overlapping and writes the overlapping CpG loci to a new “BED” file.
- the "BED” format is a text file. Each line of the file contains genomic coordinates of a CpG. The columns are separated by a tab.
- the “BED” file was then loaded directly into “R” and subjected to the “Pearson” correlation analysis (p-value ⁇ 0.01). The results were also visualized in R.
- the WGBS data were evaluated as described.
- the “BedGraph” file generated with the “BAT_summarize” module contained three groups (control, adenocarcinoma, squamous cell carcinoma) with 11,289,424 items per group.
- the "BedGraph” file has been divided into two lists. The first list contained 29,877 loci that showed differences in DNA methylation between the tumor and control groups. The second list contained 76,374 CpG loci, each methylated differently in adeno and squamous cell carcinoma groups. The regions which showed a DNA methylation difference of at least 15% were designated as differentially methylated.
- the remaining CpG loci had to meet one of three criteria in order to be included in the plasma panel:
- Differentially methylated CpG lies within a cluster consisting of at least two further differentially methylated CpG loci, all CpG loci of the cluster are either hypo- or hypermethylated, the distance between the CpG loci is 2 to 20 nucleotides,
- the panel should also contain prognostic information. That is why it was expanded to include 33 CpG loci that were recorded in a clinical study.
- the title of the study was: “Comprehensive characterization of non-small cell lung cancer (NSCLC) by integrated clinical and molecular analysis”.
- the HM 450K data set made available contained information on the DNA methylation status of a total of 41 lung carcinomas.
- the patients were classified according to their survival. 28 patients were counted in the prognostically favorable group (survival time longer than 15 months) and 13 in the unfavorable group (survival time shorter than 13 months).
- the 33 CpG loci included in the panel were able to separate the two groups based on the DNA methylation pattern and thus contained information relevant to the prognosis.
- the inventive set of methylation markers, the plasma panel contained 630 differentially methylated regions (Tab. 1). It was synthesized by the company “Roche” (Switzerland) and sent on dry ice. This was a “SeqCap Epi Enrichment Kit” (“Roche”, Switzerland) that was synthesized according to customer requirements and not commercially available. According to the manufacturer, the panel was suitable for the analysis of tissue samples as well as circulating, cell-free DNA.
- the DZL provided blood plasma from 12 patients. Of these, three patients were healthy or tumor-free (control group) and nine suffered from non-small-cell lung carcinoma of various stages (tumor group).
- the validation took place in several steps. First, the validation material, the circulating, cell-free DNA, was prepared. The extraction from the plasma, quantification, quality control (QC) and bisulfite conversion took place as already described in Sections 1.1.2.1-1.1.2.3.
- the finished library was subjected to a QC and quantified (see Section 1.1.2.2) and then sequenced on the “MiSeq” (“Illumina”, USA) (see Section 1.2.2).
- the sequencing data were saved in the “FastQ” format and then had to be analyzed (see Section 1.2.3).
- the bioinformatics pipeline from Section 1.1.2.5 was adapted for this, as this time not the entire methylome but only the 638 specific regions of the plasma panel should be analyzed.
- the results were then used to develop a classifier that subsequently interpreted the DNA methylation pattern and provided diagnostically and clinically relevant information about the patient's state of health (see Section 1.2.3.3).
- Samples from a patient who is to be diagnosed with lung tumors can also be analyzed according to the same principle. Here, however, the samples are not pooled for analysis.
- the “SeqCap Epi Enrichment Kit” was used to extract and enrich 630 differentially methylated regions from the entire zfDNA methylome.
- One of the components of the kit was the designed plasma panel (see Tab. 1).
- the 12 WGBS libraries produced were pooled equimolar within the various groups and initially prepared for a hybridization reaction. In the case of diagnostic samples, either individual samples are hybridized or pools of samples, each provided with a "bar code", are used. For this purpose, 1 pg of the WGBS L / bra / y pool with 10 pL “Bisulfite Capture Enhancer”, 1 pL “SeqCap HE Universal Oligo” and 1 pL “SeqCap HE Index Oligo” were placed in a 1.5 mL reaction vessel with a small hole pipetted in the lid. The sample was evaporated in a vacuum concentrator until a clear whitish pellet could be seen.
- Hybridization Buffer 3 pL “Hybridization Component A” were added directly to the pellet, mixed for 10 s, briefly centrifuged and incubated at 95 ° C for 10 min. The sample was then transferred to a 0.2 pL reaction vessel, 4.5 pL capture probes were added, mixed well and incubated in a thermal cycler at 47 ° C for 72 hours. The lid of the thermal cycler was preheated to 57 ° C.
- the “capture probes” were specially synthesized for this project. They contained 638 different oligonucleotides which were complementary to the differentially methylated regions investigated (see Table 1) and which specifically bound them in the course of the hybridization reaction. Enrichment and washing of the hybridized "capture probes"
- the bound “capture probes” were enriched and washed several times. Several washing buffers and the “capture beads” were prepared for this according to the manufacturer's instructions.
- the hybridized sample was mixed with 100 ml of “capture beads”, mixed briefly and incubated for 45 min at 47 ° C in the thermal cycler.
- the lid of the thermal cycler was preheated to 57 ° C. To prevent the beads from settling, the samples were briefly removed from the thermal cycler every 15 minutes and mixed.
- the “capture beads” used here were streptavidin beads that interacted with the biotinylated “capture probes”.
- the samples were removed from the thermal cycler and the “capture beads” were subjected to several washing steps.
- the beads were separated from the buffer each time at room temperature using the “DynaMag TM -PCR” magnet (“Thermo Fisher Scientific”, USA).
- the second part of the washing protocol was carried out completely at room temperature, so the buffers used for this had to be preheated to room temperature.
- the “Capture Beads” previously washed at 47 ° C were dissolved in 200 ml of simple “Wash Buffer I”, mixed for 2 min and pelleted with the aid of a magnet.
- the supernatant was discarded, 200 ml of simple “Wash Buffer II” were added to the beads, mixed for 1 min and pelleted again using a magnet.
- the supernatant was discarded, the beads dissolved in 200 ml "Wash Buffer III", briefly mixed and finally separated from the supernatant on the magnet.
- Amplification of the enriched differentially methylated regions After washing, the enriched, differentially methylated regions were amplified.
- 25 ml of double “KAPA HiFi HotStart Ready Mix” (“Roche”, Switzerland) and 5 ml of “Post LM PCR oligonucleotides” (“Roche”, Switzerland) were added to the 20 pl_ of the eluate, mixed well and using amplified using the following PCR program in a thermal cycler with a preheated lid:
- Step 1 45 s 98 ° C
- Step 2 15 s 98 ° C
- Step 3 30 s 60 ° C
- Step 4 30 s 72 ° C
- Step 5 Repeat steps 1-4 for 15 more times
- Step 6 60 s 72 ° C
- Step 7 Pause at 4 ° C
- the amplified regions were subsequently purified, e.g. using the "AmpureXP" beads ("Beckman Coulter”, USA).
- the beads were first preheated to room temperature.
- the sample was transferred to a 1.5 ml reaction vessel.
- 50 ml of dH Ü and 180 ml of “AmpureXP” beads were added to 50 ml of sample.
- the sample was mixed briefly, incubated for 15 min at room temperature, briefly centrifuged and placed on the “DynaMag TM -2” magnet (“Thermo Fisher Scientific”, USA). The supernatant was discarded and the beads were washed twice with 200 ml of freshly prepared 80% ethanol each time. The beads were then dried for 15 minutes at room temperature.
- the NGS library was sequenced from enriched, differentially methylated regions on the “MiSeg”.
- the library produced was first diluted to 4 nM and denatured. Then the 5 ml of the 4 nM library was transferred to a 1.5 ml reaction vessel, mixed with 5 ⁇ L of 0.2 N NaOH, briefly mixed, centrifuged for 1 min at 280 g and incubated for 5 min at room temperature. The denatured library was then spiked with 990 pL “Buffer HT1” (“Illumina”, USA) and re- mixed well. This resulted in a 20 pM library, which was then diluted to 4 pM with “Buffer HT1” and added 10% “PhiX” (“Illumina”, USA).
- the DNA methylation rates within the sequenced regions were calculated with the "BAT_calling” module and filtered with the "BAT_filter_vcf” module according to the CpG context and a coverage of at least eight times (see Section 1.1.2.5.3). Finally, the data was annotated against the regions of the plasma panel. The calls were: gzip tmp. vcf perl BAT_filter_vcf -vcf tmp.vcf. gz -out $ o -context CG -MDP_min 8 ⁇ MDP_max 200 rm tmp. vcf.gz done bedtools unionbedg -filier NA -header -names ⁇ sample_1> ...
- the DNA methylation pattern of a patient should be analyzed with the help of the plasma panel. From this it should be concluded whether a patient has a malignant lung tumor. If so, information about the entity of the tumor and the prognosis of the affected patient should be derived from the DNA methylation profile. This can be done on the basis of the correlation between the methylation pattern present in the patient and the methylation markers important according to the invention.
- a classifier can be created that is able to quickly and reliably interpret the results of the pipeline described in Sections 1.2.3.1 and 1.2.3.2.
- a classifier also known as predictive modeling, is an example of supervised learning.
- the aim of a classifier is to first create a model after obtaining variables (e.g. DNA methylation patterns) and an annotation, which is later able to classify the variables of independent samples correctly (Fig. 4).
- the “Qlucore Omics Explorer” software offers several options for using DNA methylation data to create an optimal classifier for the respective question.
- kNN a class is assigned based on the consideration of k nearest neighbors.
- SVM describes each object by a vector in a vector space. Within the vector space, a hyperplane is set in such a way that it acts as a separating surface between the groups and divides them into two classes.
- RT consists of several uncorrelated decision trees that were generated during the learning process. Each tree makes a decision, the class with the most votes ultimately decides on the final classification.
- the CpG loci that enabled a reliable classification of lung tumors based on malignancy and entity were then selected.
- the bioinformatic analyzes described in Section 1.1.1 were carried out, which resulted in 287 CpG loci. These loci were included in a set of methylation markers preferred according to the invention, the plasma panel (Tab. 1).
- Every single cell-free, circulating DNA sample was quantified after extraction and subjected to strict quality control.
- the total amount of extracted DNA was 10 to 30 ng per sample, of which 1 ng was analyzed with the "Agilent 2100 Bioanalyzer".
- the samples showed a clear peak at approx. 167 bp.
- the peaks at 35 and 10,380 bp corresponded to the lower and upper markers (not shown).
- the zfDNA samples were used to produce WGBS libraries.
- the finished libraries were again quantified and then subjected to a quality control using the "Agilent 2100 Bioanalyzer". All samples showed a clear peak at approx. 300 bp and thus met the sequencing requirements.
- the WGBS libraries produced were sent on dry ice to “TATAA Biocenter”, pooled there and, depending on the sample, sequenced with an average coverage of eight to ten times on a “Next-Seq 500” platform.
- the raw data was delivered in the "FastQ" format.
- the quality of the raw data was checked using the "FastQC" software. Since the 76 PE samples were sequenced, the read length was 76 bp, as expected. Within a read, the content of adapters and unidentifiable signals was 0%. The accuracy of the sequencing was given in “Ph red” values. Each “Phred” value describes how precisely the reading of nucleotides was carried out in the course of the sequencing. The raw data showed a "phred” score of over 30, which corresponds to an accuracy of more than 99.9%. spoke. Furthermore, only a very small amount of kmeren could be detected. Sequences with a minimum length of two nucleotides, which are repeated over and over again in the raw data, are called Kmers. The number of PCR duplicates was almost 0%. The amount of PCR duplicates is determined by calculating the percentage number of deduplicated sequences and comparing it with the number of all sequences. A small amount of codes and PCR duplicates indicate good library and sequencing quality.
- a base composition typical for WGBS was analyzed.
- most of the unmethylated cytosines were replaced by thymines.
- the thymine content of the raw data was therefore approx. 50% and the cytosine content almost 0%.
- the adenine and guanine composition was not influenced during the bisulfite conversion and was 25% each.
- the WGBS raw data were then processed using the "Cutadapt" software (see Section 1.1.2.5.2). The processing removed both over-presented sequences and the 10 nt long overhang at the beginning of Read 2.
- mapping efficiency This determines what percentage of reads can be assigned to the reference genome. In this case, the mapping efficiency of the “Segemehl” algorithm was 98% to 99% and was therefore suitable for all further analyzes.
- the AHgnments of the control, adenocarcinoma and squamous cell carcinoma groups were next loaded into the "BAT_calling" module.
- the module determined DNA methylation rates of the respective cytosines.
- the cytosines that were within a CpG region and had a coverage of at least eightfold were then identified using the "BAT_f i Iteri ng" module and used for all further analyzes.
- the filtering was carried out according to a DNA methylation difference of at least 15%.
- the number of differentially methylated CpG loci in the plasma of lung cancer patients was at 18,000 ( Figure 7A). Furthermore, 44,000 CpG loci were identified which, depending on the entity, were differentially methylated in adeno and squamous cell carcinoma patients (FIG. 7B). These loci were subjected to further analyzes as described in Section 1.1.2.5.5 and used for the creation of the plasma panel.
- the finished set of methylation markers, ie the finished plasma panel contained 630 differentially methylated regions (Tab. 1). Oligonucleotides hybridizing with these differentially methylated regions were synthesized as "capture probes" and thus represent a means of diagnosing lung tumors.
- the extracted zfDNA samples were quantified as described in Section 1.1.2.2 and subjected to quality control. For this purpose, 1 ng of each sample was analyzed with the "Agilent 2100 Bioanalyzer". All zfDNA samples used showed a clear peak at approx. 167 bp. The samples were then converted to bisulfite and used to produce NGS libraries. As described in section 1.2.1, the libraries were created in two steps.
- WGBS libraries were created that contained information about the entire zfDNA methylome. All 12 WGBS libraries produced showed a clear, large peak at approx. 300 bp. The larger 300 to 1,000 bp peaks were the so-called daisy chains, i.e. ssDNA fragments hybridized to one another. According to the manufacturer, they neither affect the subsequent hybridization reaction nor the actual sequencing and therefore do not have to be eliminated.
- the WGBS libraries produced were quantified, pooled in equimolar amounts and processed with the "SeqCap Epi Enrichment Kit".
- the kit used here contained the so-called “capture probes”, which were specially synthesized for this purpose.
- the “Capture Probes” hybrid target specifically to the 638 regions of the plasma panel (see Tab. 1).
- the “capture probes” including the bound differentially methylated regions were enriched, washed and amplified.
- the amplified library was then quantified and subjected to a quality control (eg "Agilent 2100 High Sensitivity DNA Kit”).
- the finished library had a high peak at approx. 300 bp and thus met the sequencing requirements of the “MiSeq”.
- the sequencing on the "MiSeq” was optimized. Sequencing was carried out in a 76 PE mode. The first 76 bp of the sequenced DNA fragments were thus read from both ends.
- the library was diluted to 4 pM.
- the libraries described here were unbalanced. Libraries whose AT or GC concentration is less than 40% or more than 60% are referred to as unbalanced. Due to their composition, such libraries usually have an unsatisfactory sequencing quality.
- the library can be moved with "PhiX Control V3". The concentration of "PhiX” has to be adjusted individually depending on the library. The optimal concentration of “PhiX Control V3” was 10% in the present case.
- the read length was 76 bp.
- the content of adapters and unidentifiable signals within a read was 0%.
- the raw data showed a “phred” score of over 30, which corresponds to a sequencing accuracy of more than 99.9%.
- the base composition (thymine content at approx. 50%, cytosine content at almost 0%, adenine and guanine content at 25%) indicated successful bisulfite conversion.
- the first 10 nt of the second read was an overhang created by the enzyme “adaptase”. The deviation of the experimentally determined from the theoretically calculated GC content was also due to the bisulfite conversion.
- PCR duplicates The number of PCR duplicates was approx. 15%. The number of deduplicated sequences differed greatly from the total. However, this is not uncommon for a panel. In contrast to genome-wide sequencing, only a small area of the genome is sequenced in a panel. This leads to a very low complexity of the library and accordingly to the creation of PCR duplicates. The number of kmeres is very small and does not interfere with further evaluation.
- the processed sequencing data were then loaded into the "Bisulfite Analysis Toolkit".
- the alignment was carried out with “Segemehl” against the “HG19” reference genome.
- the mapping efficiency was at least 90%. This means that at least 90% of the raw data could be assigned to the reference genome.
- the mean coverage, i.e. the sequencing depth, was 10 to 30 times depending on the sample.
- DNA methylation should be detected.
- the 12 alignments were loaded into the “BAT_calling” module.
- the positions determined were then initially annotated against the “HG 19” reference genome using the “Bedtools”.
- the methylated positions were filtered with the "BAT_filtering” module after a coverage of at least eightfold.
- the module for creating a classifier only those positions were selected that were on the one hand in a CpG region and on the other hand were listed in the plasma panel (Tab. 1).
- the determined zfDNA methylation rates were used to create a classifier. As described in Section 1.2.3.3, the "Qlucore Omics Explorer” software was used for this, which contained the following classification algorithms: “k-Nearest Neighbors Algorithm” (kNN), “Support Vector Machines” (SVM) and “Random Trees” ( RT).
- kNN k-Nearest Neighbors Algorithm
- SVM Small Vector Machines
- RT Random Trees
- the plasma panel was designed in such a way that it should be optimally able to provide the information relating to the malignancy, the entity and the stage of a tumor. These questions could be answered reliably by choosing a suitable classifier. Information about the prognosis should also be available.
- the correctness of a classifier was given in values between 0 and 1, where 0 corresponded to an accuracy of 0% and 1 to an accuracy of 100%.
- the complexity indicated how many differentially methylated positions or markers had to be analyzed for the classifier to achieve this accuracy. The fewer markers that had to be evaluated, the more suitable the classifier was for the clinic. Because with the number of positions to be analyzed, the error rate, time and costs of the method increase.
- the first question was whether a patient generally suffered from a malignant lung tumor.
- both the kNN and the RT algorithm delivered an accuracy of 100%.
- the RT algorithm required 237 differentially methylated positions included in the panel for the classification.
- the kNN on the other hand, only has 10 positions, which qualifies it as optimal for this question (FIG. 9A). In 9 of the 10 positions there is a stronger methylation in the tumor tissue, in one a weaker one.
- the SVM algorithm managed to differentiate the late tumor stages with 80% accuracy using 523 positions (FIG. 9C). The positions evaluated are partly more methylated in the early and partly in the late stages.
- CpG loci were selected that were within a cluster consisting of at least two further differentially methylated CpG loci. All CpG loci in the cluster were either hypo- or hypermethylated. The distance between the CpG loci was two to 20 nt.
- Tab. 1 Set of methylation markers (plasma panel, 630 differentially methylated regions). The "Tumor” column indicates whether increased (hypermethylated) or decreased (hypomethylated) methylation was identified in tumor tissue.
- A. 350 regions that detect a malignant tumor of the lung.
- B. 247 regions that distinguish the most common lung carcinomas (adeno- and squamous cell carcinoma) from one another.
- Tab. 2 The kNN algorithm used ten positions in order to be able to differentiate the lung cancer patients from the healthy subjects.
- the "Tumor” column indicates whether increased (+) or decreased (-) methylation was identified in tumor tissue. A.
- Tab. 3 The RT algorithm analyzed ten positions to determine the entity of a tumor. All positions in adenocarcinoma were hypermethylated compared to squamous cell carcinoma.
- Tab. 4 For staging (determining the tumor stage), the SVM algorithm analyzed 523 positions. Some positions are more methylated in the late stage
- Tab. 5 Exemplary oligonucleotides (capture targets) for markers on chromosome 1 that can be used in the method according to the invention.
Landscapes
- Chemical & Material Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Organic Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Engineering & Computer Science (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Immunology (AREA)
- Analytical Chemistry (AREA)
- Genetics & Genomics (AREA)
- Pathology (AREA)
- Physics & Mathematics (AREA)
- Biotechnology (AREA)
- Microbiology (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Biochemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Hospice & Palliative Care (AREA)
- Oncology (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
- Investigating Or Analysing Biological Materials (AREA)
Abstract
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP19195688.7A EP3789505A1 (fr) | 2019-09-05 | 2019-09-05 | Procédé et moyen de diagnostic du cancer du poumon |
PCT/EP2020/074775 WO2021043986A1 (fr) | 2019-09-05 | 2020-09-04 | Moyens et méthodes de diagnostic du cancer du poumon |
Publications (1)
Publication Number | Publication Date |
---|---|
EP4025713A1 true EP4025713A1 (fr) | 2022-07-13 |
Family
ID=67874380
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP19195688.7A Withdrawn EP3789505A1 (fr) | 2019-09-05 | 2019-09-05 | Procédé et moyen de diagnostic du cancer du poumon |
EP20764417.0A Pending EP4025713A1 (fr) | 2019-09-05 | 2020-09-04 | Moyens et méthodes de diagnostic du cancer du poumon |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP19195688.7A Withdrawn EP3789505A1 (fr) | 2019-09-05 | 2019-09-05 | Procédé et moyen de diagnostic du cancer du poumon |
Country Status (3)
Country | Link |
---|---|
US (1) | US20230203590A1 (fr) |
EP (2) | EP3789505A1 (fr) |
WO (1) | WO2021043986A1 (fr) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113106151A (zh) * | 2021-03-25 | 2021-07-13 | 杭州瑞普基因科技有限公司 | 基于qPCR检测肺小结节甲基化的核酸组合物、试剂盒 |
CN114277154B (zh) * | 2022-01-27 | 2022-11-29 | 武汉康录生物技术股份有限公司 | 一种用于肺癌诊断和早期肺癌无创筛查的检测试剂盒 |
CN115274124B (zh) * | 2022-07-22 | 2023-11-14 | 江苏先声医学诊断有限公司 | 一种基于数据驱动的肿瘤早筛靶向Panel和分类模型的动态优化方法 |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019068082A1 (fr) * | 2017-09-29 | 2019-04-04 | Arizona Board Of Regents On Behalf Of The University Of Arizona | Biomarqueurs de méthylation d'adn pour le diagnostic du cancer |
-
2019
- 2019-09-05 EP EP19195688.7A patent/EP3789505A1/fr not_active Withdrawn
-
2020
- 2020-09-04 WO PCT/EP2020/074775 patent/WO2021043986A1/fr unknown
- 2020-09-04 US US17/639,804 patent/US20230203590A1/en active Pending
- 2020-09-04 EP EP20764417.0A patent/EP4025713A1/fr active Pending
Also Published As
Publication number | Publication date |
---|---|
EP3789505A1 (fr) | 2021-03-10 |
US20230203590A1 (en) | 2023-06-29 |
WO2021043986A1 (fr) | 2021-03-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107475375B (zh) | 一种用于与微卫星不稳定性相关微卫星位点进行杂交的dna探针库、检测方法和试剂盒 | |
EP4025713A1 (fr) | Moyens et méthodes de diagnostic du cancer du poumon | |
CN105861710B (zh) | 测序接头、其制备方法及其在超低频变异检测中的应用 | |
CN109906276A (zh) | 用于检测早期癌症中体细胞突变特征的识别方法 | |
DE60029092T2 (de) | Verfahren zur detektion von nukleinsäuren, welche auf krebs hinweisen | |
WO2018090298A2 (fr) | Systèmes et procédés de surveillance d'évolution tumorale à vie | |
CN106650312B (zh) | 一种用于循环肿瘤dna拷贝数变异检测的装置 | |
CN107475370A (zh) | 用于肺癌诊断的基因群和试剂盒及诊断方法 | |
CN106834515B (zh) | 一种检测met基因14外显子突变的探针库、检测方法和试剂盒 | |
CN107847515A (zh) | 实体瘤甲基化标志物及其用途 | |
CN108603232A (zh) | 监测骨髓瘤的治疗或进展 | |
Wang et al. | Circulating tumor DNA analysis for tumor diagnosis | |
DE69632252T2 (de) | Verfahren zur erkennung von klonalen populationen von transformierten zellen in einer genomisch heterogenen zellulären probe | |
DE602004004988T2 (de) | Methylierungsstatus-Detektionsassays mittels methylierungsspezifischer Primerextension (MSPE) | |
CN106399304B (zh) | 一种与乳腺癌相关的snp标记 | |
CN107881232A (zh) | 探针组合物及基于ngs方法检测肺癌和结直肠癌基因的应用 | |
CN109439741B (zh) | 检测特发性癫痫病基因探针组合物、试剂盒及应用 | |
EP2935621B1 (fr) | Procédé de détermination du degré d'alliage adn-méthyle | |
EP4243023A1 (fr) | Procédé de détermination de la sensibilité à un inhibiteur de parp ou à un agent endommageant l'adn à l'aide d'un transcriptome non fonctionnel | |
CN107974504A (zh) | 基于ngs方法的肺癌和结直肠癌基因检测的方法 | |
CN114196740A (zh) | 用于同时识别多种基因类型的数字扩增检测方法、检测产品和检测试剂盒 | |
CN106834476A (zh) | 一种乳腺癌检测试剂盒 | |
CN116042820B (zh) | 一组结肠癌dna甲基化分子标志物及其在制备用于结肠癌早期诊断试剂盒中的应用 | |
CN106636351A (zh) | 一种与乳腺癌相关的snp标记及其应用 | |
CN115772564B (zh) | 用于辅助检测肺癌体细胞atm基因融合突变的甲基化生物标记物及其应用 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: UNKNOWN |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE |
|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE |
|
17P | Request for examination filed |
Effective date: 20220214 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
DAV | Request for validation of the european patent (deleted) | ||
DAX | Request for extension of the european patent (deleted) | ||
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: EXAMINATION IS IN PROGRESS |
|
17Q | First examination report despatched |
Effective date: 20231109 |