WO2024085608A1 - 혈액 내 rna의 엑손-접합 정보를 이용한 암 진단 방법 - Google Patents
혈액 내 rna의 엑손-접합 정보를 이용한 암 진단 방법 Download PDFInfo
- Publication number
- WO2024085608A1 WO2024085608A1 PCT/KR2023/016067 KR2023016067W WO2024085608A1 WO 2024085608 A1 WO2024085608 A1 WO 2024085608A1 KR 2023016067 W KR2023016067 W KR 2023016067W WO 2024085608 A1 WO2024085608 A1 WO 2024085608A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- cancer
- exon
- blood
- isolated
- individual
- Prior art date
Links
- 206010028980 Neoplasm Diseases 0.000 title claims abstract description 202
- 201000011510 cancer Diseases 0.000 title claims abstract description 198
- 210000004369 blood Anatomy 0.000 title claims abstract description 98
- 239000008280 blood Substances 0.000 title claims abstract description 98
- 108091032973 (ribonucleotides)n+m Proteins 0.000 title claims abstract description 95
- 238000000034 method Methods 0.000 title claims abstract description 62
- 238000003745 diagnosis Methods 0.000 title claims abstract description 40
- 102000040650 (ribonucleotides)n+m Human genes 0.000 title claims abstract description 12
- 210000004027 cell Anatomy 0.000 claims abstract description 47
- 108091092259 cell-free RNA Proteins 0.000 claims abstract description 41
- 210000001808 exosome Anatomy 0.000 claims abstract description 41
- 239000002773 nucleotide Substances 0.000 claims abstract description 9
- 125000003729 nucleotide group Chemical group 0.000 claims abstract description 9
- 239000000203 mixture Substances 0.000 claims description 59
- 238000010804 cDNA synthesis Methods 0.000 claims description 49
- 108020004635 Complementary DNA Proteins 0.000 claims description 47
- 239000002299 complementary DNA Substances 0.000 claims description 47
- 206010009944 Colon cancer Diseases 0.000 claims description 15
- 208000029742 colonic neoplasm Diseases 0.000 claims description 15
- 239000003795 chemical substances by application Substances 0.000 claims description 14
- 238000009007 Diagnostic Kit Methods 0.000 claims description 12
- 238000012706 support-vector machine Methods 0.000 claims description 11
- 230000002194 synthesizing effect Effects 0.000 claims description 11
- 208000014829 head and neck neoplasm Diseases 0.000 claims description 10
- 238000007481 next generation sequencing Methods 0.000 claims description 9
- 238000013528 artificial neural network Methods 0.000 claims description 8
- 210000000349 chromosome Anatomy 0.000 claims description 8
- 238000004422 calculation algorithm Methods 0.000 claims description 7
- 206010061424 Anal cancer Diseases 0.000 claims description 5
- 208000007860 Anus Neoplasms Diseases 0.000 claims description 5
- 206010005003 Bladder cancer Diseases 0.000 claims description 5
- 206010005949 Bone cancer Diseases 0.000 claims description 5
- 208000018084 Bone neoplasm Diseases 0.000 claims description 5
- 208000003174 Brain Neoplasms Diseases 0.000 claims description 5
- 206010006187 Breast cancer Diseases 0.000 claims description 5
- 208000026310 Breast neoplasm Diseases 0.000 claims description 5
- 208000000461 Esophageal Neoplasms Diseases 0.000 claims description 5
- 208000022072 Gallbladder Neoplasms Diseases 0.000 claims description 5
- 206010023825 Laryngeal cancer Diseases 0.000 claims description 5
- 206010058467 Lung neoplasm malignant Diseases 0.000 claims description 5
- 208000003445 Mouth Neoplasms Diseases 0.000 claims description 5
- 206010061902 Pancreatic neoplasm Diseases 0.000 claims description 5
- 208000000821 Parathyroid Neoplasms Diseases 0.000 claims description 5
- 208000015634 Rectal Neoplasms Diseases 0.000 claims description 5
- 208000000453 Skin Neoplasms Diseases 0.000 claims description 5
- 208000005718 Stomach Neoplasms Diseases 0.000 claims description 5
- 206010043515 Throat cancer Diseases 0.000 claims description 5
- 208000024770 Thyroid neoplasm Diseases 0.000 claims description 5
- 206010062129 Tongue neoplasm Diseases 0.000 claims description 5
- 208000007097 Urinary Bladder Neoplasms Diseases 0.000 claims description 5
- 208000002495 Uterine Neoplasms Diseases 0.000 claims description 5
- 239000004480 active ingredient Substances 0.000 claims description 5
- 201000011165 anus cancer Diseases 0.000 claims description 5
- 201000006491 bone marrow cancer Diseases 0.000 claims description 5
- 201000007455 central nervous system cancer Diseases 0.000 claims description 5
- 208000025997 central nervous system neoplasm Diseases 0.000 claims description 5
- 201000004101 esophageal cancer Diseases 0.000 claims description 5
- 201000010175 gallbladder cancer Diseases 0.000 claims description 5
- 206010017758 gastric cancer Diseases 0.000 claims description 5
- 201000005787 hematologic cancer Diseases 0.000 claims description 5
- 208000024200 hematopoietic and lymphoid system neoplasm Diseases 0.000 claims description 5
- 206010023841 laryngeal neoplasm Diseases 0.000 claims description 5
- 201000004962 larynx cancer Diseases 0.000 claims description 5
- 208000012987 lip and oral cavity carcinoma Diseases 0.000 claims description 5
- 201000007270 liver cancer Diseases 0.000 claims description 5
- 208000014018 liver neoplasm Diseases 0.000 claims description 5
- 201000005202 lung cancer Diseases 0.000 claims description 5
- 208000020816 lung neoplasm Diseases 0.000 claims description 5
- 208000015486 malignant pancreatic neoplasm Diseases 0.000 claims description 5
- 208000026037 malignant tumor of neck Diseases 0.000 claims description 5
- 208000026045 malignant tumor of parathyroid gland Diseases 0.000 claims description 5
- 201000001441 melanoma Diseases 0.000 claims description 5
- 208000008443 pancreatic carcinoma Diseases 0.000 claims description 5
- 206010038038 rectal cancer Diseases 0.000 claims description 5
- 201000001275 rectum cancer Diseases 0.000 claims description 5
- 201000000849 skin cancer Diseases 0.000 claims description 5
- 201000011549 stomach cancer Diseases 0.000 claims description 5
- 201000002510 thyroid cancer Diseases 0.000 claims description 5
- 201000006134 tongue cancer Diseases 0.000 claims description 5
- 201000005112 urinary bladder cancer Diseases 0.000 claims description 5
- 206010046766 uterine cancer Diseases 0.000 claims description 5
- 238000010801 machine learning Methods 0.000 claims description 4
- 238000003066 decision tree Methods 0.000 claims description 2
- 238000003064 k means clustering Methods 0.000 claims description 2
- 238000007477 logistic regression Methods 0.000 claims description 2
- 238000007637 random forest analysis Methods 0.000 claims description 2
- 239000012472 biological sample Substances 0.000 claims 4
- 238000011222 transcriptome analysis Methods 0.000 claims 1
- 238000011319 anticancer therapy Methods 0.000 abstract 1
- 102100026561 Filamin-A Human genes 0.000 description 35
- 101000913549 Homo sapiens Filamin-A Proteins 0.000 description 35
- 101000738771 Homo sapiens Receptor-type tyrosine-protein phosphatase C Proteins 0.000 description 28
- 102100037422 Receptor-type tyrosine-protein phosphatase C Human genes 0.000 description 28
- 239000000523 sample Substances 0.000 description 24
- 239000000090 biomarker Substances 0.000 description 23
- 108090000623 proteins and genes Proteins 0.000 description 22
- 230000000875 corresponding effect Effects 0.000 description 19
- 101001078143 Homo sapiens Integrin alpha-IIb Proteins 0.000 description 16
- 102100025306 Integrin alpha-IIb Human genes 0.000 description 16
- 238000004458 analytical method Methods 0.000 description 15
- 108700024394 Exon Proteins 0.000 description 12
- 108020004999 messenger RNA Proteins 0.000 description 11
- 238000002955 isolation Methods 0.000 description 10
- 238000010200 validation analysis Methods 0.000 description 10
- 101000598025 Homo sapiens Talin-1 Proteins 0.000 description 9
- 102100036977 Talin-1 Human genes 0.000 description 9
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 9
- 101001051767 Homo sapiens Protein kinase C beta type Proteins 0.000 description 8
- 101000890836 Homo sapiens TRAF3-interacting JNK-activating modulator Proteins 0.000 description 8
- 102100024923 Protein kinase C beta type Human genes 0.000 description 8
- 102100040128 TRAF3-interacting JNK-activating modulator Human genes 0.000 description 8
- 238000001514 detection method Methods 0.000 description 8
- 201000010099 disease Diseases 0.000 description 8
- 150000007523 nucleic acids Chemical class 0.000 description 8
- 102100026750 60S ribosomal protein L5 Human genes 0.000 description 7
- 102100026234 Cytokine receptor common subunit gamma Human genes 0.000 description 7
- 108020004414 DNA Proteins 0.000 description 7
- 101000691083 Homo sapiens 60S ribosomal protein L5 Proteins 0.000 description 7
- 101001055227 Homo sapiens Cytokine receptor common subunit gamma Proteins 0.000 description 7
- 101001043809 Homo sapiens Interleukin-7 receptor subunit alpha Proteins 0.000 description 7
- 101001098833 Homo sapiens Proprotein convertase subtilisin/kexin type 6 Proteins 0.000 description 7
- 102100021593 Interleukin-7 receptor subunit alpha Human genes 0.000 description 7
- 102100038946 Proprotein convertase subtilisin/kexin type 6 Human genes 0.000 description 7
- 238000012549 training Methods 0.000 description 7
- 102100036630 60S ribosomal protein L7a Human genes 0.000 description 6
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 6
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 6
- 102100027706 Heterogeneous nuclear ribonucleoprotein D-like Human genes 0.000 description 6
- 101000853243 Homo sapiens 60S ribosomal protein L7a Proteins 0.000 description 6
- 101001081145 Homo sapiens Heterogeneous nuclear ribonucleoprotein D-like Proteins 0.000 description 6
- 101001015059 Homo sapiens Integrin beta-5 Proteins 0.000 description 6
- 101001109719 Homo sapiens Nucleophosmin Proteins 0.000 description 6
- 101000743264 Homo sapiens RNA-binding protein 6 Proteins 0.000 description 6
- 102100033010 Integrin beta-5 Human genes 0.000 description 6
- 102100022678 Nucleophosmin Human genes 0.000 description 6
- 102100038150 RNA-binding protein 6 Human genes 0.000 description 6
- 238000001035 drying Methods 0.000 description 6
- 239000012634 fragment Substances 0.000 description 6
- 101150077246 gas5 gene Proteins 0.000 description 6
- 239000003550 marker Substances 0.000 description 6
- 238000011282 treatment Methods 0.000 description 6
- 102100032411 60S ribosomal protein L18 Human genes 0.000 description 5
- 102100040924 60S ribosomal protein L6 Human genes 0.000 description 5
- 102100029712 E3 ubiquitin-protein ligase TRIM58 Human genes 0.000 description 5
- 101001087985 Homo sapiens 60S ribosomal protein L18 Proteins 0.000 description 5
- 101000673524 Homo sapiens 60S ribosomal protein L6 Proteins 0.000 description 5
- 101000795365 Homo sapiens E3 ubiquitin-protein ligase TRIM58 Proteins 0.000 description 5
- 101000980823 Homo sapiens Leukocyte surface antigen CD53 Proteins 0.000 description 5
- 101001008498 Homo sapiens Luc7-like protein 3 Proteins 0.000 description 5
- 102100024221 Leukocyte surface antigen CD53 Human genes 0.000 description 5
- 102100027434 Luc7-like protein 3 Human genes 0.000 description 5
- 230000008859 change Effects 0.000 description 5
- 239000003153 chemical reaction reagent Substances 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 5
- 238000011528 liquid biopsy Methods 0.000 description 5
- 230000035945 sensitivity Effects 0.000 description 5
- 239000000126 substance Substances 0.000 description 5
- 238000012360 testing method Methods 0.000 description 5
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N (+)-Biotin Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 description 4
- 102100031571 40S ribosomal protein S16 Human genes 0.000 description 4
- 102100037710 40S ribosomal protein S21 Human genes 0.000 description 4
- 102100037663 40S ribosomal protein S8 Human genes 0.000 description 4
- 102100026112 60S acidic ribosomal protein P2 Human genes 0.000 description 4
- 102100035916 60S ribosomal protein L11 Human genes 0.000 description 4
- 102100031854 60S ribosomal protein L14 Human genes 0.000 description 4
- 102100021308 60S ribosomal protein L23 Human genes 0.000 description 4
- 102100041029 60S ribosomal protein L9 Human genes 0.000 description 4
- 102100022005 B-lymphocyte antigen CD20 Human genes 0.000 description 4
- 102100036952 Cytoplasmic protein NCK2 Human genes 0.000 description 4
- 102100039328 Endoplasmin Human genes 0.000 description 4
- 102100031411 GAS2-like protein 1 Human genes 0.000 description 4
- 108010013942 GMP Reductase Proteins 0.000 description 4
- 102100021188 GMP reductase 1 Human genes 0.000 description 4
- 102100021186 Granulysin Human genes 0.000 description 4
- 102100040485 HLA class II histocompatibility antigen, DRB1 beta chain Human genes 0.000 description 4
- 108010039343 HLA-DRB1 Chains Proteins 0.000 description 4
- 101000706746 Homo sapiens 40S ribosomal protein S16 Proteins 0.000 description 4
- 101001097814 Homo sapiens 40S ribosomal protein S21 Proteins 0.000 description 4
- 101001097439 Homo sapiens 40S ribosomal protein S8 Proteins 0.000 description 4
- 101000691878 Homo sapiens 60S acidic ribosomal protein P2 Proteins 0.000 description 4
- 101001073740 Homo sapiens 60S ribosomal protein L11 Proteins 0.000 description 4
- 101000704267 Homo sapiens 60S ribosomal protein L14 Proteins 0.000 description 4
- 101000675833 Homo sapiens 60S ribosomal protein L23 Proteins 0.000 description 4
- 101000672886 Homo sapiens 60S ribosomal protein L9 Proteins 0.000 description 4
- 101000897405 Homo sapiens B-lymphocyte antigen CD20 Proteins 0.000 description 4
- 101001024712 Homo sapiens Cytoplasmic protein NCK2 Proteins 0.000 description 4
- 101000812663 Homo sapiens Endoplasmin Proteins 0.000 description 4
- 101000922847 Homo sapiens GAS2-like protein 1 Proteins 0.000 description 4
- 101001040751 Homo sapiens Granulysin Proteins 0.000 description 4
- 101000998139 Homo sapiens Interleukin-32 Proteins 0.000 description 4
- 101000972143 Homo sapiens Maturin Proteins 0.000 description 4
- 101000918983 Homo sapiens Neutrophil defensin 1 Proteins 0.000 description 4
- 102100033501 Interleukin-32 Human genes 0.000 description 4
- 102100022448 Maturin Human genes 0.000 description 4
- 102100029494 Neutrophil defensin 1 Human genes 0.000 description 4
- 108091028043 Nucleic acid sequence Proteins 0.000 description 4
- -1 antibodies Proteins 0.000 description 4
- 239000000463 material Substances 0.000 description 4
- 102000039446 nucleic acids Human genes 0.000 description 4
- 108020004707 nucleic acids Proteins 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 238000000746 purification Methods 0.000 description 4
- 238000012216 screening Methods 0.000 description 4
- 102100039882 40S ribosomal protein S17 Human genes 0.000 description 3
- 102100033449 40S ribosomal protein S24 Human genes 0.000 description 3
- 102100024088 40S ribosomal protein S7 Human genes 0.000 description 3
- 102100027271 40S ribosomal protein SA Human genes 0.000 description 3
- 102100025643 60S ribosomal protein L12 Human genes 0.000 description 3
- 102100035322 60S ribosomal protein L24 Human genes 0.000 description 3
- 102100031552 Coactosin-like protein Human genes 0.000 description 3
- 102100032756 Cysteine-rich protein 1 Human genes 0.000 description 3
- 102000004190 Enzymes Human genes 0.000 description 3
- 108090000790 Enzymes Proteins 0.000 description 3
- PEDCQBHIVMGVHV-UHFFFAOYSA-N Glycerine Chemical compound OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 description 3
- 101000812077 Homo sapiens 40S ribosomal protein S17 Proteins 0.000 description 3
- 101000656669 Homo sapiens 40S ribosomal protein S24 Proteins 0.000 description 3
- 101000690200 Homo sapiens 40S ribosomal protein S7 Proteins 0.000 description 3
- 101000694288 Homo sapiens 40S ribosomal protein SA Proteins 0.000 description 3
- 101000575173 Homo sapiens 60S ribosomal protein L12 Proteins 0.000 description 3
- 101000660926 Homo sapiens 60S ribosomal protein L24 Proteins 0.000 description 3
- 101000945426 Homo sapiens CB1 cannabinoid receptor-interacting protein 1 Proteins 0.000 description 3
- 101000940352 Homo sapiens Coactosin-like protein Proteins 0.000 description 3
- 101000942084 Homo sapiens Cysteine-rich protein 1 Proteins 0.000 description 3
- 101000840275 Homo sapiens Interferon alpha-inducible protein 27, mitochondrial Proteins 0.000 description 3
- 101001049181 Homo sapiens Killer cell lectin-like receptor subfamily B member 1 Proteins 0.000 description 3
- 101000636811 Homo sapiens Neudesin Proteins 0.000 description 3
- 101000711369 Homo sapiens Probable ribosome biogenesis protein RLP24 Proteins 0.000 description 3
- 101000979599 Homo sapiens Protein NKG7 Proteins 0.000 description 3
- 101000662909 Homo sapiens T cell receptor beta constant 1 Proteins 0.000 description 3
- 101000798076 Homo sapiens T cell receptor delta constant Proteins 0.000 description 3
- 101000946860 Homo sapiens T-cell surface glycoprotein CD3 epsilon chain Proteins 0.000 description 3
- 101000915738 Homo sapiens Zinc finger Ran-binding domain-containing protein 2 Proteins 0.000 description 3
- 101000614806 Homo sapiens cAMP-dependent protein kinase type II-beta regulatory subunit Proteins 0.000 description 3
- 102100034343 Integrase Human genes 0.000 description 3
- 102100029604 Interferon alpha-inducible protein 27, mitochondrial Human genes 0.000 description 3
- 102100023678 Killer cell lectin-like receptor subfamily B member 1 Human genes 0.000 description 3
- 241001465754 Metazoa Species 0.000 description 3
- 102100031903 Neudesin Human genes 0.000 description 3
- 102100023370 Protein NKG7 Human genes 0.000 description 3
- 238000002123 RNA extraction Methods 0.000 description 3
- 102100037272 T cell receptor beta constant 1 Human genes 0.000 description 3
- 102100032272 T cell receptor delta constant Human genes 0.000 description 3
- 102100035794 T-cell surface glycoprotein CD3 epsilon chain Human genes 0.000 description 3
- 102100028956 Zinc finger Ran-binding domain-containing protein 2 Human genes 0.000 description 3
- 239000007853 buffer solution Substances 0.000 description 3
- 102100021205 cAMP-dependent protein kinase type II-beta regulatory subunit Human genes 0.000 description 3
- 230000000295 complement effect Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 230000002068 genetic effect Effects 0.000 description 3
- ZJYYHGLJYGJLLN-UHFFFAOYSA-N guanidinium thiocyanate Chemical compound SC#N.NC(N)=N ZJYYHGLJYGJLLN-UHFFFAOYSA-N 0.000 description 3
- 238000005304 joining Methods 0.000 description 3
- 229910052751 metal Inorganic materials 0.000 description 3
- 239000002184 metal Substances 0.000 description 3
- 150000002739 metals Chemical class 0.000 description 3
- 238000010606 normalization Methods 0.000 description 3
- 238000003752 polymerase chain reaction Methods 0.000 description 3
- 102000004169 proteins and genes Human genes 0.000 description 3
- 239000000985 reactive dye Substances 0.000 description 3
- 230000001105 regulatory effect Effects 0.000 description 3
- 238000011269 treatment regimen Methods 0.000 description 3
- FWMNVWWHGCHHJJ-SKKKGAJSSA-N 4-amino-1-[(2r)-6-amino-2-[[(2r)-2-[[(2r)-2-[[(2r)-2-amino-3-phenylpropanoyl]amino]-3-phenylpropanoyl]amino]-4-methylpentanoyl]amino]hexanoyl]piperidine-4-carboxylic acid Chemical compound C([C@H](C(=O)N[C@H](CC(C)C)C(=O)N[C@H](CCCCN)C(=O)N1CCC(N)(CC1)C(O)=O)NC(=O)[C@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 FWMNVWWHGCHHJJ-SKKKGAJSSA-N 0.000 description 2
- 102100026357 40S ribosomal protein S13 Human genes 0.000 description 2
- 102100037513 40S ribosomal protein S23 Human genes 0.000 description 2
- 102100022406 60S ribosomal protein L10a Human genes 0.000 description 2
- 102100024442 60S ribosomal protein L13 Human genes 0.000 description 2
- 102100037685 60S ribosomal protein L22 Human genes 0.000 description 2
- 102100021660 60S ribosomal protein L28 Human genes 0.000 description 2
- 102100040637 60S ribosomal protein L34 Human genes 0.000 description 2
- 102100036116 60S ribosomal protein L35 Human genes 0.000 description 2
- 102100022048 60S ribosomal protein L36 Human genes 0.000 description 2
- 102100040131 60S ribosomal protein L37 Human genes 0.000 description 2
- 102100035921 Arginine/serine-rich protein PNISR Human genes 0.000 description 2
- 102100027203 B-cell antigen receptor complex-associated protein beta chain Human genes 0.000 description 2
- 102100027207 CD27 antigen Human genes 0.000 description 2
- HEDRZPFGACZZDS-UHFFFAOYSA-N Chloroform Chemical compound ClC(Cl)Cl HEDRZPFGACZZDS-UHFFFAOYSA-N 0.000 description 2
- 208000005443 Circulating Neoplastic Cells Diseases 0.000 description 2
- 102100032355 Coiled-coil domain-containing protein 92 Human genes 0.000 description 2
- FBPFZTCFMRRESA-JGWLITMVSA-N D-glucitol Chemical compound OC[C@H](O)[C@@H](O)[C@H](O)[C@H](O)CO FBPFZTCFMRRESA-JGWLITMVSA-N 0.000 description 2
- 102000053602 DNA Human genes 0.000 description 2
- 230000006820 DNA synthesis Effects 0.000 description 2
- 102100030386 Granzyme A Human genes 0.000 description 2
- 102100034154 Guanine nucleotide-binding protein G(i) subunit alpha-2 Human genes 0.000 description 2
- 102100040505 HLA class II histocompatibility antigen, DR alpha chain Human genes 0.000 description 2
- 108010067802 HLA-DR alpha-Chains Proteins 0.000 description 2
- 101000691550 Homo sapiens 39S ribosomal protein L13, mitochondrial Proteins 0.000 description 2
- 101000718313 Homo sapiens 40S ribosomal protein S13 Proteins 0.000 description 2
- 101001097953 Homo sapiens 40S ribosomal protein S23 Proteins 0.000 description 2
- 101000755323 Homo sapiens 60S ribosomal protein L10a Proteins 0.000 description 2
- 101001118201 Homo sapiens 60S ribosomal protein L13 Proteins 0.000 description 2
- 101001097555 Homo sapiens 60S ribosomal protein L22 Proteins 0.000 description 2
- 101000676271 Homo sapiens 60S ribosomal protein L28 Proteins 0.000 description 2
- 101000672659 Homo sapiens 60S ribosomal protein L34 Proteins 0.000 description 2
- 101000715818 Homo sapiens 60S ribosomal protein L35 Proteins 0.000 description 2
- 101001110263 Homo sapiens 60S ribosomal protein L36 Proteins 0.000 description 2
- 101000671735 Homo sapiens 60S ribosomal protein L37 Proteins 0.000 description 2
- 101001000549 Homo sapiens Arginine/serine-rich protein PNISR Proteins 0.000 description 2
- 101000914491 Homo sapiens B-cell antigen receptor complex-associated protein beta chain Proteins 0.000 description 2
- 101000914511 Homo sapiens CD27 antigen Proteins 0.000 description 2
- 101000797732 Homo sapiens Coiled-coil domain-containing protein 92 Proteins 0.000 description 2
- 101001009599 Homo sapiens Granzyme A Proteins 0.000 description 2
- 101001070508 Homo sapiens Guanine nucleotide-binding protein G(i) subunit alpha-2 Proteins 0.000 description 2
- 101001015004 Homo sapiens Integrin beta-3 Proteins 0.000 description 2
- 101000917826 Homo sapiens Low affinity immunoglobulin gamma Fc region receptor II-a Proteins 0.000 description 2
- 101001115417 Homo sapiens M-phase phosphoprotein 8 Proteins 0.000 description 2
- 101000830386 Homo sapiens Neutrophil defensin 3 Proteins 0.000 description 2
- 101000662902 Homo sapiens T cell receptor beta constant 2 Proteins 0.000 description 2
- 101000704170 Homo sapiens U2 snRNP-associated SURP motif-containing protein Proteins 0.000 description 2
- 102100032999 Integrin beta-3 Human genes 0.000 description 2
- XEEYBQQBJWHFJM-UHFFFAOYSA-N Iron Chemical compound [Fe] XEEYBQQBJWHFJM-UHFFFAOYSA-N 0.000 description 2
- 102100029204 Low affinity immunoglobulin gamma Fc region receptor II-a Human genes 0.000 description 2
- 102100023268 M-phase phosphoprotein 8 Human genes 0.000 description 2
- TWRXJAOTZQYOKJ-UHFFFAOYSA-L Magnesium chloride Chemical compound [Mg+2].[Cl-].[Cl-] TWRXJAOTZQYOKJ-UHFFFAOYSA-L 0.000 description 2
- 102100024761 Neutrophil defensin 3 Human genes 0.000 description 2
- ISWSIDIOOBJBQZ-UHFFFAOYSA-N Phenol Chemical compound OC1=CC=CC=C1 ISWSIDIOOBJBQZ-UHFFFAOYSA-N 0.000 description 2
- 108010092799 RNA-directed DNA polymerase Proteins 0.000 description 2
- 102100037298 T cell receptor beta constant 2 Human genes 0.000 description 2
- 102100031884 U2 snRNP-associated SURP motif-containing protein Human genes 0.000 description 2
- DZBUGLKDJFMEHC-UHFFFAOYSA-N acridine Chemical compound C1=CC=CC2=CC3=CC=CC=C3N=C21 DZBUGLKDJFMEHC-UHFFFAOYSA-N 0.000 description 2
- 230000003321 amplification Effects 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 238000001574 biopsy Methods 0.000 description 2
- 239000011616 biotin Substances 0.000 description 2
- 229960002685 biotin Drugs 0.000 description 2
- 235000020958 biotin Nutrition 0.000 description 2
- 230000032823 cell division Effects 0.000 description 2
- 210000000170 cell membrane Anatomy 0.000 description 2
- 238000002405 diagnostic procedure Methods 0.000 description 2
- 239000000975 dye Substances 0.000 description 2
- 210000003743 erythrocyte Anatomy 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 239000007788 liquid Substances 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 108091070501 miRNA Proteins 0.000 description 2
- 239000002679 microRNA Substances 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 238000003199 nucleic acid amplification method Methods 0.000 description 2
- 210000004940 nucleus Anatomy 0.000 description 2
- 238000004393 prognosis Methods 0.000 description 2
- ZCCUUQDIBDJBTK-UHFFFAOYSA-N psoralen Chemical compound C1=C2OC(=O)C=CC2=CC2=C1OC=C2 ZCCUUQDIBDJBTK-UHFFFAOYSA-N 0.000 description 2
- 238000003908 quality control method Methods 0.000 description 2
- 238000012163 sequencing technique Methods 0.000 description 2
- 238000003860 storage Methods 0.000 description 2
- 239000000758 substrate Substances 0.000 description 2
- 238000002560 therapeutic procedure Methods 0.000 description 2
- 210000001519 tissue Anatomy 0.000 description 2
- 238000005199 ultracentrifugation Methods 0.000 description 2
- 238000011144 upstream manufacturing Methods 0.000 description 2
- 238000012800 visualization Methods 0.000 description 2
- 102100024682 14-3-3 protein eta Human genes 0.000 description 1
- QKNYBSVHEMOAJP-UHFFFAOYSA-N 2-amino-2-(hydroxymethyl)propane-1,3-diol;hydron;chloride Chemical compound Cl.OCC(N)(CO)CO QKNYBSVHEMOAJP-UHFFFAOYSA-N 0.000 description 1
- VXGRJERITKFWPL-UHFFFAOYSA-N 4',5'-Dihydropsoralen Natural products C1=C2OC(=O)C=CC2=CC2=C1OCC2 VXGRJERITKFWPL-UHFFFAOYSA-N 0.000 description 1
- MKNQNPYGAQGARI-UHFFFAOYSA-N 4-(bromomethyl)phenol Chemical compound OC1=CC=C(CBr)C=C1 MKNQNPYGAQGARI-UHFFFAOYSA-N 0.000 description 1
- 102100026744 40S ribosomal protein S10 Human genes 0.000 description 1
- 102100023912 40S ribosomal protein S12 Human genes 0.000 description 1
- 102100033409 40S ribosomal protein S3 Human genes 0.000 description 1
- 102100022600 40S ribosomal protein S3a Human genes 0.000 description 1
- 102100034088 40S ribosomal protein S4, X isoform Human genes 0.000 description 1
- 102100023779 40S ribosomal protein S5 Human genes 0.000 description 1
- 102100033714 40S ribosomal protein S6 Human genes 0.000 description 1
- 102100023990 60S ribosomal protein L17 Human genes 0.000 description 1
- 102100021206 60S ribosomal protein L19 Human genes 0.000 description 1
- 102100037965 60S ribosomal protein L21 Human genes 0.000 description 1
- 102100023247 60S ribosomal protein L23a Human genes 0.000 description 1
- 102100040540 60S ribosomal protein L3 Human genes 0.000 description 1
- 102100040768 60S ribosomal protein L32 Human genes 0.000 description 1
- 102100028780 AP-1 complex subunit sigma-2 Human genes 0.000 description 1
- 102000002260 Alkaline Phosphatase Human genes 0.000 description 1
- 108020004774 Alkaline Phosphatase Proteins 0.000 description 1
- 102100038046 Alpha/beta hydrolase domain-containing protein 17A Human genes 0.000 description 1
- 102100029651 Arginine/serine-rich protein 1 Human genes 0.000 description 1
- 102100027205 B-cell antigen receptor complex-associated protein alpha chain Human genes 0.000 description 1
- 241000894006 Bacteria Species 0.000 description 1
- 102100036301 C-C chemokine receptor type 7 Human genes 0.000 description 1
- 102100024217 CAMPATH-1 antigen Human genes 0.000 description 1
- 102100036008 CD48 antigen Human genes 0.000 description 1
- 108010065524 CD52 Antigen Proteins 0.000 description 1
- QCMYYKRYFNMIEC-UHFFFAOYSA-N COP(O)=O Chemical class COP(O)=O QCMYYKRYFNMIEC-UHFFFAOYSA-N 0.000 description 1
- 102100032537 Calpain-2 catalytic subunit Human genes 0.000 description 1
- KXDHJXZQYSOELW-UHFFFAOYSA-M Carbamate Chemical compound NC([O-])=O KXDHJXZQYSOELW-UHFFFAOYSA-M 0.000 description 1
- 102100035353 Cyclin-dependent kinase 2-associated protein 1 Human genes 0.000 description 1
- FBPFZTCFMRRESA-FSIIMWSLSA-N D-Glucitol Natural products OC[C@H](O)[C@H](O)[C@@H](O)[C@H](O)CO FBPFZTCFMRRESA-FSIIMWSLSA-N 0.000 description 1
- FBPFZTCFMRRESA-KVTDHHQDSA-N D-Mannitol Chemical compound OC[C@@H](O)[C@@H](O)[C@H](O)[C@H](O)CO FBPFZTCFMRRESA-KVTDHHQDSA-N 0.000 description 1
- 238000000018 DNA microarray Methods 0.000 description 1
- 101710085367 DNA polymerase I, thermostable Proteins 0.000 description 1
- 206010061818 Disease progression Diseases 0.000 description 1
- 206010061819 Disease recurrence Diseases 0.000 description 1
- 102100027418 E3 ubiquitin-protein ligase RNF213 Human genes 0.000 description 1
- 102100031748 E3 ubiquitin-protein ligase SIAH2 Human genes 0.000 description 1
- KCXVZYZYPLLWCC-UHFFFAOYSA-N EDTA Chemical compound OC(=O)CN(CC(O)=O)CCN(CC(O)=O)CC(O)=O KCXVZYZYPLLWCC-UHFFFAOYSA-N 0.000 description 1
- 241000701832 Enterobacteria phage T3 Species 0.000 description 1
- 241000588724 Escherichia coli Species 0.000 description 1
- 108700039887 Essential Genes Proteins 0.000 description 1
- 102100024422 GTPase IMAP family member 7 Human genes 0.000 description 1
- WQZGKKKJIJFFOK-GASJEMHNSA-N Glucose Natural products OC[C@H]1OC(O)[C@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-GASJEMHNSA-N 0.000 description 1
- 102100030385 Granzyme B Human genes 0.000 description 1
- 102100032610 Guanine nucleotide-binding protein G(s) subunit alpha isoforms XLas Human genes 0.000 description 1
- 102100028640 HLA class II histocompatibility antigen, DR beta 5 chain Human genes 0.000 description 1
- 108010016996 HLA-DRB5 Chains Proteins 0.000 description 1
- 241000282414 Homo sapiens Species 0.000 description 1
- 101000760084 Homo sapiens 14-3-3 protein eta Proteins 0.000 description 1
- 101000639726 Homo sapiens 28S ribosomal protein S12, mitochondrial Proteins 0.000 description 1
- 101001119189 Homo sapiens 40S ribosomal protein S10 Proteins 0.000 description 1
- 101000682687 Homo sapiens 40S ribosomal protein S12 Proteins 0.000 description 1
- 101000656561 Homo sapiens 40S ribosomal protein S3 Proteins 0.000 description 1
- 101000679249 Homo sapiens 40S ribosomal protein S3a Proteins 0.000 description 1
- 101000732165 Homo sapiens 40S ribosomal protein S4, X isoform Proteins 0.000 description 1
- 101000622644 Homo sapiens 40S ribosomal protein S5 Proteins 0.000 description 1
- 101000656896 Homo sapiens 40S ribosomal protein S6 Proteins 0.000 description 1
- 101000682512 Homo sapiens 60S ribosomal protein L17 Proteins 0.000 description 1
- 101001105789 Homo sapiens 60S ribosomal protein L19 Proteins 0.000 description 1
- 101000661708 Homo sapiens 60S ribosomal protein L21 Proteins 0.000 description 1
- 101001115494 Homo sapiens 60S ribosomal protein L23a Proteins 0.000 description 1
- 101000673985 Homo sapiens 60S ribosomal protein L3 Proteins 0.000 description 1
- 101000672453 Homo sapiens 60S ribosomal protein L32 Proteins 0.000 description 1
- 101000768016 Homo sapiens AP-1 complex subunit sigma-2 Proteins 0.000 description 1
- 101000742837 Homo sapiens Alpha/beta hydrolase domain-containing protein 17A Proteins 0.000 description 1
- 101000728589 Homo sapiens Arginine/serine-rich protein 1 Proteins 0.000 description 1
- 101000914489 Homo sapiens B-cell antigen receptor complex-associated protein alpha chain Proteins 0.000 description 1
- 101000716065 Homo sapiens C-C chemokine receptor type 7 Proteins 0.000 description 1
- 101000716130 Homo sapiens CD48 antigen Proteins 0.000 description 1
- 101000867692 Homo sapiens Calpain-2 catalytic subunit Proteins 0.000 description 1
- 101000737813 Homo sapiens Cyclin-dependent kinase 2-associated protein 1 Proteins 0.000 description 1
- 101000650316 Homo sapiens E3 ubiquitin-protein ligase RNF213 Proteins 0.000 description 1
- 101000707245 Homo sapiens E3 ubiquitin-protein ligase SIAH2 Proteins 0.000 description 1
- 101000833390 Homo sapiens GTPase IMAP family member 7 Proteins 0.000 description 1
- 101001009603 Homo sapiens Granzyme B Proteins 0.000 description 1
- 101001014590 Homo sapiens Guanine nucleotide-binding protein G(s) subunit alpha isoforms XLas Proteins 0.000 description 1
- 101001014594 Homo sapiens Guanine nucleotide-binding protein G(s) subunit alpha isoforms short Proteins 0.000 description 1
- 101000935040 Homo sapiens Integrin beta-2 Proteins 0.000 description 1
- 101001018028 Homo sapiens Lymphocyte antigen 86 Proteins 0.000 description 1
- 101000956320 Homo sapiens Membrane-spanning 4-domains subfamily A member 6A Proteins 0.000 description 1
- 101001109501 Homo sapiens NKG2-D type II integral membrane protein Proteins 0.000 description 1
- 101001014610 Homo sapiens Neuroendocrine secretory protein 55 Proteins 0.000 description 1
- 101001122162 Homo sapiens Overexpressed in colon carcinoma 1 protein Proteins 0.000 description 1
- 101000992378 Homo sapiens Oxysterol-binding protein 2 Proteins 0.000 description 1
- 101001001793 Homo sapiens Pleckstrin homology domain-containing family O member 1 Proteins 0.000 description 1
- 101000797903 Homo sapiens Protein ALEX Proteins 0.000 description 1
- 101000958299 Homo sapiens Protein lyl-1 Proteins 0.000 description 1
- 101000830691 Homo sapiens Protein tyrosine phosphatase type IVA 2 Proteins 0.000 description 1
- 101001055100 Homo sapiens Repressor of RNA polymerase III transcription MAF1 homolog Proteins 0.000 description 1
- 101000654740 Homo sapiens Septin-5 Proteins 0.000 description 1
- 101000587438 Homo sapiens Serine/arginine-rich splicing factor 5 Proteins 0.000 description 1
- 101000658614 Homo sapiens Tetraspanin-33 Proteins 0.000 description 1
- 101000715069 Homo sapiens Transcription initiation factor TFIID subunit 10 Proteins 0.000 description 1
- 101000629913 Homo sapiens Translocon-associated protein subunit beta Proteins 0.000 description 1
- 101000644682 Homo sapiens Ubiquitin-conjugating enzyme E2 H Proteins 0.000 description 1
- 108010001336 Horseradish Peroxidase Proteins 0.000 description 1
- 101710203526 Integrase Proteins 0.000 description 1
- 102100025390 Integrin beta-2 Human genes 0.000 description 1
- 239000000232 Lipid Bilayer Substances 0.000 description 1
- 102100033485 Lymphocyte antigen 86 Human genes 0.000 description 1
- 241000124008 Mammalia Species 0.000 description 1
- 229930195725 Mannitol Natural products 0.000 description 1
- 108010052285 Membrane Proteins Proteins 0.000 description 1
- 102100038555 Membrane-spanning 4-domains subfamily A member 6A Human genes 0.000 description 1
- 206010027476 Metastases Diseases 0.000 description 1
- 102100022680 NKG2-D type II integral membrane protein Human genes 0.000 description 1
- 101710163270 Nuclease Proteins 0.000 description 1
- 102100027063 Overexpressed in colon carcinoma 1 protein Human genes 0.000 description 1
- 102100032164 Oxysterol-binding protein 2 Human genes 0.000 description 1
- 102100036265 Pleckstrin homology domain-containing family O member 1 Human genes 0.000 description 1
- WCUXLLCKKVVCTQ-UHFFFAOYSA-M Potassium chloride Chemical compound [Cl-].[K+] WCUXLLCKKVVCTQ-UHFFFAOYSA-M 0.000 description 1
- 108010076504 Protein Sorting Signals Proteins 0.000 description 1
- 102100038231 Protein lyl-1 Human genes 0.000 description 1
- 102100024602 Protein tyrosine phosphatase type IVA 2 Human genes 0.000 description 1
- 241000205156 Pyrococcus furiosus Species 0.000 description 1
- 238000003559 RNA-seq method Methods 0.000 description 1
- 238000011530 RNeasy Mini Kit Methods 0.000 description 1
- 102100026898 Repressor of RNA polymerase III transcription MAF1 homolog Human genes 0.000 description 1
- 102100032744 Septin-5 Human genes 0.000 description 1
- 238000012300 Sequence Analysis Methods 0.000 description 1
- 102100029703 Serine/arginine-rich splicing factor 5 Human genes 0.000 description 1
- 102100034916 Tetraspanin-33 Human genes 0.000 description 1
- 241000205188 Thermococcus Species 0.000 description 1
- 241000589500 Thermus aquaticus Species 0.000 description 1
- 241000589498 Thermus filiformis Species 0.000 description 1
- 241000589499 Thermus thermophilus Species 0.000 description 1
- 102100036677 Transcription initiation factor TFIID subunit 10 Human genes 0.000 description 1
- 102100026229 Translocon-associated protein subunit beta Human genes 0.000 description 1
- 102100020698 Ubiquitin-conjugating enzyme E2 H Human genes 0.000 description 1
- 238000001793 Wilcoxon signed-rank test Methods 0.000 description 1
- 210000002593 Y chromosome Anatomy 0.000 description 1
- 230000002378 acidificating effect Effects 0.000 description 1
- 239000002168 alkylating agent Substances 0.000 description 1
- 229940100198 alkylating agent Drugs 0.000 description 1
- 230000000692 anti-sense effect Effects 0.000 description 1
- 238000011394 anticancer treatment Methods 0.000 description 1
- 230000001580 bacterial effect Effects 0.000 description 1
- WQZGKKKJIJFFOK-VFUOTHLCSA-N beta-D-glucose Chemical compound OC[C@H]1O[C@@H](O)[C@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-VFUOTHLCSA-N 0.000 description 1
- 239000003124 biologic agent Substances 0.000 description 1
- 238000013529 biological neural network Methods 0.000 description 1
- 239000007844 bleaching agent Substances 0.000 description 1
- UDSAIICHUKSCKT-UHFFFAOYSA-N bromophenol blue Chemical compound C1=C(Br)C(O)=C(Br)C=C1C1(C=2C=C(Br)C(O)=C(Br)C=2)C2=CC=CC=C2S(=O)(=O)O1 UDSAIICHUKSCKT-UHFFFAOYSA-N 0.000 description 1
- AIYUHDOJVYHVIT-UHFFFAOYSA-M caesium chloride Chemical compound [Cl-].[Cs+] AIYUHDOJVYHVIT-UHFFFAOYSA-M 0.000 description 1
- 230000005907 cancer growth Effects 0.000 description 1
- 210000003850 cellular structure Anatomy 0.000 description 1
- 239000001913 cellulose Substances 0.000 description 1
- 229920002678 cellulose Polymers 0.000 description 1
- 238000005119 centrifugation Methods 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 239000002738 chelating agent Substances 0.000 description 1
- 125000003636 chemical group Chemical group 0.000 description 1
- 239000007795 chemical reaction product Substances 0.000 description 1
- 238000002512 chemotherapy Methods 0.000 description 1
- 238000005345 coagulation Methods 0.000 description 1
- 230000015271 coagulation Effects 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- OBRMNDMBJQTZHV-UHFFFAOYSA-N cresol red Chemical compound C1=C(O)C(C)=CC(C2(C3=CC=CC=C3S(=O)(=O)O2)C=2C=C(C)C(O)=CC=2)=C1 OBRMNDMBJQTZHV-UHFFFAOYSA-N 0.000 description 1
- 230000000593 degrading effect Effects 0.000 description 1
- 239000005547 deoxyribonucleotide Substances 0.000 description 1
- 125000002637 deoxyribonucleotide group Chemical group 0.000 description 1
- 238000001784 detoxification Methods 0.000 description 1
- NAGJZTKCGNOGPW-UHFFFAOYSA-K dioxido-sulfanylidene-sulfido-$l^{5}-phosphane Chemical compound [O-]P([O-])([S-])=S NAGJZTKCGNOGPW-UHFFFAOYSA-K 0.000 description 1
- 230000005750 disease progression Effects 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 238000001962 electrophoresis Methods 0.000 description 1
- 230000017188 evasion or tolerance of host immune response Effects 0.000 description 1
- 230000007717 exclusion Effects 0.000 description 1
- 239000007850 fluorescent dye Substances 0.000 description 1
- 238000004108 freeze drying Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 239000008103 glucose Substances 0.000 description 1
- 229960000789 guanidine hydrochloride Drugs 0.000 description 1
- PJJJBBJSCAKJQF-UHFFFAOYSA-N guanidinium chloride Chemical compound [Cl-].NC(N)=[NH2+] PJJJBBJSCAKJQF-UHFFFAOYSA-N 0.000 description 1
- 230000023597 hemostasis Effects 0.000 description 1
- 125000002887 hydroxy group Chemical group [H]O* 0.000 description 1
- 230000000984 immunochemical effect Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 239000004615 ingredient Substances 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 239000000138 intercalating agent Substances 0.000 description 1
- 229910052742 iron Inorganic materials 0.000 description 1
- 150000002632 lipids Chemical class 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 229910001629 magnesium chloride Inorganic materials 0.000 description 1
- 239000000594 mannitol Substances 0.000 description 1
- 235000010355 mannitol Nutrition 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000013178 mathematical model Methods 0.000 description 1
- 239000012528 membrane Substances 0.000 description 1
- 239000002207 metabolite Substances 0.000 description 1
- 230000009401 metastasis Effects 0.000 description 1
- 230000011987 methylation Effects 0.000 description 1
- 238000007069 methylation reaction Methods 0.000 description 1
- 238000010369 molecular cloning Methods 0.000 description 1
- 230000001590 oxidative effect Effects 0.000 description 1
- 230000005298 paramagnetic effect Effects 0.000 description 1
- 239000002245 particle Substances 0.000 description 1
- 150000008300 phosphoramidites Chemical class 0.000 description 1
- 210000002381 plasma Anatomy 0.000 description 1
- 210000004623 platelet-rich plasma Anatomy 0.000 description 1
- 229920000729 poly(L-lysine) polymer Polymers 0.000 description 1
- 238000006116 polymerization reaction Methods 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 239000000047 product Substances 0.000 description 1
- 102000010838 rac1 GTP Binding Protein Human genes 0.000 description 1
- 108010062302 rac1 GTP Binding Protein Proteins 0.000 description 1
- 230000002285 radioactive effect Effects 0.000 description 1
- 239000002683 reaction inhibitor Substances 0.000 description 1
- 230000009257 reactivity Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000002787 reinforcement Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000010839 reverse transcription Methods 0.000 description 1
- PYWVYCXTNDRMGF-UHFFFAOYSA-N rhodamine B Chemical compound [Cl-].C=12C=CC(=[N+](CC)CC)C=C2OC2=CC(N(CC)CC)=CC=C2C=1C1=CC=CC=C1C(O)=O PYWVYCXTNDRMGF-UHFFFAOYSA-N 0.000 description 1
- 108090000850 ribosomal protein S14 Proteins 0.000 description 1
- 102000004314 ribosomal protein S14 Human genes 0.000 description 1
- 239000013049 sediment Substances 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 210000002966 serum Anatomy 0.000 description 1
- 230000007781 signaling event Effects 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 239000000243 solution Substances 0.000 description 1
- 239000000600 sorbitol Substances 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 230000006641 stabilisation Effects 0.000 description 1
- 238000011105 stabilization Methods 0.000 description 1
- 239000003381 stabilizer Substances 0.000 description 1
- 230000000087 stabilizing effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 150000005846 sugar alcohols Polymers 0.000 description 1
- 239000006228 supernatant Substances 0.000 description 1
- 238000001356 surgical procedure Methods 0.000 description 1
- 208000024891 symptom Diseases 0.000 description 1
- RYYWUUFWQRZTIU-UHFFFAOYSA-K thiophosphate Chemical compound [O-]P([O-])([O-])=S RYYWUUFWQRZTIU-UHFFFAOYSA-K 0.000 description 1
- 239000003053 toxin Substances 0.000 description 1
- 231100000765 toxin Toxicity 0.000 description 1
- 108700012359 toxins Proteins 0.000 description 1
- 229960005486 vaccine Drugs 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- 238000005406 washing Methods 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
- NLIVDORGVGAOOJ-MAHBNPEESA-M xylene cyanol Chemical compound [Na+].C1=C(C)C(NCC)=CC=C1C(\C=1C(=CC(OS([O-])=O)=CC=1)OS([O-])=O)=C\1C=C(C)\C(=[NH+]/CC)\C=C/1 NLIVDORGVGAOOJ-MAHBNPEESA-M 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6883—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
- C12Q1/6886—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B30/00—ICT specially adapted for sequence analysis involving nucleotides or amino acids
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/20—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/70—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
Definitions
- the present invention relates to a cancer diagnosis method using exon-splicing information of RNA in blood, and more specifically, (a) (i) total RNA or part thereof isolated from anucleated cells in the blood of an individual, ( Isolating one or more RNAs selected from the group consisting of ii) total RNA or a part thereof isolated from an exosome of the blood of an individual, and (iii) total cell-free RNA (cfRNA) isolated from the blood of an individual or a part thereof.
- step (b) synthesizing complementary DNA (cDNA) to the RNA isolated in step (a); (c) obtaining base sequence information of the cDNA; (d) comparing the cDNA base sequence information with a predetermined exon-junction library to obtain base sequence expression information at each exon-junction; and (e) determining whether an individual has cancer based on the base sequence expression information at each exon-junction. It relates to a method of analyzing the transcriptome to provide information necessary for cancer diagnosis in an individual.
- CTCs circulating tumor cells
- ctDNA circulating tumor DNA
- NGS Next-Generation Sequencing
- Liquid biopsy is a non-invasive technology that is more convenient to collect than tissue biopsy and allows quick analysis with a small amount of blood, making cancer detection and monitoring possible through liquid biopsy without tissue biopsy [2].
- liquid biopsy has low cancer detection sensitivity using existing technologies due to the limitation of the small number of molecules present in the blood [3, 4, 5]. Therefore, for effective early cancer screening, there is a need to increase sensitivity by using biomarkers with a large number of molecules present in the blood, even in cancer.
- the present inventors were researching to utilize biomarkers with a large number of molecules present in the blood even in cancer for effective early cancer detection screening, and were investigating the use of biomarkers in anucleate cells such as platelets, cell-derived membrane structures such as exosomes, or cell-free RNA.
- anucleate cells such as platelets, cell-derived membrane structures such as exosomes, or cell-free RNA.
- exon splicing information By securing transcriptome data, analyzing it, and using exon splicing information as a biomarker, it is possible to determine whether a subject has cancer or is normal.
- the present invention was completed by confirming that exon splicing information can be used as a biomarker to determine whether a subject has cancer or is normal.
- RNAs selected from the group consisting of total cfRNA (cell-free RNA) or parts thereof isolated from the blood of an individual.
- step (b) synthesizing complementary DNA (cDNA) to the RNA isolated in step (a);
- Another object of the present invention is a composition for diagnosing cancer comprising a single or multiple exon-splicing as an active ingredient, wherein the exon-splicing is (i) RNA or part thereof isolated from anucleated cells of the blood of an individual, (ii) an individual's To provide a composition for diagnosing cancer, characterized in that it detects total RNA or a portion thereof isolated from exosomes of blood, and (iii) total cfRNA or a portion thereof isolated from the blood of an individual.
- Another object of the present invention is a composition for cancer diagnosis consisting of a single or multiple exon-splicing, wherein the exon-splicing is (i) RNA or part thereof isolated from annucleated cells of the blood of an individual, (ii) blood of the individual (iii) total RNA or part thereof isolated from exosomes, and (iii) total cfRNA or part thereof isolated from the blood of an individual.
- a composition for diagnosing cancer is provided.
- Another object of the present invention is a composition for diagnosing cancer that essentially consists of a single or multiple exon-splicing, wherein the exon-splicing is (i) RNA or part thereof isolated from anucleated cells of the blood of an individual, (ii) an individual's To provide a composition for diagnosing cancer, characterized in that it detects total RNA or a portion thereof isolated from exosomes of blood, and (iii) total cfRNA or a portion thereof isolated from the blood of an individual.
- Another object of the present invention is to provide a cancer diagnostic kit containing the composition.
- Another object of the present invention is to provide a cancer diagnostic kit consisting of the above composition.
- Another object of the present invention is to provide a cancer diagnostic kit consisting essentially of the above composition.
- Another object of the present invention is a composition for diagnosing cancer, comprising an agent capable of detecting single or multiple exon-splicing, wherein the exon-splicing is (i) total RNA isolated from anucleate cells of the blood of an individual or its To provide a composition for diagnosing cancer, which is characterized in that it is detected in some, (ii) total RNA or part thereof isolated from exosomes in the blood of an individual, and (iii) total cfRNA or part thereof isolated from the blood of an individual.
- composition for cancer diagnosis which consists of an agent capable of detecting single or multiple exon-splicing, wherein the exon-splicing is (i) total RNA isolated from anucleated cells of the blood of an individual or a portion thereof , (ii) total RNA or a portion thereof isolated from exosomes in the blood of an individual, and (iii) total cfRNA or a portion thereof isolated from the blood of an individual.
- Another object of the present invention is a composition for cancer diagnosis that essentially consists of an agent capable of detecting single or multiple exon-splicing, wherein the exon-splicing is (i) total RNA isolated from anucleated cells of the blood of an individual or its To provide a composition for diagnosing cancer, which is characterized in that it is detected in some, (ii) total RNA or part thereof isolated from exosomes in the blood of an individual, and (iii) total cfRNA or part thereof isolated from the blood of an individual.
- Another object of the present invention is to provide the use of an agent capable of detecting single or multiple exon-junctions selected from the group consisting of exon-junctions in Table 1 for preparing a composition for cancer diagnosis. will be.
- Another object of the present invention is (a) (i) total RNA or part thereof isolated from anucleated cells in the blood of an individual, (ii) total RNA isolated from exosomes in the blood of an individual or A part thereof, and (iii) isolating one or more RNAs selected from the group consisting of total cell-free RNA (cfRNA) isolated from the blood of an individual or a part thereof.
- cfRNA total cell-free RNA
- step (b) synthesizing complementary DNA (cDNA) to the RNA isolated in step (a);
- a method for diagnosing cancer including the step of determining whether the patient has cancer based on the base sequence expression information at each exon-junction.
- the present invention provides (a) (i) total RNA or part thereof isolated from anucleated cells of the individual's blood, (ii) exosomes of the individual's blood. isolating one or more RNAs selected from the group consisting of isolated total RNA or a portion thereof, and (iii) total cfRNA (cell-free RNA) or a portion thereof isolated from the blood of an individual;
- step (b) synthesizing complementary DNA (cDNA) to the RNA isolated in step (a);
- the present invention is a composition for cancer diagnosis comprising a single or multiple exon-splicing as an active ingredient, wherein the exon-splicing is (i) RNA isolated from anucleated cells of the blood of an individual or A composition for diagnosing cancer is provided, which is characterized in detecting a part thereof, (ii) total RNA or part thereof isolated from exosomes of the blood of an individual, and (iii) total cfRNA or part thereof isolated from the blood of an individual.
- the present invention is a composition for cancer diagnosis consisting of a single or multiple exon-splicing, wherein the exon-splicing is (i) RNA isolated from anucleated cells of the blood of an individual or a portion thereof , (ii) total RNA or part thereof isolated from exosomes in the blood of an individual, and (iii) total cfRNA or part thereof isolated from the blood of an individual.
- the present invention is a composition for cancer diagnosis that essentially consists of single or multiple exon-splicing, wherein the exon-splicing is (i) RNA isolated from anucleated cells of the blood of an individual or A composition for diagnosing cancer is provided, which is characterized in detecting a part thereof, (ii) total RNA or part thereof isolated from exosomes of the blood of an individual, and (iii) total cfRNA or part thereof isolated from the blood of an individual.
- the present invention provides a cancer diagnostic kit containing the composition.
- the present invention provides a cancer diagnostic kit consisting of the above composition.
- the present invention provides a cancer diagnostic kit consisting essentially of the composition.
- the present invention is a composition for cancer diagnosis, comprising an agent capable of detecting single or multiple exon-splicing, wherein the exon-splicing is (i) an anucleate cell of the subject's blood. Cancer characterized by detection in total RNA or part thereof isolated from, (ii) total RNA or part thereof isolated from exosomes in the blood of the subject, and (iii) total cfRNA or part thereof isolated from the blood of the subject.
- a diagnostic composition is provided.
- the present invention is a composition for cancer diagnosis consisting of an agent capable of detecting single or multiple exon-splicing, wherein the exon-splicing is (i) total RNA or part thereof isolated from anucleated cells of the blood of an individual, ( It provides a composition for diagnosing cancer, characterized in that it detects ii) total RNA or a portion thereof isolated from exosomes in the blood of an individual, and (iii) total cfRNA or a portion thereof isolated from the blood of an individual.
- the present invention is a composition for cancer diagnosis that essentially consists of an agent capable of detecting single or multiple exon-splicing, wherein the exon-splicing includes (i) total RNA isolated from anucleated cells of the blood of an individual or a portion thereof,
- the exon-splicing includes (i) total RNA isolated from anucleated cells of the blood of an individual or a portion thereof.
- a composition for diagnosing cancer characterized in that it detects (ii) total RNA or a portion thereof isolated from exosomes in the blood of an individual, and (iii) total cfRNA or a portion thereof isolated from the blood of an individual.
- the present invention can detect single or multiple exon-junctions selected from the group consisting of exon-junctions in Table 1 for producing a composition for cancer diagnosis. Provides the use of the preparation.
- the present invention provides (a) (i) total RNA or part thereof isolated from anucleated cells of the individual's blood, (ii) exosomes from the individual's blood ), and (iii) isolating one or more RNAs selected from the group consisting of total RNA (cell-free RNA) or a portion thereof isolated from the blood of an individual.
- step (b) synthesizing complementary DNA (cDNA) to the RNA isolated in step (a);
- Platelets known to play a major role in hemostasis and coagulation, help cancer growth, metastasis, and immune evasion, and cancer cells have been reported to change platelet expression by directly or indirectly affecting the RNA expression process of platelets. . Since one cancer cell can change tens of thousands of platelets, platelet transcript information can be used in the present invention as a biomarker to help determine cancer. In particular, the RNA of platelet cells changed by cancer cells can have cancer-specific changes in alternative splicing patterns, so this is intended to be used in the present invention as a biomarker for diagnosing cancer. In addition, platelets are a representative anucleated cell in the blood and are known to be a major source of exosomes and cfRNA (Mol Oncol. 2021 Jun; 15(6): 1727-1743).
- RNA or part thereof isolated from anucleate cells in the blood of an individual (i) total RNA or part thereof isolated from exosomes in the blood of an individual, and (iii) total RNA isolated from the blood of an individual isolating one or more RNAs selected from the group consisting of cfRNA or a portion thereof;
- step (b) synthesizing complementary DNA (cDNA) to the RNA isolated in step (a);
- Step (a) includes (i) total RNA or part thereof isolated from anucleated cells of the subject's blood, (ii) total RNA or part thereof isolated from exosomes in the subject's blood, and (iii) isolation from the subject's blood. This is a step of isolating one or more RNAs selected from the group consisting of an entire cfRNA or a portion thereof.
- the sample may, for example, be isolated from a known or suspected individual.
- the sample may be in the form originally isolated from the individual or may be further processed to remove or add components, such as cells, or to enrich one component compared to another component.
- a sample can be isolated or obtained from an individual and transported to a sample analysis device. Samples can be stored and shipped at desired temperatures, such as room temperature, 4°C, -20°C, and/or -80°C.
- a blood sample is collected from an individual for a liquid biopsy, and the collected blood can be checked for quality control (QC) indicators to determine whether to use it or not. accuracy can be increased.
- QC quality control
- one or more selected from the group consisting of anucleated cells such as platelets, exosomes, and cfRNA are separated from the collected blood sample.
- the separation method may be a method known in the art, and preferably, they may be separated through centrifugation or the like.
- cfRNA it can be used for cDNA synthesis directly from blood, plasma, serum, or their fractions.
- the subject may be a human, mammal, animal, pet animal, service animal, or pet.
- the entity may have a disease.
- the subject cannot be free of disease or detectable disease symptoms.
- the individual may have been treated with one or more therapies, e.g., any one or more of a surgery, procedure, medication, chemotherapy, antibody, vaccine, or biologic agent.
- the individual may or may not be in remission.
- the 'nucleated cell' refers to a cell that does not have a nucleus and cannot produce daughter cells through cell division.
- the anucleated cells include platelets, red blood cells, and any cells that do not possess a nucleus due to incomplete cell division, and may preferably be platelets or red blood cells, and most preferably platelets.
- the 'exosome' refers to an extracellular vesicle having a vesicle structure with a nanoscale size (e.g., 50-90 nm), and is separated from the inside of the exosome by a lipid bilayer made of the cell membrane components of the cell from which it is derived. It has a separate external structure and contains cell membrane lipids, membrane proteins, nucleic acids, and cell components.
- the origin of exosomes is not particularly limited, but may preferably be separated from blood. Exosomes mediate the transport of mRNA, miRNA, DNA, and proteins between cells and play an important role in signaling and interactions inside and outside cells.
- Exosomes can be isolated using methods known in the art without limitation, for example, ultra-centrifugation isolation, size exclusion, immunoaffinity isolation, microscopic isolation, etc. Exosomes can be separated using microfluidics chip and polymeric methods. Additionally, exosomes can be isolated using a commercially available exosome isolation kit (e.g., Exo2DTM EV isolation kit).
- Exo2DTM EV isolation kit e.g., Exo2DTM EV isolation kit.
- RNA isolation methods include the guanidine thiocyanate/cesium chloride ultracentrifugation method, the guanidine thiocyanate/hot phenol method, the guanidine hydrochloride method, and the acidic guanidine thiocyanate/phenol/chloroform method (Chomczynski, P. and Sacchi). , N., Anal. Biochem. (1987), 162, 156-159), etc.
- RNA queous kit (Ambion Inc., Austin, TX), Micro-to-midi total RNA purification system (Invitrogen), NucleoSpin RNA II (BD Biosciences Clontech, Palo Alto, CA), RNeasy mini kit (Qiagen), GenElute mammalian total RNA kit (Sigma-Aldrich, and Trizol LS reagent (Invitrogen)), etc.
- RNA queous kit Ambion Inc., Austin, TX
- Micro-to-midi total RNA purification system Invitrogen
- NucleoSpin RNA II NucleoSpin RNA II
- RNeasy mini kit Qiagen
- GenElute mammalian total RNA kit (Sigma-Aldrich, and Trizol LS reagent (Invitrogen)
- Trizol LS reagent Trizol LS reagent
- the isolated RNA fraction can be further purified and used as mRNA only, if necessary.
- the purification method is not particularly limited as long as it is a known RNA purification method, but for example, mRNA is adsorbed to a biotinylated oligo (dT) probe, and biotin/streptoavidin binding is used to paramagnetic particles immobilized with streptoavidin.
- the mRNA can be purified by capturing the mRNA, performing a washing operation, and then eluting the mRNA. Additionally, a method of adsorbing mRNA to an oligo (dT) cellulose column and then eluting and purifying it can be adopted.
- the purification process of the mRNA is not required and can be performed optionally.
- Step (b) is a step of synthesizing complementary DNA (cDNA) to the RNA isolated in step (a).
- the method of synthesizing cDNA from RNA can be performed without limitation according to methods known in the art. For example, reverse transcriptase and deoxyribonucleotides are added to RNA to copy the primary DNA strand using the mRNA chain as a template. Afterwards, mRNA is removed from the DNA-RNA hybrid double strands by treatment with RNA degrading enzyme (RNase H). Afterwards, cDNA can be synthesized by treating DNA polymerase and using the DNA strand created by reverse transcription as a template to form a second strand of DNA to complete the template.
- RNase H RNA degrading enzyme
- Step (c) is a step of obtaining base sequence information of the cDNA.
- analyzing base sequence information can be performed by a base sequence information analysis method known in the art.
- Sequence information analysis deciphers one strand of complementary cDNA or its individual sequence.
- Sequence decoding is suitable for decoding a large number of fragments, preferably at least 10,000 or more, at least 20,000 or more, at least 30,000 or more, at least 40,000 or more, at least 50,000 or more, at least 100,000 or more, or at least 1,000,000 or more fragments.
- a detoxification method is preferred.
- base sequence information analysis methods known in the art may be used, but any method that can decode a large amount of sequences in order to decode the sequence of each fragment in sufficient quantity may be used without limitation.
- Analysis of the base sequence of the present invention is not limited thereto, but may be performed by Next-Generation Sequencing (NGS).
- NGS Next-Generation Sequencing
- next-generation sequencing it has the advantage of being able to decode a large amount of sequences within a few hours and at a low cost.
- accuracy is very high and the decoded data can be analyzed qualitatively and quantitatively.
- the analyzed base sequence information may also be called reads.
- an appropriate adapter may be attached to analyze the base sequence of the exon junction site.
- Step (d) is a step of comparing the cDNA sequence information with a predetermined exon-junction library to obtain base sequence expression information at each exon-junction.
- step (d) expression information of the sequence generated by exon-splicing is obtained from the base sequence information obtained in step (c).
- the nucleotide sequence information analysis in step (c) is performed by NGS, the frequency of nucleotide sequences aligned to a predetermined exon-junction library, that is, the read-count, is counted. do. That is, in all sequences obtained by decoding one sample, the number of reads in the corresponding sequence is counted for each different exon-junction type compared to a predetermined exon-junction library.
- the expression information of the sequence created by exon-splicing is the base sequence (read) mapped to the most terminal of two different exons existing in one gene, that is, the end of the upper exon and It is the number of base sequences (reads) containing at least one base pair of the contiguous exon region from the start of the lower exon.
- the base sequence expression information at the exon-junction that is, the number of reads, is not counted (see Figure 4).
- Each counted value can be normalized for comparison with values from other samples. This normalization divides each aggregated value by a value proportional to the decoded amount for direct quantitative comparison between samples when the decoded amount is different for each sample.
- the value proportional to the amount translated can be various values, such as the total number of translated sequences of each sample or the number of sequences mapped to the house keeping gene region.
- the predetermined exon-splicing library represents information on the exon-splicing site indicated by the genes listed in Table 1 and positional information on the corresponding chromosome.
- Table 1 each gene and the corresponding chromosome are indicated, and the end of the exon at the upper position where exon splicing occurs (position 1) and the beginning of the exon at the lower position (position 2) are indicated by the position number on the corresponding chromosome. It is done.
- the predetermined exon-junction library may be a junction of position 1 and position 2 in each chromosome listed in Table 1 below (see Figure 3).
- the expression information of the base sequence in the exon-junction is continuous in the 5' direction and/or 3' direction while including each base at position 1 and position 2 in Table 1 below. It can be characterized as sequence information (read) that is aligned to a sequence containing two or more bases.
- the expression information of the base sequence in the exon-junction includes each base at position 1 and position 2 in Table 1 below and is expressed in the 5' direction and/or 3' direction. It may be characterized as sequence information (read) aligned to a sequence containing 2 or more to 300 consecutive bases.
- the expression information of the base sequence in the exon-junction includes each base at position 1 and position 2 in Table 1 below and is expressed in the 5' direction and/or 3' direction.
- the exon-junction library includes a single or plural exon-junction listed in Table 1, and the single or plural exon-junction is exon-junction number 1,... , exon-junction number n-1 and exon-junction number n, where n is a natural number and may be any one of 1 to 441.
- Step (e) is a step of determining whether cancer is present based on the base sequence expression information at each exon-junction. In this step, it is determined whether an individual has cancer based on the base sequence expression information at each exon-junction.
- nucleotide sequence expression information at the exon-junction of the subject obtained through steps (a) to (d) with the base sequence expression level database at each exon-junction previously secured. You can decide whether you have cancer or not. For example, if the base sequence expression level at a specific exon-junction that is determined to be up-regulated in cancer patients in a previously secured database is increased in the subject's base sequence expression information compared to the normal control group, the subject has cancer. It can be decided that there is. Such a decision can be made using base sequence expression information in single or multiple exon-junctions.
- the determination of whether the subject has cancer is made by applying the nucleotide sequence expression information at each exon-junction obtained through steps (a) to (d) above to a pre-learned cancer determination model to determine whether the subject has cancer. can be decided.
- the subject's cancer determination score is extracted from the cancer determination model, and heat map visualization results of base sequence expression information at the subject's exon junction and individual exon-junction importance information can be provided. .
- the determination of whether or not to have cancer may be to determine whether to have one or two or more types of cancer.
- the determination of whether or not the patient has two or more types of cancer can be determined simultaneously or sequentially using information obtained from one sample isolated from an individual.
- the discriminant model is learned using public data (e.g., GSE68086), and the model that has been verified can be used.
- the training set and validation set are used by dividing the entire set at a ratio of 6:4, and a cancer determination model is learned using the training set for the acquired exon-splicing library characteristics, and the performance is evaluated using the validation set. You can use it after checking.
- the discriminant model is based on the SVM (support vector machine) algorithm, and exon-splicing biomarker characteristics are obtained from the individual's platelet-derived transcriptome data and inputted into the discriminant model to identify the subject's sample. It was possible to determine whether the cancer was normal or not. Additionally, the discrimination model can output the discrimination score for cancer or normality as an output value.
- SVM support vector machine
- machine learning methods include (1) supervised learning (2) unsupervised learning (3) reinforcement learning (4) semi-supervised learning (5) ) may include neural networks, etc., and more specifically, Naive Bayes Classification, Logistic Regression, Decision Tree, Random Forest, Boosting (XGBoost/ ensemble boosting/AdaBoost/Gradient Boost/LightGBM/CatBoost, etc.), Perceptron, Support Vector Machine, Quadratic classifiers, clustering (K-means clustering, Bayesian network clustering, etc.), deep It may include, but is not limited to, a deep neural network.
- a neural network refers to a learning algorithm that mimics a biological neural network.
- the algorithm may consist of an input layer, at least one hidden layer, and an output layer, and each layer may consist of at least one node. You can. Nodes in each layer receive result values from nodes existing in the previous layer, perform operations based on mathematical models, output new results, and pass the new results to nodes in the next layer.
- the neural network in the present invention includes not only a convolutional neural network and a deep neural network, but also all types of neural networks that can generate a model using the biomarker of the present invention as a characteristic. do.
- the result of determining whether the patient has cancer can be provided by additionally integrating the discrimination score for the individual's cancer, visualization of the individual's exon-splicing data, and information on the importance of the individual's exon-splicing. For example, by inputting the exon-splicing biomarker characteristics of an individual into a pre-learned cancer determination model, the predicted probability of cancer or normal is obtained, and the cancer determination result based on this is notified. In addition, it visually shows the expression patterns of relevant biomarkers and can provide various prognoses for an individual by analyzing the importance of the individual's exon-splicing.
- the type of cancer is not particularly limited, but includes bladder cancer, bone cancer, blood cancer, breast cancer, melanoma, thyroid cancer, parathyroid cancer, bone marrow cancer, rectal cancer, throat cancer, larynx cancer, lung cancer, esophagus cancer, pancreas cancer, colon cancer, stomach cancer, It may be one or more selected from the group consisting of tongue cancer, skin cancer, brain tumor, uterine cancer, head or neck cancer, gallbladder cancer, oral cancer, colon cancer, anal cancer, central nervous system tumor, liver cancer, and colon cancer.
- the present invention also provides a single or plural exon-junction selected from the group consisting of exon-junctions in Table 1; Or a composition for diagnosing cancer, comprising as an active ingredient an agent capable of detecting single or multiple exon-splicing selected from the group consisting of exon-splicing in Table 1, wherein the exon-splicing is (i) an anucleate of the subject's blood.
- Cancer characterized by detection in RNA or part thereof isolated from cells, (ii) total RNA or part thereof isolated from exosomes in the blood of an individual, and (iii) total cfRNA or part thereof isolated from the blood of an individual
- a diagnostic composition and a cancer diagnostic kit including the cancer diagnostic composition are provided.
- an agent capable of detecting the single or multiple exon-junctions is a primer pair capable of amplifying the single or multiple exon-junction sites, preferably, position 1 in each exon-junction in Table 1. And it may be a primer pair that can specifically amplify a sequence that includes each base at position 2 and two or more consecutive bases in the 5' direction and/or 3' direction.
- Primer is a nucleic acid sequence with a short free 3' terminal hydroxyl group that can form a base pair with a complementary template and serves as a starting point for copying the template strand. It refers to a short nucleic acid sequence that functions as a point.
- Primers can initiate DNA synthesis in the presence of a polymerization reagent (DNA polymerase or reverse transcriptase) and four different dNTPs (deoxynucleoside triphospates) at an appropriate buffer solution and temperature.
- Primers may incorporate additional features that do not change the basic nature of the primer, which serves as the starting point for DNA synthesis.
- the primers containing the base sequences of SEQ ID NOs. 1 to 7 each include base sequences having 95% or more sequence homology.
- the primer can be chemically synthesized using a phosphoramidite solid support method or other well-known methods.
- These nucleic acid sequences can also be modified using many means known in the art. Non-limiting examples of such modifications include methylation, “capsation,” substitution of a native nucleotide with one or more homologues, and modifications between nucleotides, such as uncharged linkages (e.g., methyl phosphonates, phosphotriesters, phoramidate, carbamate, etc.) or charged linkages (e.g. phosphorothioate, phosphorodithioate, etc.).
- uncharged linkages e.g., methyl phosphonates, phosphotriesters, phoramidate, carbamate, etc.
- charged linkages e.g. phosphorothioate, phosphorodithioate, etc.
- Nucleic acids may contain one or more additional covalently linked residues, such as proteins (e.g., nucleases, toxins, antibodies, signal peptides, poly-L-lysine, etc.), intercalating agents (e.g., acridine, psoralen, etc.). ), chelating agents (e.g. metals, radioactive metals, iron, oxidizing metals, etc.), and alkylating agents.
- proteins e.g., nucleases, toxins, antibodies, signal peptides, poly-L-lysine, etc.
- intercalating agents e.g., acridine, psoralen, etc.
- chelating agents e.g. metals, radioactive metals, iron, oxidizing metals, etc.
- alkylating agents e.g. metals, radioactive metals, iron, oxidizing metals, etc.
- the primer nucleic acid sequence may, if necessary, include a label detectable directly or indirectly by spectroscopic, photochemical, biochemical, immunochemical or chemical means.
- labels include enzymes (e.g., horseradish peroxidase, alkaline phosphatase), radioisotopes (e.g., 32P), fluorescent molecules, chemical groups (e.g., biotin), etc. there is.
- the diagnostic kit can be used to detect single or multiple exon-junction sites, which are the biomarkers according to the present invention.
- the kit of the present invention may include primers, probes, and antisense nucleic acids for detecting the single or multiple exon-junction sites, as well as one or more other component compositions, solutions, or devices suitable for the analysis method.
- the kit of the present invention includes a primer set specific for mRNA and/or cDNA complementary thereto derived from the sample to be analyzed, an appropriate amount of DNA polymerase, a dNTP mixture, a PCR buffer solution, and It may be a kit containing water.
- the PCR buffer solution may contain KCl, Tris-HCl, and MgCl2.
- components necessary for performing electrophoresis that can confirm the amplification of the PCR product may be additionally included in the kit of the present invention.
- the kit of the present invention may be a kit containing essential elements required to perform DNA chipping.
- a DNA chip kit may include a substrate to which a cDNA corresponding to a gene or a fragment thereof is attached as a probe, reagents, agents, enzymes, etc. for producing a fluorescent label probe. Additionally, the substrate may further include cDNA corresponding to a quantitative control gene or a fragment thereof.
- the kit may include stabilizers and/or non-reactive dyes for experimental convenience, stabilization, and improvement of reactivity.
- the non-reactive dye material must be selected from substances that do not affect the polymerase chain reaction, and is intended to be used for analysis or identification using the polymerase chain reaction product. Substances that satisfy these conditions can be used as water-soluble dyes such as rhodamine, tamra, bleach, bromophenol blue, xylene cyanol, bromocresol red, and cresol red.
- the non-reactive dye material may be included in an amount of 0.0001 to 0.01% by weight based on the total weight of the composition, and is preferably included in an amount of 0.001 to 0.005% by weight. If added in an amount exceeding 0.01% by weight based on the total weight of the composition, there is a problem in that the high concentration of water-soluble dye may act as a reaction inhibitor during the polymerase chain reaction.
- polyhydric alcohols can be used as a stabilizing material to further stabilize the kit components of the present invention, and one or more of glucose, glycerol, mannitol, galaxitol, glucitol, and sorbitol can be used.
- the kit components may be provided in liquid form, and are preferably in a dried state to increase stability, ease of storage, and long-term storage.
- the drying can be performed by known drying methods such as general room temperature drying, heated drying, freeze drying, and reduced pressure drying, and any drying method can be used as long as the components of the composition are not lost.
- various DNA polymerases can also be used in the amplification step of the present invention, which may include the "Klenow" fragment of E. coli DNA polymerase I, thermostable DNA polymerase, and bacteriophage T7 DNA polymerase.
- the polymerase is a thermostable DNA polymerase obtainable from various bacterial species, such as Thermus aquaticus (Taq), Thermus thermophilus (Tth), Thermus filiformis, Thermis flavus, Thermococcus literalis, and Pyrococcus furiosus (Pfu). Includes.
- Most of the above polymerases can be isolated from the bacteria themselves or can be purchased commercially.
- the polymerase used in the kit of the present invention can be obtained from cells expressing high levels of the cloned gene encoding the polymerase.
- the diagnostic method can be used to diagnose the presence of a condition, particularly a disease, in a particular subject, characterize the condition (e.g., determine the stage of cancer or determine the heterogeneity of the cancer), or determine the efficacy of a treatment for the condition. It can be used to identify, monitor response to treatment of a condition, or prognosticate/diagnose the risk of developing a condition or subsequent course of the condition.
- the present disclosure may also be useful in determining the efficacy of a particular treatment regimen.
- a particular treatment regimen may be correlated with a change in the profile of the cancer over time. This correlation may be useful in selecting therapy.
- the present diagnostic method can be used to monitor residual disease or disease recurrence.
- the base sequence information in exon-junction according to the present invention can also be used to characterize specific types of cancer. Cancers are often heterogeneous in both composition and stage. Genetic profile data may allow characterization of specific subtypes of cancer, which may be important in diagnosing or treating such subtypes. Such information may also provide clues to the subject or practitioner regarding the prognosis of a specific type of cancer and may allow the subject or practitioner to adopt treatment options depending on the progression of the disease. Some cancers can progress to become more aggressive and genetically unstable. Other cancers may remain benign, inactive, or dormant. The methods of this disclosure may be useful in determining disease progression.
- the marker is a sequence of a certain length that is significantly higher or lower in the cancer sample group by comparing the counted and normalized values for each exon-splicing type in the normal sample group and the cancer sample group. there is. Most simply, at each exon-splicing site, the difference between the mean values between the normal sample group and the cancer sample group is used, or various statistical techniques such as T-test, Mann-Whitney test, Wilcoxon Test, or Cohen's D test are used. It is used to select sequences that are significantly different from the two sample groups.
- the present invention can be used as a diagnostic marker by using each marker individually, as a whole, or by combining several markers in a panel display form, and some markers can be confirmed to improve reliability and efficiency through a list of overall patterns. You can.
- the markers identified in the present invention can be used individually or as a combined marker set. Markers can be ranked and weighted according to the number of markers and their importance, and the level of likelihood of developing a disease can be selected. These algorithms belong to the present invention.
- the present invention is a composition for diagnosing cancer comprising as an active ingredient a single or plural exon-junction selected from the group consisting of exon-junctions in Table 1, wherein the exon-junction is (i) of the individual. Characterized by detection in RNA or part thereof isolated from anucleated cells of the blood, (ii) total RNA or part thereof isolated from exosomes in the blood of an individual, and (iii) total cfRNA or part thereof isolated from the blood of an individual. Provided is a composition for diagnosing cancer.
- the present invention provides a cancer diagnostic kit containing the composition.
- the present invention is a composition for cancer diagnosis, comprising an agent capable of detecting single or multiple exon-splicing selected from the group consisting of exon-splicing in Table 1, wherein the exon-splicing is (i) detected in the blood of the subject. Characterized by detection in total RNA or part thereof isolated from the anucleated cells of, (ii) total RNA or part thereof isolated from exosomes in the blood of the subject, and (iii) total cfRNA or part thereof isolated from the blood of the subject.
- a composition for diagnosing cancer comprising an agent capable of detecting single or multiple exon-splicing selected from the group consisting of exon-splicing in Table 1, wherein the exon-splicing is (i) detected in the blood of the subject. Characterized by detection in total RNA or part thereof isolated from the anucleated cells of, (ii) total RNA or part thereof isolated from exosomes in the blood of the subject, and (iii) total cfRNA or part thereof isolated from
- the present invention provides the use of an agent capable of detecting single or multiple exon-junctions selected from the group consisting of exon-junctions in Table 1 for preparing a composition for cancer diagnosis.
- the present invention provides (a) (i) total RNA or part thereof isolated from anucleated cells in the blood of an individual, (ii) total RNA or part thereof isolated from exosomes in the blood of an individual , and (iii) isolating one or more RNAs selected from the group consisting of total cfRNA (cell-free RNA) or a portion thereof isolated from the blood of the subject; (b) synthesizing complementary DNA (cDNA) to the RNA isolated in step (a); (c) obtaining base sequence information of the cDNA; (d) comparing the cDNA base sequence information with a predetermined exon-junction library to obtain base sequence expression information at each exon-junction; and (e) determining whether the patient has cancer based on the base sequence expression information at each exon-junction. It provides a cancer diagnosis method.
- the term “comprising” is used in the same sense as “including” or “characterized by,” and specifically refers to the composition or method according to the present invention. It does not exclude additional components or method steps that have not been used. Additionally, the term “consisting of” means excluding additional elements, steps, or ingredients that are not separately stated.
- the term “essentially consisting of” means that, in the scope of a composition or method, it may include substances or steps that do not substantially affect its basic characteristics in addition to the described substances or steps.
- the method of the present invention can provide information necessary for the diagnosis of cancer, monitoring of treatment regimens, and prognosis of cancer patients, and can be usefully used in anticancer treatment.
- Figure 1 is a flow chart of the process of screening 441 exon-splicing libraries.
- Figure 2 shows an example of the characteristics of 441 selected exon-junction libraries.
- Figure 3 shows the definition of exon-junction.
- Figure 4 shows the process of counting the number of reads in exon-junction.
- Figure 5a shows an example of the expression pattern for 441 exon-junction libraries of learning dataset samples used when learning a cancer determination model.
- Figure 5b shows an example of the expression pattern for 441 exon-junction libraries of validation dataset samples used when learning a cancer determination model.
- Figure 6a is the AUC score of the Support Vector Machine (SVM) model learned with 1,072 genes as features in a previous study to explain the performance of the cancer and normal discriminative model by the 441 exon-junction library according to an embodiment of the present application. It represents.
- SVM Support Vector Machine
- Figure 6b shows the AUC score of the DNN model learned using the 441 exon-splicing libraries according to the present application as a feature to explain the performance of the cancer and normal discriminative model by the 441 exon-splicing libraries according to an embodiment of the present application. will be.
- Figure 7 shows an example diagram for comparing the performance of a model using a 441 exon-splicing library and a model using 1,072 genes from a previous study.
- Figure 8 is an example of quantitative information of exon-junctions with the largest difference between cancer and normal samples among exon-junctions whose expression is lowered in cancer samples compared to normal samples. It is shown.
- Figure 9a shows the normalization of the exon-junction with the largest difference between cancer and normal samples among exon-junctions whose expression is lowered in cancer samples compared to normal samples for the training dataset samples.
- An example diagram of the expressed expression value is shown.
- Figure 9b shows the exon-junction with the largest difference between cancer and normal samples among the exon-junctions whose expression is higher in cancer samples compared to the normal samples selected as the learning dataset for the validation dataset samples.
- -junction shows an example of the normalized expression value.
- Figure 10 is an example diagram confirming the performance of a cancer determination model using all or part of the 441 exon-junction library according to the present application.
- biomarker refers to an indicator that can detect changes in the body using proteins, DNA, RNA, metabolites, etc., and more specifically, all or all of the gene sequence represented by ‘SEQ ID NO: 1 to 882’. It is a term that includes a portion thereof or an “exon-splicing library” expressed as ‘exon-splicing library 1 to 441’.
- exon-joining library refers to some combination of the gene sequences claimed herein. It is preferable that two of the gene sequences claimed in the present application are used simultaneously, and the corresponding combinations are summarized in Table 2.
- 'exon-junction library 1' means 'SEQ ID NO: 1 and SEQ ID NO: 2'
- 'exon-junction library 441' means 'SEQ ID NO: 881 and SEQ ID NO: 882'. will be.
- RNAlater ThermoFishcer
- Sequencing data was produced in paired-end FASTAQ format using Illumina equipment according to the manufacturer's instructions. Adapter sequences and low-quality bases from the produced data were removed, and the sequenced reads were mapped to the reference genome to create a sam file.
- the generated sam file contains chromosome number and location information in the reference genome for each read. Since the sam file is very large, the sam file was converted to a bam file and used. Additionally, in order to use only reads that were accurately mapped to the reference genome, reads that were not in primary alignment were removed from the bam file.
- the number of exon-junctions is at least one consecutive exon starting from the end of two different exons in one gene among the selected reads, that is, the end of the exon at the upper position and the beginning of the exon at the lower position. Reads containing base pairs of the region were each counted, and reads containing untranslated intron portions were not counted.
- public platelet transcriptome data (GSE68086) was used, and the entire set (283 samples) was divided at a ratio of 6:4 and used as a training dataset (175 samples) and validation dataset (108 samples), respectively. did. Only the learning dataset is used in the biomarker screening and cancer determination model, and the performance of the learned cancer determination model was confirmed through the validation dataset.
- an exon-splicing library corresponding to 441 exon-splicing was derived, and the exon-splicing library was composed of 3' splice points (position 1) in 441 upstream exons and 441 low-position exons. It consists of 882 exon-junction points, including the 5' junction point (position 2).
- Figure 2 shows an example of the characteristics of the 441 exon-splicing libraries derived according to the above, and the 441 exon-splicing libraries derived with log2FoldChange and Mann-Whitney test FDR as the x and y axes, respectively, are cancer It indicates the degree of significant difference from normal.
- the threshold values used for biomarker selection, False Discovery Rate (FDR) 0.05 and log2FoldChange 1.4, -1.4, are indicated by dotted lines.
- each dot represents all exon-junctions used in the exon-junction library discovery analysis, of which exon-junction libraries with increased expression (upregulated) in cancer samples compared to normal samples with FDR less than 0.05 and log2FoldChange more than 1.4 are in red. , FDR 0.05 or less, log2FoldChange -1.4 or less, exon-splicing libraries whose expression is lowered (downregulated) are shown in blue. As the absolute value of Log2FoldChange increases, it means that the change in expression value in cancer samples and normal samples is greater.
- SEQ ID NOs: 1 to 882 are defined as follows.
- odd sequence numbers represent 150 base sequences in the 5' direction, including position 1 in each exon-junction site specified in Table 1 above.
- SEQ ID NO: 1 represents a 150-base sequence in the 5' direction, including the base at position 1 of exon-junction number 1 in Table 1
- SEQ ID NO: 881 represents the base sequence of exon-junction number 441 in Table 1. It represents 150 base sequences in the 5' direction, including the base at position 1.
- even sequence numbers represent 150 base sequences in the 3' direction, including position 2 in each exon-junction site specified in Table 1 above.
- SEQ ID NO: 2 represents a 150-base sequence in the 3' direction, including the base at position 2 of exon-junction number 1 in Table 1
- SEQ ID NO: 882 represents exon-junction number 441 in Table 1.
- It represents a 150-base sequence in the 3' direction, including the base at position 2. That is, the 3' terminal base among the 150 bases included in each odd sequence number is the base corresponding to position 1 in Table 1, and the 5' terminal base among the 150 bases included in each even sequence number is the base in the table above. It is the base corresponding to position 2 in 1.
- the exon-junction biomarker for cancer diagnosis is the 3' terminal base in odd sequence numbers (i.e., the base corresponding to position 1 in Table 1 above) and the 5' terminal base in even sequence numbers (i.e., Table 1 above). (base corresponding to position 2 in It may be a base sequence containing.
- the exon-junction biomarker for cancer diagnosis includes the 3' terminal base in odd sequence numbers (i.e., the base corresponding to position 1 in Table 1 above) and the 5' terminal base in even sequence numbers. (i.e., the base corresponding to position 2 in Table 1 above), and is continuous in the 5' direction of the odd sequence number based on position 1 and/or in the 3' direction of the even sequence number based on position 2.
- Table 2 below shows the base sequences of SEQ ID NOs: 1 to 882.
- the exon-junction numbers in Table 2 below correspond to the exon-junction numbers in Table 1 above.
- Exon-junction number sequence number base sequence
- base sequence base sequence One One GCCGGAGAGCTGGTGCTTGGGGCTCCTGGCGGCTATTATTTCTTAGGTACGTGCCCATCCGTACACCTCCCTCCCTTCTCGCGGCCCAAGGAGACCGCTTTGGGCTTCACACCCGCTGTCCCTCCCGCCCTAGGTCTCCTGGCCCAGGCT 2 CCGCCGACTCAAGGCCCCGCCCCTGTCCCCCAGCCCTCCTCCGGGCTCGCGCGCGCCTCCCTTCACCCCTGCGCTGACCCCTCCTCCTTGTCTCCTGCAGGCTGGGACAAGCGTTACTGTGAAGCGGGCTTCAGCTCCGTGGTCACTCAG 2 3 TTGATCCTGCTATTGTCATCTCTCCCAGTGGGAAGTACAATGCTGTCAAGCTTGGTAAATATGAAGATTCAAATTCAGTGACATGTTCAGTTCAACACGACAATAAAACTGTGCACTCCACTGACTTTGAAGTGAAGACAGATTCTACAG 4 ATCACGTAAAACCAAAGGAAACTGA
- platelet-derived transcriptome data is used to determine cancer, and in particular, exon-junction count data is used.
- Exon-splicing refers to the joining of the end of the exon at the upstream position (3' part) and the beginning of the exon at the downstream position (5' part) with respect to two different exons within one gene ( Figure 3).
- Exon-junction number data includes at least one continuous exon region starting from the most distal end of two different exons among the selected reads, that is, the end of the exon at the upper position and the beginning of the exon at the lower position. Reads containing base pairs are counted, and in this case, in the case of two different exons, they do not have to be immediately adjacent exons on the reference genome.
- the entire exon region to be skipped must be skipped all at once, not part, for each corresponding individual exon. For example, when there are exons 1, 2, and 3, and the exon-junction connecting 1 and 3 is counted as a disease-related marker, the entire region of exon 2 should not be included in the region to which the read is mapped. . Also, in the case of reads with mixed intron parts that are not translated, the number of exon-junctions is not counted ( Figure 4).
- the biomarkers of Example 1 above are extracted from the exon-junction number data and applied to a pre-learned cancer determination model.
- the discriminant model outputs a discrimination score between cancer and normal, and can visualize the subject's exon-splicing information, analyze its importance, and notify the subject.
- the expression patterns for the 441 exon-junction libraries of the samples used when learning the cancer determination model were depicted in the training dataset (FIG. 5a) and the validation dataset (FIG. 5b).
- the rows and columns represent 441 exon-junction libraries and samples, respectively, and the expression value of each exon-junction library for the sample was indicated in color, and samples and exon-junction libraries with similar patterns were clustered.
- the performance of the cancer determination model using the 441 exon-splicing library of the present invention was compared with existing markers (SVM model for 1072 genes).
- the model for existing markers is a Support Vector Machine (SVM) model learned using 1,072 genes as characteristics from a previous study [6] using the same dataset.
- the AUC score of the model is shown in Figure 6a, and was reported herein.
- the AUC score of the SVM model learned using the 441 exon-junction libraries as features is shown in Figure 6b.
- Figure 8 is an example of quantitative information of exon-junctions with the largest difference between cancer and normal samples among exon-junctions whose expression is lowered in cancer samples compared to normal samples.
- This shows the reference genome mapping results of three cancer samples (red, top three) and three normal samples (blue, bottom three) using the Integrative Genomics Viewer (IGV) program.
- IGV is a program that can visualize integrated genome datasets. It loads data in various formats, such as sequencing data, and shows the results of mapping to the reference genome.
- the exon-junction with lower expression in cancer samples and the largest difference from normal samples is the region 22,549,683 to 22,550,556 of the gene TRAC, and the region is indicated in the Refseq Genes track at the top of Figure 8.
- the six tracks at the bottom of the Refseq Genes track show the depth of reads actually mapped to that region for the samples loaded in each track. Through this, it was found that the number of reads mapped to the corresponding region differed between cancer and normal samples.
- Figure 9 is an example of the normalized expression value of an exon-junction with the largest difference between cancer and normal samples among exon-junctions whose expression is lowered in cancer samples compared to normal samples.
- 9A shows a graph for the learning data set
- FIG. 9B shows a graph for the verification data set.
- This graph is expressed as a bar graph by sorting all samples in the order of the log2CPM value of the corresponding exon-junction, with cancer samples shown in red and normal samples shown in blue to compare expression values in cancer and normal samples.
- Figure 10 shows an example of the performance of a cancer determination model using all or part of the 441 exon-junction library according to the present application.
- the Shapley value which indicates the extent to which a characteristic influenced the result.
- one exon-splicing library with the smallest Sharpray value, that is, the least influence on the model was removed, and then only the remaining 440 exon-splicing libraries were used as features.
- a cancer detection model was learned. In this way, the cancer discrimination model was learned by removing the exon-splicing library that had the least impact on the discrimination model one by one, and the performance was displayed in a graph.
- the exon-splicing library number was selected based on the degree of influence on the cancer determination model (number 1 is the exon-splicing library with the highest impact).
- number 1 is the exon-splicing library with the highest impact.
- RNA-Seq of tumor-educated platelets enables blood-based pan-cancer, multiclass, and molecular pathway cancer diagnostics.” Cancer cells 28.5 (2015): 666-676.
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Physics & Mathematics (AREA)
- Public Health (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Theoretical Computer Science (AREA)
- Pathology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Analytical Chemistry (AREA)
- Biomedical Technology (AREA)
- Epidemiology (AREA)
- Organic Chemistry (AREA)
- Databases & Information Systems (AREA)
- Biotechnology (AREA)
- Software Systems (AREA)
- Biophysics (AREA)
- Evolutionary Biology (AREA)
- Wood Science & Technology (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Genetics & Genomics (AREA)
- Evolutionary Computation (AREA)
- Primary Health Care (AREA)
- General Engineering & Computer Science (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Zoology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Immunology (AREA)
- Mathematical Physics (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Bioethics (AREA)
- Biochemistry (AREA)
- Molecular Biology (AREA)
- Hospice & Palliative Care (AREA)
Abstract
본 발명은 혈액 내 RNA의 엑손-접합 정보를 이용한 암 진단 방법에 관한 것으로서, 보다 구체적으로는 개체의 혈액의 무핵세포 또는 엑소좀에서 분리한 RNA; 또는 cfRNA를 분리하고, 이의 전사체 데이터를 얻은 다음 엑손-접합에서의 염기서열 발현정보를 이용하여 암 보유 여부를 결정하는, 개체에서 암 진단에 필요한 정보를 제공하기 위해 전사체를 분석하는 방법에 관한 것이다. 본 발명의 방법은 암의 진단에 필요한 정보를 제공할 수 있어 항암 치료 요법에 유용하게 이용될 수 있다.
Description
본 출원은 2022년 10월 17일에 출원된 대한민국 특허출원 제 10-2022-0133331호를 우선권으로 주장하고, 상기 명세서 전체는 본 출원의 참고문헌이다.
본 발명은 혈액 내 RNA의 엑손-접합 정보를 이용한 암 진단 방법에 관한 것으로서, 보다 구체적으로는 (a) (i) 개체의 혈액의 무핵세포(anucleated cells)에서 분리한 전체 RNA 또는 이의 일부, (ii) 개체의 혈액의 엑소좀(exosome)에서 분리한 전체 RNA 또는 이의 일부, 및 (iii) 개체의 혈액에서 분리한 전체 cfRNA(cell-free RNA) 또는 이의 일부로 이루어진 군에서 선택된 하나 이상의 RNA를 분리하는 단계; (b) 상기 (a) 단계에서 분리한 RNA에 대한 상보적 DNA (cDNA)을 합성하는 단계; (c) 상기 cDNA의 염기서열정보를 수득하는 단계; (d) 상기 cDNA 염기서열정보를 미리 정해진 엑손-접합(exon-junction) 라이브러리와 대비하여 각 엑손-접합에서의 염기서열 발현정보를 수득하는 단계; 및 (e) 상기 각 엑손-접합에서의 염기서열 발현정보를 바탕으로 암 보유 여부를 결정하는 단계를 포함하는, 개체에서 암 진단에 필요한 정보를 제공하기 위해 전사체를 분석하는 방법에 관한 것이다.
차세대 염기서열 분석(Next-Generation Sequencing, NGS) 기술을 사용하여 환자의 혈액 내의 엑소좀, 순환성 종양세포 (Circulating Tumor Cell, CTCs), 순환 종양성 DNA (ctDNA) 등의 핵산 정보를 분석하는 액체생검 (Liquid Biopsy) 기술이 암 질환 진단 및 치료에 도입되고 있다 [1].
액체 생검은 비침습성 기술로, 조직 생검 (Tissue Biopsy)에 비해 채취에 편리하고 소량의 혈액으로 빠른 분석이 가능하여 조직생검 없이 액체 생검으로 암 판별과 모니터링이 가능해졌다 [2].
그러나 액체생검은 혈액 내 존재하는 분자 수가 적은 한계로 인해 기존 기술로는 암 검출 민감도가 낮다 [3, 4, 5]. 따라서 효과적인 암 조기 판별 스크리닝을 위해서는 암에서도 혈액 내 존재하는 분자 수가 많은 바이오 마커를 활용하여 민감도를 높일 필요성이 있다.
이에 본 발명자들은 효과적인 암 조기 판별 스크리닝을 위해서는 암에서도 혈액 내 존재하는 분자 수가 많은 바이오 마커를 활용하기 위해 연구하던 중, 혈소판과 같은 무핵세포, 엑소좀과 같은 세포 유래 막 구조물, 또는 무세포 RNA에서 전사체 (transcriptome) 데이터를 확보하고 이를 분석하여 엑손 접합 정보를 바이오마커로 이용하면 피험자의 암과 정상 여부를 판별할 수 있다는 점, 특히 사전 학습된 암 여부 판별 모형 기반의 머신러닝 알고리즘을 이용하여 분석하여 엑손 접합 정보를 바이오마커로 이용하면 피험자의 암과 정상 여부를 판별할 수 있다는 점을 확인하여 본 발명을 완성하였다.
따라서, 본 발명의 목적은
(a) (i) 개체의 혈액의 무핵세포(anucleated cells)에서 분리한 전체 RNA 또는 이의 일부, (ii) 개체의 혈액의 엑소좀(exosome)에서 분리한 전체 RNA 또는 이의 일부, 및 (iii) 개체의 혈액에서 분리한 전체 cfRNA(cell-free RNA) 또는 이의 일부로 이루어진 군에서 선택된 하나 이상의 RNA를 분리하는 단계
(b) 상기 (a) 단계에서 분리한 RNA에 대한 상보적 DNA (cDNA)을 합성하는 단계;
(c) 상기 cDNA의 염기서열정보를 수득하는 단계;
(d) 상기 cDNA 염기서열정보를 미리 정해진 엑손-접합(exon-junction) 라이브러리와 대비하여 각 엑손-접합에서의 염기서열 발현정보를 수득하는 단계; 및
(e) 상기 각 엑손-접합에서의 염기서열 발현정보를 바탕으로 암 보유 여부를 결정하는 단계를 포함하는, 개체에서 암 진단에 필요한 정보를 제공하기 위해 전사체를 분석하는 방법을 제공하는 것이다.
본 발명의 다른 목적은 단수 또는 복수의 엑손-접합을 유효성분으로 포함하는 암 진단용 조성물이며, 상기 엑손-접합은 (i) 개체의 혈액의 무핵세포에서 분리한 RNA 또는 이의 일부, (ii) 개체의 혈액의 엑소좀에서 분리한 전체 RNA 또는 이의 일부, 및 (iii) 개체의 혈액에서 분리한 전체 cfRNA 또는 이의 일부에서 검출하는 것을 특징으로 하는 암 진단용 조성물을 제공하는 것이다.
본 발명의 또다른 목적은, 단수 또는 복수의 엑손-접합으로 이루어진 암 진단용 조성물이며, 상기 엑손-접합은 (i) 개체의 혈액의 무핵세포에서 분리한 RNA 또는 이의 일부, (ii) 개체의 혈액의 엑소좀에서 분리한 전체 RNA 또는 이의 일부, 및 (iii) 개체의 혈액에서 분리한 전체 cfRNA 또는 이의 일부에서 검출하는 것을 특징으로 하는 암 진단용 조성물을 제공하는 것이다.
본 발명의 또다른 목적은, 단수 또는 복수의 엑손-접합으로 필수적으로 이루어지는 암 진단용 조성물이며, 상기 엑손-접합은 (i) 개체의 혈액의 무핵세포에서 분리한 RNA 또는 이의 일부, (ii) 개체의 혈액의 엑소좀에서 분리한 전체 RNA 또는 이의 일부, 및 (iii) 개체의 혈액에서 분리한 전체 cfRNA 또는 이의 일부에서 검출하는 것을 특징으로 하는 암 진단용 조성물을 제공하는 것이다.
본 발명의 또다른 목적은 상기 조성물을 포함하는 암 진단 키트를 제공하는 것이다.
본 발명의 또다른 목적은 상기 조성물로 이루어진 암 진단 키트를 제공하는 것이다.
본 발명의 또다른 목적은 상기 조성물로 필수적으로 이루어진 암 진단 키트를 제공하는 것이다.
본 발명의 또다른 목적은 단수 또는 복수의 엑손-접합을 검출할 수 있는 제제를 포함하는, 암 진단용 조성물이며, 상기 엑손-접합은 (i) 개체의 혈액의 무핵세포에서 분리한 전체 RNA 또는 이의 일부, (ii) 개체의 혈액의 엑소좀에서 분리한 전체 RNA 또는 이의 일부, 및 (iii) 개체의 혈액에서 분리한 전체 cfRNA 또는 이의 일부에서 검출하는 것을 특징으로 하는 암 진단용 조성물을 제공하는 것이다.
본 발명의 또다른 목적은 단수 또는 복수의 엑손-접합을 검출할 수 있는 제제로 이루어진, 암 진단용 조성물이며, 상기 엑손-접합은 (i) 개체의 혈액의 무핵세포에서 분리한 전체 RNA 또는 이의 일부, (ii) 개체의 혈액의 엑소좀에서 분리한 전체 RNA 또는 이의 일부, 및 (iii) 개체의 혈액에서 분리한 전체 cfRNA 또는 이의 일부에서 검출하는 것을 특징으로 하는 암 진단용 조성물을 제공하는 것이다.
본 발명의 또다른 목적은 단수 또는 복수의 엑손-접합을 검출할 수 있는 제제로 필수적으로 이루어진 암 진단용 조성물이며, 상기 엑손-접합은 (i) 개체의 혈액의 무핵세포에서 분리한 전체 RNA 또는 이의 일부, (ii) 개체의 혈액의 엑소좀에서 분리한 전체 RNA 또는 이의 일부, 및 (iii) 개체의 혈액에서 분리한 전체 cfRNA 또는 이의 일부에서 검출하는 것을 특징으로 하는 암 진단용 조성물을 제공하는 것이다.
본 발명의 또다른 목적은 암 진단용 조성물을 제조하기 위한, 상기 표 1의 엑손-접합(exon-junction)으로 이루어진 군에서 선택된 단수 또는 복수의 엑손-접합을 검출할 수 있는 제제의 용도를 제공하는 것이다.
본 발명의 또다른 목적은 (a) (i) 개체의 혈액의 무핵세포(anucleated cells)에서 분리한 전체 RNA 또는 이의 일부, (ii) 개체의 혈액의 엑소좀(exosome)에서 분리한 전체 RNA 또는 이의 일부, 및 (iii) 개체의 혈액에서 분리한 전체 cfRNA(cell-free RNA) 또는 이의 일부로 이루어진 군에서 선택된 하나 이상의 RNA를 분리하는 단계
(b) 상기 (a) 단계에서 분리한 RNA에 대한 상보적 DNA (cDNA)을 합성하는 단계;
(c) 상기 cDNA의 염기서열정보를 수득하는 단계;
(d) 상기 cDNA 염기서열정보를 미리 정해진 엑손-접합(exon-junction) 라이브러리와 대비하여 각 엑손-접합에서의 염기서열 발현정보를 수득하는 단계; 및
(e) 상기 각 엑손-접합에서의 염기서열 발현정보를 바탕으로 암 보유 여부를 결정하는 단계를 포함하는, 암 진단 방법을 제공하는 것이다.
상기와 같은 목적을 달성하기 위하여, 본 발명은 (a) (i) 개체의 혈액의 무핵세포(anucleated cells)에서 분리한 전체 RNA 또는 이의 일부, (ii) 개체의 혈액의 엑소좀(exosome)에서 분리한 전체 RNA 또는 이의 일부, 및 (iii) 개체의 혈액에서 분리한 전체 cfRNA(cell-free RNA) 또는 이의 일부로 이루어진 군에서 선택된 하나 이상의 RNA를 분리하는 단계;
(b) 상기 (a) 단계에서 분리한 RNA에 대한 상보적 DNA (cDNA)을 합성하는 단계;
(c) 상기 cDNA의 염기서열정보를 수득하는 단계;
(d) 상기 cDNA 염기서열정보를 미리 정해진 엑손-접합(exon-junction) 라이브러리와 대비하여 각 엑손-접합에서의 염기서열 발현정보를 수득하는 단계; 및
(e) 상기 각 엑손-접합에서의 염기서열 발현정보를 바탕으로 암 보유 여부를 결정하는 단계를 포함하는, 개체에서 암 진단에 필요한 정보를 제공하기 위해 전사체를 분석하는 방법을 제공한다.
본 발명의 다른 목적을 달성하기 위하여, 본 발명은 단수 또는 복수의 엑손-접합을 유효성분으로 포함하는 암 진단용 조성물이며, 상기 엑손-접합은 (i) 개체의 혈액의 무핵세포에서 분리한 RNA 또는 이의 일부, (ii) 개체의 혈액의 엑소좀에서 분리한 전체 RNA 또는 이의 일부, 및 (iii) 개체의 혈액에서 분리한 전체 cfRNA 또는 이의 일부에서 검출하는 것을 특징으로 하는 암 진단용 조성물을 제공한다.
또한, 본 발명의 다른 목적을 달성하기 위하여, 본 발명은 단수 또는 복수의 엑손-접합으로 이루어진 암 진단용 조성물이며, 상기 엑손-접합은 (i) 개체의 혈액의 무핵세포에서 분리한 RNA 또는 이의 일부, (ii) 개체의 혈액의 엑소좀에서 분리한 전체 RNA 또는 이의 일부, 및 (iii) 개체의 혈액에서 분리한 전체 cfRNA 또는 이의 일부에서 검출하는 것을 특징으로 하는 암 진단용 조성물을 제공한다.
또한, 본 발명의 다른 목적을 달성하기 위하여, 본 발명은 단수 또는 복수의 엑손-접합으로 필수적으로 이루어지는 암 진단용 조성물이며, 상기 엑손-접합은 (i) 개체의 혈액의 무핵세포에서 분리한 RNA 또는 이의 일부, (ii) 개체의 혈액의 엑소좀에서 분리한 전체 RNA 또는 이의 일부, 및 (iii) 개체의 혈액에서 분리한 전체 cfRNA 또는 이의 일부에서 검출하는 것을 특징으로 하는 암 진단용 조성물을 제공한다.
본 발명의 또다른 목적을 달성하기 위하여, 본 발명은 상기 조성물을 포함하는 암 진단 키트를 제공한다.
또한, 본 발명은 상기 조성물로 이루어진 암 진단 키트를 제공한다.
또한, 본 발명은 상기 조성물로 필수적으로 이루어진 암 진단 키트를 제공한다.
본 발명의 또다른 목적을 달성하기 위하여, 본 발명은 단수 또는 복수의 엑손-접합을 검출할 수 있는 제제를 포함하는, 암 진단용 조성물이며, 상기 엑손-접합은 (i) 개체의 혈액의 무핵세포에서 분리한 전체 RNA 또는 이의 일부, (ii) 개체의 혈액의 엑소좀에서 분리한 전체 RNA 또는 이의 일부, 및 (iii) 개체의 혈액에서 분리한 전체 cfRNA 또는 이의 일부에서 검출하는 것을 특징으로 하는 암 진단용 조성물을 제공한다.
또한, 본 발명은 단수 또는 복수의 엑손-접합을 검출할 수 있는 제제로 이루어진, 암 진단용 조성물이며, 상기 엑손-접합은 (i) 개체의 혈액의 무핵세포에서 분리한 전체 RNA 또는 이의 일부, (ii) 개체의 혈액의 엑소좀에서 분리한 전체 RNA 또는 이의 일부, 및 (iii) 개체의 혈액에서 분리한 전체 cfRNA 또는 이의 일부에서 검출하는 것을 특징으로 하는 암 진단용 조성물을 제공한다.
또한, 본 발명은 단수 또는 복수의 엑손-접합을 검출할 수 있는 제제로 필수적으로 이루어진 암 진단용 조성물이며, 상기 엑손-접합은 (i) 개체의 혈액의 무핵세포에서 분리한 전체 RNA 또는 이의 일부, (ii) 개체의 혈액의 엑소좀에서 분리한 전체 RNA 또는 이의 일부, 및 (iii) 개체의 혈액에서 분리한 전체 cfRNA 또는 이의 일부에서 검출하는 것을 특징으로 하는 암 진단용 조성물을 제공한다.
본 발명의 또다른 목적을 달성하기 위하여, 본 발명은 암 진단용 조성물을 제조하기 위한, 상기 표 1의 엑손-접합(exon-junction)으로 이루어진 군에서 선택된 단수 또는 복수의 엑손-접합을 검출할 수 있는 제제의 용도를 제공한다.
본 발명의 또다른 목적을 달성하기 위하여, 본 발명은 (a) (i) 개체의 혈액의 무핵세포(anucleated cells)에서 분리한 전체 RNA 또는 이의 일부, (ii) 개체의 혈액의 엑소좀(exosome)에서 분리한 전체 RNA 또는 이의 일부, 및 (iii) 개체의 혈액에서 분리한 전체 cfRNA(cell-free RNA) 또는 이의 일부로 이루어진 군에서 선택된 하나 이상의 RNA를 분리하는 단계
(b) 상기 (a) 단계에서 분리한 RNA에 대한 상보적 DNA (cDNA)을 합성하는 단계;
(c) 상기 cDNA의 염기서열정보를 수득하는 단계;
(d) 상기 cDNA 염기서열정보를 미리 정해진 엑손-접합(exon-junction) 라이브러리와 대비하여 각 엑손-접합에서의 염기서열 발현정보를 수득하는 단계; 및
(e) 상기 각 엑손-접합에서의 염기서열 발현정보를 바탕으로 암 보유 여부를 결정하는 단계를 포함하는, 암 진단 방법을 제공한다.
다른 정의가 없는 한, 본 명세서에 사용된 모든 기술적 및 과학적 용어는 당업자들에 의해 통상적으로 이해되는 동일한 의미를 가진다. 다음의 참고문헌은 본 발명의 명세서에 사용된 여러 용어들의 일반적인 정의를 갖는 기술(skill)의 하나를 제공한다: Singleton et al., DICTIONARY OF MICROBIOLOGY AND MOLECULAR BIOLOTY(2th ed. 1994); THE CAMBRIDGE DICTIONARY OF SCIENCE AND TECHNOLOGY(Walkered., 1988); 및 Hale & Marham, THE HARPER COLLINS DICTIONARY OF BIOLOGY
이하, 본 발명을 상세히 설명한다.
지혈과 응고에 주된 역할을 하는 것으로 알려진 혈소판은 암의 성장, 전이, 면역 회피에 도움을 주며, 암 세포는 혈소판의 RNA 발현 과정에 직접 또는 간접적으로 영향을 미쳐 혈소판의 발현을 변화시키는 것으로 보고되었다. 하나의 암 세포가 수만 개의 혈소판을 변화시킬 수 있으므로 혈소판의 전사체 정보는 암 판별을 돕는 바이오마커로서 본 발명에 활용될 수 있다. 특히 암 세포들에 의하여 변화된 혈소판 세포의 RNA는 선택적 스플라이싱(alternative splicing)패턴이 암 특이적으로 변화할 수 있기에 이를 암을 진단하는 바이오마커로 본 발명에서 활용하고자 한다. 또한, 혈소판은 혈액 내의 대표적인 무핵세포 (anucleated cell로서 엑소좀과 cfRNA의 주요한 소스 (source)로 알려져 있다 (Mol Oncol. 2021 Jun; 15(6): 1727-1743).
따라서, 본 발명은
(a) (i) 개체의 혈액의 무핵세포에서 분리한 전체 RNA 또는 이의 일부, (ii) 개체의 혈액의 엑소좀에서 분리한 전체 RNA 또는 이의 일부, 및 (iii) 개체의 혈액에서 분리한 전체 cfRNA 또는 이의 일부로 이루어진 군에서 선택된 하나 이상의 RNA를 분리하는 단계;
(b) 상기 (a) 단계에서 분리한 RNA에 대한 상보적 DNA (cDNA)을 합성하는 단계;
(c) 상기 cDNA의 염기서열정보를 수득하는 단계;
(d) 상기 cDNA 염기서열정보를 미리 정해진 엑손-접합 라이브러리와 대비하여 각 엑손-접합에서의 염기서열 발현정보를 수득하는 단계; 및
(e) 상기 각 엑손-접합에서의 염기서열 발현정보를 바탕으로 암 보유 여부를 결정하는 단계를 포함하는, 개체에서 암 진단에 필요한 정보를 제공하기 위해 전사체를 분석하는 방법을 제공한다.
(a) 단계는 (i) 개체의 혈액의 무핵세포에서 분리한 전체 RNA 또는 이의 일부, (ii) 개체의 혈액의 엑소좀에서 분리한 전체 RNA 또는 이의 일부, 및 (iii) 개체의 혈액에서 분리한 전체 cfRNA 또는 이의 일부로 이루어진 군에서 선택된 하나 이상의 RNA를 분리하는 단계이다.
샘플은 예컨대 공지되거나 또는 의심되는 개체에서 분리된 것일 수 있다. 샘플은 개체로부터 원래 분리된 형태일 수 있거나 또는 세포와 같은 성분을 제거 또는 부가하기 위해 또는 또 다른 성분과 비교하여 한 성분을 풍부화시키기 위해 추가로 프로세싱시킬 수 있다. 샘플은 개체로부터 분리 또는 수득될 수 있고, 샘플 분석 장치로 수송될 수 있다. 샘플은 원하는 온도, 예를 들어, 실온, 4℃, -20℃, 및/또는 -80℃ 하에 보존 및 선적될 수 있다.
예를 들어, 샘플은 액체 생검(liquid biopsy)을 위해 개체로부터 혈액 샘플을 채취하고, 이 때, 채취한 혈액은 품질 관리(Quality Check, QC) 지표를 확인하여 사용 여부를 결정할 수 있고, 이로써 판별의 정확도를 높일 수 있다. 이 후, 채취한 혈액 샘플로부터 혈소판과 같은 무핵세포, 엑소좀 및 cfRNA로 이루어진 군에서 선택된 하나 이상을 분리한다. 분리하는 방법으로 당업계에 공지된 방법에 의할 수 있으며, 바람직하게는 원심분리 등을 통하여 이들을 분리할 수 있다. cfRNA의 경우 혈액, 혈장, 혈청 또는 이들의 분획에서 직접 cDNA 합성에 이용될 수 있다.
개체는 인간, 포유동물, 동물, 애완용 동물, 서비스 동물, 또는 애완동물일 수 있다. 개체는 질환이 있을 수 있다. 개체는 질환 또는 검출 가능한 질환 증상이 없을 수 없다. 개체는 하나 이상의 요법, 예를 들어, 수술, 처치, 투약, 화학요법, 항체, 백신 또는 생물 제제 중 어느 하나 이상으로 치료받은 적이 있을 수 있다. 개체는 차도가 있을 수 있거나 또는 그렇지 않을 수 있다.
본 발명에서 상기 '무핵세포'는 핵이 존재하지 않는 세포로서 세포분열을 통해 딸세포를 생성하지 못하는 세포를 의미한다. 상기 무핵세포는 혈소판, 적혈구 그리고 불완전한 세포분열로 인해 핵을 보유하지 못한 일체의 세포를 포함하며, 바람직하게는 혈소판 또는 적혈구일 수 있고, 가장 바람직하게는 혈소판일 수 있다.
본 발명에서 상기 '엑소좀'은 나노 단위 크기(예컨대, 50-90 nm)를 갖는 소낭 구조를 갖는 세포밖 소포체를 의미하며, 유래되는 세포의 세포막 성분으로 이루어진 지질이중막에 의해 엑소좀 내부와 외부가 분리된 구조를 가지며, 세포의 세포막 지질, 세포막 단백질, 핵산 및 세포 성분 등을 가지고 있다. 본 발명에서 엑소좀의 유래는 특별히 제한되지는 않으나, 바람직하게는 혈액으로부터 분리된 것일 수 있다. 엑소좀은 세포 간의 mRNA, miRNA, DNA, 및 단백질의 운송을 매개하고 세포 내외의 신호전달 및 상호작용에 중요한 역할을 한다. 엑소좀은 당업계에 알려진 방법을 제한 없이 사용하여 분리될 수 있으며, 예를 들어, 초원심분리(ultra-centrifugation isolation), 크기별 제외법(size exclusion), 면역친화성 분리(immunoaffinity isolation), 미세유체 기술(microfluidics chip) 및 폴리머를 이용한 방법(polymeric method) 등을 사용하여 엑소좀을 분리할 수 있다. 또한, 시판중인 엑소좀 분리용 키트(예컨대, Exo2DTM EV isolation kit)를 사용하여 엑소좀을 분리할 수 있다.
무핵세포 및/또는 엑소좀으로부터 RNA의 분리는 당업계에 공지된 다양한 방법을 통해 이루어질 수 있다. 예를 들어 RNA의 분리 방법으로는, 티오시안산구아니딘·염화세슘 초원심법, 티오시안산구아니딘·핫 페놀법, 구아니딘 염산법, 산성 티오시안산구아니딘·페놀·클로로포름법 (Chomczynski, P. and Sacchi, N., Anal. Biochem. (1987), 162, 156-159) 등을 포함하며 이에 제한되지 않는다. 또한, 시판되는 RNA 추출용 시약 (예를 들어, RNA queous kit (Ambion Inc., Austin, TX), Micro-to-midi total RNA purification system (Invitrogen), NucleoSpin RNA II (BD Biosciences Clontech, Palo Alto, CA), RNeasy mini kit (Qiagen), GenElute mammalian total RNA kit (Sigma-Aldrich, and Trizol LS reagent (Invitrogen)) 등을 시약으로 첨부된 프로토콜에 따라 사용할 수도 있다. 상기 당업계에 공지된 RNA의 분리에 관하여, 이에 대한 구체적인 방법은 Joseph Sambrook, et al.,MolecularCloning, A LaboratoryManual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.(2001)에 개시되어 있으며, 이 문헌은 본 명세서에 참조로써 삽입된다.
분리된 RNA 분획은, 필요에 따라 추가로 mRNA만으로 정제하여 사용될 수 있다. 정제방법은 공지의 RNA 정제법이라면 특별히 제한되지 않지만, 예를 들어 비오틴화한 올리고 (dT) 프로브에 mRNA를 흡착시켜, 또한 스트렙토아비딘을 고정화한 상자성 입자에, 비오틴/스트렙토아비딘의 결합을 사용하여 mRNA를 포착하여 세정 조작한 후, mRNA를 용출함으로써, mRNA를 정제할 수 있다. 또한, 올리고 (dT) 셀룰로오스 칼럼에 mRNA를 흡착시키고, 다음으로 이것을 용출하여 정제하는 방법도 채용할 수 있다. 다만, 본 발명의 방법을 위해서는, 상기 mRNA의 정제공정은 필수가 아니라 선택적으로 수행될 수 있다.
(b) 단계는 상기 (a) 단계에서 분리한 RNA에 대한 상보적 DNA (cDNA)을 합성하는 단계이다.
RNA로부터 cDNA를 합성하는 방법은 당업계에 공지된 방법에 따라 제한없이 수행될 수 있다. 예를 들어, RNA에 역전사 효소와 디옥시리보뉴클레오티드를 첨가하여 mRNA 사슬을 주형으로 1차 DNA 가닥을 복사한다. 이후, RNA 분해효소 (RNase H)를 처리하여 DNA-RNA 혼성 double strands로부터 mRNA를 제거한다. 이후, DNA중합효소를 처리하여 역전사에 의해 만들어진 DNA 가닥을 주형으로 하여 DNA의 두 번째 가닥을 형성하여 템플릿을 완성하는 방법에 따라 cDNA가 합성될 수 있다.
(c) 단계는 상기 cDNA의 염기서열정보를 수득하는 단계이다.
본 발명의 하나의 양태에서 염기서열정보를 분석하는 것은 당업계에 공지된 염기서열정보 분석 방법에 의해서 수행될 수 있다. 염기서열정보 분석은 상보적 cDNA의 한 쪽 사슬(strand) 또는 이들 각각의 서열을 해독한다. 서열 해독은 대량의 단편, 바람직하게는 적어도 10000개이상, 적어도 20000개이상, 적어도 30000개이상, 적어도 40000개이상, 적어도 50000개이상, 적어도 100000개이상, 적어도 1000000개 이상의 단편을 해독하므로 이에 적합한 해독 방법이 바람직하다.
염기서열정보 분석은 당업계에 공지된 염기서열정보 분석법이 사용될 수 있으나, 각 단편의 서열을 충분한 수량으로 해독하기 위하여 대량의 서열해독이 가능한 방법이라면 제한없이 사용될 수 있다. 본 발명의 상기 염기서열의 분석은 이에 제한되지는 않으나, 차세대 염기서열 분석법 (Next-Generation Sequencing, NGS)에 의해서 수행될 수 있다. 차세대 염기서열 분석법이 사용되는 경우 대량의 서열을 수 시간 내에 적은 비용으로 해독할 수 있다는 장점이 있으며, 충분한 양의 서열을 읽는 경우 정확도가 아주 높으며 해독된 데이터를 정성, 정량적으로 분석이 가능하다.
본 발명에서 상기 분석된 염기서열정보는 리드(reads)로도 불릴 수 있다.
한편, 엑손 접합 부위의 염기서열 분석을 위하여 적절한 어댑터(adapter)를 부착할 수도 있다.
(d) 단계는 상기 cDNA 염기서열정보를 미리 정해진 엑손-접합(exon-junction) 라이브러리와 대비하여 각 엑손-접합에서의 염기서열 발현정보를 수득하는 단계이다.
상기 (d) 단계에서는 상기 (c) 단계에서 수득된 염기서열정보에서 엑손-접합에 의해서 생긴 서열의 발현정보를 수득한다. 예를 들어, 상기 (c) 단계에서의 염기서열정보 분석이 NGS에 의해서 수행된 경우, 미리 정해진 엑손-접합 라이브러리에 정렬(alignment)되는 염기서열의 빈도, 즉 리드수(read-count)를 계수한다. 즉, 한 개 샘플 해독으로 얻은 모든 서열에서 미리 정해진 엑손-접합 라이브러리와 대비하여 각각 다른 엑손-접합 종류에 대해서 해당 서열의 리드의 수를 계수한다.
상기 염기서열정보에서 엑손-접합에 의해 생긴 서열의 발현정보, 즉, 리드수는 한 유전자 내에 존재하는 서로 다른 두 개의 엑손들의 최말단에 맵핑된 염기서열(리드), 즉 상위 엑손의 끝 부분과 하위 엑손의 시작 부분으로부터 최소 1개 이상의 연속된 엑손 영역의 염기쌍을 포함하는 염기서열(리드)의 수이며, 이 때 서로 다른 두 개의 엑손들의 경우 참조 유전체 상에서 바로 인접한 엑손이 아니어도 된다. 또한 해독되지 아니하는 인트론 부분을 포함하는 서열의 경우는 상기 엑손-접합에서의 염기서열 발현정보, 즉 리드수로 계수하지 아니한다(도 4 참고).
계수된 각 값은 다른 샘플들의 값과 비교하기 위해 정규화될 수 있다. 이 정규화는 각 샘플마다 해독된 양이 다를 경우 샘플간의 직접적인 정량적인 비교를 위해 해독된 양에 비례하는 값으로 집계된 각 값을 나누는 것이다. 이 때, 해독된 양에 비례하는 값은 각 샘플의 해독된 전체 서열수, house keeping 유전자 영역에 맵핑된 서열수 등 다양한 값이 가능하다.
본 발명에서 상기 미리 정해진 엑손-접합 라이브러리는 표 1에 기재된 유전자 및 해당 염색체에서의 위치 정보로 표시되는 엑손-접합 부위에 대한 정보를 나타낸다. 하기 표 1에서 각각의 유전자 및 해당 염색체가 표시되어 있고, 엑손 접합이 되는 상위 위치의 엑손의 끝 부분 (position 1) 및 하위 위치 엑손의 시작 부분 (position 2)이 해당 염색체에서의 위치 번호로 표시되어 있다. 즉, 본 발명에서 상기 미리 정해진 엑손-접합 라이브러리는 하기 표 1에 기재된 각 염색체에서 position 1 및 position 2의 접합부일 수 있다(도 3 참고).
본 발명의 일양태에서, 상기 엑손-접합에서의 염기서열의 발현정보, 즉, 리드수는 하기 표 1에서 position 1 및 position 2의 각 염기를 포함하면서 5'방향 및/또는 3'방향으로 연속되는 2 이상의 염기를 포함하는 서열에 정렬(alignment)되는 서열정보(리드)인 것을 특징으로 할 수 있다.
본 발명의 다른 일양태에서, 상기 엑손-접합에서의 염기서열의 발현정보, 즉, 리드수는 하기 표 1에서 position 1 및 position 2의 각 염기를 포함하면서 5'방향 및/또는 3'방향으로 연속되는 2 이상 내지 300 이하의 염기를 포함하는 서열에 정렬(alignment)되는 서열정보(리드)인 것을 특징으로 할 수 있다.
본 발명의 다른 일양태에서, 상기 엑손-접합에서의 염기서열의 발현정보, 즉, 리드수는 하기 표 1에서 position 1 및 position 2의 각 염기를 포함하면서 5'방향 및/또는 3'방향으로 연속되는 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290 또는 300 염기를 포함하는 서열에 정렬(alignment)되는 서열정보(리드)인 것을 특징으로 할 수 있다.
엑손-접합 번호 | 유전자 | 염색체 | Strand | Position1 | Position2 |
1 | ITGA2B | 17 | - | 44385209 | 44385286 |
2 | TRDC | 14 | + | 22463210 | 22463774 |
3 | TRAF3IP3 | 1 | + | 209779374 | 209780470 |
4 | FCGR2A | 1 | + | 161506591 | 161509820 |
5 | PCSK6 | 15 | - | 101389564 | 101393212 |
6 | TRAC | 14 | + | 22549682 | 22550557 |
7 | PTPRC | 1 | + | 198713072 | 198716682 |
8 | IFI27 | 14 | + | 94114880 | 94115781 |
9 | TRBC1 | 7 | + | 142792539 | 142792692 |
10 | RPL5 | 1 | + | 92832117 | 92833389 |
11 | TLN1 | 9 | - | 35713043 | 35713196 |
12 | SPARC | 5 | - | 151673216 | 151686865 |
13 | HLA-DRB1 | 6 | - | 32579104 | 32580247 |
14 | LUC7L3 | 17 | + | 50744813 | 50745720 |
15 | LOC728975 | 11 | - | 65181072 | 65181225 |
16 | TRBC1 | 7 | + | 142792080 | 142792522 |
17 | GAS5 | 1 | - | 173865894 | 173866177 |
18 | TRDC | 14 | + | 22464323 | 22465533 |
19 | LUC7L3 | 17 | + | 50719831 | 50736960 |
20 | HBD | 11 | - | 5234213 | 5234342 |
21 | TLN1 | 9 | - | 35714081 | 35714239 |
22 | TRBC1 | 7 | + | 142792798 | 142793121 |
23 | DEFA1 | 8 | - | 6977879 | 6978460 |
24 | GAS5 | 1 | - | 173866206 | 173866528 |
25 | OSBP2 | 22 | + | 30893566 | 30893638 |
26 | CRIP1 | 14 | + | 105488388 | 105488471 |
27 | HLA-DRB1 | 6 | - | 32580856 | 32581557 |
28 | TRAF3IP3 | 1 | + | 209781458 | 209782056 |
29 | GNLY | 2 | + | 85694470 | 85695320 |
30 | GZMA | 5 | + | 55108394 | 55110021 |
31 | HSP90B1 | 12 | + | 103947430 | 103947633 |
32 | RPS17 | 15 | - | 82538985 | 82539981 |
33 | TRAC | 14 | + | 22550664 | 22551605 |
34 | CD3D | 11 | - | 118339227 | 118339451 |
35 | FCGR2A | 1 | + | 161510074 | 161510834 |
36 | RPL21 | 13 | + | 27256334 | 27256436 |
37 | U2SURP | 3 | + | 143014409 | 143016257 |
38 | GIMAP7 | 7 | + | 150514945 | 150519934 |
39 | NKG7 | 19 | - | 51371835 | 51371940 |
40 | KLRK1 | 12 | - | 10373231 | 10378132 |
41 | IL2RG | X | - | 71110688 | 71110897 |
42 | KLRB1 | 12 | - | 9595421 | 9598046 |
43 | TRIM58 | 1 | + | 247860712 | 247864705 |
44 | TLN1 | 9 | - | 35713298 | 35713953 |
45 | ITGA2B | 17 | - | 44385076 | 44385164 |
46 | NKG7 | 19 | - | 51372074 | 51372161 |
47 | U2SURP | 3 | + | 143012353 | 143014311 |
48 | HLA-DRB1 | 6 | - | 32581838 | 32584109 |
49 | TLN1 | 9 | - | 35714687 | 35714760 |
50 | TRDC | 14 | + | 22463839 | 22464204 |
51 | IGHM | 14 | - | 105852263 | 105854405 |
52 | RPL22 | 1 | - | 6186816 | 6192930 |
53 | ITGA2B | 17 | - | 44384354 | 44384538 |
54 | DEFA3 | 8 | - | 7016862 | 7018222 |
55 | IFI27 | 14 | + | 94115942 | 94116442 |
56 | KLRB1 | 12 | - | 9598161 | 9598499 |
57 | TRIM58 | 1 | + | 247857666 | 247860617 |
58 | DEFA1B | 8 | - | 6996996 | 6997577 |
59 | IL7R | 5 | + | 35860990 | 35867306 |
60 | IL7R | 5 | + | 35867463 | 35871056 |
61 | IL2RG | X | - | 71111050 | 71111425 |
62 | ZYX | 7 | + | 143382447 | 143382593 |
63 | PTPRC | 1 | + | 198732390 | 198732480 |
64 | SEPTIN5 | 22 | + | 19718818 | 19719602 |
65 | TLN1 | 9 | - | 35715187 | 35716390 |
66 | SIAH2 | 3 | - | 150742698 | 150762433 |
67 | RSRP1 | 1 | - | 25242705 | 25243550 |
68 | RPL23A | 17 | + | 28720030 | 28720707 |
69 | ITGA2B | 17 | - | 44384585 | 44384948 |
70 | LUC7L3 | 17 | + | 50740345 | 50741102 |
71 | TRAF3IP3 | 1 | + | 209780606 | 209781345 |
72 | FLNA | X | - | 154354711 | 154354825 |
73 | FLNA | X | - | 154361787 | 154361979 |
74 | GNAS | 20 | + | 58889353 | 58895612 |
75 | PTPRC | 1 | + | 198709824 | 198712953 |
76 | DEFA1B | 8 | - | 6997763 | 6999123 |
77 | IL7R | 5 | + | 35874542 | 35875512 |
78 | LY86 | 6 | + | 6625012 | 6626293 |
79 | TLN1 | 9 | - | 35711792 | 35712005 |
80 | FLNA | X | - | 154366224 | 154366308 |
81 | FLNA | X | - | 154362148 | 154362242 |
82 | LCK | 1 | + | 32279747 | 32279841 |
83 | ITGA2B | 17 | - | 44381061 | 44383493 |
84 | CRIP1 | 14 | + | 105488260 | 105488331 |
85 | PLEKHO1 | 1 | + | 150150287 | 150150912 |
86 | TLN1 | 9 | - | 35714373 | 35714574 |
87 | RBM6 | 3 | + | 50068764 | 50070455 |
88 | HLA-DRB5 | 6 | - | 32519651 | 32521905 |
89 | RPS10 | 6 | - | 34417547 | 34418369 |
90 | CAPN2 | 1 | + | 223745439 | 223746997 |
91 | DEFA1 | 8 | - | 6978646 | 6980013 |
92 | CDK2AP1 | 12 | - | 123267282 | 123271564 |
93 | MPHOSPH8 | 13 | + | 19633961 | 19642115 |
94 | RPL34 | 4 | + | 108620600 | 108621951 |
95 | RBM6 | 3 | + | 50070552 | 50075201 |
96 | ITGA2B | 17 | - | 44383946 | 44384085 |
97 | IL2RG | X | - | 71108695 | 71109228 |
98 | GAS5 | 1 | - | 173865547 | 173865857 |
99 | IFI27 | 14 | + | 94111773 | 94114851 |
100 | CD3D | 11 | - | 118339906 | 118340375 |
101 | MTURN | 7 | + | 30135298 | 30146177 |
102 | ZYX | 7 | + | 143381779 | 143382248 |
103 | FLNA | X | - | 154359406 | 154359484 |
104 | RPL7A | 9 | + | 133349042 | 133349551 |
105 | PNN | 14 | + | 39177916 | 39179091 |
106 | IL2RG | X | - | 71109390 | 71110156 |
107 | PTPRC | 1 | + | 198708261 | 198709687 |
108 | TAF10 | 11 | - | 6611818 | 6611958 |
109 | PCSK6 | 15 | - | 101384425 | 101389464 |
110 | HSP90B1 | 12 | + | 103932418 | 103932826 |
111 | PCSK6 | 15 | - | 101398576 | 101427892 |
112 | IGHM | 14 | - | 105851974 | 105852148 |
113 | CCDC92 | 12 | - | 123943493 | 123944272 |
114 | SSR2 | 1 | - | 156015069 | 156018270 |
115 | GAS5 | 1 | - | 173864304 | 173864484 |
116 | HSP90B1 | 12 | + | 103934287 | 103937695 |
117 | GAS5 | 1 | - | 173864704 | 173865229 |
118 | ITGA2B | 17 | - | 44386131 | 44389286 |
119 | LCK | 1 | + | 32279994 | 32280079 |
120 | RPL32 | 3 | - | 12840242 | 12841494 |
121 | RPL12 | 9 | - | 127449708 | 127450731 |
122 | CD53 | 1 | + | 110894401 | 110894960 |
123 | CD53 | 1 | + | 110897892 | 110899124 |
124 | PRKCB | 16 | + | 24035547 | 24092791 |
125 | PTPRC | 1 | + | 198749549 | 198750492 |
126 | HBD | 11 | - | 5233092 | 5233991 |
127 | UBE2H | 7 | - | 129879642 | 129880895 |
128 | CRIP1 | 14 | + | 105488516 | 105488663 |
129 | CD3E | 11 | + | 118314494 | 118315486 |
130 | PCSK6 | 15 | - | 101432100 | 101443556 |
131 | PRKCB | 16 | + | 23837406 | 23988508 |
132 | ITGA2B | 17 | - | 44380153 | 44380246 |
133 | LCK | 1 | + | 32280210 | 32285514 |
134 | MS4A1 | 11 | + | 60464344 | 60465921 |
135 | TLN1 | 9 | - | 35714876 | 35715059 |
136 | GNAI2 | 3 | + | 50236453 | 50252100 |
137 | IL32 | 16 | + | 3068239 | 3068990 |
138 | RPL7A | 9 | + | 133349700 | 133349912 |
139 | PTPRC | 1 | + | 198752772 | 198754269 |
140 | DAP | 5 | - | 10748271 | 10761014 |
141 | RPS16 | 19 | - | 39433761 | 39435607 |
142 | RPL6 | 12 | - | 112408338 | 112408420 |
143 | RPS23 | 5 | - | 82277852 | 82278320 |
144 | CD53 | 1 | + | 110892533 | 110894327 |
145 | CD48 | 1 | - | 160679131 | 160681202 |
146 | ITGA2B | 17 | - | 44384138 | 44384311 |
147 | RPS10-NUDT3 | 6 | - | 34418424 | 34421730 |
148 | MPHOSPH8 | 13 | + | 19642270 | 19646443 |
149 | IL7R | 5 | + | 35871213 | 35873480 |
150 | KLRB1 | 12 | - | 9598653 | 9599767 |
151 | LUC7L3 | 17 | + | 50743810 | 50744652 |
152 | PTPRC | 1 | + | 198744203 | 198748109 |
153 | RBM6 | 3 | + | 50066502 | 50068690 |
154 | RPL10A | 6 | + | 35469529 | 35470179 |
155 | COTL1 | 16 | - | 84590262 | 84617501 |
156 | DEFA3 | 8 | - | 7016099 | 7016676 |
157 | IL7R | 5 | + | 35857059 | 35860852 |
158 | NPM1 | 5 | + | 171392816 | 171392914 |
159 | RPS7 | 2 | + | 3575888 | 3576487 |
160 | MS4A1 | 11 | + | 60466157 | 60466959 |
161 | IGHM | 14 | - | 105854737 | 105854917 |
162 | RPL18 | 19 | - | 48615447 | 48615877 |
163 | ITGA2B | 17 | - | 44383704 | 44383894 |
164 | RPS14 | 5 | - | 150447735 | 150449703 |
165 | ITGB2 | 21 | - | 44886430 | 44886736 |
166 | ATP6V1G2-DDX39B | 6 | - | 31530450 | 31530779 |
167 | RBM6 | 3 | + | 50075330 | 50077008 |
168 | PTPRC | 1 | + | 198702530 | 198703298 |
169 | CD3E | 11 | + | 118312866 | 118313707 |
170 | RPL6 | 12 | - | 112405376 | 112405853 |
171 | TLN1 | 9 | - | 35712124 | 35712835 |
172 | GNAI2 | 3 | + | 50252142 | 50252397 |
173 | IL7R | 5 | + | 35875587 | 35875983 |
174 | RPS16 | 19 | - | 39433569 | 39433665 |
175 | IL2RG | X | - | 71108346 | 71108599 |
176 | PTPRC | 1 | + | 198752371 | 198752594 |
177 | NPM1 | 5 | + | 171391799 | 171392710 |
178 | CD27 | 12 | + | 6451014 | 6451268 |
179 | ITGA2B | 17 | - | 44380645 | 44380879 |
180 | RPLP2 | 11 | + | 810039 | 810234 |
181 | MS4A6A | 11 | - | 60173129 | 60175402 |
182 | IL2RG | X | - | 71110295 | 71110504 |
183 | GAS5 | 1 | - | 173864506 | 173864675 |
184 | HNRNPDL | 4 | - | 82427564 | 82428018 |
185 | ITGA2B | 17 | - | 44380301 | 44380386 |
186 | RPS8 | 1 | + | 44778129 | 44778576 |
187 | RPL22 | 1 | - | 6193054 | 6197652 |
188 | ITGA2B | 17 | - | 44380490 | 44380600 |
189 | RPL24 | 3 | - | 101682492 | 101682771 |
190 | ITGA2B | 17 | - | 44385716 | 44385824 |
191 | PTPRC | 1 | + | 198734425 | 198735127 |
192 | IL32 | 16 | + | 3067613 | 3067984 |
193 | SELL | 1 | - | 169707449 | 169708417 |
194 | GAS2L1 | 22 | + | 29310734 | 29310827 |
195 | PTPRC | 1 | + | 198703372 | 198704472 |
196 | CD27 | 12 | + | 6450352 | 6450541 |
197 | RBM6 | 3 | + | 50062108 | 50065031 |
198 | RPS8 | 1 | + | 44776774 | 44777614 |
199 | RPL23 | 17 | - | 38853105 | 38853698 |
200 | RPS10-NUDT3 | 6 | - | 34421807 | 34424669 |
201 | FLNA | X | - | 154364957 | 154365136 |
202 | RPL34 | 4 | + | 108622024 | 108622105 |
203 | PTPRC | 1 | + | 198742367 | 198744054 |
204 | RPL24 | 3 | - | 101681215 | 101682429 |
205 | CD3E | 11 | + | 118313874 | 118314448 |
206 | PCSK6 | 15 | - | 101382209 | 101384322 |
207 | NENF | 1 | + | 212433120 | 212442565 |
208 | RPL9 | 4 | - | 39454649 | 39454864 |
209 | TRAF3IP3 | 1 | + | 209777487 | 209778111 |
210 | CCR7 | 17 | - | 40555818 | 40558893 |
211 | ITGA2B | 17 | - | 44385921 | 44386010 |
212 | FLNA | X | - | 154364372 | 154364526 |
213 | NPM1 | 5 | + | 171392978 | 171400153 |
214 | RPL6 | 12 | - | 112406890 | 112408240 |
215 | HNRNPDL | 4 | - | 82428446 | 82429248 |
216 | PCSK6 | 15 | - | 101366332 | 101370335 |
217 | GZMB | 14 | - | 24631214 | 24631858 |
218 | NCK2 | 2 | + | 105816613 | 105855048 |
219 | ZRANB2 | 1 | - | 71078565 | 71078656 |
220 | PTPRC | 1 | + | 198639341 | 198692347 |
221 | COTL1 | 16 | - | 84566955 | 84590105 |
222 | FLNA | X | - | 154360587 | 154361308 |
223 | RPS24 | 10 | + | 78040225 | 78040615 |
224 | ITGB5 | 3 | - | 124819834 | 124821313 |
225 | RPS16 | 19 | - | 39433418 | 39433522 |
226 | RPLP2 | 11 | + | 812633 | 812760 |
227 | RPS8 | 1 | + | 44777789 | 44778000 |
228 | FLNA | X | - | 154366470 | 154366562 |
229 | PTPRC | 1 | + | 198735252 | 198741869 |
230 | TRBC2 | 7 | + | 142801961 | 142802105 |
231 | FLNA | X | - | 154358568 | 154358984 |
232 | HLA-DRA | 6 | + | 32443921 | 32444652 |
233 | PTPRC | 1 | + | 198722476 | 198728340 |
234 | FLNA | X | - | 154364165 | 154364259 |
235 | PTPRC | 1 | + | 198718302 | 198722416 |
236 | NENF | 1 | + | 212444442 | 212445830 |
237 | RPS5 | 19 | + | 58394595 | 58394682 |
238 | HNRNPDL | 4 | - | 82424883 | 82426037 |
239 | PCSK6 | 15 | - | 101427980 | 101429987 |
240 | RPL6 | 12 | - | 112406037 | 112406294 |
241 | HLA-DRB1 | 6 | - | 32580270 | 32580746 |
242 | IL2RG | X | - | 71107921 | 71108277 |
243 | ITGA2B | 17 | - | 44385335 | 44385551 |
244 | RPL3 | 22 | - | 39318592 | 39319595 |
245 | PNISR | 6 | - | 99401630 | 99402540 |
246 | RPS3 | 11 | + | 75399577 | 75400694 |
247 | PTPRC | 1 | + | 198706952 | 198708133 |
248 | IGHM | 14 | - | 105855234 | 105855480 |
249 | NKG7 | 19 | - | 51372307 | 51372379 |
250 | SON | 21 | + | 33546379 | 33549476 |
251 | RPL17 | 18 | - | 49488566 | 49489359 |
252 | GNLY | 2 | + | 85695423 | 85695958 |
253 | FLNA | X | - | 154352447 | 154352553 |
254 | MTURN | 7 | + | 30135298 | 30157438 |
255 | FLNA | X | - | 154365259 | 154365349 |
256 | RPSA | 3 | + | 39412061 | 39412274 |
257 | TRAF3IP3 | 1 | + | 209778173 | 209779315 |
258 | FLNA | X | - | 154362578 | 154362661 |
259 | RPL9 | 4 | - | 39457681 | 39458194 |
260 | MS4A1 | 11 | + | 60463121 | 60464288 |
261 | RPL11 | 1 | + | 23693913 | 23694660 |
262 | PTPRC | 1 | + | 198742026 | 198742232 |
263 | MAF1 | 8 | + | 144105766 | 144105869 |
264 | ITGB5 | 3 | - | 124873531 | 124886931 |
265 | RPL18 | 19 | - | 48615946 | 48616079 |
266 | PTPRC | 1 | + | 198704498 | 198706734 |
267 | HSP90B1 | 12 | + | 103932942 | 103933956 |
268 | MTURN | 7 | + | 30157595 | 30158986 |
269 | RPL13 | 16 | + | 89560712 | 89560940 |
270 | TRIM58 | 1 | + | 247868063 | 247875900 |
271 | GAS2L1 | 22 | + | 29310998 | 29311462 |
272 | CD79B | 17 | - | 63929475 | 63929770 |
273 | SELL | 1 | - | 169696553 | 169701560 |
274 | TRAF3IP3 | 1 | + | 209775736 | 209777352 |
275 | ZRANB2 | 1 | - | 71076877 | 71078457 |
276 | RPL35 | 9 | - | 124860264 | 124861419 |
277 | PRKAR2B | 7 | + | 107045214 | 107070281 |
278 | HNRNPDL | 4 | - | 82427304 | 82427433 |
279 | FLNA | X | - | 154357623 | 154358199 |
280 | CD52 | 1 | + | 26318071 | 26320171 |
281 | RPL11 | 1 | + | 23692759 | 23693807 |
282 | RPS23 | 5 | - | 82276518 | 82277693 |
283 | GAS2L1 | 22 | + | 29310546 | 29310638 |
284 | SELL | 1 | - | 169703440 | 169704568 |
285 | CD79B | 17 | - | 63929324 | 63929434 |
286 | RPL24 | 3 | - | 101682907 | 101685818 |
287 | CD79A | 19 | + | 41880738 | 41880867 |
288 | ZYX | 7 | + | 143382685 | 143382801 |
289 | RPS21 | 20 | + | 62387388 | 62387611 |
290 | RPL14 | 3 | + | 40457991 | 40458642 |
291 | HLA-DRA | 6 | + | 32440032 | 32442448 |
292 | RPL12 | 9 | - | 127450804 | 127451281 |
293 | RPL18 | 19 | - | 48617423 | 48617791 |
294 | ZRANB2 | 1 | - | 71072548 | 71076795 |
295 | IL32 | 16 | + | 3068010 | 3068180 |
296 | COTL1 | 16 | - | 84617583 | 84617838 |
297 | RPSA | 3 | + | 39408724 | 39410754 |
298 | FLNA | X | - | 154359905 | 154359990 |
299 | IL7R | 5 | + | 35873648 | 35874449 |
300 | RPL9 | 4 | - | 39454225 | 39454533 |
301 | PRKCB | 16 | + | 24113069 | 24123835 |
302 | TRAF3IP3 | 1 | + | 209773019 | 209775349 |
303 | RNF213 | 17 | + | 80263778 | 80273241 |
304 | RPL5 | 1 | + | 92837633 | 92840551 |
305 | RPL7A | 9 | + | 133351071 | 133351262 |
306 | RPL7A | 9 | + | 133350319 | 133350597 |
307 | FLNA | X | - | 154354043 | 154354151 |
308 | TRBC2 | 7 | + | 142801427 | 142801944 |
309 | FLNA | X | - | 154353204 | 154353296 |
310 | TRIM58 | 1 | + | 247867867 | 247867963 |
311 | RPL37 | 5 | - | 40832573 | 40834181 |
312 | RPS21 | 20 | + | 62387674 | 62387843 |
313 | RPL14 | 3 | + | 40458736 | 40461407 |
314 | RPS13 | 11 | - | 17074466 | 17075097 |
315 | RPS8 | 1 | + | 44776140 | 44776675 |
316 | MS4A1 | 11 | + | 60467060 | 60468250 |
317 | FLNA | X | - | 154362332 | 154362418 |
318 | TSPAN33 | 7 | + | 129167560 | 129167773 |
319 | RPL10A | 6 | + | 35470351 | 35470580 |
320 | RPL5 | 1 | + | 92840639 | 92841766 |
321 | GZMA | 5 | + | 55105618 | 55107794 |
322 | PRKAR2B | 7 | + | 107122004 | 107128212 |
323 | ITGB5 | 3 | - | 124821474 | 124841383 |
324 | RPL19 | 17 | + | 39201319 | 39202317 |
325 | RPL5 | 1 | + | 92833660 | 92834779 |
326 | RPL14 | 3 | + | 40461661 | 40461939 |
327 | RPS4X | X | - | 72272772 | 72273232 |
328 | CCDC92 | 12 | - | 123942785 | 123943347 |
329 | FLNA | X | - | 154359154 | 154359246 |
330 | FLNA | X | - | 154355072 | 154357434 |
331 | MTURN | 7 | + | 30146299 | 30157438 |
332 | RPL28 | 19 | + | 55386693 | 55387930 |
333 | FLNA | X | - | 154359646 | 154359732 |
334 | RPS6 | 9 | - | 19379618 | 19380190 |
335 | PRKCB | 16 | + | 24092947 | 24094163 |
336 | GMPR | 6 | + | 16254735 | 16274415 |
337 | RPS13 | 11 | - | 17075623 | 17077168 |
338 | RPL11 | 1 | + | 23695908 | 23696344 |
339 | RPL5 | 1 | + | 92836392 | 92837456 |
340 | ITGB3 | 17 | + | 47291088 | 47292139 |
341 | PTPRC | 1 | + | 198750626 | 198752249 |
342 | PTPRC | 1 | + | 198728448 | 198729137 |
343 | PTPRC | 1 | + | 198732556 | 198734196 |
344 | LUC7L3 | 17 | + | 50741731 | 50743706 |
345 | PTPRC | 1 | + | 198754404 | 198755906 |
346 | CD53 | 1 | + | 110896733 | 110897809 |
347 | RPL23 | 17 | - | 38850214 | 38850362 |
348 | SELL | 1 | - | 169701688 | 169703255 |
349 | TRAF3IP3 | 1 | + | 209775489 | 209775599 |
350 | PRKCB | 16 | + | 24094297 | 24112973 |
351 | RPS3A | 4 | + | 151099714 | 151100485 |
352 | RPS12 | 6 | + | 132817061 | 132817480 |
353 | YWHAH | 22 | + | 31944820 | 31956139 |
354 | RPS21 | 20 | + | 62388364 | 62388457 |
355 | RPL7A | 9 | + | 133350727 | 133351002 |
356 | PTP4A2 | 1 | - | 31919658 | 31937987 |
357 | FLNA | X | - | 154358355 | 154358445 |
358 | IL32 | 16 | + | 3067415 | 3067554 |
359 | RPL11 | 1 | + | 23694791 | 23695798 |
360 | NENF | 1 | + | 212442625 | 212444339 |
361 | RPS24 | 10 | + | 78037304 | 78040204 |
362 | PTPRC | 1 | + | 198729171 | 198731617 |
363 | PTPRC | 1 | + | 198731726 | 198732300 |
364 | DAP | 5 | - | 10683571 | 10748175 |
365 | FLNA | X | - | 154354483 | 154354616 |
366 | GMPR | 6 | + | 16238780 | 16246842 |
367 | GNLY | 2 | + | 85696056 | 85697506 |
368 | SRSF5 | 14 | + | 69770540 | 69770995 |
369 | PRKCB | 16 | + | 23988590 | 24032136 |
370 | RPL36 | 19 | + | 5691453 | 5691532 |
371 | RPL37 | 5 | - | 40834265 | 40834471 |
372 | PTPRC | 1 | + | 198734232 | 198734328 |
373 | RPLP2 | 11 | + | 811645 | 812535 |
374 | ITGB3 | 17 | + | 47307637 | 47310139 |
375 | HNRNPDL | 4 | - | 82426129 | 82426463 |
376 | RPL12 | 9 | - | 127447726 | 127447877 |
377 | PNISR | 6 | - | 99408271 | 99409173 |
378 | RPL5 | 1 | + | 92833458 | 92833545 |
379 | RPL7A | 9 | + | 133350052 | 133350240 |
380 | AP1S2 | X | - | 15846011 | 15852346 |
381 | FLNA | X | - | 154352924 | 154353001 |
382 | PRKCB | 16 | + | 24032247 | 24035419 |
383 | NCK2 | 2 | + | 105745138 | 105816430 |
384 | RPL13 | 16 | + | 89561063 | 89561227 |
385 | RPL6 | 12 | - | 112406342 | 112406747 |
386 | RPS16 | 19 | - | 39435708 | 39435848 |
387 | NPM1 | 5 | + | 171400210 | 171400839 |
388 | RPS7 | 2 | + | 3576630 | 3577710 |
389 | FLNA | X | - | 154353457 | 154353554 |
390 | FLNA | X | - | 154352675 | 154352772 |
391 | PTPRC | 1 | + | 198699704 | 198702387 |
392 | RPS10-NUDT3 | 6 | - | 34424840 | 34425072 |
393 | RPL23 | 17 | - | 38850475 | 38852604 |
394 | RPLP2 | 11 | + | 810357 | 811597 |
395 | FLNA | X | - | 154362784 | 154364022 |
396 | RPS24 | 10 | + | 78035720 | 78037194 |
397 | RPL18 | 19 | - | 48616824 | 48617316 |
398 | FLNA | X | - | 154364719 | 154364821 |
399 | TRIM58 | 1 | + | 247864935 | 247867845 |
400 | NCK2 | 2 | + | 105855289 | 105881328 |
401 | NPM1 | 5 | + | 171407774 | 171410527 |
402 | ITGB5 | 3 | - | 124848558 | 124859242 |
403 | FLNA | X | - | 154361570 | 154361670 |
404 | FLNA | X | - | 154353727 | 154353915 |
405 | NCK2 | 2 | + | 105882049 | 105892982 |
406 | RPS21 | 20 | + | 62387914 | 62388309 |
407 | RPSA | 3 | + | 39411777 | 39411896 |
408 | NPM1 | 5 | + | 171400925 | 171405302 |
409 | PTPRC | 1 | + | 198748199 | 198749416 |
410 | ABHD17A | 19 | - | 1880115 | 1881235 |
411 | CD53 | 1 | + | 110895055 | 110896653 |
412 | GAS2L1 | 22 | + | 29308738 | 29310439 |
413 | RPL9 | 4 | - | 39458309 | 39458394 |
414 | RPS17 | 15 | - | 82536881 | 82538306 |
415 | RPL28 | 19 | + | 55388048 | 55388243 |
416 | RPL35 | 9 | - | 124858067 | 124860183 |
417 | PRKAR2B | 7 | + | 107070316 | 107121952 |
418 | RPL18 | 19 | - | 48616202 | 48616726 |
419 | FLNA | X | - | 154354291 | 154354381 |
420 | RPS17 | 15 | - | 82538371 | 82538880 |
421 | ITGB5 | 3 | - | 124841551 | 124848309 |
422 | FLNA | X | - | 154365486 | 154366024 |
423 | RPL36 | 19 | + | 5690600 | 5691319 |
424 | GNLY | 2 | + | 85697677 | 85698564 |
425 | RPL23 | 17 | - | 38852732 | 38853022 |
426 | RBM6 | 3 | + | 50065126 | 50066242 |
427 | C12orf75 | 12 | + | 105330937 | 105348602 |
428 | FLNA | X | - | 154366639 | 154366732 |
429 | RAC1 | 7 | + | 6374770 | 6387212 |
430 | RPL14 | 3 | + | 40461506 | 40461608 |
431 | GMPR | 6 | + | 16246961 | 16250284 |
432 | RPL5 | 1 | + | 92834913 | 92836190 |
433 | DAP | 5 | - | 10681169 | 10683529 |
434 | GMPR | 6 | + | 16250367 | 16254562 |
435 | CD3D | 11 | - | 118339494 | 118339775 |
436 | HNRNPDL | 4 | - | 82428179 | 82428278 |
437 | PRKCB | 16 | + | 23836348 | 23837375 |
438 | LYL1 | 19 | - | 13099734 | 13100657 |
439 | FLNA | X | - | 154366850 | 154367397 |
440 | ITGB5 | 3 | - | 124817710 | 124819739 |
441 | RPS7 | 2 | + | 3580260 | 3580805 |
본 발명의 일양태에서, 상기 엑손-접합 라이브러리는 상기 표 1에 기재된 단수 또는 복수의 엑손-접합을 포함하며, 상기 단수 또는 복수의 엑손-접합은 엑손-접합 번호 1, … , 엑손-접합 번호 n-1 및 엑손-접합 번호 n이되, 상기 n은 자연수로서 1 내지 441 중 어느 하나인 것을 특징으로 할 수 있다.
(e) 단계는 상기 각 엑손-접합에서의 염기서열 발현정보를 바탕으로 암 보유 여부를 결정하는 단계이다. 본 단계에서는 상기 각 엑손-접합에서의 염기서열 발현정보를 바탕으로 하여 개체에서 암 보유 여부를 결정한다.
본 발명의 일 구현예에서, 상기 (a) 내지 (d) 단계를 통해 수득한 피험자의 엑손-접합에서의 염기서열 발현정보를 미리 확보된 각 엑손-접합에서의 염기서열 발현량 데이터베이스와 비교함으로써 암 보유 여부를 결정할 수 있다. 예를 들어, 미리 확보된 데이터베이스에 암 환자에서 상향 조절되는 것으로 판별되어 있는 특정 엑손-접합에서의 염기서열 발현량이 정상인 대조군 대비 피험자의 염기서열 발현정보에서 증가되어 있는 경우 상기 피험자는 암을 보유하고 있는 것으로 결정할 수 있다. 이와 같은 결정은 단수 또는 복수의 엑손-접합에서의 염기서열 발현정보를 이용하여 수행될 수 있다.
바람직하게는, 상기 암 보유 여부 결정은 사전 학습된 암 여부 판별 모형에 피험자 상기 (a) 내지 (d) 단계를 통해 수득한 각 엑손-접합에서의 염기서열 발현정보를 적용함으로써 피험자의 암 보유 여부를 결정할 수 있다.
또한 암 여부 판별 모형으로부터 피험자의 암 여부 판별 스코어를 추출하며, 피험자의 엑손-접합에서의 염기서열 발현정보의 히트맵 시각화 결과 및 개인별 엑손-접합(exon-junction) 중요도 정보 등을 제공할 수 있다.
본 발명에서 상기 암 보유 여부의 결정은 하나 또는 2종류 이상의 암의 보유 여부를 결정하는 것일 수 있다. 바람직하게는 2종류 이상의 암의 보유 여부를 결정하는 것으로 상기 2종류 이상의 암의 보유 여부의 결정은 개체에서 분리한 1개의 시료에서 얻은 정보를 이용하여 동시에 또는 순차적으로 결정될 수 있다.
본 발명의 일 구현예에서, 상기 판별 모형은 공개 데이터 (예를 들어, GSE68086)을 이용하여 학습되고, 이를 검증한 모형을 이용할 수 있다. 일반적으로, 학습세트와 검증세트는 전체 세트를 6:4의 비율로 나누어 사용되며, 상기 획득된 엑손-접합 라이브러리 특성에 대해 학습세트를 이용하여 암 여부 판별 모형을 학습하고 검증세트를 이용하여 성능을 확인한 후 사용할 수 있다.
본 발명의 일 실시예에서는 판별 모형은 SVM (support vector machine) 알고리즘을 기반으로 하며, 개체의 혈소판 유래 전사체 데이터로부터 엑손-접합 바이오마커 특성들을 획득하고 이를 판별 모형에 입력함으로써 피험자의 샘플에 대한 암 정상 여부를 판별할 수 있었다. 또한 판별 모형에서는 암 또는 정상 여부에 대한 판별 스코어를 출력 값으로 출력할 수도 있다.
한편, SVM 알고리즘을 기반으로 학습한 판별 모형이라는 점은 일 예시에 불과하며 암 판별 모델을 학습할 때 사용할 수 있는 모든 머신러닝 방법 또는 유형은 모두 포함하는 것으로 해석되어야 한다. 예를 들어, 머신러닝 방법은 (1) 지도 학습법(supervised learning) (2) 비지도학습(unsupervised learing) (3) 강화학습(reinforcement learning) (4) 준지도학습(semi-supervised learning) (5) 뉴럴 네트워크 등을 포함할 수 있으며, 더욱 구체적으로는 나이브 베이즈 분류(Naive Bayes Classification), 로지스틱 회귀(Logistic Regression), 의사결정나무(Decision tree), 랜덤포레스트(Random forest), 부스팅(XGBoost/ensemble boosting/AdaBoost/Gradient Boost/LightGBM/CatBoost 등), 퍼셉트론(Perceptron), 서포트 벡터 머신(Support Vector Machine), 쿼드라틱 분류(Quadratic classifiers), 클러스터링(K-means clustering, Bayesian network clustering 등), 딥 뉴럴 네트워크(Deep Neural Network) 등을 모두 포함할 수 있으나 이에 한정되지 않는다.
본 발명에서 뉴럴 네트워크(neural network)란, 생물학적 신경망을 모방한 학습 알고리즘을 의미하며, 해당 알고리즘은 입력층, 최소 한 개의 은닉층 및 출력층으로 구성될 수 있으며, 각 층은 적어도 하나의 노드로 구성될 수 있다. 각 층의 노드들은 이전 층에 존재하는 노드들로부터 결과값을 입력 받아 수학적 모델에 기반한 연산을 수행하여 새로운 결과값을 출력하며, 새로운 결과값을 다음 층의 노드들로 전달한다. 본 발명에서의 뉴럴 네트워크는 컨볼루션 뉴럴 네트워크 (Convolutional Neural Network), 딥 뉴럴 네트워크 (Deep Neural Network) 뿐만 아니라, 본 발명의 바이오마커를 특성으로 하여 모형을 생성할 수 있는 모든 종류의 뉴럴 네트워크를 포함한다.
한편, 상기 암 보유 여부의 결정 결과는 추가적으로 개체의 암에 대한 판별 스코어와 개체의 엑손-접합 데이터의 시각화 및 개체의 엑손-접합 중요도 정보를 통합하여 제공할 수 있다. 예를 들어, 사전 학습된 암 여부 판별 모형에 개체의 엑손-접합 바이오마커 특성을 입력하면 암 또는 정상에 대한 예측 확률을 얻고, 이에 기반한 암 여부 판별 결과를 통보해주는 수단이다. 또한 해당 바이오마커들의 발현 패턴을 시각적으로 보여주며 개인의 엑손-접합 중요도를 분석함에 따라 개인에 대한 다양한 예후를 제공할 수도 있다.
본 발명에서 상기 암은 그 종류가 특별히 제한되지 않으나, 방광암, 뼈암, 혈액암, 유방암, 흑색종양, 갑상선암, 부갑상선암, 골수암, 직장암, 인후암, 후두암, 폐암, 식도암, 췌장암, 대장암, 위암, 설암, 피부암, 뇌종양, 자궁암, 두부 또는 경부암, 담낭 암, 구강암, 결장암, 항문 부근암, 중추신경계 종양, 간암 및 대장암으로 이루어진 군에서 선택되는 하나 이상일 수 있다.
본 발명은 또한 상기 표 1의 엑손-접합(exon-junction)으로 이루어진 군에서 선택된 단수 또는 복수의 엑손-접합; 또는 상기 표 1의 엑손-접합으로 이루어진 군에서 선택된 단수 또는 복수의 엑손-접합을 검출할 수 있는 제제를 유효성분으로 포함하는 암 진단용 조성물로서, 상기 엑손-접합은 (i) 개체의 혈액의 무핵세포에서 분리한 RNA 또는 이의 일부, (ii) 개체의 혈액의 엑소좀에서 분리한 전체 RNA 또는 이의 일부, 및 (iii) 개체의 혈액에서 분리한 전체 cfRNA 또는 이의 일부에서 검출하는 것을 특징으로 하는 암 진단용 조성물 및 상기 암 진단용 조성물을 포함하는 암 진단 키트를 제공한다.
본 발명에서 상기 단수 또는 복수의 엑손-접합을 검출할 수 있는 제제란 상기 단수 또는 복수의 엑손-접합 부위를 증폭할 수 있는 프라이머쌍, 바람직하게는, 상기 표 1에서 각 엑손-접합에서 position 1 및 position 2의 각 염기를 포함하면서 5'방향 및/또는 3'방향으로 연속되는 2 이상의 염기를 포함하는 서열을 특이적으로 증폭할 수 있는 프라이머쌍일 수 있다.
본 발명에서 사용하는 용어인 프라이머란 짧은 자유 3'말단 수산화기(free 3' hydroxyl group)를 가지는 핵산 서열로 상보적인 주형(template)과 염기쌍(base pair)을 형성할 수 있고 주형 가닥 복사를 위한 시작 지점으로 기능을 하는 짧은 핵산 서열을 의미한다. 프라이머는 적절한 완충용액 및 온도에서 중합반응을 위한 시약(DNA 중합효소 또는 역전사 효소) 및 상이한 4가지 dNTP (deoxynucleoside triphospate)의 존재하에서 DNA합성을 개시할 수 있다.
프라이머는 DNA 합성의 개시점으로 작용하는 프라이머의 기본 성질을 변화시키지 않는 추가의 특징을 혼입할 수 있다. 본 발명에서 상기 서열번호 1 내지 7의 염기서열을 포함하는 프라이머는 각각 서열 상동성이 95% 이상인 염 기서열을 포함하는 개념이다.
본 발명에서 상기 프라이머는 포스포르아미다이트 고체 지지체 방법, 또는 기타 널리 공지된 방법을 사용하여 화학적으로 합성할 수 있다. 이러한 핵산 서열은 또한 당해 분야에 공지된 많은 수단을 이용하여 변형시킬 수 있다. 이러한 변형의 비제한적인 예로는 메틸화, "캡화", 천연 뉴클레오타이드 하나 이상의 동족체로의 치환, 및 뉴클레오타이드 간의 변형, 예를 들면, 하전되지 않은 연결체(예: 메틸 포스포네이트, 포스포트리에스테르, 포스포로아미데이트, 카바메이트 등) 또는 하전된 연결체(예: 포스포로티오에이트, 포스포로디티오에이트 등)로의 변형이 있다. 핵산은 하나 이상의 부가적인 공유 결합된 잔기, 예를 들면, 단백질(예: 뉴클레아제, 독소, 항체, 시그날 펩타이드, 폴리-L-리신 등), 삽입제(예: 아크리딘, 프소랄렌 등), 킬레이트화제(예: 금속, 방사성 금속, 철, 산화성 금속 등), 및 알킬화제를 함유할 수 있다.
또한, 본 발명에서 상기 프라이머 핵산 서열은 필요한 경우, 분광학적, 광화학적, 생화학적, 면역화학적 또는 화학적 수단에 의해 직접적으로 또는 간접적으로 검출 가능한 표지를 포함할 수 있다. 표지의 예로는, 효소 (예를 들어, 호스래디쉬 퍼옥시다제, 알칼린 포스파타아제), 방사성 동위원소(예를 들어, 32P), 형광성 분자, 화학그룹(예를 들어, 바이오틴) 등이 있다.
본 발명에서 상기 진단용 키트는 본 발명에 따른 상기 바이오마커인 단수 또는 복수의 엑손-접합 부위를 검출하기 위해 사용될 수 있다. 본 발명의 상기 키트에는 상기 단수 또는 복수의 엑손-접합 부위를 검출하기 위한 프라이머, 프로브, 안티센스 핵산뿐만 아니라 분석 방법에 적합한 한 종류 또는 그 이상의 다른 구성성분 조성물, 용액 또는 장치가 포함될 수 있다.
구체적인 일례로서, 본 발명의 키트는 PCR을 수행하기 위해, 분석하고자 하는 시료로부터 유래된 mRNA 및/또는 이와 상보적인 cDNA에 대해 특이적인 프라이머 세트, 적당량의 DNA 중합효소, dNTP 혼합물, PCR 완충용액 및 물을 포함하는 키트일 수 있다. 상기 PCR 완충용액은 KCl, Tris-HCl 및 MgCl2를 함유할 수 있다. 이외에 PCR 산물의 증폭 여부를 확인할 수 있는 전기영동 수행에 필요한 구성 성분들이 본 발명의 키트에 추가로 포함될 수 있다.
다른 구체적인 일례로서, 본 발명의 키트는 DNA 칩(chip)을 수행하기 위해 필요한 필수 요소를 포함하는 키트일 수 있다. DNA 칩 키트는, 유전자 또는 그의 단편에 해당하는 cDNA가 프로브로 부착되어 있는 기판, 형광표식 프로브를 제작하기 위한 시약, 제제, 효소 등을 포함할 수 있다. 또한, 기판은 정량 대조군 유전자 또는 그의 단편에 해당하는 cDNA를 추가로 포함할 수 있다.
한편, 상기 키트에는 실험상의 편의, 안정화 및 반응성 향상을 위해 안정화제 및/또는 비반응성 염료 등을 포함할 수 있다.
상기 비반응성 염료 물질이란 중합효소연쇄반응에 영향을 미치지 않는 물질로부터 선택되어져야 하며, 중합효소연쇄반응 산물을 이용한 분석이나 식별을 위해 사용되는 것을 목적으로 한다. 이러한 조건을 만족시키는 물질로는 로다민, 탐라, 락스, 브로모페놀 블루, 크실렌 시아놀, 브로모크레졸 레드, 크레졸 레드 등의 수용성 염료로 사용될 수 있다. 상기 비반응성 염료 물질은 조성물 전체 중량 대비 0.0001∼0.01중량%의 함량으로 포함될 수 있으며, 0.001∼0.005중량%의 함량으로 포함되는 것이 바람직하다. 조성물 전체 중량 대비 0.01중량% 초과의 함량으로 첨가되는 경우 중합효소연쇄반응 시 고농도의 수용성 염료가 반응 저해제로 작용될 수 있는 문제점이 있다.
또한, 상기 다가알코올류는 본 발명의 키트 구성성분을 보다 안정화시키기 위한 안정화 물질로 사용될 수 있으며, 글루코스, 글리세롤, 만니톨, 갈락시톨, 글루시톨, 솔비톨 중 하나 이상의 물질을 사용할 수 있다.
상기 키트 구성성분은 액상 형태로 제공될 수 있으며, 안정성, 보관의 간편성 및 장기 보관성을 증가시키기 위하여 건조된 상태인 것이 바람직하다. 상기 건조는 일반적인 상온건조, 가온건조, 동결건조, 감압건조와 같은 공지의 건조 방법에 의해 수행될 수 있으며, 조성물의 성분이 손실되지 않는 한, 임의의 건조 방법은 모두 사용 가능하다.
본 발명에서는 또한 다양한 DNA 중합효소가 본 발명의 증폭 단계에 이용될 수 있으며, E. coli DNA 중합효소 I의 "클레나우" 단편, 열안정성 DNA 중합효소 및 박테리오파아지 T7 DNA 중합효소가 이에 포함될 수 있으나, 이에 제한되는 것은 아니다. 바람직하게는, 중합효소는 다양한 박테리아 종으로부터 얻을 수 있는 열안정성 DNA 중합효소이고, 이는 Thermus aquaticus (Taq), Thermus thermophilus (Tth), Thermus filiformis, Thermis flavus, Thermococcus literalis, 및 Pyrococcus furiosus(Pfu)를 포함한다. 상기 중합효소 대부분은 박테리아 그 자체로부터 분리될 수 있고 또는 상업적으로 구입할 수 있다. 또한, 본 발명의 키트에서 이용되는 중합효소는 중합효소를 암호화하는 클로닝 유전자의 높은 레벨을 발현하는 세포로부터 수득할 수 있다.
암 진단을 위한 분석
본 진단 방법은 특정 대상체에서 병태, 특히 질환의 존재를 진단하거나, 병태의 특징을 규명하거나 (예를 들어, 암의 병기를 결정하거나 또는 암의 이질성을 결정한다), 병태의 치료제에 대한 효능을 확인하거나, 병태의 치료에 대한 반응을 모니터링하거나, 병태 또는 병태의 후속 과정의 발생 위험을 예후 예측/진단하기 위해 사용될 수 있다. 본 기재 내용은 또한, 특별한 치료 요법의 효능을 결정하는 데 유용할 수 있다. 또 다른 예에서, 특정의 치료 요법은 시간 경과에 따른 암의 프로파일 변화와 상관이 있을 수 있다. 이러한 상관 관계는 요법을 선택하는 데 유용할 수 있다. 부가적으로, 치료 후에 암에 차도가 있는 것으로 관찰되는 경우, 본 진단 방법은 잔여 질환 또는 질환의 재발을 모니터링하기 위해 사용될 수 있다.
본 발명에 따른 엑손-접합에서의 염기서열정보는 또한, 특이적 형태의 암의 특징을 규명하기 위해 사용될 수 있다. 암은 종종, 조성과 병기 둘 다에 있어서 이질적이다. 유전적 프로파일 데이터는 특이적 하위 유형의 암을 진단 또는 치료하는 데 중요할 수 있는, 그러한 특이적 하위 유형의 암의 특징 규명을 허용할 수 있다. 이러한 정보는 또한, 특이적 유형의 암의 예후에 관한 대상체 또는 실무자에게 단서를 제공할 수 있으며, 대상체 또는 실무자가 질환의 진행에 따라 치료 옵션을 채택하도록 허용할 수 있다. 일부 암은 보다 공격적이고 유전적으로 불안정하도록 진행될 수 있다. 다른 암은 양성, 비활성 또는 휴면 상태로 남아있을 수 있다. 본 기재 내용의 방법은 질환 진행을 결정하는데 유용할 수 있다.
마커 및 패널
본 발명에서 마커는 정상 샘플군과 암 샘플군에서, 각 엑손-접합 종류에 대해 계수되고 정규화된 값들을 비교하여 암 샘플군에서 유의하게 높게 나오거나 낮게 나오는 정해진 길이의 서열을 마커로 선정할 수 있다. 가장 간단하게는 각 엑손-접합 부위에서, 정상샘플군과 암샘플군에서의 평균값의 차이를 이용하며, 또는 T-test, Mann-Whitney test, Wilcoxon Test, 또는 Cohen's D test 등의 다양한 통계기법을 사용하여 두 샘플군에서 유의한 차이가 나는 서열들을 선택한다.
본 발명은 진단 마커로서 각 마커를 개별적으로 사용하거나, 전체적으로 사용하거나, 몇몇 마커를 조합하여 패널 디스플레이 형태로 하여 사용할 수 있고, 몇몇의 마커는 전체적인 패턴의 목록을 통하여 신뢰성 및 효율성을 향상시키는 것을 확인할 수 있다. 본 발명에서 확인된 마커는 개별적으로, 또는 조합된 마커 세트로 사용될 수 있다. 마커들은 마커의 수 및 그 중요도에 따라 순위를 매길 수 있고, 가중치를 둘 수 있으며, 질환으로 발전할 가능성의 수준을 선정할 수 있다. 이러한 알고리즘은 본 발명에 속한다.
한편, 본 발명은 상기 표 1의 엑손-접합(exon-junction)으로 이루어진 군에서 선택된 단수 또는 복수의 엑손-접합을 유효성분으로 포함하는 암 진단용 조성물이며, 상기 엑손-접합은 (i) 개체의 혈액의 무핵세포에서 분리한 RNA 또는 이의 일부, (ii) 개체의 혈액의 엑소좀에서 분리한 전체 RNA 또는 이의 일부, 및 (iii) 개체의 혈액에서 분리한 전체 cfRNA 또는 이의 일부에서 검출하는 것을 특징으로 하는 암 진단용 조성물을 제공한다.
아울러, 본 발명은 상기 조성물을 포함하는 암 진단 키트를 제공한다.
아울러, 본 발명은 상기 표 1의 엑손-접합으로 이루어진 군에서 선택된 단수 또는 복수의 엑손-접합을 검출할 수 있는 제제를 포함하는, 암 진단용 조성물이며, 상기 엑손-접합은 (i) 개체의 혈액의 무핵세포에서 분리한 전체 RNA 또는 이의 일부, (ii) 개체의 혈액의 엑소좀에서 분리한 전체 RNA 또는 이의 일부, 및 (iii) 개체의 혈액에서 분리한 전체 cfRNA 또는 이의 일부에서 검출하는 것을 특징으로 하는 암 진단용 조성물을 제공한다.
아울러, 본 발명은 암 진단용 조성물을 제조하기 위한, 상기 표 1의 엑손-접합(exon-junction)으로 이루어진 군에서 선택된 단수 또는 복수의 엑손-접합을 검출할 수 있는 제제의 용도를 제공한다.
아울러, 본 발명은 (a) (i) 개체의 혈액의 무핵세포(anucleated cells)에서 분리한 전체 RNA 또는 이의 일부, (ii) 개체의 혈액의 엑소좀(exosome)에서 분리한 전체 RNA 또는 이의 일부, 및 (iii) 개체의 혈액에서 분리한 전체 cfRNA(cell-free RNA) 또는 이의 일부로 이루어진 군에서 선택된 하나 이상의 RNA를 분리하는 단계; (b) 상기 (a) 단계에서 분리한 RNA에 대한 상보적 DNA (cDNA)을 합성하는 단계; (c) 상기 cDNA의 염기서열정보를 수득하는 단계; (d) 상기 cDNA 염기서열정보를 미리 정해진 엑손-접합(exon-junction) 라이브러리와 대비하여 각 엑손-접합에서의 염기서열 발현정보를 수득하는 단계; 및 (e) 상기 각 엑손-접합에서의 염기서열 발현정보를 바탕으로 암 보유 여부를 결정하는 단계를 포함하는, 암 진단 방법을 제공한다.
본 명세서에서 용어 “을 포함하는(comprising)”이란 “함유하는(including)” 또는 “특징으로 하는(characterized by)”과 동일한 의미로 사용되며, 본 발명에 따른 조성물 또는 방법에 있어서, 구체적으로 언급되지 않은 추가적인 구성 성분 또는 방법의 단계 등을 배제하지 않는다. 또한 용어 “로 이루어지는(consisting of)”이란 별도로 기재되지 않은 추가적인 요소, 단계 또는 성분 등을 제외하는 것을 의미한다. 용어 “필수적으로 이루어지는(essentially consisting of)”이란 조성물 또는 방법의 범위에 있어서, 기재된 물질 또는 단계와 더불어 이의 기본적인 특성에 실질적으로 영향을 미치지 않는 물질 또는 단계 등을 포함할 수 있는 것을 의미한다.
위 과제의 해결 수단은 일 예시에 불과한 것으로서 통상의 기술자가 이해할 수 있는 범위에 속하며, 위 수단에 포함된 기술적 사상과 동일한 범위의 수단은 모두 포함하는 것으로 해석되어야 할 것이다.
따라서, 본 발명의 방법은 암의 진단, 치료 요법에 대한 모니터링, 암 환자의 예후에 필요한 정보를 제공할 수 있어 항암 치료에 유용하게 이용될 수 있다.
도 1는 441개의 엑손-접합 라이브러리를 선별하는 과정에 대한 흐름도이다.
도 2은 선별된 441개 엑손-접합 라이브러리의 특성에 대한 일 예시도를 나타낸 것이다.
도 3는 엑손-접합의 정의를 나타낸 것이다.
도 4는 엑손-접합에서의 리드수를 계수하는 단계의 과정을 나타낸 것이다.
도 5a는 암 여부 판별 모형 학습 시 사용된 학습 데이터셋 샘플들의 441개 엑손-접합 라이브러리에 대한 발현 패턴의 일 예시도를 나타낸 것이다.
도 5b는 암 여부 판별 모형 학습 시 사용된 검증 데이터셋 샘플들의 441개 엑손-접합 라이브러리에 대한 발현 패턴의 일 예시도를 나타낸 것이다.
도 6a는 본원의 일 실시예에 따라 441개 엑손-접합 라이브러리에 의한 암과 정상 판별 모형 성능을 설명하기 위해 선행연구에서 유전자 1,072개를 특성으로 하여 학습한 Support Vector Machine (SVM) 모형의 AUC 스코어를 나타낸 것이다.
도 6b는 본원의 일 실시예에 따라 441개 엑손-접합 라이브러리에 의한 암과 정상 판별 모형 성능을 설명하기 위해 본원에 따른 441개 엑손-접합 라이브러리를 특성으로 하여 학습한 DNN 모형의 AUC 스코어를 나타낸 것이다.
도 7은 441개 엑손-접합 라이브러리에 의한 모형과 선행 연구의 1,072개 유전자를 이용한 모형 성능을 비교하기 위한 일 예시도를 나타낸 것이다.
도 8은 정상 샘플에 비해 암 샘플에서 발현이 낮아지는 엑손-접합(exon-junction) 중 암과 정상 샘플에서의 차이가 가장 큰 엑손-접합(exon-junction)의 정량 정보에 대한 일 예시도를 나타낸 것이다.
도 9a는 학습 데이터셋 샘플들에 대하여 정상 샘플에 비해 암 샘플에서 발현이 낮아지는 엑손-접합(exon-junction) 중 암과 정상 샘플에서의 차이가 가장 큰 엑손-접합(exon-junction)의 정규화된 발현값에 대한 일 예시도를 나타낸 것이다.
도 9b는 검증 데이터셋 샘플들에 대하여 학습 데이터셋으로 선별한 정상 샘플에 비해 암 샘플에서 발현이 높아지는 엑손-접합(exon-junction) 중 암과 정상 샘플에서의 차이가 가장 큰 엑손-접합(exon-junction)의 정규화된 발현값에 대한 일 예시도를 나타낸 것이다.
도 10은 본원에 따른 441개 엑손-접합 라이브러리의 전체 또는 일부를 사용한 암 여부 판별 모형의 성능을 확인한 일 예시도이다.
아래에서는 첨부한 도면을 참조하여 본원이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 본원의 실시예를 상세히 설명한다. 그러나 본원은 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시예에 한정되지 않는다.
본원 명세서 전체에서, 어떤 부분이 어떤 구성 요소를 “포함” 한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성 요소를 제외하는 것이 아니라 다른 구성 요소를 더 포함할 수 있는 것을 의미한다.
본원 명세서 전체에서 사용하는 정도의 용어 “약”, “실질적으로” 등은 언급된 의미에 고유한 제조 및 물질 허용오차가 제시될 때 그 수치에서 또는 그 수치에 근접한 의미로 사용되고, 본원의 이해를 돕기 위해 정확하거나 절대적인 수치가 언급된 개시 내용을 비양심적인 침해자가 부당하게 이용하는 것을 방지하기 위해 사용된다. 본원 명세서 전체에서 사용하는 정도의 용어 “~(하는) 단계” 또는 “~의 단계”는 “~ 를 위한 단계”를 의미하지 않는다.
본원 명세서 전체에서, “바이오마커”란 단백질이나 DNA, RNA, 대사 물질 등을 이용해 몸 안의 변화를 알아낼 수 있는 지표로서, 더 구체적으로는 '서열번호 1 내지 882'로 표현되는 유전자 서열의 전부 또는 그 일부 또는 '엑손-접합 라이브러리 1 내지 441'로 표현되는 “엑손-접합 라이브러리”을 포함하는 용어이다.
본원 명세서 전체에서, “엑손-접합 라이브러리”이란, 본원에서 청구하는 유전자 서열의 일부 조합을 의미한다. 본원이 청구하는 유전자 서열은 두 개씩 동시에 사용되는 것이 바람직하며, 이에 따른 조합을 표 2에 정리하였다. 예를 들어, 본원 명세서 전체에서 '엑손-접합 라이브러리 1'이란 '서열번호 1 및 서열번호 2'를 의미하는 것이며, '엑손-접합 라이브러리 441'이란 '서열번호 881 및 서열번호 882'를 의미하는 것이다.
이하, 본 발명의 이해를 돕기 위하여 바람직한 실시예를 제시한다. 그러나 하기의 실시예는 본 발명을 보다 쉽게 이해하기 위하여 제공되는 것일 뿐, 이에 의해 본 발명의 내용이 한정되는 것은 아니다.
실험방법
1. 혈액에서 혈소판 분리
EDTA가 들어있는 검체 용기에 6mL의 혈액을 채취하여 이를 15mL 코니칼 (conical) 튜브로 옮기어 120g에서 20분간 원심 분리를 하였다. 상층의 혈소판 풍부 혈장의 1.3mL를 e-tube로 옮기고 360g로 20분간 원심분리를 후 상층액을 제거하고 침전된 혈소판을 확보하였다. RNAlater (ThermoFishcer) 30uL를 혈소판에 넣은 후 조심스럽게 혈소판 침전을 풀어주고, 4℃에서 하루 보관한 후에 영하 80℃에서 이후의 분석이 진행되기 전까지 보관하였다.
2. total RNA 분리 및 cDNA 합성
RNAlater처리된 혈소판 샘플에서 total RNA를 mirVana miRNA Isolation Kit (ThermoFisher) 등을 사용하여 분리하였다. 이때 얻어진 RNA의 양이 500pg 이상이며 RIN 값이 6이상이고 5S/28S/18S의 피크가 잘 나오는지를 확인 후 RNA 시퀀싱 라이브러리 제작을 위한 cDNA를 합성하였다.
3. NGS 서열분석
일루미나사의 장비를 이용하여 Paired-end로 FASTAQ 포맷으로 제조사의 지침에 따라서 시퀀싱 데이터를 생산하였다. 생산된 데이터의 어댑터 서열 및 퀄리티가 낮은 베이스들을 제거하고 시퀀싱된 리드 (read)를 참조 유전체에 매핑하여 sam 파일을 생성하였다. 생성된 sam 파일은 각 리드별로 참조 유전체에서의 염색체 번호 및 위치 정보를 담고 있다. sam 파일은 용량이 매우 크므로 sam 파일을 bam 파일로 변환하여 사용하였다. 또한 참조 유전체에 정확하게 매핑된 리드만을 사용하기 위하여 bam 파일로부터 primary alignment가 아닌 리드들은 제거하였다.
4. 엑손-접합(exon-junction) 수 계산
엑손-접합 수는 선별된 리드 (read) 중 한 유전자 내 서로 다른 두 개의 엑손의 최말단, 즉 상위 위치의 엑손의 끝 부분과 하위 위치의 엑손의 시작 부분으로부터 시작하여 최소 1개 이상의 연속된 엑손 영역 염기쌍을 포함하는 리드를 각각 계수하여 얻었으며, 해독되지 아니하는 인트론 부분을 포함하는 리드의 경우 계수하지 아니하였다.
실시예 1. 바이오마커 선별
본 발명에서는 공개 혈소판 전사체 데이터(GSE68086)를 사용하였으며, 전체 세트 (283개 샘플)를 6:4의 비율로 나누어 각각 학습 데이터셋 (175개 샘플)과 검증 데이터셋 (108개 샘플)으로 사용하였다. 바이오마커 선별 및 암 여부 판별 모형에는 학습 데이터셋만 사용되며, 학습된 암 여부 판별 모형의 성능은 검증 데이터셋을 통해 확인하였다.
진단능이 있는 바이오마커 발굴을 위해 학습 데이터셋을 엑손-접합에 대한 CPM (counts per million mapped reads) 값으로 정규화 후 모든 샘플에서 log2CPM 값이 0인 엑손-접합을 제외하였으며, 성별에 대한 차이가 발생하지 않도록 Y 염색체에 존재하는 엑손-접합을 제외하였다. Batch-invariant 정규화 후 학습 데이터셋의 모든 샘플에 대한 분산이 0인 엑손-접합을 제외하고, 남은 엑손-접합에 대하여 Mann-Whitney test를 수행하여 임계값인 False Discovery Rate (FDR) 0.05 이하 및 log2FoldChange의 절댓값 1.4 이상을 만족하는 441개의 엑손-접합을 엑손-접합 바이오마커로 결정하였다. 도 1은 441개의 엑손-접합 바이오마커를 선별하는 과정에 대한 흐름도이다.
위에 따라 본 연구에서는 441개의 엑손-접합에 해당하는 엑손-접합 라이브러리를 도출하였으며, 해당 엑손-접합 라이브러리는 441개의 상위 위치의 엑손에서의 3' 접합지점 (position 1) 및 441개의 하위 위치의 엑손에서의 5' 접합지점 (position 2)를 포함하는 882개의 엑손-접합 지점으로 구성된다.
도 2는 위에 따라 도출된 441개 엑손-접합 라이브러리의 특성에 대한 일 예시도를 나타낸 것으로, log2FoldChange와 Mann-Whitney test FDR을 각각 x축, y축으로 하여 도출된 441개 엑손-접합 라이브러리가 암과 정상에서 유의하게 차이나는 정도를 나타낸 것이다. 바이오마커 선별에 사용된 임계값인 False Discovery Rate (FDR) 0.05 및 log2FoldChange 1.4, -1.4를 점선으로 표시하였다.
그 결과, 표시된 441개 엑손-접합 라이브러리가 암 샘플과 정상 샘플에서 유의한 차이가 나타났다. 각각의 점은 엑손-접합 라이브러리 발굴 분석에 사용된 모든 엑손-접합을 나타내며, 이 중 FDR 0.05 이하, log2FoldChange 1.4 이상으로 정상 샘플에 비해 암 샘플에서 발현이 높아지는 (상향조절) 엑손-접합 라이브러리는 빨간색, FDR 0.05 이하, log2FoldChange -1.4 이하로 발현이 낮아지는 (하향조절) 엑손-접합 라이브러리는 파란색으로 나타내었다. Log2FoldChange의 절댓값이 커질수록 암 샘플과 정상 샘플에서 발현값의 변화가 크다는 것을 의미하므로 x축의 양의 방향으로 갈수록 해당 엑손-접합에서 정상 샘플에 비해 암 샘플에서 발현이 높아지고 음의 방향으로 갈수록 정상 샘플에 비해 암 샘플에서 발현이 낮아짐을 의미한다. 또한 FDR이 작을수록 그래프의 y값은 커지므로 암 샘플과 정상 샘플에서 발현값 차이가 유의하고, 그 차이가 클수록 도면의 우상향 또는 좌상향에 위치한다. 441개 엑손-접합 라이브러리의 유전자 정보는 상기 표 1에 명시되어 있으며, 이 중 상향 조절되는 것은 156개이고, 하향 조절되는 것은 285개이다.
본 발명에서 서열번호 1 내지 882은 다음과 같이 정의된다. 서열번호 1 내지 882에서 홀수 서열번호는 상기 표 1에 명시된 각각의 엑손-접합 부위에서 position 1을 포함하면서 5'방향으로 150개의 염기서열을 나타낸 것이다. 예를 들어, 서열번호 1은 상기 표 1에서 엑손-접합 번호 1번의 position 1의 염기를 포함하면서 5'방향으로 150개의 염기서열을 나타내며, 서열번호 881은 상기 표 1에서 엑손-접합 번호 441번의 position 1의 염기를 포함하면서 5'방향으로 150개의 염기서열을 나타낸다. 그 다음으로, 서열번호 1 내지 882에서 짝수 서열번호는 상기 표 1에 명시된 각각의 엑손-접합 부위에서 position 2를 포함하면서 3'방향으로 150개의 염기서열을 나타낸 것이다. 예를 들어, 서열번호 2는 상기 표 1에서 엑손-접합 번호 1번의 position 2의 염기를 포함하면서 3'방향으로 150개의 염기서열을 나타내며, 서열번호 882는 상기 표 1에서 엑손-접합 번호 441번의 position 2의 염기를 포함하면서 3'방향으로 150개의 염기서열을 나타낸다. 즉, 각각의 홀수 서열번호에 포함된 150개의 염기 중 3'말단 염기는 상기 표 1에서 position 1에 해당하는 염기이며, 각각의 짝수 서열번호에 포함된 150개의 염기 중 5'말단 염기는 상기 표 1에서 position 2에 해당하는 염기이다.
본 발명에서 암 진단을 위한 엑손-접합 바이오마커는 홀수 서열번호에서 3'말단 염기(즉, 상기 표 1에서 position 1에 해당하는 염기) 및 짝수 서열번호에서 5'말단 염기(즉, 상기 표 1에서 position 2에 해당하는 염기)를 필수적으로 포함하면서, position 1을 기준으로 홀수 서열번호의 5'방향 및/또는 position 2를 기준으로 짝수 서열번호의 3'방향으로 연속되는 염기를 하나 이상 추가로 포함하는 염기서열일 수 있다.
본 발명의 일 구체예에서, 상기 암 진단을 위한 엑손-접합 바이오마커는 홀수 서열번호에서 3'말단 염기(즉, 상기 표 1에서 position 1에 해당하는 염기) 및 짝수 서열번호에서 5'말단 염기(즉, 상기 표 1에서 position 2에 해당하는 염기)를 필수적으로 포함하면서, position 1을 기준으로 홀수 서열번호의 5'방향 및/또는 position 2를 기준으로 짝수 서열번호의 3'방향으로 연속되는 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290 또는 298개의 염기를 추가로 포함하는 염기서열일 수 있다.
이하 표 2에는 서열번호 1 내지 882의 염기서열을 나타내었다. 하기 표 2에서 엑손-접합 번호는 상기 표 1의 엑손-접합 번호와 대응된다.
엑손-접합 번호 | 서열 번호 | 염기서열 | 서열 번호 | 염기서열 |
1 | 1 | GCCGGAGAGCTGGTGCTTGGGGCTCCTGGCGGCTATTATTTCTTAGGTACGTGCCCATCCGTACACCTCCCTCCCTTCTCGCGGCCCAAGGAGACCGCTTTGGGCTTCACACCCGCTGTCCCTCCCGCCCTAGGTCTCCTGGCCCAGGCT | 2 | CCGCCGACTCAAGGCCCCGCCCCTGTCCCCCAGCCCTCCTCCGGGCTCGCGCGCGCCTCCCTTCACCCCTGCGCTGACCCCTCCTCCTTGTCTCCTGCAGGCTGGGACAAGCGTTACTGTGAAGCGGGCTTCAGCTCCGTGGTCACTCAG |
2 | 3 | TTGATCCTGCTATTGTCATCTCTCCCAGTGGGAAGTACAATGCTGTCAAGCTTGGTAAATATGAAGATTCAAATTCAGTGACATGTTCAGTTCAACACGACAATAAAACTGTGCACTCCACTGACTTTGAAGTGAAGACAGATTCTACAG | 4 | ATCACGTAAAACCAAAGGAAACTGAAAACACAAAGCAACCTTCAAAGAGCTGCCATAAACCCAAAGGTTAGTTCAAATCAAAGGGCCAACTTCAGAATCAAGGGTTAAAGCAAACTCTGTAATTGTCCACTGGGGCCAAAATGTATCAGA |
3 | 5 | TAATAACCAAGGTTCTAAGCAAAGTTCTGAAAAGAAAACTTTTTGTAGTAAATATGCTAGCATAGACAAGTTCCTTGTGTTTTCCAACAGGTTTGCTTCAAAATCAATCCTTACAGCTTCAAGAACAGGAGAAACTCTTAACAAAGAAAG | 6 | ATCAGGCTTTGCCCGTGTGGAGTCCAAAGTCCTTCCCTAACGAAGTGGAGCCTGAGGGTACAGGGAAGGAGAAAGACTGGGATCTCAGAGACCAGCTGCAAAAGAAGACTTTGCAGCTCCAGGCCAAGGAAAAGGAGGTGAGAGGGTGAC |
4 | 7 | CCATTCAGTGGTTCCACAATGGGAATCTCATTCCCACCCACACGCAGCCCAGCTACAGGTTCAAGGCCAACAACAATGACAGCGGGGAGTACACGTGCCAGACTGGCCAGACCAGCCTCAGCGACCCTGTGCATCTGACTGTGCTTTCCG | 8 | AATGGCTGGTGCTCCAGACCCCTCACCTGGAGTTCCAGGAGGGAGAAACCATCATGCTGAGGTGCCACAGCTGGAAGGACAAGCCTCTGGTCAAGGTCACATTCTTCCAGAATGGAAAATCCCAGAAATTCTCCCATTTGGATCCCACCT |
5 | 9 | GTCACCACGGATCTGCGTCAGCGCTGTACCGATGGCCACACTGGGACCTCAGTCTCTGCCCCCATGGTGGCGGGCATCATCGCCTTGGCTCTAGAAGCAAAGTAAGTTCCCACTTACCTTTTTCTAAAAAAAAAAAATGTTTAGATTGTG | 10 | TACTGCTCGTGCGATGGCTACACCAACAGCATCTACACCATCTCCGTCAGCAGCGCCACCGAGAATGGCTACAAGCCCTGGTACCTGGAAGAGTGTGCCTCCACCCTGGCCACCACCTACAGCAGTGGGGCCTTTTATGAGCGAAAAATC |
6 | 11 | GGCATGGAAAGGCTGTAGTTGTTCACCTGCCCAAGAACTAGGAGGTCTGGGGTGGGAGAGTCAGCCTGCTCTGGATGCTGAAAGAATGTCTGTTTTTCCTTTTAGAAAGTTCCTGTGATGTCAAGCTGGTCGAGAAAAGCTTTGAAACAG | 12 | ATACGAACCTAAACTTTCAAAACCTGTCAGTGATTGGGTTCCGAATCCTCCTCCTGAAAGTGGCCGGGTTTAATCTGCTCATGACGCTGCGGCTGTGGTCCAGCTGAGGTGAGGGGCCTTGAAGCTGGGAGTGGGGTTTAGGGACGCGGG |
7 | 13 | TATTACATAACATTCTTATTCTTTTAACAGGTCCAGGAGAGCCTCAGATTATTTTTTGTAGAAGTGAAGCTGCACATCAAGGAGTAATTACCTGGAATCCCCCTCAAAGATCATTTCATAATTTTACCCTCTGTTATATAAAAGAGACAG | 14 | AAAAAGATTGCCTCAATCTGGATAAAAACCTGATCAAATATGATTTGCAAAATTTAAAACCTTATACGAAATATGTTTTATCATTACATGCCTACATCATTGCAAAAGTGCAACGTAATGGAAGTGCTGCAATGTGTCATTTCACAACTA |
8 | 15 | CCCTTCTTGTGGCTCCCAACCTGGGGCAGCCCCCTGCCTCCCTTTAGATGGGCAATCGGCTTAGAAAGTGGAGGGGAAGCCAGTGTGGATCTACTCACAGAATGTTCTTTTGGTTTCCAGCCAGGATTGCTACAGTTGTGATTGGAGGAG | 16 | TTGTGGCCATGGCGGCTGTGCCCATGGTGCTCAGTGCCATGGGCTTCACTGCGGCGGGAATCGCCTCGTCCTCCATAGCAGCCAAGATGATGTCCGCGGCGGCCATTGCCAATGGGGGTGGAGTTGCCTCGGGCAGCCTTGTGGCTACTC |
9 | 17 | AGGAGGTGCTGGGCTGTCAGAGGAAGCTGGTCTGGGCCTGGGAGTCTGTGCCAACTGCAAATCTGACTTTACTTTTAATTGCCTATGAAAATAAGGTCTCTCATTTATTTTCCTCTCCCTGCTTTCTTTCAGACTGTGGCTTTACCTCGG | 18 | TGTCCTACCAGCAAGGGGTCCTGTCTGCCACCATCCTCTATGAGATCCTGCTAGGGAAGGCCACCCTGTATGCTGTGCTGGTCAGCGCCCTTGTGTTGATGGCCATGGTAAGCAGGAGGGCAGGATGGGGCCAGCAGGCTGGAGGTGACA |
10 | 19 | CGCCCTCTCTCTTTCACACGTCACTGGCGTGACCGTCCGCGCTACATACTGCGCCTGCGCAAGGGCTGTGGCCCTTTTCCCACCCCCTAGCGCCGCTGGGCCTGCAGGTCTCTGTCGAGCAGCGGACGCCGGTCTCTGTTCCGCAGGATG | 20 | GGGTTTGTTAAAGTTGTTAAGAATAAGGCCTACTTTAAGAGATACCAAGTGAAATTTAGAAGACGACGAGGTACTGTCACCTTTTTGTGTTTACAATATTAATCTGCTTTGCAGATGCAGTGGAGTATCCTTTCTACAATTATTTTTTTC |
11 | 21 | GTATTGCAGCTCGGGATGTGGCAGGTGGGCTGCGGTCACTGGCCCAGGCCGCTAGGGGAGTCGCTGCACTGACGTCAGATCCTGCAGTGCAGGCCATTGTACTTGATACGGCCAGTGATGTGCTGGACAAGGCCAGCAGCCTCATTGAGG | 22 | CTCACCTTCTCCTTTCTCAAGCCCAATTCTTCCCCCTTCATCCTTAGATGGAGAAGTGTACCCAGGACCTGGGCAACAGCACCAAAGCCGTGAGCTCAGCCATCGCCCAGCTACTGGGAGAGGTTGCCCAGGGCAATGAGAATTATGCAG |
12 | 23 | GTATCTGTGGGAGCTAATCCTGTCCAGGTGGAAGTAGGAGAATTTGATGATGGTGCAGAGGAAACCGAAGAGGAGGTGGTGGCGGAAAGTATGTCCCTTCCCTGTAACTTGGCACATCCAAGCTGCCCTTGGCTGCCTGGGCCTGGGGCA | 24 | GGAGGGACCACGGGGTGGAGGGGAGATAGACCCAGCCCAGAGCTCTGAGTGGTTTCCTGTTGCCTGTCTCTAAACCCCTCCACATTCCCGCGGTCCTTCAGACTGCCCGGAGAGCGCGCTCTGCCTGCCGCCTGCCTGCCTGCCACTGAG |
13 | 25 | GATTCCTGAGCTGAAATGCAGATGACCACATTCAAGGAAGAACTTTCTGCCCCGGCTTTGCAGGATGAAAAGCTTTCCTGCTTGGCAGTTATTCTTCCACAAGAGAGGGCTTTCTCAGGACCTGGTTGCTACTGGTTCGGCAACTGCAGA | 26 | CCCTGGTTGGTGTGGGTTGTGGTGTTAGAGAAATCTCAGGTGGGAGATCTGGGGCTGGGACATTGTGTTGGAGGACAGATTTGCTTCAATAACTTTTAAGTGTATATCTTTTCCTCTTTTTCCCAGGACACTCTGGACTTCAGCCAACAG |
14 | 27 | TTTGCTGCACAAGAAAAACAAATGGAAGTTTGTGAAGTATGTGGAGCCTTTTTAATAGTAGGAGATGCCCAGTCCCGGGTAGATGACCATTTGATGGGAAAACAACACATGGGCTATGCCAAAATTAAAGCTACTGTAGAAGAATTAAAA | 28 | GAAAAGTTAAGGAAAAGAACCGAAGAACCTGATCGTGATGAGCGTCTAAAAAAGGAGAAGCAAGAAAGAGAAGAAAGAGAAAAAGAACGGGAGAGAGAAAGGGAAGAAAGAGAAAGGAAAAGACGAAGGGAAGAGGAAGAAAGAGAAAAA |
15 | 29 | GCTCTCAGCCCACCCACCTGGAAGCGCCCATGTGTCACCTTACCCAGGAGAGGGCGGCAGAGGCTGCGAGCAGGACGGCAGCCCCCTCTCCCCACCCCCAGGACCCTGAGATCCTGCTTCACGGGCTGCAAGAAGTTGGGGGGCCAGGAT | 30 | CTGGCAGCGAGCAGACCCCTGCCGGACACTCAGCAAACGGCAGCCTCACCCCGCAGGGCCGCGCCACTCCCCTTCCCCACCCCACCGCCGCGTCCCGGCTCAGCGCTCCCCGGGGAACGCAGGGGGACCGGGCTCGCTGCGTGACCTTGG |
16 | 31 | TGAGGGTCTCGGCCACCTTCTGGCAGAACCCCCGCAACCACTTCCGCTGTCAAGTCCAGTTCTACGGGCTCTCGGAGAATGACGAGTGGACCCAGGATAGGGCCAAACCCGTCACCCAGATCGTCAGCGCCGAGGCCTGGGGTAGAGCAG | 32 | ACTGTGGCTTTACCTCGGGTAAGTAAGCCCTTCCTTTTCCTCTCCCTCTCTCATGGTTCTTGACCTAGAACCAAGGCATGAAGAACTCACAGACACTGGAGGGTGGAGGGTGGGAGAGACCAGAGCTACCTGTGCACAGGTACCCACCTG |
17 | 33 | GTGGAGTCCAACTTGCCTGGACCAGCTTAATGGTTCTGGTAAGTATTAATGAAAACAGTAGATAGACTTAATGAAAATGCTGATGGTGATATGCTTACTGCTGAGCTAATGGCTTAAGGCTTGGCTGATGAATACTGACTGTATTTTCCT | 34 | AGATGTACTATCTGTCTGATGTATCTGGGGTAGTTGTGGTTTGCTGTTAATGGTTAAGCAGTGTACCACCAATCTACCATTAAAATATTTTTTGCTGACAATTTTGTATTAAAATTACAGGCATTAGACAGAAAGCTGGAAGTTGAAATG |
18 | 35 | CCTGAAGCCACCCTCTTCACTCTTTTTCAGCCATAGTTCATACCGAGAAGGTGAACATGATGTCCCTCACAGTGCTTGGGCTACGAATGCTGTTTGCAAAGACTGTTGCCGTCAATTTTCTCTTGACTGCCAAGTTATTTTTCTTGTAAG | 36 | GCTGACTGGCATGAGGAAGCTACACTCCTGAAGAAACCAAAGGCTTACAAAAATGCATCTCCTTGGCTTCTGACTTCTTTGTGATTCAAGTTGACCTGTCATAGCCTTGTTAAAATGGCTGCTAGCCAAACCACTTTTTCTTCAAAGACA |
19 | 37 | TTCGTTGGCGGGTGCCTGGGCTGGTGGGAACAGCCGCCCGAAGGAAGCACCATGATTTCGGCCGCGCAGTTGTTGGATGAGTTAATGGGCCGGGACCGAAACCTAGCCCCGGACGAGAAGCGCAGCAACGTGCGGTGGGACCACGAGAGC | 38 | GTTTGTAAATATTATCTCTGTGGTTTTTGTCCTGCGGAATTGTTCACAAATACACGTTCTGATCTTGGTAAGTGAATTTTCTGTGTAACTTTTATCAAATTTATGATATTTAAAATGTTGAATAGGAGTGGTGAAAGGAAAAAAACTGAT |
20 | 39 | ATTACTGGTGGTCTACCCTTGGACCCAGAGGTTCTTTGAGTCCTTTGGGGATCTGTCCTCTCCTGATGCTGTTATGGGCAACCCTAAGGTGAAGGCTCATGGCAAGAAGGTGCTAGGTGCCTTTAGTGATGGCCTGGCTCACCTGGACAA | 40 | TGTTGCTTACACTTTCTTCTGACATAACAGTGTTCACTAGCAACCTCAAACAGACACCATGGTGCATCTGACTCCTGAGGAGAAGACTGCTGTCAATGCCCTGTGGGGCAAAGTGAACGTGGATGCAGTTGGTGGTGAGGCCCTGGGCAG |
21 | 41 | GCTCAGGAAGCATGTGGACCTTTGGAGATGGATTCTGCACTGAGTGTGGTACAGAATCTAGAGAAAGATCTACAGGAAGTGAAGGCAGCAGCTCGAGATGGCAAGCTTAAACCCTTACCTGGGGAGACAGTAAGTATGTTTAAGACCTCA | 42 | CCTATTCCCCAACAGCCAGGTGGGAAGATGGTGGCAGCTGCAAAGGCCTCAGTGCCAACGATTCAGGACCAGGCTTCAGCCATGCAGCTGAGTCAGTGTGCCAAGAACCTGGGCACCGCGCTGGCTGAACTCCGGACGGCTGCCCAGAAG |
22 | 43 | ACCTGTGCACAGGTACCCACCTGTCCTTCCTCCGTGCCAACAGTGTCCTACCAGCAAGGGGTCCTGTCTGCCACCATCCTCTATGAGATCCTGCTAGGGAAGGCCACCCTGTATGCTGTGCTGGTCAGCGCCCTTGTGTTGATGGCCATG | 44 | GTCAAGAGAAAGGATTTCTGAAGGCAGCCCTGGAAGTGGAGTTAGGAGCTTCTAACCCGTCATGGTTTCAATACACATTCTTCTTTTGCCAGCGCTTCTGAAGAGCTGCTCTCACCTCTCTGCATCCCAATAGATATCCCCCTATGTGCA |
23 | 45 | GCTCAAGGAAAAACATGGCCTGCTATTGCAGAATACCAGCGTGCATTGCAGGAGAACGTCGCTATGGAACCTGCATCTACCAGGGAAGACTCTGGGCATTCTGCTGCTGAGCTTGCAGAAAAAGAAAAATGAGCTCAAAATTTGCTTTGA | 46 | CCATTCTCCTGGTGGCCCTGCAGGCCCAGGCTGAGCCACTCCAGGCAAGAGCTGATGAGGTTGCTGCAGCCCCGGAGCAGATTGCAGCGGACATCCCAGAAGTGGTTGTTTCCCTTGCATGGGACGAAAGCTTGGCTCCAAAGCATCCAG |
24 | 47 | GCATTAGACAGAAAGCTGGAAGTTGAAATGGTAAGTGAAACTGTATCCAAGTAAGCAGGTAACTGGGCAAACTTCCTACGGCACAAATGGCTTTTTAGTTACCTCCTAGTGCTGAATGCATTAAATAAATGGCGGATTCTTGTCTTGTTA | 48 | CTAGAATGATGAGGATCTTAACCACCATTATCTTAACTGAGGCACCCAAAATGGTGAGTTGGGGAACATAGAGAGTACACCTAAGTTCACATGAAGTTGTTTCTTCCCAGGTCCTAAAGAGCAAGCCTAACTCAAGCCATTGGCACACAG |
25 | 49 | GCCCTGCATGGGGGGGCATGACCTCTGACCTGTCCCCTGCCTCCAGGTGCCATCCACTTAGAATTCCAGGCCAGTGGGAATCACTACGTGTGGAGGAAGAGCACCTCAACTGTTCACAACATCATCGTGGGCAAGCTCTGGATCGACCAG | 50 | TCAGGGGACATCGAGATTGTGAACCATAAGACCAATGACCGGTGCCAGCTGAAGTTCCTGCCCTACAGCTACTTCTCCAAAGAGGCAGCCCGGAAGGTAAGCAGGACCAGCCACCTCTAAGCACCCCAGGGGGCCCAGGGCAGAGTCTGC |
26 | 51 | GACCTCTGGGGGCCACGCTGAGGTAGGTGGGACCCACCCTGGTGGCAGGGGCCAGGGGTGATGGCACCCCCTCACGGCCCTTCTCTTTGCAGCACGAAGGCAAACCCTACTGCAACCACCCCTGCTACGCAGCCATGTTTGGGCCTAAAG | 52 | GCTTTGGGCGGGGCGGAGCCGAGAGCCACACTTTCAAGTAAACCAGGTAGGTAGGACCCCACCCCCTATCCTGCCTCCTGGTTCCACCCTCGGGATGGGGATGCCCCCTCCCAGGGAGGCCTGACCACTCGTGGGCCCCAAAGGAGGCCG |
27 | 53 | GAGCACGGTCTGAATCTGCACAGAGCAAGATGCTGAGTGGAGTCGGGGGCTTTGTGCTGGGCCTGCTCTTCCTTGGGGCCGGGCTGTTCATCTACTTCAGGAATCAGAAAGGTGAGGAGCCTTTGGGAGCTGGCTCTCTCCATAGGCTTT | 54 | AGGCTGGGATGGTGTCCACAGGCCTGATCCAGAATGGAGACTGGACCTTCCAGACCCTGGTGATGCTGGAAACAGTTCCTCGAAGTGGAGAGGTTTACACCTGCCAAGTGGAGCACCCAAGCGTGACAAGCCCTCTCACAGTGGAATGGA |
28 | 55 | AAGTGACAGTGATGACTTTGGTGATGTTCTCCCCAGTGCAGAGAACTGCATTCAGAATTAGACAACCTCAGTGACGAGTATCTCTCCTGCCTGCGTAAGCTGCAGCACTGTCGAGAAGAGCTGAACCAGAGCCAGCAGCTGCCTCCCAGA | 56 | AGGCAATGTGGGCGATGGCTCCCAGTGCTGATGGTGGTGATTGCTGCAGCACTGGCAGTGTTCCTGGCCAATAAAGACAACCTGATGATCTGAATAATTTGTGACAACTGCCTTGGGTGAAAATCAGAAGCAAGCAACTCAGCGAAAAAC |
29 | 57 | AGATGACATACAAAAAGGGCAGGACCTGAGAAAGATTAAGCTGCAGGCTCCCTGCCCATAAAACAGGGTGTGAAAGGCATCTCAGCGGCTGCCCCACCATGGCTACCTGGGCCCTCCTGCTCCTTGCAGCCATGCTCCTGGGCAACCCAG | 58 | GTCTGGTCTTCTCTCGTCTGAGCCCTGAGTACTACGACCTGGCAAGAGCCCACCTGCGTGATGAGGAGAAATCCTGCCCGTGCCTGGCCCAGGAGGGCCCCCAGGTACGTGTTGGCTCTCTGCTCACCTGCCACAGTCCCTCTCCTTTCC |
30 | 59 | GCATCTTGGTCCGATACTCTGAGAGAAGTCAATATCACCATCATAGACAGAAAAGTCTGCAATGATCGAAATCACTATAATTTTAACCCTGTGATTGGAATGAATATGGTTTGTGCTGGAAGCCTCCGAGGTGGAAGAGACTCGTGCAAT | 60 | GGAGATTCTGGAAGCCCTTTGTTGTGCGAGGGTGTTTTCCGAGGGGTCACTTCCTTTGGCCTTGAAAATAAATGCGGAGACCCTCGTGGGCCTGGTGTCTATATTCTTCTCTCAAAGAAACACCTCAACTGGATAATTATGACTATCAAG |
31 | 61 | CATTAATGGGCCCATAAATGTTGTGTTTAGGTGGAAGAAGAGCCCGAAGAAGAACCTGAAGAGACAGCAGAAGACACAACAGAAGACACAGAGCAAGACGAAGATGAAGAAATGGATGTGGGAACAGATGAAGAAGAAGAAACAGCAAAG | 62 | GAATCTACAGCTGAAAAAGATGAATTGTAAATTATACTCTCACCATTTGGATCCTGTGTGGAGAGGGAATGTGAAATTTACATCATTTCTTTTTGGGAGAGACTTGTTTTGGATGCCCCCTAATCCCCTTCTCCCCTGCACTGTAAAATG |
32 | 63 | TTATGTCACGCATCTGATGAAGCGAATTCAGAGAGGCCCAGTAAGAGGTATCTCCATCAAGCTGCAGGAGGAGGAGAGAGAAAGGAGAGACAATTATGTTCCTGAGGTAAACTTTCTGGATATTTGGGCTTCTGGCTAATCCTCAAATGA | 64 | CCGCGTTCGCACCAAAACCGTGAAGAAGGCGGCCCGGGTCATCATAGAAAAGTACTACACGCGCCTGGGCAACGACTTCCACACGAACAAGCGCGTGTGCGAGGAGATCGCCATTATCCCCAGCAAAAAGCTCCGCAACAAGATAGCAGG |
33 | 65 | GACATGCAAGCCCATAACCGCTGTGGCCTCTTGGTTTTACAGATACGAACCTAAACTTTCAAAACCTGTCAGTGATTGGGTTCCGAATCCTCCTCCTGAAAGTGGCCGGGTTTAATCTGCTCATGACGCTGCGGCTGTGGTCCAGCTGAG | 66 | ATCTGCAAGATTGTAAGACAGCCTGTGCTCCCTCGCTCCTTCCTCTGCATTGCCCCTCTTCTCCCTCTCCAAACAGAGGGAACTCTCCTACCCCCAAGGAGGTGAAAGCTGCTACCACCTCTGTGCCCCCCCGGCAATGCCACCAACTGG |
34 | 67 | CCCCTCCGAGATCGAGATGATGCTCAGTACAGCCACCTTGGAGGAAACTGGGCTCGGAACAAGTGAACCTGAGACTGGTGGCTTCTAGAAGCAGCCATTACCAACTGTACCTTCCCTTCTTGCTCAGCCAATAAATATATCCTCTTTCAC | 68 | AGAACTTGTGTGTTCATATCACTCATGCAGACTTCTGAGGGTGTGGGAGGGTGGATCTCACAGTCCCATCTGCTAGGCCATTGATGTCTCTCTCTGGTTCTTCTAGCTGCCGACACACAAGCTCTGTTGAGGAATGACCAGGTCTATCAG |
35 | 69 | TCCAGAATGGAAAATCCCAGAAATTCTCCCATTTGGATCCCACCTTCTCCATCCCACAAGCAAACCACAGTCACAGTGGTGATTACCACTGCACAGGAAACATAGGCTACACGCTGTTCTCATCCAAGCCTGTGACCATCACTGTCCAAG | 70 | TGCCCAGCATGGGCAGCTCTTCACCAATGGGGATCATTGTGGCTGTGGTCATTGCGACTGCTGTAGCAGCCATTGTTGCTGCTGTAGTGGCCTTGATCTACTGCAGGAAAAAGCGGATTTCAGGTTTGTAGCTCCTCCCAGTCCCTTTTG |
36 | 71 | GGCAAGATTCTTGCCAAGAGAATTAATGTGCGTATTGAGCACATTAAGCACTCTAAGAGCCGAGATAGCTTCCTGAAACGTGTGAAGGAAAATGATCAGAAAAAGAAAGAAGCCAAAGAGAAAGGTACCTGGGTTCAACTAAAGCGCCAG | 72 | CCTGCTCCACCCAGAGAAGCACACTTTGTGAGAACCAATGGGAAGGAGCCTGAGCTGCTGGAACCTATTCCCTATGAATTCATGGCATAATAGGTGTTAAAAAAAAAAATAAAGGACCTCTGGGCTACAAAAATGTTTCTCTTCATTGAG |
37 | 73 | TTTAGATAGCATTAAGAATCTATATGTAAAAGTATGCTTTTTATTTCTTAGCCTCTTCTGGAAAACAAACTTAAAGCATTCAGTATTGGAAAAATGAGTACAGCTAAGCGAACTTTAAGTAAAAAGGAACAGGAAGAATTAAAGAAAAAG | 74 | GAGGATGAAAAGGCAGCTGCTGAGATTTATGAGGAGTTTCTTGCTGCTTTTGAAGGAAGTGATGGTAATAAAGTGAAAACATTTGTGCGAGGGGGTGTTGTTAATGCAGCTAAAGGTAAGTTTATAAAGTATAACTGCTAATAAAGCATA |
38 | 75 | TTTTGTGTGTGTGGCTCCACCCAGCCTGAGCTTCCTGACTGAGAGGTTTTGGTGGCGGTACAGACACTTTTAACTCACAGTAAAAGCAAAAGCAACAGCTCAAGCAGCCTCCTTGGAGAAAACCTGAAAATTCAACTTGTTCAAGAGAAG | 76 | GTCTTGTACGTGCCTAAGTTCTAGAGCCTCCTGACGTGAGCATGGCTGAGAGTGAGGACCGCTCCCTGAGGATCGTTCTGGTAGGGAAAACTGGAAGTGGGAAAAGTGCAACAGCGAACACCATCCTTGGAGAGGAAATCTTTGATTCTA |
39 | 77 | GTGCCCTGAGCCTGGGTGCTCACTGTGGCGGTCCCCGTCCTGGCTATGAAACCTTGTGAGCAGAAGGCAAGAGCGGCAAGATGAGTTTTGAGCGTTGTATTCCAAAGGCCTCATCTGGAGCCTCGGGAAAGTCTGGTCCCACATCTGCCC | 78 | TGTCCTTGTCCCCAGCCATCTCCATGGTGGTGGCCATGGCGGTGTACACCAGCGAGCGGTGGGACCAGCCTCCACACCCCCAGATCCAGACCTTCTTCTCCTGGTCCTTCTACCTGGGCTGGGTCTCAGCTATCCTCTTGCTCTGTACAG |
40 | 79 | ACTAACAATAATTGAAATGCAGAAGGGAGACTGTGCACTCTATGCCTCGAGCTTTAAAGGCTATATAGAAAACTGTTCAACTCCAAATACGTACATCTGCATGCAAAGGACTGTGTAAAGATGATCAACCATCTCAATAAAAGCCAGGAA | 80 | TTATCAACACTGATTTTACTTATAGTTTATTTGTGGTTTCAAACAGGATTTACTTAAACTGGTGAAGTCATATCATTGGATGGGACTAGTACACATTCCAACAAATGGATCTTGGCAGTGGGAAGATGGCTCCATTCTCTCACCCAACCT |
41 | 81 | GTACAAGAACTCGGATAATGATAAAGTCCAGAAGTGCAGCCACTATCTATTCTCTGAAGAAATCACTTCTGGCTGTCAGTTGCAAAAAAAGGAGATCCACCTCTACCAAACATTTGTTGTTCAGCTCCAGGACCCACGGGAACCCAGGAG | 82 | CTTCCTGACCACTATGCCCACTGACTCCCTCAGTGTTTCCACTCTGCCCCTCCCAGAGGTTCAGTGTTTTGTGTTCAATGTCGAGTACATGAATTGCACTTGGAACAGCAGCTCTGAGCCCCAGCCTACCAACCTCACTCTGCATTATTG |
42 | 83 | CTTAGAAATTAGAGGTGATGCTAAAGAAAACAGCTGTATTTCCATCTCACAGACATCTGTGTATTCTGAGTACTGTAGTACAGAAATCAGATGGATCTGCCAAAAAGAACTAACACCTGTGAGAAATAAAGTGTATCCTGACTCTTGACT | 84 | AAGCAGATTCAATATTTTTCTGTTCCATTTATAGATACACACACAGAACCTGATACGTGACAAAGCAATTCTGTTTTGGATTGGATTAAATTTTTCATTATCAGAAAAGAACTGGAAGTGGATAAACGGCTCTTTTTTAAATTCTAATGA |
43 | 85 | TGACGTTAGGTACAGATTGAGGGCATCTGTAACAGCTGAAATGTTCCCAAACAGGTAAAGCTCCAGATGGCTCTGGAACTTATGAGGAAAGAGTTGGAGGACGCCTTGACTCAGGAGGCCAACGTGGGGAAAAAGACTGTCATTTGGAAG | 86 | GAGAAAGTGGAAATGCAGAGGCAGCGCTTCAGATTGGAGTTTGAGAAGCATCGTGGCTTTCTGGCCCAGGAGGAGCAACGGCAGCTGAGGCGGCTGGAGGCGGAGGAGCGAGCGACGCTGCAGAGACTGCGGGAGAGCAAGAGCCGGCTG |
44 | 87 | ATGGAGAAGTGTACCCAGGACCTGGGCAACAGCACCAAAGCCGTGAGCTCAGCCATCGCCCAGCTACTGGGAGAGGTTGCCCAGGGCAATGAGAATTATGCAGGTATGTGGGCAGAGAGCCAGGCATGGGGCATATTGTGAGGGAGGTAG | 88 | AAAACCCCTTTCTTATCATAGGCTCAGGAAGCATGTGGACCTTTGGAGATGGATTCTGCACTGAGTGTGGTACAGAATCTAGAGAAAGATCTACAGGAAGTGAAGGCAGCAGCTCGAGATGGCAAGCTTAAACCCTTACCTGGGGAGACA |
45 | 89 | GTCTCCTGGCCCAGGCTCCAGTTGCGGATATTTTCTCGAGTTACCGCCCAGGCATCCTTTTGTGGCACGTGTCCTCCCAGAGCCTCTCCTTTGACTCCAGCAACCCAGAGTACTTCGACGGCTACTGGGGTAACACCGCCATTCCAGACT | 90 | AGCGGGCTTCAGCTCCGTGGTCACTCAGGCGAGTAGGGAGCAAAAGCGCAGTGGGGGCGGCTCCCAAACAGGGCCCCCTCTCACCCTCAGGACTTCCCTTCCAGGCCGGAGAGCTGGTGCTTGGGGCTCCTGGCGGCTATTATTTCTTAG |
46 | 91 | CCATCTCCATGGTGGTGGCCATGGCGGTGTACACCAGCGAGCGGTGGGACCAGCCTCCACACCCCCAGATCCAGACCTTCTTCTCCTGGTCCTTCTACCTGGGCTGGGTCTCAGCTATCCTCTTGCTCTGTACAGGTGACTATCCTGCCC | 92 | CAGGCTACATCCACGTGACGCAGACCTTCAGCATTATGGCTGTTCTGTGGGCCCTGGTGTCCGTGAGCTTCCTGGTCCTGTCCTGCTTCCCCTCACTGTTCCCCCCAGGCCACGGCCCGCTTGTCTCAACCACCGCAGCCTTTGCTGCAG |
47 | 93 | GTTTTACTTTTCCTGAAGATGGATGCATCTGGACCCTCAGATAGTGATATGCCAAGTCGGACACGACCTAAGAGCCCAAGAAAACATAATTATAGGAATGAAAGTGCCCGTGAAAGCCTTTGTGATTCTCCTCATCAGAATCTCTCAAGA | 94 | CCTCTTCTGGAAAACAAACTTAAAGCATTCAGTATTGGAAAAATGAGTACAGCTAAGCGAACTTTAAGTAAAAAGGAACAGGAAGAATTAAAGAAAAAGGTAATGTTGAAAATGTATTTTGAATTATCCTTGGAAATGAATGTGTCTAAG |
48 | 95 | TCCAACCTAAGGTGACTGTATATCCTTCAAAGACCCAGCCCCTGCAGCACCACAACCTCCTGGTCTGCTCTGTGAGTGGTTTCTATCCAGGCAGCATTGAAGTCAGGTGGTTCCTGAACGGCCAGGAAGAGAAGGCTGGGATGGTGTCCA | 96 | GGGAGTTCCGGGCGGTGACGGAGCTGGGGCGGCCTGACGCTGAGTACTGGAACAGCCAGAAGGACATCCTGGAGCAGGCGCGGGCCGCGGTGGACACCTACTGCAGACACAACTACGGGGTTGTGGAGAGCTTCACAGTGCAGCGGCGAG |
49 | 97 | GCAGTGGCAGAGCAGATTCCACTGCTGGTGCAGGGCGTCCGAGGAAGCCAAGCCCAGCCTGACAGCCCCAGCGCTCAGCTTGCCCTCATTGCTGCCAGCCAGAGCTTCCTGCAGGCAAGGCACCCCCTCTGCACTTCTCTGACCTGACCT | 98 | ATGGGCTTGGTCTGACTACTCTTGTCTTCACAGCATGCAGCCAAGCAGGCTGCAGCCTCAGCCACACAGACCATCGCTGCAGCTCAGCACGCAGCCTCTACCCCCAAGGCCTCTGCCGGCCCCCAGCCCCTGCTGGTGCAGAGCTGCAAG |
50 | 99 | AAAGTCTCCTCCAGTTTTAAAAGCCTACAATCCTGTGAGCCTCTTCATTCCCAATGTAACCCTGACCACTGCTGTTTGTTCCAGATCACGTAAAACCAAAGGAAACTGAAAACACAAAGCAACCTTCAAAGAGCTGCCATAAACCCAAAG | 100 | CCATAGTTCATACCGAGAAGGTGAACATGATGTCCCTCACAGTGCTTGGGCTACGAATGCTGTTTGCAAAGACTGTTGCCGTCAATTTTCTCTTGACTGCCAAGTTATTTTTCTTGTAAGGTAAGAATTAGCCGCTTCTTATTCCTATCT |
51 | 101 | AGGGGGAGGTGAGCGCCGACGAGGAGGGCTTTGAGAACCTGTGGGCCACCGCCTCCACCTTCATCGTCCTCTTCCTCCTGAGCCTCTTCTACAGTACCACCGTCACCTTGTTCAAGGTAGCACGGCTGTGGCACAGGGAGGAGGGTGCAG | 102 | CTGAGCCCCAGGCCCCAGGCCGGTACTTCGCCCACAGCATCCTGACCGTGTCCGAAGAGGAATGGAACACGGGGGAGACCTACACCTGCGTGGTGGCCCATGAGGCCCTGCCCAACAGGGTCACCGAGAGGACCGTGGACAAGTCCACCG |
52 | 103 | GTATTTGAAATATCTCACCAAAAAATATTTGAAGAAGAATAATCTACGTGACTGGTTGCGCGTAGTTGCTAACAGCAAAGAGAGTTACGAATTACGTTACTTCCAGATTAACCAGGACGAAGAAGAGGAGGAAGACGAGGATTAAATTTC | 104 | TTGGTCAATTTAATGATTTCTACAGGAGCAGTTTTTGCAAGAAAGGATCAAAGTGAACGGAAAAGCTGGGAACCTTGGTGGAGGGGTGGTGACCATCGAAAGGAGCAAGAGCAAGATCACCGTGACATCCGAGGTGCCTTTCTCCAAAAG |
53 | 105 | AATATGTCGTCGGTGCCCCCACTTGGAGCTGGACCCTGGGAGCGGTAAGTGCCCCCACCACTGGGCCTCCCGAAGCCCCTTATCCCAGTTCTCAGGCTGACAACTCCTGAGCGCCCCCCACCCCCGCCCCGCCTCCACCAAACCACCCTT | 106 | TGCAGGGCTGGGGCTGAGTGGCCTTAATCTCCTCCTTCTTTGCCCTCCGTCCCCTCTGTGCTTCCTCCCCTGGAAAAGACTAATTTGCGCCCTTGTCCTCAGGGTACTCGGTGGCCGTGGGCGAGTTCGACGGGGATCTCAACACTACAG |
54 | 107 | GTGACCCCAGCCATGAGGACCCTCGCCATCCTTGCTGCCATTCTCCTGGTGGCCCTGCAGGCCCAGGCTGAGCCACTCCAGGCAAGAGCTGATGAGGTTGCTGCAGCCCCGGAGCAGATTGCAGCGGACATCCCAGAAGTGGTTGTTTCC | 108 | GAAAGTAACCCCGGAAATTAGGACACCTCATCCCAAAAGACCTTTAAATAGGGGAAGTCCACTTGTGCACGGCTGCTCCTTGCTATAGAAGACCTGGGACAGAGGACTGCTGTCTGCCCTCTCTGGTCACCCTGCCTAGCTAGAGGATCT |
55 | 109 | CGGCTGTGCCCATGGTGCTCAGTGCCATGGGCTTCACTGCGGCGGGAATCGCCTCGTCCTCCATAGCAGCCAAGATGATGTCCGCGGCGGCCATTGCCAATGGGGGTGGAGTTGCCTCGGGCAGCCTTGTGGCTACTCTGCAGTCACTGG | 110 | GAGCAACTGGACTCTCCGGATTGACCAAGTTCATCCTGGGCTCCATTGGGTCTGCCATTGCGGCTGTCATTGCGAGGTTCTACTAGCTCCCTGCCCCTCGCCCTGCAGAGAAGAGAACCATGCCAGGGGAGAAGGCACCCAGCCATCCTG |
56 | 111 | ATACACACACAGAACCTGATACGTGACAAAGCAATTCTGTTTTGGATTGGATTAAATTTTTCATTATCAGAAAAGAACTGGAAGTGGATAAACGGCTCTTTTTTAAATTCTAATGAGTGAGTATTAGATGAGCTAACTTTAATATTCAAT | 112 | CCGGGTCTCTTAAACTGCCCAATATATTGGCAGCAACTCCGAGAGAAATGCTTGTTATTTTCTCACACTGTCAACCCTTGGAATAACAGTCTAGCTGATTGTTCCACCAAAGAATCCAGCCTGCTGCTTATTCGAGATAAGGATGAATTG |
57 | 113 | CCCGGGGCGCGGCGATGCGCGCGGCACGGCGAGGACCTGAGCCGCTTCTGCGAGGAGGACGAGGCGGCGCTGTGCTGGGTGTGCGACGCCGGCCCCGAGCACAGGACGCACCGCACGGCGCCGCTGCAGGAGGCCGCCGGCAGCTACCAG | 114 | GTAAAGCTCCAGATGGCTCTGGAACTTATGAGGAAAGAGTTGGAGGACGCCTTGACTCAGGAGGCCAACGTGGGGAAAAAGACTGTCATTTGGAAGGTAAGACCATGTTGGGGCTTTAGGAGGCTTGCCTGTTTGAAGGATCCAGATTCG |
58 | 115 | GCTCAAGGAAAAACATGGCCTGCTATTGCAGAATACCAGCGTGCATTGCAGGAGAACGTCGCTATGGAACCTGCATCTACCAGGGAAGACTCTGGGCATTCTGCTGCTGAGCTTGCAGAAAAAGAAAAATGAGCTCAAAATTTGCTTTGA | 116 | CCATTCTCCTGGTGGCCCTGCAGGCCCAGGCTGAGCCACTCCAGGCAAGAGCTGATGAGGTTGCTGCAGCCCCGGAGCAGATTGCAGCGGACATCCCAGAAGTGGTTGTTTCCCTTGCATGGGACGAAAGCTTGGCTCCAAAGCATCCAG |
59 | 117 | TTCCTCCCCAGGAGACTTGGAAGATGCAGAACTGGATGACTACTCATTCTCATGCTATAGCCAGTTGGAAGTGAATGGATCGCAGCACTCACTGACCTGTGCTTTTGAGGACCCAGATGTCAACATCACCAATCTGGAATTTGAAATATG | 118 | TGGGGCCCTCGTGGAGGTAAAGTGCCTGAATTTCAGGAAACTACAAGAGATATATTTCATCGAGACAAAGAAATTCTTACTGATTGGAAAGAGCAATATATGTGTGAAGGTTGGAGAAAAGAGTCTAACCTGCAAAAAAATAGACCTAAC |
60 | 119 | TCGTGGAGGTAAAGTGCCTGAATTTCAGGAAACTACAAGAGATATATTTCATCGAGACAAAGAAATTCTTACTGATTGGAAAGAGCAATATATGTGTGAAGGTTGGAGAAAAGAGTCTAACCTGCAAAAAAATAGACCTAACCACTATAG | 120 | TTAAACCTGAGGCTCCTTTTGACCTGAGTGTCGTCTATCGGGAAGGAGCCAATGACTTTGTGGTGACATTTAATACATCACACTTGCAAAAGAAGTATGTAAAAGTTTTAATGCACGATGTAGCTTACCGCCAGGAAAAGGATGAAAACA |
61 | 121 | ATTTCTTCCTGACCACTATGCCCACTGACTCCCTCAGTGTTTCCACTCTGCCCCTCCCAGAGGTTCAGTGTTTTGTGTTCAATGTCGAGTACATGAATTGCACTTGGAACAGCAGCTCTGAGCCCCAGCCTACCAACCTCACTCTGCATT | 122 | GACAGACTACACCCAGGGAATGAAGAGCAAGCGCCATGTTGAAGCCATCATTACCATTCACATCCCTCTTATTCCTGCAGCTGCCCCTGCTGGGAGTGGGGCTGAACACGACAATTCTGACGCCCAATGGGAATGAAGACACCACAGCTG |
62 | 123 | GGTGCTCTGGGAGGTGCCTTCCCGCCGCCCCCTCCCCCGATCGAGGAATCATTTCCCCCTGCGCCTCTGGAGGAGGAGATCTTCCCTTCCCCGCCGCCTCCTCCGGAGGAGGAGGGAGGGCCTGAGGCCCCCATACCGCCCCCACCACAG | 124 | CCCAGGGAGAAGGTGAGCAGTATTGATTTGGAGATCGACTCTCTGTCCTCACTGCTGGATGACATGACCAAGAATGATCCTTTCAAAGCCCGGGTAAGGGACCGGAGAGTAGGAAAAGCAGGGCTCAGGGCCAGAGAGACTGGGCATAGA |
63 | 125 | AAGGGGGAAATTATTTTTCCTGAATCTGCTGTGATCCAAGAAATCGTTGTTTCTTTCAGAGCATCCCGCGGGTGTTCAGCAAGTTTCCTATAAAGGAAGCTCGAAAGCCCTTTAACCAGAATAAAAACCGTTATGTTGACATTCTTCCTT | 126 | ATGATTATAACCGTGTTGAACTCTCTGAGATAAACGGAGATGCAGGGTCAAACTACATAAATGCCAGCTATATTGATGTGAGTAAAAATTTGCATTTTTCTTATACCTACATATTTCATTCAGCTCCTTGTTTGTCTTGGTAAAATTTTA |
64 | 127 | CGCCGGGCTCTGGCGGCCTGACCGGGCCTGGGGTCCGAGCGTGCCCCCGGGCCTGGGGGGGTCGCCGCGATGGACTCGCTGGCAGCGCCCCAGGACCGCCTGGTGGAGCAGCTGCTGTCGCCGCGGACCCAGGCCCAGAGGCGGCTCAAG | 128 | GACATTGACAAGCAGTACGTGGGCTTCGCCACACTGCCCAACCAGGTGCACCGCAAGTCGGTGAAGAAAGGCTTTGACTTCACACTCATGGTGGCTGGTGAGTGGGCCAGGCTCCTCGGGGGAGTGGCTGGGGTCACTGGCCAGCCAAGC |
65 | 129 | GGAGCAGCTGCCCACCCTGACAGTGAGGAGCAGCAGCAGCGGCTGCGGGAGGCAGCTGAGGGGCTGCGCATGGCCACCAATGCAGCTGCGCAGAATGCCATCAAGAAAAAGCTGGTGCAGCGCCTGGAGGTGAGGCTGGGAGTTTCACCA | 130 | GCCCGCATCCTGGCCCAAGCCACATCTGACCTGGTCAATGCCATCAAGGCTGATGCTGAGGGGGAAAGTGATCTGGAGAACTCCCGCAAGCTCTTAAGTGCTGCCAAGATCCTAGCTGATGCCACAGCCAAGATGGTAGAGGCTGCCAAG |
66 | 131 | TATGCCACCACGGGCTGTTCCCTGACCCTGCACCATACGGAGAAACCAGAACATGAAGACATATGTGAATACCGTCCCTACTCCTGCCCATGTCCTGGTGCTTCCTGCAAGTGGCAGGGGTCCCTGGAAGCTGTGATGTCCCATCTCATG | 132 | CCTATTCTGCAGTGCCAGGCCGGGCACCTGGTGTGTAACCAATGCCGCCAGAAGTTGAGCTGCTGCCCGACGTGCAGGGGCGCCCTGACGCCCAGCATCAGGAACCTGGCTATGGAGAAGGTGGCCTCGGCAGTCCTGTTTCCCTGTAAG |
67 | 133 | AATTCTGTAGCAAAGCCAATACAAAAATCAGCTAAAGCTGCCACAGAAGAGGCATCTTCAAGATCACCAAAAATAGATCAGAAAAAAAGTCCATATGGACTGTGGATACCTATCTAAAAGAAGAAAACTGATGGCTAAGTTTGCATGAAA | 134 | TTTAATATAGGATTTAGAAACCAAGGGTATGTGTTTTAAAATTACACTTTTTCTTAACCTGTCTAGCTGTCGGAAAAGGTAACAGAAGATGGAACTCGAAATCCCAATGAAAAACCTACCCAGCAAAGAAGCATAGCTTTTAGCTCTAAT |
68 | 135 | CCGTTCCCAGAGGGCGCCGCTCTGCAAATTACCCAATCAGCTCTAAGTACAAAGCATCGCGAGTCTTTAGTGCTCTTTGGCGCTATAAGCCCGTGGGAACGAGCATTGGAGACCCTTTTCACAAGATGGCGCCGAAAGCGAAGAAGGAAG | 136 | CTCCTGCCCCTCCTAAAGCTGAAGCCAAAGCGAAGGCTTTAAAGGCCAAGAAGGCAGTGTTGAAAGGTGTCCACAGCCACAAAAAGAAGAAGATCCGCACGTCACCCACCTTCCGGCGGCCGAAGACACTGCGACTCCGGAGACAGCCCA |
69 | 137 | GGTACTCGGTGGCCGTGGGCGAGTTCGACGGGGATCTCAACACTACAGGCAAGAAATCCACTTAGGGCGGGAGTTGGGTAGCCCAGCCCGGGGAGGAGCGCCTTCCTGAAATCTCCCCTATGTAGGGAAATCTTCCTGCACACACATTTT | 138 | CCGCTGTCCCTCCCGCCCTAGGTCTCCTGGCCCAGGCTCCAGTTGCGGATATTTTCTCGAGTTACCGCCCAGGCATCCTTTTGTGGCACGTGTCCTCCCAGAGCCTCTCCTTTGACTCCAGCAACCCAGAGTACTTCGACGGCTACTGGG |
70 | 139 | GATAAAGTGAATCCTTTTTCTTTTTAAAAAGAAAAATAACTCTTTTTTTTTGGCAAGAAAAGGTTGCTAATAATCACAGATAATTTATACAATTATATTTTTTCCCCCAGGTCCGTGTGAAAAAATTCATGATGAAAATCTACGAAAACA | 140 | GTATGAGAAGAGCTCTCGTTTCATGAAAGTTGGCTATGAGAGAGATTTTTTGCGATACTTACAGAGCTTACTTGCAGAAGTAGAACGTAGGATCAGACGAGGCCATGCTCGTTTGGCATTATCTCAAAACCAGCAGTCTTCTGGGGTAAG |
71 | 141 | GCTGCTTTTTTAGATCAGGCTTTGCCCGTGTGGAGTCCAAAGTCCTTCCCTAACGAAGTGGAGCCTGAGGGTACAGGGAAGGAGAAAGACTGGGATCTCAGAGACCAGCTGCAAAAGAAGACTTTGCAGCTCCAGGCCAAGGAAAAGGAG | 142 | TGCAGAGAACTGCATTCAGAATTAGACAACCTCAGTGACGAGTATCTCTCCTGCCTGCGTAAGCTGCAGCACTGTCGAGAAGAGCTGAACCAGAGCCAGCAGCTGCCTCCCAGAGTAAGAGGGTCTCTCCTTCCCATAAAGCCCTGGATG |
72 | 143 | GCTCTGGCTGGGGACCAGCCCTCGGTGCAGCCCCCTCTACGGTCTCAGCAGCTGGCCCCACAGTACACCTACGCCCAGGGCGGCCAGCAGACTTGGGTACGGCCTGGCCAGCTAGGGACACTGGGGCTAGCCAGCTGGGTGTTCTGTGAG | 144 | ACGCCTGATGGCTCAGAGGTGGATGTGGACGTGGTGGAGAATGAGGACGGCACTTTCGACATCTTCTACACGGCCCCCCAGCCGGGCAAATACGTCATCTGTGTGCGCTTTGGTGGCGAGCACGTGCCCAACAGCCCCTTCCAAGTGACG |
73 | 145 | GGTCCAGTAGGCGTCAATGTCACTTATGGAGGGGATCCCATCCCTAAGAGCCCTTTCTCAGTGGCAGTATCTCCAAGCCTGGACCTCAGCAAGATCAAGGTGTCTGGCCTGGGAGAGAGTAAGTAGTTGGGGCCCTTGTCGCAAAGGCCT | 146 | ACCCACTTCACAGTAAATGCCAAAGCTGCTGGCAAAGGCAAGCTGGACGTCCAGTTCTCAGGACTCACCAAGGGGGATGCAGTGCGAGATGTGGACATCATCGACCACCATGACAACACCTACACAGTCAAGTACACGCCTGTCCAGCAG |
74 | 147 | GCTGCGTCAGGTGGCTGGCCGGCGCGGCGCTCCCCTGCTCTCTGGCTCCGGGCTGCGGCGCGGCGGCTGGAGCGAGCCCCTGTCCCGGCGCGGGGCGGCGGCGGGCGGCCGGCAGGCGCTGCCTTGCGTGTGAGTGCACCTCACTCACAT | 148 | GTGCTGGAGAATCTGGTAAAAGCACCATTGTGAAGCAGATGAGGATCCTGCATGTTAATGGGTTTAATGGAGAGTAAGTGTCAAATCTGTGCAGGGGGGCACCAAGTAAGAGGAACAGACTTTATACTAACCTTTAGGAAGTATAGGTGG |
75 | 149 | TCTTTATTTCAGGTAATATGATATTTGATAATAAAGAAATTAAATTAGAAAACCTTGAACCCGAACATGAGTATAAGTGTGACTCAGAAATACTCTATAATAACCACAAGTTTACTAACGCAAGTAAAATTATTAAAACAGATTTTGGGA | 150 | GTCCAGGAGAGCCTCAGATTATTTTTTGTAGAAGTGAAGCTGCACATCAAGGAGTAATTACCTGGAATCCCCCTCAAAGATCATTTCATAATTTTACCCTCTGTTATATAAAAGAGACAGGTAATTTGTGTAGAATTTAATTTCATCAGA |
76 | 151 | GTGACCCCAGCCATGAGGACCCTCGCCATCCTTGCTGCCATTCTCCTGGTGGCCCTGCAGGCCCAGGCTGAGCCACTCCAGGCAAGAGCTGATGAGGTTGCTGCAGCCCCGGAGCAGATTGCAGCGGACATCCCAGAAGTGGTTGTTTCC | 152 | GAAAGTAACCCCGGAAATTAGGACACCTCATCCCAAAAGACCTTTAAATAGGGGAAGTCCACTTGTGCACGGCTGCTCCTTGCTATAGAAGACCTGGGACAGAGGACTGCTGTCTGCCCTCTCTGGTCACCCTGCCTAGCTAGAGGATCT |
77 | 153 | CCTACCCCCACTGCATGGCTACTGAATGCTCACCACAATCTATTCTTGCTTTCCAGGGGAGATGGATCCTATCTTACTAACCATCAGCATTTTGAGTTTTTTCTCTGTCGCTCTGTTGGTCATCTTGGCCTGTGTGTTATGGAAAAAAAG | 154 | GATTAAGCCTATCGTATGGCCCAGTCTCCCCGATCATAAGAAGACTCTGGAACATCTTTGTAAGAAACCAAGAAAAGTGAGTGTTTTTGGTGCTTAAAAAGTGTTGTGTTGGCAACATCCCAGTGGCCAAGAATGATATTCCAGGACAAG |
78 | 155 | TTTGAATATGTTTGCAAAATATACGATGTACTCGCAACTAATCTATTGTTTTCTTCTTCGTAGATCCATTACAAGATTTTGGCTTTTCTGTTGAAAAGTGTTCCAAGCAATTAAAATCAAATATCAACATTAGATTTGGAATTATTCTGA | 156 | GAGAGGACATCAAAGAGCTTTTTCTTGACCTAGCTCTCATGTCTCAAGGCTCATCTGTTTTGAATTTCTCCTATCCCATCTGTGAGGCGGCTCTGCCCAAGTTTTCTTTCTGTGGAAGAAGGAAAGGAGGTAAGCCATCTGTCTTGCTCA |
79 | 157 | CTTCCTCCTAGCACTGGGACATTTCAAGAAGCTCAGAGCCGGTTGAATGAAGCTGCTGCTGGGCTGAATCAGGCAGCCACAGAACTGGTGCAGGCCTCTCGGGGAACCCCTCAGGACCTGGCTCGAGCCTCAGGCCGATTTGGACAGGAC | 158 | TGGGCAACCCTGTCTCCTTTCTCACCCCAGGTGGCTAAAGCAGTGACCCAGGCTCTGAACCGCTGTGTCAGCTGCCTACCTGGCCAGCGCGATGTGGATAATGCCCTGAGGGCAGTTGGAGATGCCAGCAAGCGACTCCTGAGTGACTCG |
80 | 159 | GAGCTGGCACGGGCGAGGTCGAGGTTGTGATCCAGGACCCCATGGGACAGAAGGGCACGGTAGAGCCTCAGCTGGAGGCCCGGGGCGACAGCACATACCGCTGCAGCTACCAGCCCACCATGGAGGGCGTCCACACCGTGCACGTCACGT | 160 | TTGCTGGCCAGCACATCGCCAAGAGCCCCTTCGAGGTGTACGTGGATAAGTCACAGGGTGACGCCAGCAAAGTGACAGCCCAAGGTCCCGGCCTGGAGCCCAGTGGCAACATCGCCAACAAGACCACCTACTTTGAGATCTTTACGGCAG |
81 | 161 | GTGTCGAGCTTGGCAAGCCCACCCACTTCACAGTAAATGCCAAAGCTGCTGGCAAAGGCAAGCTGGACGTCCAGTTCTCAGGACTCACCAAGGGGGATGCAGTGCGAGATGTGGACATCATCGACCACCATGACAACACCTACACAGTCA | 162 | GATGTCATTGCGGATGACGTCTGCCCTCCTCTAAGGCCTTCTCCTCCCACTGCCTGCAGGCCACGCCCACCAGCCCCATCCGAGTCAAGGTGGAGCCCTCTCATGACGCCAGTAAGGTGAAGGCCGAGGGCCCTGGCCTCAGTCGCACTG |
82 | 163 | CAGTTGGAGGAGAAAGGTCTGGGGGCCTCCCCCTGGGGCAACTTGGGCCAGCAACTCTTGCTTCTGCCCACAGGGAGTCTAGTGGATTTTCTCAAGACCCCTTCAGGCATCAAGTTGACCATCAACAAACTCCTGGACATGGCAGCCCAA | 164 | ATTGCAGAAGGCATGGCATTCATTGAAGAGCGGAATTATATTCATCGTGACCTTCGGGCTGCCAACATTCTGGTGTCTGACACCCTGAGCTGCAAGATTGCAGACTTTGGCCTAGCACGCCTCATTGAGGACAACGAGTACACAGCCAGG |
83 | 165 | ACATTGCAGTGGCTGCCCCCTACGGGGGTCCCAGTGGCCGGGGCCAAGTGCTGGTGTTCCTGGGTCAGAGTGAGGGGCTGAGGTCACGTCCCTCCCAGGTCCTGGACAGCCCCTTCCCCACAGGCTCTGCCTTTGGCTTCTCCCTTCGAG | 166 | CCGAAGTGGGGCGTGTGTATTTGTTCCTGCAGCCGCGAGGCCCCCACGCGCTGGGTGCCCCCAGCCTCCTGCTGACTGGCACACAGCTCTATGGGCGATTCGGCTCTGCCATCGCACCCCTGGGCGACCTCGACCGGGATGGCTACAATG |
84 | 167 | TGGGTAGGCGCCGCGTCCTGCAGCGTCTCACCGGGGCCTGTCTGTGCCTCTGCAGCCGAGAGGGTGACCTCTCTGGGCAAGGACTGGCATCGGCCCTGCCTGAAGTGCGAGAAATGTGGGAAGACGCTGACCTCTGGGGGCCACGCTGAG | 168 | CACGAAGGCAAACCCTACTGCAACCACCCCTGCTACGCAGCCATGTTTGGGCCTAAAGGTATGCTCCCGTCATCCCCACCCCACCCCACCCCACAGCCTCCTCCACCCCAGCCTGTTGACTTTTTCCACCTTCTCTGCAGGCTTTGGGCG |
85 | 169 | AGCGGCGGCGCCGGGGCAGCTCCGACGCCCTCCCGCGGGGAAGGAGCCCCCGCGGTGCCGCCGAGGCCCCGACGCGGGGCCGCCCCTCGGCTCGCCGCCCCGCGCCCGCGCCCGCTGGGAATGATGAAGAAGAACAATTCCGCCAAGCGG | 170 | GGACCTCAGGATGGAAACCAGCAGCCTGCACCGCCCGAGAAGGTCGGCTGGGTCCGGAAATTCTGCGGGAAAGGGATTTTCAGGGAGATTTGGAAAAACCGCTATGTGGTGCTGAAAGGGGACCAGCTCTACATCTCTGAGAAGGAGGTG |
86 | 171 | CCAGGTGGGAAGATGGTGGCAGCTGCAAAGGCCTCAGTGCCAACGATTCAGGACCAGGCTTCAGCCATGCAGCTGAGTCAGTGTGCCAAGAACCTGGGCACCGCGCTGGCTGAACTCCGGACGGCTGCCCAGAAGGTATGGAAGCTGGTT | 172 | AGACTTGTGGGTCTCTTATGACATTTTCACCTACAGGCAGTGGCAGAGCAGATTCCACTGCTGGTGCAGGGCGTCCGAGGAAGCCAAGCCCAGCCTGACAGCCCCAGCGCTCAGCTTGCCCTCATTGCTGCCAGCCAGAGCTTCCTGCAG |
87 | 173 | TTGGGAAAGGCCTAGAAGCATCTCTAGGACCATTGTTTCTTAGACCTATACTCATAGAATTGCCTCTCTTCTCAGCAAAACCTGGAAATCCACCGGAAGATAAAACAGTCTGAGCAGGAGCTAGCCTATCTGGAAAGGAGAGAACGAGAG | 174 | GGAAAGTTTAAAGGAAGAGGAAATGATCGCAGGGAAAAGCTCCAGTCTTTTGACTCTCCAGAAAGGAAACGGATTAAGTACTCCAGGGAAACTGACAGGTAAGCCAGGAACTCTTCATTCAGCCTAGGCCTCAAGCCTAATGATAAAACC |
88 | 175 | TTGAGCCTAAGGTGACTGTGTATCCTGCAAGGACCCAGACCCTGCAGCACCACAACCTCCTGGTCTGCTCTGTGAATGGTTTCTATCCAGGCAGCATTGAAGTCAGGTGGTTCCGGAACAGCCAGGAAGAGAAGGCTGGGGTGGTGTCCA | 176 | GGGAGTACCGGGCGGTGACGGAGCTGGGGCGGCCTGACGCTGAGTACTGGAACAGCCAGAAGGACTTCCTGGAAGACAGGCGCGCCGCGGTGGACACCTACTGCAGACACAACTACGGGGTTGGTGAGAGCTTCACAGTGCAGCGGCGAG |
89 | 177 | AGAGGCGGATTTGGTCGTGGACGTGGTCAGCCACCTCAGTAAAATTGGAGAGGATTCTTTTGCATTGAATAAACTTACAGCCAAAAAACCTTAATCTTTTGTCCATTTTGTTTGCATTGTGCAGCCTGAACAGGAACAGTTTAAGTGTCA | 178 | TTTTCCTGCAAGATTGGAGTCAGTGAGTTGTCTTCCCTTGTTCTATAGTAGATTAGATCATATGATGATTCTAAATCGATGTTTCACTTTCTAGCTGGTGCCGACAAGAAAGCCGAGGCTGGGGCTGGGTCAGCAACCGAATTCCAGTTT |
90 | 179 | CTCTCTTGTTCTGCAGTTCTGGCAATACGGCGAGTGGGTGGAGGTGGTGGTGGATGACAGGCTGCCCACCAAGGACGGGGAGCTGCTCTTTGTGCATTCAGCCGAAGGGAGCGAGTTCTGGAGCGCCCTGCTGGAGAAGGCATACGCCAA | 180 | GATCAACGGATGCTATGAAGCGCTATCAGGGGGTGCCACCACTGAGGGCTTCGAAGACTTCACCGGAGGCATTGCTGAGTGGTATGAGTTGAAGAAGCCCCCTCCCAACCTGTTCAAGATCATCCAGAAAGCTCTGCAAAAAGGCTCTCT |
91 | 181 | GTGACCCCAGCCATGAGGACCCTCGCCATCCTTGCTGCCATTCTCCTGGTGGCCCTGCAGGCCCAGGCTGAGCCACTCCAGGCAAGAGCTGATGAGGTTGCTGCAGCCCCGGAGCAGATTGCAGCGGACATCCCAGAAGTGGTTGTTTCC | 182 | GAAAGTAACCCCGGAAATTAGGACACCTCATCCCAAAAGACCTTTAAATAGGGGAAGTCCACTTGTGCACGGCTGCTCCTTGCTATAGAAGACCTGGGACAGAGGACTGCTGTCTGCCCTCTCTGGTCACCCTGCCTAGCTAGAGGATCT |
92 | 183 | CTGGGAGTGTCCACTCGCCTTCCACCAGCATGGCAACGTCTTCACAGTACCGCCAGCTGCTCAGTGACTACGGGCCACCGTCCCTAGGCTACACCCAGGTATGTCAATGGGGGTGATGGCATGGTGGGAGGGCCAGGGGGAGACATGCTT | 184 | CGCGGGGCTGCAGCGCTACCGCCCGGCCTCGCCGCCGCCGCCGCCGCCCTCGCGGCCTGGCCCCGCCGCGCCCGGCGCGCCCGCCGCCCGGGGGGATGTCTTACAAACCGAACTTGGCCGCGCACATGCCCGCCGCCGCCCTCAACGCCG |
93 | 185 | GAGGAGTTGGCCGAAGTCGAAGAAGGAGTTGGAGTAGTGGGCGAAGATAATGACGCAGCCGCGAGAGGAGCGGAGGCCTTTGGCGACAGTGAGGAGGACGGAGAGGATGTGTTCGAGGTGGAGAAGATCCTGGACATGAAGACCGAGGGG | 186 | GGTAAAGTTCTTTACAAAGTTCGCTGGAAAGGCTATACATCGGATGATGATACCTGGGAGCCCGAGATTCACCTGGAGGACTGTAAAGAAGTGCTTCTTGAATTTAGGAAGAAAATTGCAGAGAACAAAGCCAAAGCAGTCAGGAAGGAT |
94 | 187 | CTGCATCTCTGCCCACGTCGGAGAGGTGCGTCGGCTTCCGTACAACACGGATACTCTCTCTCTGACGCAACTTCCTGTCCTGCGCAATTCTATTTGACCTTTGAACTGGCAAAGGCTTTTTTCTTCCTCTTCCGGGGACGTTGTCTGCAG | 188 | GCACTCAGAATGGTCCAGCGTTTGACATACCGACGTAGGCTTTCCTACAATACAGCCTCTAACAAAACTAGGCTGTAAGTATTTCTGAAAATTTTAAGTATATATTGTCATTTACTCTACAAAATGCTGACCTACTGACTGTTTCACTTT |
95 | 189 | CCCCAATTCCCTCAGGTGGCAATCTCAGGTCTGCTCTTCTGCTTACCAACAGGGAAAGTTTAAAGGAAGAGGAAATGATCGCAGGGAAAAGCTCCAGTCTTTTGACTCTCCAGAAAGGAAACGGATTAAGTACTCCAGGGAAACTGACAG | 190 | TGATCGTAAACTTGTTGATAAAGAAGATATCGACACTAGCAGCAAAGGAGGCTGTGTCCAACAGGCTACTGGCTGGAGGAAAGGGACAGGCCTGGGATATGGCCATCCTGGATTGGCTTCATCAGAGGAGGTAAAATGGTTTCCATCTTT |
96 | 191 | ATGGCGTCGTATTTTGGGCATTCAGTGGCTGTCACTGACGTCAACGGGGATGGGTGAGGAGGGACATGCCCCACCCCTACCCAGTTGGGTCCCAAATTACCAGAGCTGCCCTCTGTCTCCCTTTCCTAGCCCTAGTCTCACGTATCCACT | 192 | CCCCCGCCCCGCCTCCACCAAACCACCCTTTCTCACCTGGAGTGGGAGGTTGCTTTGGGTACAAGAATGATGCTCTCGCCTGCGCTGTCCGTGCAGGTGGAAATTTTGGATTCCTACTACCAGAGGCTGCATCGGCTGCGCGGAGAGCAG |
97 | 193 | AGAATCCTTTCCTGTTTGCATTGGAAGCCGTGGTTATCTCTGTTGGCTCCATGGGATTGATTATCAGCCTTCTCTGTGTGTATTTCTGGCTGGAACGGTGAGATTTGGAGAAGCCCAGAAAAATGAGGGGAACGGTAGCTGACAATAGCA | 194 | ATTATAGACATAAGTTCTCCTTGCCTAGTGTGGATGGGCAGAAACGCTACACGTTTCGTGTTCGGAGCCGCTTTAACCCACTCTGTGGAAGTGCTCAGCATTGGAGTGAATGGAGCCACCCAATCCACTGGGGGAGCAATACTTCAAAAG |
98 | 195 | CTCCTGGTAACGTTTTTATCCATGGATGACTTGCTTGGGTAAGGACATGAAGACAGTTCCTGTCATACCTTTTAAAGGTACATGTTTTATTGATGTTAACGTTAATTGATTGAGCTACTGTTAGTGATGATTTTAAAATTAAAGCAGATG | 196 | CTGACTGAACATGAAGGTCTTAATTAGCTCTAACTGACTAAAGGCATTTGTTAGTTTTGGCAGGGGGTGAACACTCATCTGTGGCTATTCTAAGACCACTCTTATTTCTTAGGTGGAGTCCAACTTGCCTGGACCAGCTTAATGGTTCTG |
99 | 197 | GGTCTGGCTGAAGTTGAGGATCTCTTACTCTCTAGGCCACGGAATTAACCCGAGCAGGCATGGAGGCCTCTGCTCTCACCTCATCAGCAGTGACCAGTGTGGCCAAAGTGGTCAGGGTGGCCTCTGGCTCTGCCGTAGTTTTGCCCCTGG | 198 | CCAGGATTGCTACAGTTGTGATTGGAGGAGGTGAGTCTGTGGGGAAGGGGCTCAAGTAACCACCTGCCCCTAGGGAGGTGGACTTGGGGAGCAGCTGGCCTTGTCCATGCCAATGTTTCCCTCACATGGGTGGTCAGGGGAGGAGGTGGG |
100 | 199 | TGTGCCAGAGCTGTGTGGAGCTGGATCCAGCCACCGTGGCTGGCATCATTGTCACTGATGTCATTGCCACTCTGCTCCTTGCTTTGGGAGTCTTCTGCTTTGCTGGACATGAGACTGGAAGGCTGTCTGGGGGTTAGTGGAAGAGCAGAG | 200 | CATGGGTAGAGGGAACGGTGGGAACACTGCTCTCAGACATTACAAGACTGGACCTGGGAAAACGCATCCTGGACCCACGAGGAATATATAGGTGTAATGGGACAGATATATACAAGGACAAAGAATCTACCGTGCAAGTTCATTATCGAA |
101 | 201 | CAGCTGGCCGACGTTGCGGAGAAATGGTGCTCCAACACGCCCTTCGAGCTCATCGCCACCGAGGAGACCGAACGCAGGATGGATTTCTACGCCGACCCCGGCGTCTCCTTCTATGTGCTGTGTCCGGACAACGGCTGCGGCGACAATTTT | 202 | CACGTGTGGAGTGAGAGCGAGGACTGCCTGCCTTTCTTGCAGCTAGCACAGGATTACATCTCCTCCTGCGGCAAGAAGACGCTCCACGAAGTCCTGGAAAAAGTCTTCAAGTCTTTCAGACCTGTAGGTGCCTGCTTGGCTTCTCACCAC |
102 | 203 | ACGCCCCGCAGAAGAAGTTCGGCCCTGTGGTGGCCCCAAAGCCCAAAGTGAATCCCTTCCGGCCCGGGGACAGCGAGCCTCCCCCGGCACCCGGGGCCCAGCGCGCACAGATGGGCCGGGTGGGCGAGATTCCCCCGCCGCCCCCGGAAG | 204 | ACTTTCCCCTGCCTCCACCTCCCCTTGCTGGGGATGGCGACGATGCAGAGGGTGCTCTGGGAGGTGCCTTCCCGCCGCCCCCTCCCCCGATCGAGGAATCATTTCCCCCTGCGCCTCTGGAGGAGGAGATCTTCCCTTCCCCGCCGCCTC |
103 | 205 | GGGAGCTGGCACGGGCGGCCTGGGCCTGGCTGTAGAGGGCCCCTCCGAGGCCAAGATGTCCTGCATGGATAACAAGGACGGCAGCTGCTCGGTCGAGTACATCCCTTATGAGGCTGGCACCTACAGCCTCAACGTCACCTATGGTGGCCA | 206 | GGACGTGACCTATGACGGCAGTCCCGTGCCCAGCAGCCCCTTCCAGGTGCCCGTGACCGAGGGCTGCGACCCCTCCCGGGTGCGTGTCCACGGGCCAGGCATCCAAAGTGGCACCACCAACAAGCCCAACAAGTTCACTGTGGAGACCAG |
104 | 207 | GGTTTAACTGACGTTTTCTTTCTGCCCAGCCGAAAGGAAAGAAGGCCAAGGGAAAGAAGGTGGCTCCGGCCCCAGCTGTCGTGAAGAAGCAGGAGGCTAAGAAAGTGGTGAATCCCCTGTTTGAGAAAAGGCCTAAGAATTTTGGCATTG | 208 | GACAGGACATCCAGCCCAAAAGAGACCTCACCCGCTTTGTGAAATGGCCCCGCTATATCAGGTTGCAGCGGCAGAGAGCCATCCTCTATAAGCGGCTGAAAGTGCCTCCTGCGATTAACCAGTTCACCCAGGCCCTGGACCGCCAAACAG |
105 | 209 | GTATTTCTTAGAAAATGATGGGTTTAAATGAAATGGATCCTGTTGACAGTAAATTTTCTTATTCTGTTCTTTAGGAACCGGCGAATATTTGGCTTGTTGATGGGTACCCTTCAAAAATTTAAACAAGAATCCACTGTTGCTACTGAAAGG | 210 | CAAAAGCGGCGCCAGGAAATTGAACAAAAACTTGAAGTTCAGGCAGAAGAAGAGAGAAAGCAGGTTGAAAATGAAAGGAGAGAACTGTTTGAAGAGAGGCGTGCTAAACAGACAGAACTGCGGCTTTTGGAACAGAAAGTTGAGCTTGCG |
106 | 211 | GAACAATCAGTGGATTATAGACATAAGTTCTCCTTGCCTAGTGTGGATGGGCAGAAACGCTACACGTTTCGTGTTCGGAGCCGCTTTAACCCACTCTGTGGAAGTGCTCAGCATTGGAGTGAATGGAGCCACCCAATCCACTGGGGGAGC | 212 | ATATCTCCAGTGATCCCCTGGGCTCCAGAGAACCTAACACTTCACAAACTGAGTGAATCCCAGCTAGAACTGAACTGGAACAACAGATTCTTGAACCACTGTTTGGAGCACTTGGTGCAGTACCGGACTGACTGGGACCACAGCTGGACT |
107 | 213 | ATTTCTGTATCTTCTTGTCAGGGGTTGAAAAGTTTCAGTTACATGATTGTACACAAGTTGAAAAAGCAGATACTACTATTTGTTTAAAATGGAAAAATATTGAAACCTTTACTTGTGATACACAGAATATTACCTACAGATTTCAGTGTG | 214 | GTAATATGATATTTGATAATAAAGAAATTAAATTAGAAAACCTTGAACCCGAACATGAGTATAAGTGTGACTCAGAAATACTCTATAATAACCACAAGTTTACTAACGCAAGTAAAATTATTAAAACAGATTTTGGGAGTGAGTATGTTA |
108 | 215 | CTCCGGTGTCGGCGGGTGGCGCGGCGCCCCCGGAGGGGGCCATATCTAACGGGGTTTACGTACTGCCGAGCGCGGCCAACGGAGACGTGAAGCCCGTGGTGTCCAGCACGCCTTTGGTGGACTTCTTGATGCAGCTGGAAGATTACACGC | 216 | CGGCTCCCGCCGCGCTGCCCTCCAGCACCGCCGCGGAGAACAAGGCCAGCCCCGCGGGGACAGCGGGGGGACCTGGGGCTGGAGCAGCTGCTGGGGGCACGGGACCCTTGGCGGCGCGGGCCGGGGAGCCAGCTGAGCGGCGTGGGGCGG |
109 | 217 | CAGCCAGTTAACCTGGAGGGACGTCCAGCACCTGCTAGTGAAGACATCCCGGCCGGCCCACCTGAAAGCGAGCGACTGGAAAGTGAACGGCGCGGGTCATAAAGGTGCGGCAGTGGCGTTCTGGTGGACCATTGGGTGGCCCTGGAATGT | 218 | AGTTAGCCTGTCTGCCATCACTGCCTCACTGTGCTTCTCTCTCCCCCAGGTCACCACGGATCTGCGTCAGCGCTGTACCGATGGCCACACTGGGACCTCAGTCTCTGCCCCCATGGTGGCGGGCATCATCGCCTTGGCTCTAGAAGCAAA |
110 | 219 | TTCCTTAGAGAGGAAGAAGCTATTCAGTTGGATGGATTAAATGCATCACAAATAAGAGAACTTAGAGAGAAGTCGGAAAAGTTTGCCTTCCAAGCCGAAGTTAACAGAATGATGAAACTTATCATCAATTCATTGTATAAAAATAAAGAG | 220 | ATTTTCCTGAGAGAACTGATTTCAAATGCTTCTGATGCTTTAGATAAGATAAGGCTAATATCACTGACTGATGAAAATGCTCTTTCTGGAAATGAGGAACTAACAGTCAAAATTAAGGTAAGTGTAAGGCAGTTTTTCTTTCTTTTAAAG |
111 | 221 | GCATCCGCATGCTGGACGGCGATGTCACAGATGTGGTCGAGGCAAAGTCGCTGGGCATCAGACCCAACTACATCGACATTTACAGTGCCAGCTGGGGGCCGGACGACGACGGCAAGACGGTGGACGGGCCCGGCCGACTGGCTAAGCAGG | 222 | GCACTGAATTCACTGAAACTTGCTGGGCTGCGTCCTCACTTGGTTTTTTCCTTTGTTTCAGACACGGCACTCGTTGTGCGGGAGAAGTTGCTGCTTCAGCAAACAATTCCTACTGCATCGTGGGCATAGCGTACAATGCCAAAATAGGAG |
112 | 223 | GTGAAATGATCCCAACAGAAGAACATCGGAGACCAGAGAGAGGAACTCAAAGGGGCGCTGCCTCCGGGTCTGGGGTCCTGGCCTGCGTGGCCTGTTGGCACGTGTTTCTCTTCCCCGCCCGGCCTCCAGTTGTGTGCTCTCACACAGGCT | 224 | CTGACCCTATTCCCCCGTGCTGTGTCTCCTGCAGAGGGGGAGGTGAGCGCCGACGAGGAGGGCTTTGAGAACCTGTGGGCCACCGCCTCCACCTTCATCGTCCTCTTCCTCCTGAGCCTCTTCTACAGTACCACCGTCACCTTGTTCAAG |
113 | 225 | GTCCTCTGGATGTCAGCATGGCAGCCACAAACCTGGAGAACCAGCTGCACAGCGCACAGAAGAACCTCCTGTTCCTTCAGCGGGAGCATGCCAGCACGCTCAAGGGGCTGCACTCCGAGATCAGGCGGCTGCAGCAGCACTGCACAGGTA | 226 | GAGGTTCTGTATTTACAATAATAATTTATAAGCAAATAACTCACATTTCATCCTCAGTATTTTTCAGTGCTGTACAAGCGTCTTGAATTACTCTGGTAGCTTTTCCAGAAAGACCCATGACTTCACCACATTTCTCGAGTTACGATGAAG |
114 | 227 | TGCTAGCAATGTCTCCCACACTGTGGTCCTGCGCCCTCTCAAGGCTGGTTATTTCAACTTCACCTCGGCAACAATTACTTACCTGGCCCAGGAGGATGGGCCCGTTGTGGTGAGTTGCCCAAACCCTTAGCTGGATGGAATTTGGATCTG | 228 | TAATTTTATATCCATTACTTACTAACCCTTTTTGTTTCATCCATTTTCTAGTGCTGCATTAGACGTGGAACTATCTGATGATTCCTTCCCTCCAGAAGACTTTGGCATTGTGTCTGGAATGCTCAATGTCAAATGGGACCGGATTGCCCC |
115 | 229 | GGTTTTTAATGACCACAACAAGCAAGCATGCAGCTTACTGCTTGAAAGGTGAGGATTGGAAATGTTGGGACTATTATAATTGCAGAATACATGATGATCTCAATCCAACTTGAACTCTCTCACTGATTACTTGATGACAATAAAATATCT | 230 | GTTTTGGTGGCATATACACCTTAATCTGTAGATGGGAGTGATTAGCTGTTTAAAAGTTAAAATGTGACTGAGAAGGAAATTGAGTAGGGCAAATTTTAAATGGGTATTATTTTTCATCTTCAAACAGGCAGACCTGTTATCCTAAACTAG |
116 | 231 | TTTCTATTCCGCCTTCCTTGTAGCAGATAAGGTTATTGTCACTTCAAAACACAACAACGATACCCAGCACATCTGGGAGTCTGACTCCAATGAATTTTCTGTAATTGCTGACCCAAGAGGAAACACTCTAGGACGGGGAACGACAATTAC | 232 | CCTTGTCTTAAAAGAAGAAGCATCTGATTACCTTGAATTGGATACAATTAAAAATCTCGTCAAAAAATATTCACAGTTCATAAACTTTCCTATTTATGTATGGAGCAGCAAGGTAAATCTATATTGATTAAAAACTTATATGTATTACCT |
117 | 233 | GACTCAGAATTCATGATTGAAGAAATGCAGGTTAGTTTAAACTTTGAAGGAAATTTTTAAGGTGGCAAAAGGTTTTGGTGGCATATACACCTTAATCTGTAGATGGGAGTGATTAGCTGTTTAAAAGTTAAAATGTGACTGAGAAGGAAA | 234 | AAATGGAGATTAATCTTAAACTGAAACAGTAGTTGGGAAATCTTTTAGAAATCCACCTATTACTACCTATTGGTAAAGGAGATTAAATTTCTACAGGTATGGAGAGTCGGCTTGACTACACTGTGTGGAGCAAGTTTTAAAGAAGCAAAG |
118 | 235 | AGTGGCCATCGTGGTGGGCGCCCCGCGGACCCTGGGCCCCAGCCAGGAGGAGACGGGCGGCGTGTTCCTGTGCCCCTGGAGGGCCGAGGGCGGCCAGTGCCCCTCGCTGCTCTTTGACCTCCGTGAGTCCCAGGCAAGGAGAGCAAGGTT | 236 | TCTGGAGTGGGTGCTGCTGCTCTTGGGACCTTGTGCTGCCCCTCCAGCCTGGGCCTTGAACCTGGACCCAGTGCAGCTCACCTTCTATGCAGGCCCCAATGGCAGCCAGTTTGGATTTTCACTGGACTTCCACAAGGACAGCCATGGGAG |
119 | 237 | CAGAAGGCATGGCATTCATTGAAGAGCGGAATTATATTCATCGTGACCTTCGGGCTGCCAACATTCTGGTGTCTGACACCCTGAGCTGCAAGATTGCAGACTTTGGCCTAGCACGCCTCATTGAGGACAACGAGTACACAGCCAGGGAGG | 238 | GGGCCAAGTTTCCCATTAAGTGGACAGCGCCAGAAGCCATTAACTACGGGACATTCACCATCAAGTCAGATGTGTGGTCTTTTGGGATCCTGCTGACGGAAATTGTCACCCACGGCCGCATCCCTTACCCAGGTTAGAGCCAAGGGCAGG |
120 | 239 | GCATCATGGCCGCCCTCAGACCCCTTGTGAAGCCCAAGATCGTCAAAAAGAGAACCAAGAAGTTCATCCGGCACCAGTCAGACCGATATGTCAAAATTAAGGTATGTGGTCCTGGGATGGAAATGGGTGTGGGGTGAAGAAAAGAGTTTC | 240 | TCTTTATTTTATTTAAAAGAGCCGGAGCCGGAAGTGCTTGCCTTTTTCCCTGCTAGGACCCAGGGGTTACGACCCATCAGCCCTTGCGCGCCACCGTCCCTTCTCTCTTCCTCGGCGCTGCCTACGGAGGTGGCAGCCATCTCCTTCTCG |
121 | 241 | TCTCCAAAAAAAGTTGGTGATGACATTGCCAAGGCAACGGGTGACTGGAAGGGCCTGAGGATTACAGTGAAACTGACCATTCAGAACAGACAGGCCCAGGTATTTGCTTGTGCTTGGTTTCGGGAGAGGAGGGTGGGGGGACAGGTAGCA | 242 | CAACCCCGGAAAGACGCTGAGAGGGCTGTGGCTCGGGGCTCCCTCTGCACAGACACTAACTCTTCTTTTCCCCCAGTATACCTGAGGTGCACCGGAGGTGAAGTCGGTGCCACTTCTGCCCTGGCCCCCAAGATCGGCCCCCTGGGTCTG |
122 | 243 | GATGCTCCCAGAGCTGAGTGGGAGTGGGACGAGAATGGGGATCAGTGCTGTGAGAATGTATCTGCTTTGTCCCAGTTCTTCATCCTGCTGCTGATTATCCTCCTTGCTGAGGTGACCTTGGCCATCCTGCTCTTTGTATATGAACAGAAG | 244 | CTGAATGAGTATGTGGCTAAGGGTCTGACCGACAGCATCCACCGTTACCACTCAGACAATAGCACCAAGGCAGCGTGGGACTCCATCCAGTCATTTGTGAGTACAGGTGGAATCCTCTTCAGATCAGCCCAGACTTCATTTTCAAGCCTA |
123 | 245 | CAAATACTTCCTCTTGATATGGTGGAATTATAGAGTAGTATCATTTGTAACTGAAATGTCTTCTAGGGTTGCTATGCGAAAGCAAGACTGTGGTTTCATTCCAATTTCCTGTATATCGGAATCATCACCATCTGTGTATGTGTGATTGAG | 246 | GTGTTGGGGATGTCCTTTGCACTGACCCTGAACTGCCAGATTGACAAAACCAGCCAGACCATAGGGCTATGATCTGCAGTAGTCCTGTGGTGAAGAGACTTGTTTCATCTCCGGAAATGCAAAACCATTTATAGCATGAAGCCCTACATG |
124 | 247 | CCCCTCTCTCTGCCCTCACAGCCTGCATGATGAATGTGCACAAGCGCTGCGTGATGAATGTTCCCAGCCTGTGTGGCACGGACCACACGGAGCGCCGCGGCCGCATCTACATCCAGGCCCACATCGACAGGGACGTCCTCATTGTCCTCG | 248 | TAAGAGATGCTAAAAACCTTGTACCTATGGACCCCAATGGCCTGTCAGATCCCTACGTAAAACTGAAACTGATTCCCGATCCCAAAAGTGAGAGCAAACAGAAGACCAAAACCATCAAATGCTCCCTCAACCCTGAGTGGAATGAGACAT |
125 | 249 | GATTTATTTCACATAGATGACTATAACAGAGTGCCACTTAAACATGAGCTGGAAATGAGTAAAGAGAGTGAGCATGATTCAGATGAATCCTCTGATGATGACAGTGATTCAGAGGAACCAAGCAAATACATCAATGCATCTTTTATAATG | 250 | AGCTACTGGAAACCTGAAGTGATGATTGCTGCTCAGGGACCACTGAAGGAGACCATTGGTGACTTTTGGCAGATGATCTTCCAAAGAAAAGTCAAAGTTATTGTTATGCTGACAGAACTGAAACATGGAGACCAGGTTTGTACTTTTGAG |
126 | 251 | CTCTTGGGCAATGTGCTGGTGTGTGTGCTGGCCCGCAACTTTGGCAAGGAATTCACCCCACAAATGCAGGCTGCCTATCAGAAGGTGGTGGCTGGTGTGGCTAATGCCCTGGCTCACAAGTACCATTGAGATCCTGGACTGTTTCCTGAT | 252 | ATGGGCAACCCTAAGGTGAAGGCTCATGGCAAGAAGGTGCTAGGTGCCTTTAGTGATGGCCTGGCTCACCTGGACAACCTCAAGGGCACTTTTTCTCAGCTGAGTGAGCTGCACTGTGACAAGCTGCACGTGGATCCTGAGAACTTCAGG |
127 | 253 | CACCATATGAAGGCGGAGTATGGAAAGTTAGAGTGGACCTACCTGATAAATACCCTTTCAAATCTCCATCTATAGGTATGTTACTACTTGGTTTTTCTCCTTAGAGAGTTTTGAAATCTAAGGGGGAAAAATCTTACATTTCAGGGGAGG | 254 | ATGCCTGGGGTGTTATTGGTAATTCTGCCAGATACTGTAAAGCCTGTGTAATTTCCTTTACCGTCCACCTTAGCATCGAGAGTAAACATGAGGTTACGATCCTGGGAGGACTTAATGAATTTGTAGTGAAGTTTTATGGACCACAAGGAA |
128 | 255 | GCAGCCATGTTTGGGCCTAAAGGTATGCTCCCGTCATCCCCACCCCACCCCACCCCACAGCCTCCTCCACCCCAGCCTGTTGACTTTTTCCACCTTCTCTGCAGGCTTTGGGCGGGGCGGAGCCGAGAGCCACACTTTCAAGTAAACCAG | 256 | GTGGTGGAGACCCCATCCTTGGCTGCTTGCAGGGCCACTGTCCAGGCAAATGCCAGGCCTTGTCCCCAGATGCCCAGGGCTCCCTTGTTGCCCCTAATGCTCTCAGTAAACCTGAACACTTGGAAAACCTGTGTGTGTACATGCGCGTGT |
129 | 257 | ACAATGGGGTTTGCCATTCTCTATCTGGGTCTCACTGGCACAGACAGTGCTGCAAGATTGGTTCCCTCATGGGAATGAAATGTTTCCCCTCCTTCCTCCGCAGGACAAAACAAGGAGAGGCCACCACCTGTTCCCAACCCAGACTATGAG | 258 | CCCATCCGGAAAGGCCAGCGGGACCTGTATTCTGGCCTGAATCAGAGACGCATCTGACCCTCTGGAGAACACTGCCTCCCGCTGGCCCAGGTCTCCTCTCCAGTCCCCCTGCGACTCCCTGTTTCCTGGGCTAGTCTTGGACCCCACGAG |
130 | 259 | GTGAAATGGCTCCAGCAACAGGAAGTGAAACGAAGGGTGAAGAGACAGGTGCGAAGTGACCCGCAGGCCCTTTACTTCAACGACCCCATTTGGTCCAACATGTGGTACCTGGTGAGTAGGACAGGACCTCTGTCTGCCCCAGGACACTTG | 260 | TTCGTGAGACTATTAATTGATGGCATTCTGCTTCTTGTCTTGCAGATTGGAAACCTGGAAGATTACTACCATTTTTATCACAGCAAAACCTTTAAAAGATCAACCTTGAGTAGCAGAGGCCCTCACACCTTCCTCAGAATGGACCCCCAG |
131 | 261 | GCTGGGAGTTTGCACCTGGGGTACAGAGGCAGGGAGGAAGGCGGGTGACTCTGTGGGTAACTAGCTGGAGGCTGGGCCCCCCGGGCTGCCTGACATACACCTCCTTCTGCTTTTGCAGGGGCTTCGGGAAGCAGGGATTCCAGTGCCAAG | 262 | TTTGCTGCTTTGTGGTGCACAAGCGGTGCCATGAATTTGTCACATTCTCCTGCCCTGGCGCTGACAAGGGTCCAGCCTCCGATGTAAGTAATGGGCATCGATTGCTTTTCTCTGTCCACAGTCAATGCTGCCTTGTGATTAAATGTGAGT |
132 | 263 | CCCTAAATGCCGAGCTGCAGCTGGACCGGCAGAAGCCCCGCCAGGGCCGGCGGGTGCTGCTGCTGGGCTCTCAACAGGCAGGCACCACCCTGAACCTGGATCTGGGCGGAAAGCACAGCCCCATCTGCCACACCACCATGGCCTTCCTTC | 264 | CCGTGAGCTGGTGAGGAGGCAGAGGGCATGGGCCTTAAAGGATCTGGGACCTCAGAAAGGCTCCAACCCCTGAGCCCCACTTACGTCTTTGCAGCTTCAACATCCAGATGTGTGTTGGAGCCACTGGGCACAACATTCCTCAGAAGCTAT |
133 | 265 | ACTCTGCCTCCTCCTTAGGGGCCAAGTTTCCCATTAAGTGGACAGCGCCAGAAGCCATTAACTACGGGACATTCACCATCAAGTCAGATGTGTGGTCTTTTGGGATCCTGCTGACGGAAATTGTCACCCACGGCCGCATCCCTTACCCAG | 266 | GGATGACCAACCCGGAGGTGATTCAGAACCTGGAGCGAGGCTACCGCATGGTGCGCCCTGACAACTGTCCAGAGGAGCTGTACCAACTCATGAGGCTGTGCTGGAAGGAGCGCCCAGAGGACCGGCCCACCTTTGACTACCTGCGCAGTG |
134 | 267 | CCAGTTATCAATGTTAAAAAGTGATCTCCCTCTCTCCTCTATCTCCTGTCTTGCCCACCCCCTCTCCATCTCCCCCACCTCTCTTTTTTACAGTATATTATTTCCGGATCACTCCTGGCAGCAACGGAGAAAAACTCCAGGAAGTGTTTG | 268 | GTCAAAGGAAAAATGATAATGAATTCATTGAGCCTCTTTGCTGCCATTTCTGGAATGATTCTTTCAATCATGGACATACTTAATATTAAAATTTCCCATTTTTTAAAAATGGAGAGTCTGAATTTTATTAGAGCTCACACACCATATATT |
135 | 269 | CATGCAGCCAAGCAGGCTGCAGCCTCAGCCACACAGACCATCGCTGCAGCTCAGCACGCAGCCTCTACCCCCAAGGCCTCTGCCGGCCCCCAGCCCCTGCTGGTGCAGAGCTGCAAGGTAAGACTCTAGGAAGGATGTGGGAGTGGAAGA | 270 | GTCTTTCCACCTCTCCCTCAGGGAGCAGCTGCCCACCCTGACAGTGAGGAGCAGCAGCAGCGGCTGCGGGAGGCAGCTGAGGGGCTGCGCATGGCCACCAATGCAGCTGCGCAGAATGCCATCAAGAAAAAGCTGGTGCAGCGCCTGGAG |
136 | 271 | GCCGGGCGGCGGCCGGGCCGGCGGACGGCGGGATGGGCTGCACCGTGAGCGCCGAGGACAAGGCGGCGGCCGAGCGCTCTAAGATGATCGACAAGAACCTGCGGGAGGACGGAGAGAAGGCGGCGCGGGAGGTGAAGTTGCTGCTGTTGG | 272 | GTGCTGGGGAGTCAGGGAAGAGCACCATCGTCAAGCAGATGAAGTAAGTGCTGTATTCCAGAGGCAGTGCTCAAACTCCAGCTTCCCCTCTTCACCCTCTGGGCCTGCACTGCCCCCGACTACAGGCCCAGCCAGTCTTAGCCAGGCCCA |
137 | 273 | CCCCTTGGGAATCACCTGGACCAGTGGGGGCCACAGTGGGAAGGGGGCAGGCAGGAGCAGCATGAACCCCCTGTGCCCTCCTCTCCCCAGGACGACTTCAAAGAGGGCTACCTGGAGACAGTGGCGGCTTATTATGAGGAGCAGCACCCA | 274 | GAGCTCACTCCTCTACTTGAAAAAGAAAGAGATGGATTACGGTGCCGAGGCAACAGATCCCCTGTCCCGGATGTTGAGGATCCCGCAACCGAGGAGCCTGGGGAGAGCTTTTGTGACAAGGTCATGAGATGGTTCCAGGCCATGCTGCAG |
138 | 275 | GACAGGACATCCAGCCCAAAAGAGACCTCACCCGCTTTGTGAAATGGCCCCGCTATATCAGGTTGCAGCGGCAGAGAGCCATCCTCTATAAGCGGCTGAAAGTGCCTCCTGCGATTAACCAGTTCACCCAGGCCCTGGACCGCCAAACAG | 276 | CTACTCAGCTGCTTAAGCTGGCCCACAAGTACAGACCAGAGACAAAGCAAGAGAAGAAGCAGAGACTGTTGGCCCGGGCCGAGAAGAAGGCTGCTGGCAAAGGGGACGTCCCAACGAAGAGACCACCTGTCCTTCGAGCAGGTGAGTAGG |
139 | 277 | CCAATATACAAACTGGAGTGTGGAGCAGCTTCCTGCAGAACCCAAGGAATTAATCTCTATGATTCAGGTCGTCAAACAAAAACTTCCCCAGAAGAATTCCTCTGAAGGGAACAAGCATCACAAGAGTACACCTCTACTCATTCACTGCAG | 278 | GGATGGATCTCAGCAAACGGGAATATTTTGTGCTTTGTTAAATCTCTTAGAAAGTGCGGAAACAGAAGAGGTAGTGGATATTTTTCAAGTGGTAAAAGCTCTACGCAAAGCTAGGCCAGGCATGGTTTCCACATTCGTAAGTATCCTTCA |
140 | 279 | TGAAAGCTGGTGGAATGCGAATTGTGCAGAAACACCCACATACAGGAGACACCAAAGAAGAGAAAGACAAGGATGACCAGGAATGGGAAAGCCCCAGGTGGGATGATGCTAGCGACTCTTGAGCATGTTTTCCAAAAACCCTATTCGGTT | 280 | CGCTCCCCGGCGCTCACACCTGAGCTCACTCGCGCACGCCCGCCCGGCCCGAGAACCGCGCCGCCGCCTCGGCCCCGCGGAAGCCCCGCCGCGTCATGTCTTCGCCTCCCGAAGGGAAACTAGAGACTAAAGCTGGACACCCGCCCGCCG |
141 | 281 | CTGCTGGAGCCAGTTCTGCTTCTCGGCAAGGAGCGATTTGCTGGTGTAGACATCCGTGTCCGTGTAAAGGGTGGTGGTCACGTGGCCCAGATTTATGGTGAGTCCCAGGAACTGGGCGCATGGAGGAGGTGGCTCTGGGAGGGAGGCCTT | 282 | AGCTGGAGCCGGAGCTCACGGGGCCCCTGTTTCTCTTGTATCTTACAGAAGACAGCGACAGCTGTGGCGCACTGCAAACGCGGCAATGGTCTCATCAAGGTGAACGGGCGGCCCCTGGAGATGATTGAGCCGCGCACGCTACAGTACAAG |
142 | 283 | GTTGAAAAGAAAAAGAAGGAGAAGGTTCTCGCAACTGTTACAAAACCAGTTGGTGGTGACAAGAACGGCGGTACCCGGGTGGTTAAACTTCGCAAAATGGTAAGATGTGGGGACTGTAAATTGGATTTTCTGTTTATGCTTGAATACTGT | 284 | GGTAACCTCAAAGCTAAAAAGCCCAAGAAGGGGAAGCCCCATTGCAGCCGCAACCCTGTCCTTGTCAGAGGAATTGGCAGGTATTCCCGATCTGCCATGTATTCCAGAAAGGCCATGTACAAGAGGAAGTACTCAGCCGCTAAATCCAAG |
143 | 285 | GCAAGTGTCGTGGACTTCGTACTGCTAGGAAGCTCCGTAGTCACCGACGAGACCAGAAGTGGCATGATAAACAGTATAAGAAAGCTCATTTGGGCACAGCCCTAAAGGCCAACCCTTTTGGAGGTGCTTCTCATGCAAAAGGAATCGTGC | 286 | TTTTCAAAGGAGAGACCCCAGCCTCGGGTCAGGCGCGGCGCAGACAGCGGCGCGGGGTCCTTGGCTGGGCGGGGCTTGCTCGCGGTGGCTTGTGGCTCCTTCCTGCGGTGCTTCTCTCTTTCGCTCAGGCCCGTGGCGCCGACAGGATGG |
144 | 287 | CTGATCCACAACAACTTCGGAGTGCTCTTCCATAACCTCCCCTCCCTCACGCTGGGCAATGTGTTTGTCATCGTGGGCTCTATTATCATGGTAGTTGCCTTCCTGGGCTGCATGGGCTCTATCAAGGAAAACAAGTGTCTGCTTATGTCG | 288 | TTCTTCATCCTGCTGCTGATTATCCTCCTTGCTGAGGTGACCTTGGCCATCCTGCTCTTTGTATATGAACAGAAGGTAAGTTATAAAGACAACAACTTATTGTCTTAATACTGAAAGTGGGGAGTATGCAGTGGAGAAGTTGGTACAAAG |
145 | 289 | CCCGGTCCTTTGGAGTAGAATGGATTGCAAGTTGGCTAGTGGTCACGGTGCCCACCATTCTTGGCCTGTTACTTACCTGAGATGAGCTCTTTTAACTCAAGCGAAACTTCAAGGCCAGAAGATCTTGCCTGTTGGTGATCATGCTCCTCA | 290 | ACAAAAGGCCCTTCCCAAAGGAGCTCCAGAACAGTGTGCTTGAAACCACCCTTATGCCACATAATTACTCCAGGTGTTATACTTGCCAAGTCAGCAATTCTGTGAGCAGCAAGAATGGCACGGTCTGCCTCAGTCCACCCTGTACCCTGG |
146 | 291 | GTGGAAATTTTGGATTCCTACTACCAGAGGCTGCATCGGCTGCGCGGAGAGCAGGTGGGGGCCAGGTCCCAGTGGGCGTGGCTGGGTGGAGGGGGAACTGAGACTTCAGAATATTTCATGGGAGGTGAGGGCCCATTTCTTAAAGAGGAT | 292 | GGAAATCTTCCTGCACACACATTTTTCCCTGGGTGCAGAACGGGGAGCGGGAAGTGGGTAGGTTCTAAGGCTCTCATTCCCTGAGCCTGGCTCTCCCTATCGCCAGAATATGTCGTCGGTGCCCCCACTTGGAGCTGGACCCTGGGAGCG |
147 | 293 | CTGGTGCCGACAAGAAAGCCGAGGCTGGGGCTGGGTCAGCAACCGAATTCCAGTTTGTGAGTATCTTCCTATTTGTTTTCCATGAGCCATCACTTGTTCTGGCCTCAGTCTGGTTGCTCTGCAAGTTGTGGGGATGTCATATAGTATGGG | 294 | CCAGTTTTCTTGGCTTTAAGGGACAGAGTTCTCACATTGCCCTGTGTTCACAGTGTGGTTTGATTTACATAGGTCTGGAGGGTGAGCGACCTGCGAGACTCACAAGAGGGGAAGCTGACAGAGATACCTACAGACGGAGTGCTGTGCCAC |
148 | 295 | GTTCTTTACAAAGTTCGCTGGAAAGGCTATACATCGGATGATGATACCTGGGAGCCCGAGATTCACCTGGAGGACTGTAAAGAAGTGCTTCTTGAATTTAGGAAGAAAATTGCAGAGAACAAAGCCAAAGCAGTCAGGAAGGATATTCAG | 296 | AGACTATCCTTAAATAACGACATATTTGAGGCGAACTCTGATAGCGATCAGCAAAGTGAGACAAAAGAAGATACTTCCCCAAAGAAGAAAAAGAAAAAATTGAGGCAGAGAGAAGAGAAAAGCCCAGATGATCTGAAAAAGAAAAAAGCA |
149 | 297 | GAGGCTCCTTTTGACCTGAGTGTCGTCTATCGGGAAGGAGCCAATGACTTTGTGGTGACATTTAATACATCACACTTGCAAAAGAAGTATGTAAAAGTTTTAATGCACGATGTAGCTTACCGCCAGGAAAAGGATGAAAACAAATGGACG | 298 | CATGTGAATTTATCCAGCACAAAGCTGACACTCCTGCAGAGAAAGCTCCAACCGGCAGCAATGTATGAGATTAAAGTTCGATCCATCCCTGATCACTATTTTAAAGGCTTCTGGAGTGAATGGAGTCCAAGTTATTACTTCAGAACTCCA |
150 | 299 | AGAGACCGGGTCTCTTAAACTGCCCAATATATTGGCAGCAACTCCGAGAGAAATGCTTGTTATTTTCTCACACTGTCAACCCTTGGAATAACAGTCTAGCTGATTGTTCCACCAAAGAATCCAGCCTGCTGCTTATTCGAGATAAGGATG | 300 | AAAGTAACAAATATAATATATCCACTCCACCACACATTTCAGCATTTAATACTCTTGTTTTTCCTTATGTACTAGTGACATCCTTAATACAGAAATCATCAATAGAAAAATGCAGTGTGGACATTCAACAGAGCAGGAATAAAACAACAG |
151 | 301 | GGTTTGAAAGAAATCTTAACTGTTTTTTTCCCCTACTCTTCTAAGATTGAAGAATTAGGGTCTGAAGGAAAAGTAGAAGAAGCCCAGGGGATGATGAAATTAGTTGAGCAATTAAAAGAAGAGAGAGAACTGCTAAGGTCCACAACGTCG | 302 | ACAATTGAAAGCTTTGCTGCACAAGAAAAACAAATGGAAGTTTGTGAAGTATGTGGAGCCTTTTTAATAGTAGGAGATGCCCAGTCCCGGGTAGATGACCATTTGATGGGAAAACAACACATGGGCTATGCCAAAATTAAAGCTACTGTA |
152 | 303 | GCCCAGTACATCTTGATCCATCAGGCTTTGGTGGAATACAATCAGTTTGGAGAAACAGAAGTGAATTTGTCTGAATTACATCCATATCTACATAACATGAAGAAAAGGGATCCACCCAGTGAGCCGTCTCCACTAGAGGCTGAATTCCAG | 304 | AGACTTCCTTCATATAGGAGCTGGAGGACACAGCACATTGGAAATCAAGAAGAAAATAAAAGTAAAAACAGGAATTCTAATGTCATCCCATGTATGTAGTTTATTTTTTTATTTTTTGTATCAGATAAAGTTAAGCTCTTTTGGATTTGT |
153 | 305 | CAGCCCCAGAAGCGAGAGGAGCAAACCAAGAAGGAGAATGAAGAAGACAAACTCACTGACTGGAATAAACTGGCTTGTCTGCTTTGCAGAAGGCAGTTTCCCAATAAAGAAGTTCTGATCAAACACCAGCAGCTGTCAGACCTGCACAAG | 306 | CAAAACCTGGAAATCCACCGGAAGATAAAACAGTCTGAGCAGGAGCTAGCCTATCTGGAAAGGAGAGAACGAGAGGTAAACTTTGGTGACCTATTACTCCCTTGACCTCAGCTCTTTTTGCTTTCTGATATAGACTTCATAGGCTGTGCT |
154 | 307 | GGCTTAAGTCCACTCCCCGCCCTAAGTTCTCTGTGTGTGTCCTGGGGGACCAGCAGCACTGTGACGAGGCTAAGGCCGTGGATATCCCCCACATGGACATCGAGGCGCTGAAAAAACTCAACAAGAATAAAAAACTGGTCAAGAAGCTGG | 308 | CCAAGAAGTATGATGCGTTTTTGGCCTCAGAGTCTCTGATCAAGCAGATTCCACGAATCCTCGGCCCAGGTTTAAATAAGGCAGGAAAGTTCCCTTCCCTGCTCACACACAACGAAAACATGGTGGCCAAAGTGGATGAGGTGAAGTCCA |
155 | 309 | ATGACGTCCGGTTGTTTGCCTTCGTGCGCTTCACCACCGGGGATGCCATGAGCAAGAGGTCCAAGTTTGCCCTCATCACGTGGATCGGTGAGAACGTCAGCGGGCTGCAGCGCGCCAAAACCGGGACGGACAAGACCCTGGTGAAGGAGG | 310 | AAGCGCCGCGTCGCGCGGCCACCAGCGCTGATGTGTGTGTGTGTTTTTTTCTTCTCCCAACCCAAAGGGTGACTTTTAAATATGACGGCTCCACCATCGTCCCCGGCGAGCAGGGAGCGGAGTACCAGCACTTCATCCAGCAGTGCACAG |
156 | 311 | GCTCAAGGAAAAACATGGACTGCTATTGCAGAATACCAGCGTGCATTGCAGGAGAACGTCGCTATGGAACCTGCATCTACCAGGGAAGACTCTGGGCATTCTGCTGCTGAGCTTGCAGAAAAAGAAAAATGAGCTCAAAATTTGCTTTGA | 312 | CCATTCTCCTGGTGGCCCTGCAGGCCCAGGCTGAGCCACTCCAGGCAAGAGCTGATGAGGTTGCTGCAGCCCCGGAGCAGATTGCAGCGGACATCCCAGAAGTGGTTGTTTCCCTTGCATGGGACGAAAGCTTGGCTCCAAAGCATCCAG |
157 | 313 | CCTCTTACTCTCATTCATTTCATACACACTGGCTCACACATCTACTCTCTCTCTCTATCTCTCTCAGAATGACAATTCTAGGTACAACTTTTGGCATGGTTTTTTCTTTACTTCAAGTCGTTTCTGGAGAAAGTGGCTATGCTCAAAATG | 314 | GAGACTTGGAAGATGCAGAACTGGATGACTACTCATTCTCATGCTATAGCCAGTTGGAAGTGAATGGATCGCAGCACTCACTGACCTGTGCTTTTGAGGACCCAGATGTCAACATCACCAATCTGGAATTTGAAATATGGTGAGGGATGG |
158 | 315 | CTTGCTGCTTGAGTTTTATAATGTCTAATAAATTGTATTTTAGCTGTGGAGGAAGATGCAGAGTCAGAAGATGAAGAGGAGGAGGATGTGAAACTCTTAAGTATATCTGGAAAGCGGTCTGCCCCTGGAGGTGGTAGCAAGGTTCCACAG | 316 | AAAAAAGTAAAACTTGCTGCTGATGAAGATGATGACGATGATGATGAAGAGGATGATGATGAAGAGTAAGTATGATTTTAGAAACTTGATATACTTCCGGAATCTTGACAAAAAAAGGAATTTGACATAGTTATATGCATGAGGGTTTTA |
159 | 317 | GAGGCGGCCGCGCGTGTGTTGGGCCCGGGGTGCTCGGACGCGCGCTCAGGGTCGGTCCTGCTGTTCGTTGCTTCTTAGGCTCTTCTGGAGCTGGAGATGAACTCGGACCTCAAGGCTCAGCTCAGGGAGCTGAATATTACGGCAGCTAAG | 318 | GAAATTGAAGTTGGTGGTGGTCGGAAAGCTATCATAATCTTTGTTCCCGTTCCTCAACTGAAATCTTTCCAGAAAATCCAAGTCCGGCTAGTACGCGAATTGGAGAAAAAGTTCAGTGGGAAGCATGTCGTCTTTATCGCTCAGGTATCT |
160 | 319 | AAAATTTCCCATTTTTTAAAAATGGAGAGTCTGAATTTTATTAGAGCTCACACACCATATATTAACATATACAACTGTGAACCAGCTAATCCCTCTGAGAAAAACTCCCCATCTACCCAATACTGTTACAGCATACAATCTCTGTTCTTG | 320 | GGCATTTTGTCAGTGATGCTGATCTTTGCCTTCTTCCAGGAACTTGTAATAGCTGGCATCGTTGAGAATGAATGGAAAAGAACGTGCTCCAGACCCAAATCTGTAAGTAGTAGCCCCTCTGGCCAAAACCTCCCTCTAGAAAATCCACAT |
161 | 321 | GGGTGGCCCTGCACAGGCCCGATGTCTACTTGCTGCCACCAGCCCGGGAGCAGCTGAACCTGCGGGAGTCGGCCACCATCACGTGCCTGGTGACGGGCTTCTCTCCCGCGGACGTCTTCGTGCAGTGGATGCAGAGGGGGCAGCCCTTGT | 322 | TCTCCGAGAGCCACCCCAATGCCACTTTCAGCGCCGTGGGTGAGGCCAGCATCTGCGAGGATGACTGGAATTCCGGGGAGAGGTTCACGTGCACCGTGACCCACACAGACCTGCCCTCGCCACTGAAGCAGACCATCTCCCGGCCCAAGG |
162 | 323 | ACCCTACGTCCGCTCCAAGGGCCGGAAGTTCGAGCGTGCCAGAGGCCGACGGGCCAGCCGAGGCTACAAAAACTAACCCTGGATCCTACTCTCTTATTAAAAAGATTTTTGCTGACAGTGCTCTGTGTGTGTTATTGGGGGATGGGTTGG | 324 | CCTTCACCCTCCTGGATCTGGGAGGCCAGAAGCTGGGCGCCAGATCCCTGTCTCACCCGGTTCTCCTTCCCCTTCCCTAGGTCCTCGCAAGGGCCGAGAGGTGTACCGGCATTTCGGCAAGGCCCCAGGAACCCCGCACAGCCACACCAA |
163 | 325 | GAGGCATGATCTGCTGGTGGGCGCTCCACTGTATATGGAGAGCCGGGCAGACCGAAAACTGGCCGAAGTGGGGCGTGTGTATTTGTTCCTGCAGCCGCGAGGCCCCCACGCGCTGGGTGCCCCCAGCCTCCTGCTGACTGGCACACAGCT | 326 | AACTGAGACTTCAGAATATTTCATGGGAGGTGAGGGCCCATTTCTTAAAGAGGATGCTTGTCCAGCGGCGTGAATGATGGTGCTCCTCATCTTGCAGATGGCGTCGTATTTTGGGCATTCAGTGGCTGTCACTGACGTCAACGGGGATGG |
164 | 327 | AAATGGCACCTCGAAAGGGGAAGGAAAAGAAGGAAGAACAGGTCATCAGCCTCGGACCTCAGGTGGCTGAAGGAGAGAATGTATTTGGTGTCTGCCATATCTTTGCATCCTTCAATGACACTTTTGTCCATGTCACTGATCTTTCTGGCA | 328 | ACTCTTTGGGAGGAATAATGCCGGCGTCTTCCGGAACCCGACCTCGCCCCGTGACCTCAGAGGTATACTTCCGGGACACGGAAGTGACCCCCGTCGCTCCGCCCCCTCCCACTCTCTCTTTCCGGTGTGGAGTCTGGAGACGACGTGCAG |
165 | 329 | GATAATCCCCTTTTCAAGAGCGCCACCACGACGGTCATGAACCCCAAGTTTGCTGAGAGTTAGGAGCACTTGGTGAAGACAAGGCCGTCAGGACCCACCATGTCTGCCCCATCACGCGGCCGAGACATGGCTTGCCACAGCTCTTGAGGA | 330 | AACATCGCCGCCATCGTCGGGGGCACCGTGGCAGGCATCGTGCTGATCGGCATTCTCCTGCTGGTCATCTGGAAGGCTCTGATCCACCTGAGCGACCTCCGGGAGTACAGGCGCTTTGAGAAGGAGAAGCTCAAGTCCCAGTGGAACAAT |
166 | 331 | TTGAACAGACACGGTAGAAGACTCGCCCATTTTGGAATGTGACCGTCTGTCCTTCAGGAGAGGACACCAGGGTGGGGGTGAAGGAGACACTACTGCCCCCACCCCTGACAGCCCCCACCCCATGGCTTCCATCTTTTGCATCACCACCAC | 332 | AGGTGGCCAGAGCAGGCCGGTTTGGCACCAAGGGCTTGGCTATCACATTTGTGTCCGATGAGAATGATGCCAAGATCCTCAATGATGTGCAGGATCGCTTTGAGGTCAATATTAGTGAGCTGCCTGATGAGATAGACATCTCCTCCTACA |
167 | 333 | GGTGTCTGCTTCTTTTGCAGTGATCGTAAACTTGTTGATAAAGAAGATATCGACACTAGCAGCAAAGGAGGCTGTGTCCAACAGGCTACTGGCTGGAGGAAAGGGACAGGCCTGGGATATGGCCATCCTGGATTGGCTTCATCAGAGGAG | 334 | GCTGAAGGCCGGATGAGGGGCCCCAGTGTTGGAGCCTCAGGAAGAACCAGCAAAAGACAGTCCAACGAGACTTACCGAGATGCTGTTCGAAGAGTCATGTTTGCTCGATATAAAGAACTCGATTAAGAAAGGAGACAAGTTCCATGGGAT |
168 | 335 | TTGCAGATGTCCCAGGAGAGAGGAGTACAGCCAGCACCTTTCCTACAGACCCAGTTTCCCCATTGACAACCACCCTCAGCCTTGCACACCACAGCTCTGCTGCCTTACCTGCACGCACCTCCAACACCACCATCACAGCGAACACCTCAG | 336 | ATGCCTACCTTAATGCCTCTGAAACAACCACTCTGAGCCCTTCTGGAAGCGCTGTCATTTCAACCACAACAATAGGTGATATTACCCTCAGTCAGGCAGCCACACCATCCCCATGTGCCTGGTGATGTGCTCTCACAAGGGCCTTCCACC |
169 | 337 | GTGATGAGGATGATAAAAACATAGGCAGTGATGAGGATCACCTGTCACTGAAGGAATTTTCAGAATTGGAGCAAAGTGGTTATTATGTCTGCTACCCCAGAGGAAGCAAACCAGAAGATGCGAACTTTTATCTCTACCTGAGGGCAAGAG | 338 | TGTGTGAGAACTGCATGGAGATGGATGTGATGTCGGTGGCCACAATTGTCATAGTGGACATCTGCATCACTGGGGGCTTGCTGCTGCTGGTTTACTACTGGAGCAAGAATAGAAAGGCCAAGGCCAAGCCTGTGACACGAGGAGCGGGTG |
170 | 339 | AAATATGAGATTACGGAGCAGCGCAAGATTGATCAGAAAGCTGTGGACTCACAAATTTTACCAAAAATCAAAGCTATTCCTCAGCTCCAGGGCTACCTGCGATCTGTGTTTGCTCTGACGAATGGAATTTATCCTCACAAATTGGTGTTC | 340 | ACACACCAGAAATTTGTCATTGCCACTTCAACCAAAATCGATATCAGCAATGTAAAAATCCCAAAACATCTTACTGATGCTTACTTCAAGAAGAAGAAGCTGCGGAAGCCCAGACACCAGGAAGGTGAGATCTTCGACACAGAAAAAGAG |
171 | 341 | GTGGCTAAAGCAGTGACCCAGGCTCTGAACCGCTGTGTCAGCTGCCTACCTGGCCAGCGCGATGTGGATAATGCCCTGAGGGCAGTTGGAGATGCCAGCAAGCGACTCCTGAGTGACTCGGTAGGAGGACGGTAGGGGGTGGGGGAACGT | 342 | GTCGCTGCACTGACGTCAGATCCTGCAGTGCAGGCCATTGTACTTGATACGGCCAGTGATGTGCTGGACAAGGCCAGCAGCCTCATTGAGGAGGCGAAAAAGGCAGCTGGCCATCCAGGGGACCCTGAGAGCCAGCAGCGGCTTGCCCAG |
172 | 343 | TGGGAAGGTTAGTTCTGCCTCCTGGGCTACAGGTGTCTGGGCATTTGTTCTGTGCCTGTGGAGCCCCTCTGGGCCTGCCCCCTGACCACCTGTGCCCTCTGTTCCAGGTGCTGGGGAGTCAGGGAAGAGCACCATCGTCAAGCAGATGAA | 344 | GATCATCCACGAGGATGGCTACTCCGAGGAGGAATGCCGGCAGTACCGGGCGGTTGTCTACAGCAACACCATCCAGTCCATCATGGCCATTGTCAAAGCCATGGGCAACCTGCAGATCGACTTTGCCGACCCCTCCAGAGCGGTATGTGC |
173 | 345 | AGAATAAGTGGGAAGACTCAGTGTGCCTGTGCCCTCTGCCATTCACTTCATCTATCAATGTTCTCTGATTTCAGGATTAAGCCTATCGTATGGCCCAGTCTCCCCGATCATAAGAAGACTCTGGAACATCTTTGTAAGAAACCAAGAAAA | 346 | AATTTAAATGTGAGTTTCAATCCTGAAAGTTTCCTGGACTGCCAGATTCATAGGGTGGATGACATTCAAGCTAGAGATGAAGTGGAAGGTTTTCTGCAAGATACGTTTCCTCAGCAACTAGAAGAATCTGAGAAGCAGAGGCTTGGAGGG |
174 | 347 | CTATCCGTCAGTCCATCTCCAAAGCCCTGGTGGCCTATTACCAGAAATGTGAGTGAGCATGGGTCCTTCCCATGAGGTAGATGGGTGTGTGGGGATCAAGTCAAGGACTCTGTGTGATTATCTAAATCCTCGTCCCTGCTCTTCTTGCCA | 348 | GCTAAGCTGCCCAGCATGTAACTTAAATCCCTGTTCATTCCCCATTCCTTTAGCTGCTGGAGCCAGTTCTGCTTCTCGGCAAGGAGCGATTTGCTGGTGTAGACATCCGTGTCCGTGTAAAGGGTGGTGGTCACGTGGCCCAGATTTATG |
175 | 349 | GACGATGCCCCGAATTCCCACCCTGAAGAACCTAGAGGATCTTGTTACTGAATACCACGGGAACTTTTCGGTGAGAACGCTGTCATAAGCATGCTGCAGTCTATCAACTGCCAACTGCCTGCCAGCAAGACAGACAGAGTGTGGGGGTGG | 350 | GTAGTAGGGGCACAACAAATATAAGGTCCACTTTGCTTTTCTTTTTTCTATAGAGAATCCTTTCCTGTTTGCATTGGAAGCCGTGGTTATCTCTGTTGGCTCCATGGGATTGATTATCAGCCTTCTCTGTGTGTATTTCTGGCTGGAACG |
176 | 351 | ATTGTTGAATTGTCTTCTTTTATCTAGGAAATCTGTGCTCAGTACTGGGGAGAAGGAAAGCAAACATATGGAGATATTGAAGTTGACCTGAAAGACACAGACAAATCTTCAACTTATACCCTTCGTGTCTTTGAACTGAGACATTCCAAG | 352 | AGGAAAGACTCTCGAACTGTGTACCAGTACCAATATACAAACTGGAGTGTGGAGCAGCTTCCTGCAGAACCCAAGGAATTAATCTCTATGATTCAGGTCGTCAAACAAAAACTTCCCCAGAAGAATTCCTCTGAAGGGAACAAGCATCAC |
177 | 353 | ATGTTTATTGTTCATTTTCTTCACATGTTTAGTGATGAAAAATTTCTCCCTTCTAGGTTTCCCTTGGGGGCTTTGAAATAACACCACCAGTGGTCTTAAGGTTGAAGTGTGGTTCAGGGCCAGTGCATATTAGTGGACAGCACTTAGTAG | 354 | CTGTGGAGGAAGATGCAGAGTCAGAAGATGAAGAGGAGGAGGATGTGAAACTCTTAAGTATATCTGGAAAGCGGTCTGCCCCTGGAGGTGGTAGCAAGGTTCCACAGGTAGAGATGGCAATTTTATTATAGGTTTTGTATTATAGCTTTT |
178 | 355 | CTAACCCCTGTGTGTCCCCTCCTATTACAGCCCAAAGATCCCTGTGCAGCTCCGATTTTATTCGCATCCTTGTGATCTTCTCTGGAATGTTCCTTGTTTTCACCCTGGCCGGGGCCCTGTTCCTCCATCAACGAAGGAAATATAGATCAA | 356 | ACAAAGGAGAAAGTCCTGTGGAGCCTGCAGAGCCTTGTCATTACAGCTGCCCCAGGGAGGAGGAGGGCAGCACCATCCCCATCCAGGAGGATTACCGAAAACCGGAGCCTGCCTGCTCCCCCTGAGCCAGCACCTGCGGGAGCTGCACTA |
179 | 357 | ACCTGATCGTGGGAGCTTACGGGGCCAACCAGGTGGCTGTGTACAGGTGAGCACTGGCTCCAGGGGCGGGATGGGGAAGGTCCTGTGCCATCAAGAGGAGGCCAGGCCAGGAGGAGCCACAATGGCAAGCCTACCCCATCACCCTATCCC | 358 | GTGGCCGGGGCCAAGTGCTGGTGTTCCTGGGTCAGAGTGAGGGGCTGAGGTCACGTCCCTCCCAGGTCCTGGACAGCCCCTTCCCCACAGGCTCTGCCTTTGGCTTCTCCCTTCGAGGTGCCGTAGACATCGATGACAACGGATACCCAG |
180 | 359 | CGCCTCATGCGGCGCCGCGCACGGGTCCCAGAGCCTTCTGGGTAGCGGTTTAACCCCGCCTCTTGCGTCGGCGCCTTCCTTTTCCTCCCTGTCGCCACCGAGGTCGCACGCGTGAGACTTCTCCGCCGCCTCCGCCGCAGACGCCGCCGC | 360 | GATGCGCTACGTCGCCTCCTACCTGCTGGCTGCCCTAGGGGGCAACTCCTCCCCCAGCGCCAAGGACATCAAGAAGATCTTGGACAGCGTGGGTATCGAGGCGGACGACGACCGGCTCAACAAGGTAGCGGCCGCCCTTGCCCCGCAGCC |
181 | 361 | GGAACTCTCTCTCTGATGCTGATTTGCACTCTGCTGGAATTCTGCCTAGCTGTGCTCACTGCTGTGCTGCGGTGGAAACAGGCTTACTCTGACTTCCCTGGGGTGAGTGTGCTGGCCGGCTTCACTTAACCTTGCCTAGTGTATCTTATC | 362 | ATCCTGTCTGTCAAACAGGCCACCTTAAATCCTGCCTCACTGCAGTGTGAGTTGGACAAAAATAATATACCAACAAGAAGTTATGTTTCTTACTTTTATCATGATTCACTTTATACCACGGACTGCTATACAGCCAAAGCCAGTCTGGCT |
182 | 363 | TGATCCCCTGGGCTCCAGAGAACCTAACACTTCACAAACTGAGTGAATCCCAGCTAGAACTGAACTGGAACAACAGATTCTTGAACCACTGTTTGGAGCACTTGGTGCAGTACCGGACTGACTGGGACCACAGCTGGACTGTGAGTGACT | 364 | GCAGCCACTATCTATTCTCTGAAGAAATCACTTCTGGCTGTCAGTTGCAAAAAAAGGAGATCCACCTCTACCAAACATTTGTTGTTCAGCTCCAGGACCCACGGGAACCCAGGAGACAGGCCACACAGATGCTAAAACTGCAGAATCTGG |
183 | 365 | GCAGACCTGTTATCCTAAACTAGGTGAGTCAGCTTTTGGTACATGTGATGATTTTCAGTGTAACCAATGATGTAATGATTCTGCCAAATGAAATATAATGATATCACTGTAAAACCGTTCCATTTTGATTCTGAGGTTACTCTACTAACA | 366 | GGTAACATTCTAGTTTATGCCCCGAAAAGGGGAATATAGCCATTCTATAATGTTTGGAGATTTTGGATTACTCCTAATTGTATGCAAGTTGTCTTACTGTGTATTGTCCCTTAATTTCAGGACTCAGAATTCATGATTGAAGAAATGCAG |
184 | 367 | ATTGAAAATATTGAACTTCCCATGGATACAAAAACAAATGAAAGAAGAGGATTTTGTTTTATCACATATACTGATGAAGAGCCAGTAAAAAAATTGTTAGAAAGCAGATACCATCAAATTGGTTCTGGGAAGGTAAAGCCATTTAAGCAC | 368 | AAAGAACACAAACTGGATGGCAAATTGATAGATCCCAAAAGGGCCAAAGCTTTAAAAGGGAAAGAACCTCCCAAAAAGGTTTTTGTGGGTGGATTGAGCCCGGATACTTCTGAAGAACAAATTAAAGAATATTTTGGAGCCTTTGGAGAG |
185 | 369 | CTTCAACATCCAGATGTGTGTTGGAGCCACTGGGCACAACATTCCTCAGAAGCTATGTGAGTGGCATGAAGGGGGCAGGAGGGAGGTGGGCTTGGACTCCCCCGGAGGCTGGCCAGGGAGGTCCTGACTCTTCTGCTTGCCCTGCCAGCC | 370 | GAGGAGCCACAATGGCAAGCCTACCCCATCACCCTATCCCATCAGAGCTCAGCCAGTGGTGAAGGCCTCTGTCCAGCTACTGGTGCAAGATTCACTGAATCCTGCTGTGAAGAGCTGTGTCCTACCTCAGACCAAGACACCCGTGAGCTG |
186 | 371 | TATTTGTATCCCCTTTTCAGACTCCTGAGGAAGAAGAGATTTTAAACAAAAAACGATCTAAAAAAATTCAGAAGAAATATGATGAAAGGAAAAAGAATGCCAAAATCAGCAGTCTCCTGGAGGAGCAGTTCCAGCAGGGCAAGCTTCTTG | 372 | CGTGCATCGCTTCAAGGCCGGGACAGTGTGGCCGAGCAGATGGCTATGTGCTAGAGGGCAAAGAGTTGGAGTTCTATCTTAGGAAAATCAAGGCCCGCAAAGGCAAATAAATCCTTGTTTTGTCTTCACCCATGTAATAAAGGTGTTTAT |
187 | 373 | GAGCAGTTTTTGCAAGAAAGGATCAAAGTGAACGGAAAAGCTGGGAACCTTGGTGGAGGGGTGGTGACCATCGAAAGGAGCAAGAGCAAGATCACCGTGACATCCGAGGTGCCTTTCTCCAAAAGGTACAGGAGGGAAGTGTGTGTGTGG | 374 | ACTGACTGAAACTTCATCTCTGTTATCATTTGTGTATTTTCTTAGAAAAAGCTTGTGGTGAAGGGGGGCAAAAAAAAGAAGCAAGTTCTGAAGTTCACTCTTGATTGCACCCACCCTGTAGAAGATGGAATCATGGATGCTGCCAATTTT |
188 | 375 | AGCTCAGCCAGTGGTGAAGGCCTCTGTCCAGCTACTGGTGCAAGATTCACTGAATCCTGCTGTGAAGAGCTGTGTCCTACCTCAGACCAAGACACCCGTGAGCTGGTGAGGAGGCAGAGGGCATGGGCCTTAAAGGATCTGGGACCTCAG | 376 | GTGAAACCTCCAGTGGGGGAGGTGGTGTGGGGAACCCCTGGGAAGATGAGATGAGGATCCCATGCCCTAATCGCCAATTCTGACCCATTCCTCGATGTCTATAGACCTGATCGTGGGAGCTTACGGGGCCAACCAGGTGGCTGTGTACAG |
189 | 377 | GGCTGCTAAGGAAGCAAAAAAGGCTAAGCAAGCATCTAAAAAGACTGCAATGGCTGCTGCTAAGGTAATTATGGGGTTTCTTTACTTTCTTGAACAATACAACAGGAAAATTTTCTTTTTTTGAGACGGAGTCTTGCTCTGTTGCCCAGG | 378 | CTTTGTTTTGCAGGAAGAAATTCAAAAGAAAAGAACCCGCCGAGCAGTCAAATTCCAGAGGGCCATTACTGGTGCATCTCTTGCTGATATAATGGCCAAGAGGAATCAGAAACCTGAAGTTAGAAAGGCTCAACGAGAACAAGCTATCAG |
190 | 379 | GCCTGCGCCCCCTGGCAGCACTGGAACGTCCTAGAAAAGACTGAGGAGGCTGAGAAGACGCCCGTAGGTAGCTGCTTTTTGGCTCAGCCAGAGAGCGGCCGCCGCGCCGAGTACTCCCCCTGTCGCGGGAACACCCTGAGCCGCATTTAC | 380 | AGGGACGTGGACTGCCGGGCTTCAGCGCCCCACCCCTTCTTGTGCCTTCCAGGTGATGAGACCCGAAATGTAGGCTCCCAAACTTTACAAACCTTCAAGGCCCGCCAAGGACTGGGGGCGTCGGTCGTCAGCTGGAGCGACGTCATTGTG |
191 | 381 | TAGCACTTTTAAGAAAATTTTTCTTATCAGCTTTTATTTGTTTACCTCCTAGGTCCCAGGGATGAAACTGTTGATGATTTCTGGAGGATGATTTGGGAACAGAAAGCCACAGTTATTGTCATGGTCACTCGATGTGAAGAAGGAAACAGG | 382 | AACAAGTGTGCAGAATACTGGCCGTCAATGGAAGAGGGCACTCGGGCTTTTGGAGATGTTGTTGTAAAGATCAACCAGCACAAAAGATGTCCAGATTACATCATTCAGAAATTGAACATTGTAAATGTGAGTTTGCTTTTTACATAATTT |
192 | 383 | GCGTGTGACACTGAGGACACTGTGGGACACCTGGGACCCTGGAGGGACAAGGATCCGGCCCTTTGGTGCCAACTCTGCCTCTCTTCACAGCACCAGGCCATAGAAAGATTTTATGATAAAATGCAAAATGCAGAATCAGGACGTGGACAG | 384 | GTGATGTCGAGCCTGGCAGAGCTGGAGGTGAGCCGTGGCCTCCCCCTCCACCAAGCTTAGTCCCTGGGTCTTAGGCTCCACAGGACACTGGGTCTGGGCCCCGGGTCCCCTTGGGAATCACCTGGACCAGTGGGGGCCACAGTGGGAAGG |
193 | 385 | CTTCTTGCCAGCCCTGGTCATGCAGTGGCCATGGAGAATGTGTAGAAATCATCAATAATTACACCTGCAACTGTGATGTGGGGTACTATGGGCCCCAGTGTCAGTTTGGTAAGTCTCTTTCCTTTCTTTGCTTCTTCTTAGGTAAAGTCA | 386 | CTGAAGAAGCAGAGAACTGGGGAGATGGTGAGCCCAACAACAAGAAGAACAAGGAGGACTGCGTGGAGATCTATATCAAGAGAAACAAAGATGCAGGCAAATGGAACGATGACGCCTGCCACAAACTAAAGGCAGCCCTCTGTTACACAG |
194 | 387 | GTGGTGGGCTGCGGGGCGCCCGGGGCACAGCCGTGACCTGCCCACACCTGCAGGTGCTGAGGAGCCACGTGATGGTGCGAGTGGGTGGTGGCTGGGACACGCTGGAGCATTACCTGGACAAGCACGACCCGTGCCGCTGCTCCTCCACTG | 388 | CTCATCGCCCACCCCAGCCGAGGGTCTGCACCTTTTCTCCACAGAGGGTGTCGCCCACCACCAGTCCCCGCCCTGCTAGCCCAGTCCCTGGGAGTGAGCGCCGGGGCTCCCGGCCTGAGATGACTCCCGTTAGCTTACGAAGCACAAAGG |
195 | 389 | TGCTAAAGAGTTTTTCTTTCACCTTTTAATATAACGAATTAATTAGCTTTTATTCTTCTATTCATTTTCTTGCAGATGCCTACCTTAATGCCTCTGAAACAACCACTCTGAGCCCTTCTGGAAGCGCTGTCATTTCAACCACAACAATAG | 390 | CTACTACTCCATCTAAGCCAACATGTGGTAAGTTTATTTACTTAGAATCAGCATACCTCACTTTGGAATAGCACTTTAATTACATCTTTCTTTATTCCAAGCTTTCAGGACCCACTAGTAAGCTAAACTCACTGGCTCTAATTTCTCACC |
196 | 391 | CCAATGCTGAGTGTGCCTGTCGCAATGGCTGGCAGTGCAGGGACAAGGAGTGCACCGAGTGTGATCCTCTTCCAAACCCTTCGCTGACCGCTCGGTCGTCTCAGGCCCTGAGCCCACACCCTCAGCCCACCCACTTACCTTATGTCAGTG | 392 | AGATGCTGGAGGCCAGGACAGCTGGGCACATGCAGACTCTGGCTGACTTCAGGCAGCTGCCTGCCCGGACTCTCTCTACCCACTGGCCACGTGAGTTTTCTCCTTAATCCCCACCGCTAGAGAGAATGCATACACGAGGGGCCAGGAGGG |
197 | 393 | CAGCAAGAAGTCTATGTGCCCCAGGATCCTGGATTACCTGAGGAAGAAGAGATCAAGGAAAAAAAACCCACCAGTCAAGGAAAGTCAAGTAGCAAGAAGGAAATGTCTAAAAGAGATGGCAAGGAGAAAAAAGACAGAGGAGTGACGAGG | 394 | TTTCAGGAAAATGCCAGTGAAGGGAAGGCCCCTGCAGAAGACGTCTTTAAGAAGCCCCTGCCTCCTACTGTGAAGAAGGAAGAGAGTCCCCCTCCAGTAAGACCAACATTGATCCCCTGGACCTAGGGCTGGGGCTGGGGATGGTTCCGA |
198 | 395 | TTGCCTTGCTCTCCTTGGTAACCTAGTTCCTGTAACCTTGTGTTTTCCAGATTGGCCCCCGCCGCATCCACACAGTCCGTGTGCGGGGAGGTAACAAGAAATACCGTGCCCTGAGGTTGGACGTGGGGAATTTCTCCTGGGGCTCAGAGT | 396 | GTTGTACTCGTAAAACAAGGATCATCGATGTTGTCTACAATGCATCTAATAACGAGCTGGTTCGTACCAAGACCCTGGTGAAGAATTGCATCGTGCTCATCGACAGCACACCGTACCGACAGTGGTACGAGTCCCACTATGCGCTGCCCC |
199 | 397 | GACGTGGTGGGTCCTCTGGTGCGAAATTCCGGATTTCCTTGGGTCTTCCGGTAGGAGCTGTAATCAATTGTGCTGACAACACAGGTGAGGTCTTTGCACGTTGCTATACTCCCCCTTTTAAAAGCACTCAATGGGCCTGTGGCTAATGAC | 398 | ACGTAATAAGGCAGCGCCCAGAGGCGGAAGAGGCCGGTTTTTGCTCCGGCCACGTGAGGAGGGTGGGCGGGGCGTTAAAGTTCATATCCCAGTGTCCTTTGAATCGACTTCCTTTTTTCTTTTTTCCGGCGTTCAAGATGTCGAAGCGAG |
200 | 399 | GTCTGGAGGGTGAGCGACCTGCGAGACTCACAAGAGGGGAAGCTGACAGAGATACCTACAGACGGAGTGCTGTGCCACGTGAGTAAATGCATCACCTATATTAGGGGTGTTGGGGTGAAATGTCTGGATTCTCACAGCTGGCTCTGGCTG | 400 | TGAAGGAACAGTTTGCCTGGAGACATTTCTACTGGTACCTTACCAATGAGGGTATCCAGTATCTCCGTGATTACCTTCATCTGCCCCCGGAGATTGTGCCTGCCACCCTACGCCGTAGCCGTCCAGAGACTGGCAGGCCTCGGCCTAAAG |
201 | 401 | TCCCTTCGAAGTGAAGGTGGGCACCGAGTGTGGCAATCAGAAGGTACGGGCCTGGGGCCCTGGGCTGGAGGGCGGCGTCGTTGGCAAGTCAGCAGACTTTGTGGTGGAGGCTATCGGGGACGACGTGGGCACGCTGGGTAAGTTGGAGGC | 402 | GCCTGAGGCCCTCCTTGTCTTGGCAGAGGGAGAGGAGCGCGTGAAGCAGAAGGACCTGGGGGATGGCGTGTATGGCTTCGAGTATTACCCCATGGTCCCTGGAACCTATATCGTCACCATCACGTGGGGTGGTCAGAACATCGGGCGCAG |
202 | 403 | GACCACATGATACTGTTTTGAGATTTTATTTACTTTTACAATGGAAAGATTTGATGTTACTCTATTCTTAATTTAGGCACTCAGAATGGTCCAGCGTTTGACATACCGACGTAGGCTTTCCTACAATACAGCCTCTAACAAAACTAGGCT | 404 | GTCCCGAACCCCTGGTAATAGAATTGTTTACCTTTATACCAAGAAGGTTGGGAAAGCACCAAAATCTGCATGTGGTGTGTGCCCAGGCAGACTTCGAGGGGTAAGTGTACCTTTTACTGTGTGCAGCCTAACAAGTCTTGAACTTACTGA |
203 | 405 | GCTTGGTTTGCCAGTGCTGGTGTTGGGCGCACAGGAACCTATATCGGAATTGATGCCATGCTAGAAGGCCTGGAAGCCGAGAACAAAGTGGATGTTTATGGTTATGTTGTCAAGCTAAGGCGACAGAGATGCCTGATGGTTCAAGTAGAG | 406 | GCCCAGTACATCTTGATCCATCAGGCTTTGGTGGAATACAATCAGTTTGGAGAAACAGAAGTGAATTTGTCTGAATTACATCCATATCTACATAACATGAAGAAAAGGGATCCACCCAGTGAGCCGTCTCCACTAGAGGCTGAATTCCAG |
204 | 407 | GCACCTACAAAGGCAGCACCTAAGCAAAAGATTGTGAAGCCTGTGAAAGTTTCAGCTCCCCGAGTTGGTGGAAAACGCTAAACTGGCAGATTAGATTTTTAAATAAAGATTGGATTATAACTCTAGGTTGTGCTGGATTTTTTTTTTTTC | 408 | ACTTGTCCAGTACAGTCTAACTCTAATAATAAGTTGTACCACTAAGGAGTAAAGTGCTTTTGCCTTAAGTTACTTTTACCCCACAGGGCTGCTAAGGAAGCAAAAAAGGCTAAGCAAGCATCTAAAAAGACTGCAATGGCTGCTGCTAAG |
205 | 409 | AGATGGATGTGATGTCGGTGGCCACAATTGTCATAGTGGACATCTGCATCACTGGGGGCTTGCTGCTGCTGGTTTACTACTGGAGCAAGAATAGAAAGGCCAAGGCCAAGCCTGTGACACGAGGAGCGGGTGCTGGCGGCAGGCAAAGGG | 410 | GACAAAACAAGGAGAGGCCACCACCTGTTCCCAACCCAGACTATGAGGTAACGTGGGATAGAAATGGGCCAGGACGCTGGAGGGGATGTCCCTCCAGGGGGGAAGGAAACAGATGGGATGGCCCATCTTGTCTGCCAGATGCCTCAAAGC |
206 | 411 | TTAGCCATTTCTATGGATTTGGTTTGGTGGACGCAGAAGCTCTCGTTGTGGAGGCAAAGAAGTGGACAGCAGTGCCATCGCAGCACATGTGTGTGGCCGCCTCGGACAAGAGACCCAGGTAAGGCTCTGCTGTGGCATCGGTGACTTCTC | 412 | CCGTTGTTGAGCTGTGTGGACTCTAGGGTGTGTTGTGTCATTGCAGCAGCCAGTTAACCTGGAGGGACGTCCAGCACCTGCTAGTGAAGACATCCCGGCCGGCCCACCTGAAAGCGAGCGACTGGAAAGTGAACGGCGCGGGTCATAAAG |
207 | 413 | CTGCGGCCGCTGGCAGCGCTGGCCCTGGTCCTGGCGCTGGCCCCGGGGCTGCCCACAGCCCGGGCCGGGCAGACACCGCGCCCTGCCGAGCGGGGGCCCCCAGTGCGGCTTTTCACCGAGGAGGAGCTGGCCCGCTATGGCGGGGAGGAG | 414 | GAAGATCAGCCCATCTACTTGGCAGTGAAGGGAGTGGTGTTTGATGTCACCTCCGGAAAGGGTAAGTGGTGTGGCATTTTGAATCTTCATTTCCAGGGAGCACAGAAGCCAGAGTGAGCAGCACTTGGAGGTGTGAGGAAAGGGAGGGAA |
208 | 415 | CGGCTTTGATTCAGCAAGCCACAACAGTTAAAAACAAGGATATCAGGAAATTTTTGGATGGTATCTATGTCTCTGAAAAAGGAACTGTTCAGCAGGCTGATGAATAAGATCTAAGAGGTAAGTTCTTACAGTGTCTTAAGTTTTATTACT | 416 | TAGTGCCTCTGCAATTTAAATATTTTTTACACAGATTTGATGCTGTGCAAATGCCCTCTCCCCTTTTAGGTGTTGCTTGTTCAGTATCTCAAGCCCAGAAAGATGAATTAATCCTTGAAGGAAATGACATTGAGCTTGTTTCAAATTCAG |
209 | 417 | CCTCCTCCTTACAGGGAGCAGATAGCAGGGACTTACAGATGAACCAGGCCCTGCGATTTTTGGAAAATGAGCACCAGCAACTGCAGGCCAAGATTGAATGCCTGCAAGGGGACAGAGACCTGTGCAGCTTGGATACCCAGGACCTACAAG | 418 | ATCAACTAAAAAGGTCAGAGGCAGAGAAACTCACCCTGGTGACCAGAGTACAGCAGTTGCAGGGTAAGTTCGCTTTCCAGATTCTGAAAGTCCACAGGGTTTTCCTGGGGTCCTGGCCCACAAAAGGCACCCAGAGTAGGGACTAAGGGC |
210 | 419 | GTATGCCTGTGTCAAGATGAGGTCACGGACGATTACATCGGAGACAACACCACAGTGGACTACACTTTGTTCGAGTCTTTGTGCTCCAAGAAGGACGTGCGGAACTTTAAAGCCTGGTTCCTCCCTATCATGTACTCCATCATTTGTTTC | 420 | GGGAGGAAAGTTCCCAACAGCGTCTCCCCCTCCACTGCTTTCTTTAATAACAAAGACTTGTCCCTGCCAAGCAATAACTTTCTCGCCTTGTCTCCTACAGGGAAACCAATGAAAAGCGTGCTGGTGGTGGCTCTCCTTGTCATTTTCCAG |
211 | 421 | GTGATGAGACCCGAAATGTAGGCTCCCAAACTTTACAAACCTTCAAGGCCCGCCAAGGACTGGGGGCGTCGGTCGTCAGCTGGAGCGACGTCATTGTGGTGGGCCCCGCGGTACAGGGCACAGGGAACAATCGGGGGCAGGGACACCTGG | 422 | GCGGCGCTCACCCAGCTTTCCTATGCAGAGTGGCCATCGTGGTGGGCGCCCCGCGGACCCTGGGCCCCAGCCAGGAGGAGACGGGCGGCGTGTTCCTGTGCCCCTGGAGGGCCGAGGGCGGCCAGTGCCCCTCGCTGCTCTTTGACCTCC |
212 | 423 | GTGAAGGCACGTGGGCCTGGATTGGAGAAGACAGGTGTGGCCGTCAACAAGCCAGCAGAGTTCACAGTGGATGCCAAGCACGGTGGCAAGGCCCCACTTCGGGTCCAAGTCCAGGTAGAGCACCCACGGGTGTTGGGGGCAGGGCAGGTG | 424 | GACAAGGGCGACGGCTCCTGTGATGTGCGCTACTGGCCGCAGGAGGCTGGCGAGTATGCCGTTCACGTGCTGTGCAACAGCGAAGACATCCGCCTCAGCCCCTTCATGGCTGACATCCGTGACGCGCCCCAGGACTTCCACCCAGACAGG |
213 | 425 | ATTTTATTATAGGTTTTGTATTATAGCTTTTAGTTTGGTGATAGAACAGCTCTTGTTCATGAGTACGTATCTTTTCTTTTAAAAGAAAAAAGTAAAACTTGCTGCTGATGAAGATGATGACGATGATGATGAAGAGGATGATGATGAAGA | 426 | TGATGATGATGATGATTTTGATGATGAGGAAGCTGAAGAAAAAGCGCCAGTGAAGAAAGTGAGTAGATACAATGCTACAAGGTTGTTAAACTAACAATAGAAATGGTGATTTTTTAGTGCTATTTGCTTGTTTTGTAGTTAAGGGAAGCT |
214 | 427 | CCTAGATATTATCCTACTGAAGATGTGCCTCGAAAGCTGTTGAGCCACGGCAAAAAACCCTTCAGTCAGCACGTGAGAAAACTGCGAGCCAGCATTACCCCCGGGACCATTCTGATCATCCTCACTGGACGCCACAGGGGCAAGGTGAGA | 428 | CTTAACCTTAATTGGCATTCTCTTACTGTTGATGCATTTGTGTCCTTGTAGGTTGAAAAGAAAAAGAAGGAGAAGGTTCTCGCAACTGTTACAAAACCAGTTGGTGGTGACAAGAACGGCGGTACCCGGGTGGTTAAACTTCGCAAAATG |
215 | 429 | TAAAATGTTTATTGGAGGCTTGAGCTGGGATACAAGCAAAAAAGATCTGACAGAGTACTTGTCTCGATTTGGGGAAGTTGTAGACTGCACAATTAAAACAGATCCAGTCACTGGGAGATCAAGAGGATTTGGATTTGTGCTTTTCAAAGA | 430 | CGCCGCCGCTGCTGCCGCGACCCGGACTGCGCGCCAGCACCCCCCTGCCGACAGCTCCGTCACTATGGAGGATATGAACGAGTACAGCAATATAGAGGAATTCGCAGAGGGATCCAAGATCAACGCGAGCAAGAATCAGCAGGATGACGG |
216 | 431 | GTTGCTGGATCTTTCCAATGAAGGGTTTACAAACTGGGAATTCATGACTGTCCACTGCTGGGGAGAAAAGGCTGAAGGGCAGTGGACCTTGGAAATCCAAGATCTGCCATCCCAGGTCCGCAACCCGGAGAAGCAAGGTCAGTGGCTCTT | 432 | GACCAGCGCCTGCGCGGAGCACTCGGACCAGCGGGTGGTCTACTTGGAGCACGTGGTGGTTCGCACCTCCATCTCACACCCACGCCGAGGAGACCTCCAGATCTACCTGGTTTCTCCCTCGGGAACCAAGTCTCAACTTCTGGCAAAGAG |
217 | 433 | GGGGACTCTGGAGGCCCTCTTGTGTGTAACAAGGTGGCCCAGGGCATTGTCTCCTATGGACGAAACAATGGCATGCCTCCACGAGCCTGCACCAAAGTCTCAAGCTTTGTACACTGGATAAAGAAAACCATGAAACGCTACTAACTACAG | 434 | GCCCCCCTGGGAAAACACTCACACACACTACAAGAGGTGAAGATGACAGTGCAGGAAGATCGAAAGTGCGAATCTGACTTACGCCATTATTACGACAGTACCATTGAGTTGTGCGTGGGGGACCCAGAGATTAAAAAGACTTCCTTTAAG |
218 | 435 | AGATTATAATTCTCTGCTGAGATTTGAGTTGGATTTGAGGATTTGGAGAATCCCTGCAGCTTTGTAACTTCAGAGGTGTAATTAGCTGAAAACATCATCGTTTTGAAGAGTTCTGCGTTTTGCCAGTCACCTCTCAACTGTGTGCCAAAG | 436 | AAGGACTCCATGAAAGATGACAGAAGAAGTTATTGTGATAGCCAAGTGGGACTACACCGCCCAGCAGGACCAGGAGCTGGACATCAAGAAGAACGAGCGGCTGTGGTTGCTGGACGACTCCAAGACGTGGTGGCGGGTGAGGAACGCGGC |
219 | 437 | AGAAAACAACTGAGGCCAAGATGATGAAAGCTGGGGGCACTGAAATAGGAAAGACACTTGCAGAAAAGAGCCGAGGCCTATTTAGTGCTAATGACTGGCAATGTAAAACGTATGTTTTTTAAATTATTGTCTGCTCTTTCTTCCAAAATA | 438 | TTAGAAGTTATGAATTCCAGATATGTAGTGAGGACAAGTTAAAATGTAAAATTTTACAAATTTAAATTTTTATAAATGCTTTTTAATCTGTTTTTAGATGTGGAAATGTAAACTTTGCTAGAAGAACCAGCTGTAATCGATGTGGTCGGG |
220 | 439 | CTTCTACAGAGATAACAATTATTTTGCTTTTCAGAAGGACGCATGCTGTTTCTTAGGGACACGGCTGACTTCCAGATATGACCATGTATTTGTGGCTTAAACTCTTGGCATTTGGCTTTGCCTTTCTGGACACAGAAGTATTTGTGACAG | 440 | GGCAAAGCCCAACACCTTCCCCCACTGGTAAGAATTAATATTTATATTTTTACTAATTTTATTTTCTTGTTGCAAAGTTTATATATTTAACTACAATTTTCTATTATTAACACTGAAATTATTTTTAAGGATAAATTTTATAATCATGAG |
221 | 441 | AATTTCGCTAAGGAGTTTGTGATCAGTGATCGGAAGGAGCTGGAGGAAGATTTCATCAAGAGCGAGCTGAAGAAGGCGGGGGGAGCCAATTACGACGCCCAGACGGAGTAACCCCAGCCCCCGCCACACCACCCCTTGCCAAAGTCATCT | 442 | CGGTTGTTTGCCTTCGTGCGCTTCACCACCGGGGATGCCATGAGCAAGAGGTCCAAGTTTGCCCTCATCACGTGGATCGGTGAGAACGTCAGCGGGCTGCAGCGCGCCAAAACCGGGACGGACAAGACCCTGGTGAAGGAGGTCGTACAG |
222 | 443 | GTGAAGGCGTTTGGGCCGGGGCTGCAGGGAGGCAGTGCGGGCTCCCCCGCCCGCTTCACCATCGACACCAAGGGCGCCGGCACAGGTGGCCTGGGCCTGACGGTGGAGGGCCCCTGTGAGGCGCAGCTCGAGTGCTTGGACAATGGGGAT | 444 | GTGGAGCCAGGCCTGGGGGCTGACAACAGTGTGGTGCGCTTCCTGCCCCGTGAGGAAGGGCCCTATGAGGTGGAGGTGACCTATGACGGCGTGCCCGTGCCTGGCAGCCCCTTTCCTCTGGAAGCTGTGGCCCCCACCAAGCCTAGCAAG |
223 | 445 | ATGGTCATCTTTAAGGTACCTGATTGCATGCACTTAAATGCAGATTATTTTGGAGTTTGAAAAGGGACTATTAATGAAATCTTTCTTTTCCCTCCTTTCTCTTTTTCCCTTCCCCGCCACTGATTCAGTGAGCTGGAGATTGGATCACAG | 446 | CCGAAGGAGTAAAGGTGCTGCAATGATGTTAGCTGTGGCCACTGTGGATTTTTCGCAAGAACATTAATAAACTAAAAACTTCATGTGTCTGGTTGTTTGAAATGTATTTGCAGTTTCCTGGGACTGCTAGGAGGTTAGTCTGCTGATTTC |
224 | 447 | GACTATCCATCCCTTGCCTTGCTTGGAGAGAAATTGGCAGAGAACAACATCAACCTCATCTTTGCAGTGACAAAAAACCATTATATGCTGTACAAGGTATGCTGGGAGGGAGGGAGGCTAGTGATTTGTGGGGTGAAGTGGGTGGTGAGG | 448 | TGGCGAAAGGATGCACTGCATTTGCTGGTGTTCACAACAGATGATGTGCCCCACATCGCATTGGATGGAAAATTGGGAGGCCTGGTGCAGCCACACGATGGCCAGTGCCACCTGAACGAGGCCAACGAGTACACTGCATCCAACCAGATG |
225 | 449 | ATGTGGATGAGGCTTCCAAGAAGGAGATCAAAGACATCCTCATCCAGTATGACCGGACCCTGCTGGTAGCTGACCCTCGTCGCTGCGAGTCCAAAAAGTTTGGAGGCCCTGGTGCCCGCGCTCGCTACCAGAAATCCTACCGATAAGCCC | 450 | ATTTATGGTGAGTCCCAGGAACTGGGCGCATGGAGGAGGTGGCTCTGGGAGGGAGGCCTTCACAGCGCTCCTGTACCCTTTAATTGTGTGTCTTTCTCACAGCTATCCGTCAGTCCATCTCCAAAGCCCTGGTGGCCTATTACCAGAAAT |
226 | 451 | AGCCTCCCAGGCTGGGCAGCTGCTCTGGTCTCACCTCTCTGCTTTCTGTAGGTATTGGCAAGCTTGCCAGTGTACCTGCTGGTGGGGCTGTAGCCGTCTCTGCTGCCCCAGGCTCTGCAGCCCCTGCTGCTGGTTCTGCCCCTGCTGCAG | 452 | CAGAGGAGAAGAAAGATGAGAAGAAGGAGGAGTCTGAAGAGTCAGATGATGACATGGGATTTGGCCTTTTTGATTAAATTCCTGCTCCCCTGCAAATAAAGCCTTTTTACACATCTCTCAAGTATTCCATGAGCACTTTGTCAAGGGTGG |
227 | 453 | GATGTTGTCTACAATGCATCTAATAACGAGCTGGTTCGTACCAAGACCCTGGTGAAGAATTGCATCGTGCTCATCGACAGCACACCGTACCGACAGTGGTACGAGTCCCACTATGCGCTGCCCCTGGGCCGCAAGAAGGGAGCCAAGCTG | 454 | ACTCCTGAGGAAGAAGAGATTTTAAACAAAAAACGATCTAAAAAAATTCAGAAGAAATATGATGAAAGGAAAAAGAATGCCAAAATCAGCAGTCTCCTGGAGGAGCAGTTCCAGCAGGGCAAGCTTCTTGGTGAGAAGGCTGTTGTGTTG |
228 | 455 | GTTACTGTGCTCTTTGCTGGCCAGCACATCGCCAAGAGCCCCTTCGAGGTGTACGTGGATAAGTCACAGGGTGACGCCAGCAAAGTGACAGCCCAAGGTCCCGGCCTGGAGCCCAGTGGCAACATCGCCAACAAGACCACCTACTTTGAG | 456 | CAGAGGCCCCGCAGCGCTCCCTTTCAGTGGGGCTGCTCTTAGCAAAGGCTCACAGGCTCCTTCCCACTGCAGGCAAAAGTGACCGCCAATAACGACAAGAACCGCACCTTCTCCGTCTGGTACGTCCCCGAGGTGACGGGGACTCATAAG |
229 | 457 | ACTTAATAATTTTTTAAAATGTAGAACAAGTGTGCAGAATACTGGCCGTCAATGGAAGAGGGCACTCGGGCTTTTGGAGATGTTGTTGTAAAGATCAACCAGCACAAAAGATGTCCAGATTACATCATTCAGAAATTGAACATTGTAAAT | 458 | AAAAAAGAAAAAGCAACTGGAAGAGAGGTGACTCACATTCAGTTCACCAGCTGGCCAGACCACGGGGTGCCTGAGGATCCTCACTTGCTCCTCAAACTGAGAAGGAGAGTGAATGCCTTCAGCAATTTCTTCAGTGGTCCCATTGTGGTG |
230 | 459 | GCTATCAAAGGAGGCTGACTTTGTACTATCTGATATGCATGTGTTTGTGGCCTGTGAGTCTGTGATGTAAGGCTCAATGTCCTTACAAAGCAGCATTCTCTCATCCATTTTTCTTCCCCTGTTTTCTTTCAGACTGTGGCTTCACCTCCG | 460 | AGTCTTACCAGCAAGGGGTCCTGTCTGCCACCATCCTCTATGAGATCTTGCTAGGGAAGGCCACCTTGTATGCCGTGCTGGTCAGTGCCCTCGTGCTGATGGCCATGGTAAGGAGGAGGGTGGGATAGGGCAGATGATGGGGGCAGGGGA |
231 | 461 | GCCTGGTGGAGCCAGTGGACGTGGTAGACAACGCTGATGGCACCCAGACCGTCAATTATGTGCCCAGCCGAGAAGGGCCCTACAGCATCTCAGTACTGTATGGAGATGAAGAGGTACCCCGGAGGTAAGAGGCAGGGCCTGCTGCCTGTG | 462 | TGCATGATGTGACAGATGCGTCCAAGGTCAAGTGCTCTGGGCCCGGCCTGAGCCCAGGCATGGTTCGTGCCAACCTCCCTCAGTCCTTCCAGGTGGACACAAGCAAGGCTGGTGTGGCCCCATTGCAGGTCAAAGTGCAAGGGCCCAAAG |
232 | 463 | CCCTCTCCCAGAGACTACAGAGAACGTGGTGTGTGCCCTGGGCCTGACTGTGGGTCTGGTGGGCATCATTATTGGGACCATCTTCATCATCAAGGGATTGCGCAAAAGCAATGCAGCAGAACGCAGGGGGCCTCTGTAAGGCACATGGAG | 464 | GTGATGGTGTTTCTTAGAGAGAAGATCACTGAAGAAACTTCTGCTTTAATGGCTTTACAAAGCTGGCAATATTACAATCCTTGACCTCAGTGAAAGCAGTCATCTTCAGCATTTTCCAGCCCTATAGCCACCCCAAGTGTGGATATGCCT |
233 | 465 | AATGTTTTACATTGTGATATATAATATATATATATATATAAATTCACATTAGCAAACTAATTATTTTATTTTTTGTTACTGAAATTCAGGCCTATTTTCACAATGGAGACTATCCTGGAGAACCCTTTATTTTACATCATTCAACATCTT | 466 | ATAATTCTAAGGCACTGATAGCATTTCTGGCATTTCTGATTATTGTGACATCAATAGCCCTGCTTGTTGTTCTCTACAAAATCTATGATCTACATAAGAAAAGATCCTGGTAAGAGTTGATTTTAAATTTTTAAATAATAATGGTATTAG |
234 | 467 | GACAATGAAGGCTGCCCTGTGGAGGCGTTGGTCAAGGACAACGGCAATGGCACTTACAGCTGCTCCTACGTGCCCAGGAAGCCGGTGAAGCACACAGCCATGGTGTCCTGGGGAGGCGTCAGCATCCCCAACAGCCCCTTCAGGGTGAGC | 468 | ACCTGGCCCCCTGACAGCTGGGTGGTCTCCCGCTAGGTGAAGGCACGTGGGCCTGGATTGGAGAAGACAGGTGTGGCCGTCAACAAGCCAGCAGAGTTCACAGTGGATGCCAAGCACGGTGGCAAGGCCCCACTTCGGGTCCAAGTCCAG |
235 | 469 | AAGTGTAGGCCTCCCAGGGACCGTAATGGCCCCCATGAACGTTACCATTTGGAAGTTGAAGCTGGAAATACTCTGGTTAGAAATGAGTCGCATAAGAATTGCGATTTCCGTGTAAAAGATCTTCAATATTCAACAGACTACACTTTTAAG | 470 | GCCTATTTTCACAATGGAGACTATCCTGGAGAACCCTTTATTTTACATCATTCAACATCTTGTAAGTTATCACTGGGCTATTTATTATATATATTAAGATATATATTAATGCTTATAAAGCTATATTATTTTACACTTATAATCACATTT |
236 | 471 | TGCATGTAAACCCACGCTTACGTCTCTTCCTTCCTTCCCACTACAGAGTTTTATGGACGAGGAGCCCCCTACAATGCCTTGACGGGGAAGGACTCCACTAGAGGGGTAGCCAAGATGTCCTTGGATCCTGCAGACCTCACCCATGACACT | 472 | ACGGGTCTCACGGCCAAGGAACTGGAGGCCCTGGATGAGGTCTTCACCAAAGTGTACAAAGCCAAATACCCCATCGTCGGCTACACTGCCCGGAGAATTCTCAATGAGGATGGCAGCCCTAACCTGGACTTCAAGCCTGAAGACCAGCCC |
237 | 473 | GCTGGAGGACCGCAGTCTGTCCTTCTAGCCTGACCCCTGCTGTCTTCCTAGGCCATCTGGCTGCTGTGCACAGGCGCTCGTGAGGCTGCCTTCCGGAACATTAAGACCATTGCTGAGTGCCTGGCAGATGAGCTCATCAATGCTGCCAAG | 474 | GGCTCCTCGAACTCCTATGCCATTAAGAAGAAGGACGAGCTGGAGCGTGTGGCCAAGTCCAACCGCTGATTTTCCCAGCTGCTGCCCAATAAACCTGTCTGCCCTTTGGGGCAGTCCCAGCCACCTGTGCTGTTGTCTGTCTTCGGTGGG |
238 | 475 | GAGGAGATGTTAAAGTAACCCATCTTGCAGGACGACATTGAAGATTGGTCTTCTGTTGATCTAAGATGATTATTTTGTAAAAGACTTTCTAGTGTACAAGACACCATTGTGTCCAACTGTATATAGCTGCCAATTAGTTTTCTTTGTTTT | 476 | TGTAAAGAAAGTTTTGTAGAAAACTATCTTAATGAGAATTCTGTGTTTTCAAAATAGGCCAACAGAGCACTTATGGCAAGGCATCTCGAGGGGGTGGCAATCACCAAAACAATTACCAGCCATACTAAAGGAGAACATTGGAGAAAACAG |
239 | 477 | ACACGGCACTCGTTGTGCGGGAGAAGTTGCTGCTTCAGCAAACAATTCCTACTGCATCGTGGGCATAGCGTACAATGCCAAAATAGGAGGTAAGGCCGGGCGTGGCAGCCTGCGAGCCGAGGGGCCTGGGGCAGGGGCAGCTGGGAGCTC | 478 | GGAACATAAAATCCATTTCAAACAGAGCTGTCACATGCCATTTCTCCTCACTCACCACGATTCCATTTCTTAGGATTCCTACGCCAGCTACGACGTGAACGGCAATGATTATGACCCATCTCCACGATATGATGCCAGCAATGAAAATAA |
240 | 479 | GACCTCTGGTCCTCAATCGAGTTCCTCTACGAAGAACACACCAGAAATTTGTCATTGCCACTTCAACCAAAATCGATATCAGCAATGTAAAAATCCCAAAACATCTTACTGATGCTTACTTCAAGAAGAAGAAGCTGCGGAAGCCCAGAC | 480 | GTAGAAATTTCCTTTACCCAAATTTAGATGCCTGTGATTTTATGAATTCAGAAGTCAGTTTTTAATTGCAGAAAACTAATTATTTTCTTTTTAACTTACAGAGGGTGGTTTTCCTGAAGCAGCTGGCTAGTGGCTTATTACTTGTGACTG |
241 | 481 | GACACTCTGGACTTCAGCCAACAGGTAATACCTTTTAATCCTCTTTTAGAAACAGACACAGTTTCCCTAGTGAGAGGTGAAGCCAGCTGGACTTCTGGGTGGGGTGGGGACTTGGAGAACTTTTCTTACAAGAGGTTTTTTTTTGTTTTT | 482 | TTTGGAATAAACATCACTAAACCTGGCTTCCTCTCTCAGGAGCACGGTCTGAATCTGCACAGAGCAAGATGCTGAGTGGAGTCGGGGGCTTTGTGCTGGGCCTGCTCTTCCTTGGGGCCGGGCTGTTCATCTACTTCAGGAATCAGAAAG |
242 | 483 | GCCTGGAGTGGTGTGTCTAAGGGACTGGCTGAGAGTCTGCAGCCAGACTACAGTGAACGACTCTGCCTCGTCAGTGAGATTCCCCCAAAAGGAGGGGCCCTTGGGGAGGGGCCTGGGGCCTCCCCATGCAACCAGCATAGCCCCTACTGG | 484 | TTTCATATGGGACAACTGGGAGAAGGGTGATAAAAAAGCTTTAACCTATGTGCTCCTGCTCCCTCTTTCTCCCCTGTCAGGACGATGCCCCGAATTCCCACCCTGAAGAACCTAGAGGATCTTGTTACTGAATACCACGGGAACTTTTCG |
243 | 485 | GCTGGGACAAGCGTTACTGTGAAGCGGGCTTCAGCTCCGTGGTCACTCAGGCGAGTAGGGAGCAAAAGCGCAGTGGGGGCGGCTCCCAAACAGGGCCCCCTCTCACCCTCAGGACTTCCCTTCCAGGCCGGAGAGCTGGTGCTTGGGGCT | 486 | AGCACTGGAACGTCCTAGAAAAGACTGAGGAGGCTGAGAAGACGCCCGTAGGTAGCTGCTTTTTGGCTCAGCCAGAGAGCGGCCGCCGCGCCGAGTACTCCCCCTGTCGCGGGAACACCCTGAGCCGCATTTACGTGGAAAATGATTTTA |
244 | 487 | TCTCACAGAAAGTTCTCCGCTCCCAGACATGGGTCCCTCGGCTTCCTGCCTCGGAAGCGCAGCAGCAGGCATCGTGGGAAGGTGAAGAGCTTCCCTAAGGATGACCCGTCCAAGCCGGTCCACCTCACAGCCTTCCTGGGATACAAGGCT | 488 | GCGTGGAAATGGCGCTCGGTACGTGCCCCCGACCTGTCGTCTGCCGCGGGGGCGCGCTCGCACGCCGGAAGGGGCGGGGCCAGATTTGGCTTTATATAGCGGACCCGTAAGGCCGACCGGCCTCTACCGGCGGGATTTGATGGCGTGATG |
245 | 489 | AAGAAAAGCAGCAAACAGAAAGGGTTACAAAAGAGATGAATGAATTTATCCATAAAGAGCAAAATAGTTTATCACTACTAGAAGCAAGAGAAGCAGACGGTGATGTGGTTAATGAAAAGAAGAGAACTCCAAATGAAACCACATCAGTTT | 490 | CAGGAGACAGTGAAGATGAGAGGAGTGACAGAGGATCTGAGTCATCTGACACTGATGATGAAGAATTACGGCATCGAATCCGGCAAAAACAGGAAGCTTTTTGGAGAAAAGAAAAAGAACAGCAGCTATTACATGATAAACAGATGGAAG |
246 | 491 | ATTTCCCATGAGCACCCACAGGCGTGCACGCAGCGCAGCCCTCCGTCGTCGCTCGCGCCCTTTATACTCACTTCCGCCCGCGAGCCACTTCCTTTCCTTTCAGCGGAGCGCGGCGGCAAGATGGCAGTGCAAATATCCAAGAAGAGGAAG | 492 | TTTGTCGCTGATGGCATCTTCAAAGCTGAACTGAATGAGTTTCTTACTCGGGAGCTGGCTGAAGATGGCTACTCTGGAGTTGAGGTGCGAGTTACACCAACCAGGACAGAAATCATTATCTTAGCCACCAGGTAAAACTCATTTGACTGG |
247 | 493 | TAAATGTTAATGAGAATGTGGAATGTGGAAACAATACTTGCACAAACAATGAGGTGCATAACCTTACAGAATGTAAAAATGCGTCTGTTTCCATATCTCATAATTCATGTACTGCTCCTGATAAGACATTAATATTAGATGTGCCACCAG | 494 | GGGTTGAAAAGTTTCAGTTACATGATTGTACACAAGTTGAAAAAGCAGATACTACTATTTGTTTAAAATGGAAAAATATTGAAACCTTTACTTGTGATACACAGAATATTACCTACAGATTTCAGTGTGGTAAGAATATAACATTGACCA |
248 | 495 | ATCAAGACACAGCCATCCGGGTCTTCGCCATCCCCCCATCCTTTGCCAGCATCTTCCTCACCAAGTCCACCAAGTTGACCTGCCTGGTCACAGACCTGACCACCTATGACAGCGTGACCATCTCCTGGACCCGCCAGAATGGCGAAGCTG | 496 | AGGCCAAAGAGTCTGGGCCCACGACCTACAAGGTGACCAGCACACTGACCATCAAAGAGAGCGACTGGCTCGGCCAGAGCATGTTCACCTGCCGCGTGGATCACAGGGGCCTGACCTTCCAGCAGAATGCGTCCTCCATGTGTGTCCCCG |
249 | 497 | GCTACATCCACGTGACGCAGACCTTCAGCATTATGGCTGTTCTGTGGGCCCTGGTGTCCGTGAGCTTCCTGGTCCTGTCCTGCTTCCCCTCACTGTTCCCCCCAGGCCACGGCCCGCTTGTCTCAACCACCGCAGCCTTTGCTGCAGGTA | 498 | TCTGCCGGTCCCTGGCCCTGCTGGGGGGCTCCCTGGGCCTGATGTTCTGCCTGATTGCTTTGAGCACCGATTTCTGGTTTGAGGCTGTGGGTCCCACCCACTCAGCTCACTCGGGCCTCTGGCCAACAGGGCATGGGGACATCATATCAG |
250 | 499 | AGCTGAATGGTGAAACAAATACACCCATTGAAGGAAACCAGGCGGGTGATGCAGCTGCCTCTGCCAGGAGTCTACCAAATGAAGAAATAGTGCAGAAGATAGAGGAAGTACTTTCTGGGGTCTTAGATACAGAACTACGATATAAGCCAG | 500 | ACTTGAAAGAGGGCTCCAGAAAAAGTAGATGCGTATCTGTACAAACAGATCCTACTGATGAAATTCCCACTAAAAAGTCAAAGAAGCATAAAAAGCACAAAAACAAAAAGAAGAAAAAGAAGAAAGAAAAGGAAAAAAAATATAAAAGAC |
251 | 501 | ATATCCCAGAAGAAACTGAAGAAACAAAAACTTATGGCACGGGAGTAAATTCAGCATTAAAATAAATGTAATTAAAAGGAAAAGAATGTTGGTTGTCTTTATTAGTGAACATATTTCAAGTGTCCTTACAAGATGGATCAAATGAGGATT | 502 | AACAAAGCACCTAAGATGCGCCGCCGGACCTACAGAGCTCATGGTCGGATTAACCCATACATGAGCTCTCCCTGCCACATTGAGATGATCCTTACGGAAAAGGAACAGATTGTTCCTAAACCAGAAGAGGAGGTTGCCCAGAAGAAAAAG |
252 | 503 | TCTCCTTCCGCCTGCGGAGGGGAAGCTGAAGTCTGGTCTTCCTCAGGTCTGGTCTTCTCTCGTCTGAGCCCTGAGTACTACGACCTGGCAAGAGCCCACCTGCGTGATGAGGAGAAATCCTGCCCGTGCCTGGCCCAGGAGGGCCCCCAG | 504 | GGTGACCTGTTGACCAAAACACAGGAGCTGGGCCGTGACTACAGGACCTGTCTGACGATAGTCCAAAAACTGAAGAAGATGGTGGATAAGCCCACCCAGGTGAGGCCAAGGGGCTACAGAGCCTCCTGTCTGCTGCTCAATGGAGGGGCC |
253 | 505 | AAATTAGCATCCAGGATATGACAGCCCAGGTGACCAGCCCATCGGGCAAGACCCATGAGGCCGAGATCGTGGAAGGGGAGAACCACACCTACTGCATCCGCTTTGTTCCCGCTGAGATGGGCACACACACAGTCAGCGTGAAGTACAAGG | 506 | GGCGTGGGCCGTGCTTTCTTCCTGCAGGCAGCCCCTTCTCTGTGAAGGTGACAGGCGAGGGCCGGGTGAAAGAGAGCATCACCCGCAGGCGTCGGGCTCCTTCAGTGGCCAACGTTGGTAGTCATTGTGACCTCAGCCTGAAAATCCCTG |
254 | 507 | CAGCTGGCCGACGTTGCGGAGAAATGGTGCTCCAACACGCCCTTCGAGCTCATCGCCACCGAGGAGACCGAACGCAGGATGGATTTCTACGCCGACCCCGGCGTCTCCTTCTATGTGCTGTGTCCGGACAACGGCTGCGGCGACAATTTT | 508 | TTACTGGGGCTTCCGGATGCAGATGACGATGCGTTTGAAGAGTACAGTGCTGACGTGGAAGAAGAGGAGCCAGAGGCGGACCACCCCCAGATGGGGGTCAGCCAGCAGTAAATCTGGGGGCTCCCCTGAGAAGGAGAGTGAGCCCCACAG |
255 | 509 | AGGGAGAGGAGCGCGTGAAGCAGAAGGACCTGGGGGATGGCGTGTATGGCTTCGAGTATTACCCCATGGTCCCTGGAACCTATATCGTCACCATCACGTGGGGTGGTCAGAACATCGGGCGCAGGTGAGGCCCCCAGGCATCCCTCTCCC | 510 | CTCTGCCTGCAGCCTGTAACCCGAGTGCCTGCCGGGCGGTTGGCCGGGGCCTCCAGCCCAAGGGTGTGCGGGTGAAGGAGACAGCTGACTTCAAGGTGTACACAAAGGGCGCTGGCAGTGGGGAGCTGAAGGTCACCGTGAAGGGCCCCA |
256 | 511 | AGGCTGCTGCTGAGAAGGCAGTGACCAAGGAGGAATTTCAGGGTGAATGGACTGCTCCCGCTCCTGAGTTCACTGCTACTCAGCCTGAGGTTGCAGACTGGTCTGAAGGTGTACAGGTGCCCTCTGTGCCTATTCAGCAATTCCCTACTG | 512 | AAGACTGGAGCGCTCAGCCTGCCACGGAAGACTGGTCTGCAGCTCCCACTGCTCAGGCCACTGAATGGGTAGGAGCAACCACTGACTGGTCTTAAGCTGTTCTTGCATAGGCTCTTAAGCAGCATGGAAAAATGGTTGATGGAAAATAAA |
257 | 513 | CAGCATCTATTACTAATTTCCATCCTAAGTACTGAGTTCATTAAGTCTTGGGTTCCTTTATTTTGGCTTGCATTATTGCATTTTCAGATCAACTAAAAAGGTCAGAGGCAGAGAAACTCACCCTGGTGACCAGAGTACAGCAGTTGCAGG | 514 | GTTTGCTTCAAAATCAATCCTTACAGCTTCAAGAACAGGAGAAACTCTTAACAAAGAAAGGTCAGCAAATTTATTACCACAAATTCTAAGATATTGCTCTTCTCTTACCTGCCTAGAGGCAGCGGGATGGACTACATGACCTCCTGGAGT |
258 | 515 | GGGACGTCAGCATCGGCATCAAGTGTGCCCCTGGAGTGGTAGGCCCCGCCGAAGCTGACATCGACTTCGACATCATCCGCAATGACAATGACACCTTCACGGTCAAGTACACGCCCCGGGGGGCTGGCAGCTACACCATTATGGTCCTCT | 516 | CTTGCCCTTGCCCCTGTGCCCTGCAGGTGAATGTGGGAGCTGGCAGCCACCCCAACAAGGTCAAAGTATACGGCCCCGGAGTAGCCAAGACAGGGCTCAAGGCCCACGAGCCCACCTACTTCACTGTGGACTGCGCCGAGGCTGGCCAGG |
259 | 517 | CTCCGGGTTGACAAATGGTGGGGTAACAGAAAGGAACTGGCTACCGTTCGGACTATTTGTAGTCATGTACAGAACATGATCAAGGGTGTTACACTGGTAAGCAGATGTATCAGACTTCCTTGTTTTGGAAAGGGAGGTTTCTCAAACCTG | 518 | TTGTGTGGCCTGACGAGTGTGTTCTCTCTTCTAGTCGACATTACTCTGAAGGGACGCACAGTTATCGTGAAGGGCCCCAGAGGAACCCTGCGGAGGGACTTCAATCACATCAATGTAGAACTCAGCCTTCTTGGAAAGAAAAAAAAGAGG |
260 | 519 | CATCCACTGCTGCCTCTGTTCTCTCCCCAGGCTGTCCAGATTATGAATGGGCTCTTCCACATTGCCCTGGGGGGTCTTCTGATGATCCCAGCAGGGATCTATGCACCCATCTGTGTGACTGTGTGGTACCCTCTCTGGGGAGGCATTATG | 520 | TATATTATTTCCGGATCACTCCTGGCAGCAACGGAGAAAAACTCCAGGAAGTGTTTGGCAAGTAACCATATGTCCTTCTTTCCCACATGTCAGAGAAGTACCTATTTTTTTCGGTTAAAAACTGAGACCCTTAAAAAGCCAAGGTATCAC |
261 | 521 | GGGGTATGATGGCATCTGACTCCTTGTTACCCACTTCCTGCAGCTAGATACACTGTCAGATCCTTTGGCATCCGGAGAAATGAAAAGATTGCTGTCCACTGCACAGTTCGAGGGGCCAAGGCAGAAGAAATCTTGGAGAAGGGTCTAAAG | 522 | GTGCGGGAGTATGAGTTAAGAAAAAACAACTTCTCAGATACTGGAAACTTTGGTTTTGGGATCCAGGAACACATCGATCTGGGTATCAAATATGACCCAAGCATTGGTATCTACGGCCTGGACTTCTATGTGGTATGAATATTTAATCTT |
262 | 523 | AAAAGCAACTGGAAGAGAGGTGACTCACATTCAGTTCACCAGCTGGCCAGACCACGGGGTGCCTGAGGATCCTCACTTGCTCCTCAAACTGAGAAGGAGAGTGAATGCCTTCAGCAATTTCTTCAGTGGTCCCATTGTGGTGCACTGCAG | 524 | TGCTGGTGTTGGGCGCACAGGAACCTATATCGGAATTGATGCCATGCTAGAAGGCCTGGAAGCCGAGAACAAAGTGGATGTTTATGGTTATGTTGTCAAGCTAAGGCGACAGAGATGCCTGATGGTTCAAGTAGAGGTATGTTCTAACCT |
263 | 525 | TGGTCTGGTCTCTCACTCCCCAGGCAATACTAGCCCCTCTGGAGCACGGAGCTCCTTCCCCAAAGACATGAAGCTATTGGAGAACTCGAGCTTTGAAGCCATCAACTCACAGCTGACTGTGGAGACTGGAGATGCCCACATCATTGGCAG | 526 | GATTGAGAGCTACTCATGTAAGATGGCAGGAGACGACAAACACATGTTCAAGCAGTTCTGCCAGGAGGGCCAGCCCCACGTGCTGGAGGCACTTTCTCCACCCCAGACTTCAGGACTGAGCCCCAGCAGGTGAGCCATGGTGGGGCCTAC |
264 | 527 | GTCTCAACATATGCACTAGTGGAAGTGCCACCTCATGTGAAGAATGTCTGCTAATCCACCCAAAATGTGCCTGGTGCTCCAAAGAGGTATGTAGGTGGGGGAGGGGAGGAAGAAGGGAAGGAATGCTGCGAGGGTGAGGGTGAGAAGGAG | 528 | CCCGCGCTCCGGCCCCAGCCCCGGCCGCCGGCCCCCGCGGAGTGCAGCGACCGCGCCGCCGCTGAGGGAGGCGCCCCACCATGCCGCGGGCCCCGGCGCCGCTGTACGCCTGCCTCCTGGGGCTCTGCGCGCTCCTGCCCCGGCTCGCAG |
265 | 529 | GTCCTCGCAAGGGCCGAGAGGTGTACCGGCATTTCGGCAAGGCCCCAGGAACCCCGCACAGCCACACCAAGTGAGTATCAGGCCCCCAGCCCTGCCCTCTCCCCAGACTCAGCCTGCAGGGCCAGGCCTGGCCACACTTGGGCTGCTTCT | 530 | CCCGCCAGCCTTGTCCTCTCCACCAGGTATGTGCACTGCGCGTGACCAGCCGGGCCCGCAGCCGCATCCTCAGGGCAGGGGGCAAGATCCTCACTTTCGACCAGCTGGCCCTGGACTCCCCTAAGGGCTGTGGCACTGTCCTGCTCTCCG |
266 | 531 | TGTTAAATCTAACTAGATAGACTTTATGAAGTAGAAGTATTGTAAATCAGCTTTCCCAAAAATGACATGGCAGATATTCTAAAGCAAAATTTTAATAATTTACATTTTTTTTCTCCATTACAGCTACTACTCCATCTAAGCCAACATGTG | 532 | ATGAAAAATATGCAAACATCACTGTGGATTACTTATATAACAAGGAAACTAAATTATTTACAGCAAAGCTAAATGTTAATGAGAATGTGGAATGTGGAAACAATACTTGCACAAACAATGAGGTGCATAACCTTACAGAATGTAAAAATG |
267 | 533 | AGTCTAAGTGTATTCCCTCTGGCTTCCATTTAGATTTTCCTGAGAGAACTGATTTCAAATGCTTCTGATGCTTTAGATAAGATAAGGCTAATATCACTGACTGATGAAAATGCTCTTTCTGGAAATGAGGAACTAACAGTCAAAATTAAG | 534 | TGTGATAAGGAGAAGAACCTGCTGCATGTCACAGACACCGGTGTAGGAATGACCAGAGAAGAGTTGGTTAAAAACCTTGGTACCATAGCCAAATCTGGGACAAGCGAGTTTTTAAACAAAATGACTGAAGCACAGGAAGATGGCCAGTCA |
268 | 535 | GCTTCCGGATGCAGATGACGATGCGTTTGAAGAGTACAGTGCTGACGTGGAAGAAGAGGAGCCAGAGGCGGACCACCCCCAGATGGGGGTCAGCCAGCAGTAAATCTGGGGGCTCCCCTGAGAAGGAGAGTGAGCCCCACAGTAACCTAG | 536 | GCATGTCGGTGTAAACCTGATTGTCTCGACATTTTCTGTTTAATTGATTGGTGCTGTGAGGAGTTCGGCTGCTCGTGGTAAAACAGCGTACTCCAGTTTTAAGTCATCGGGTAAAATAATAGGACAGTGATTTCCATCTGTGCTTCAGTA |
269 | 537 | TGCCGGGAGAGCCGCGGCGACGTCAGTTCCTCCTTTCGGGGCTCTGATTGGTCAGAGCGCCCGGCGCTTCTGGTTGGCCGGCCCTGCTATCATCCCAGAGTGCATTGCGGGGCCGCTTCCTTTCCGCTCGGCTGTTTTCCTGCGCAGGAG | 538 | CCGCAGGGCCGTAGGCAGCCATGGCGCCCAGCCGGAATGGCATGGTCTTGAAGCCCCACTTCCACAAGGACTGGCAGCGGCGCGTGGCCACGTGGTTCAACCAGCCGGCCCGTAAGATCCGCAGGTGAGCCCTGCGCTCGGGGCTGCCCC |
270 | 539 | AGCTGGTGGAGAACCCTGCTGACTTCTGTGGTTTCTGTGCTCTTCCCAGAAGTAAGGCTGTCACAAGGCTGGAAGCAGAGAACATCCCCATGGAACTGAAGACAGCATGCTGCATCCCTGGGAGGAGGGAGCTCTTAAGGAAGTTCCAAG | 540 | TGGATGTAAAGCTGGATCCCGCCACGGCGCACCCGAGTCTGCTCTTGACCGCCGACCTGCGCAGTGTGCAGGATGGAGAACCATGGAGGGATGTCCCCAACAACCCTGAGCGATTTGACACATGGCCCTGCATCCTGGGTTTGCAGAGCT |
271 | 541 | GGTCTGCACCTTTTCTCCACAGAGGGTGTCGCCCACCACCAGTCCCCGCCCTGCTAGCCCAGTCCCTGGGAGTGAGCGCCGGGGCTCCCGGCCTGAGATGACTCCCGTTAGCTTACGAAGCACAAAGGAGGGGCCCGAGACCCCACCCAG | 542 | GCCCCGGGATCAGCTGCCCCCCCATCCCCGCTCCCGCCGCTACTCCGGGGACAGTGACTCCTCAGCCTCCTCCGCCCAGAGCGGCCCCCTTGGTACCCGCAGTGATGACACAGGCACTGGCCCCCGGAGGGAGCGACCCAGCCGGCGGCT |
272 | 543 | GATGACAGCAAGGCTGGCATGGAGGAAGATCACACCTACGAGGTAAGGAGAGGGGCAGGCCCAGCAGCTCTGAGTCCTCGGGGTCAGTGGCCACTATCTGCTGGTGTGGTTGGGGTGTGGTCCCGGCCTGAGTTCCACTTAATGTCTCCA | 544 | CAGGCTCACAACCCCACCCCTGTCCCCGCAGGATTCAGCACCTTGGCACAGCTGAAGCAGAGGAACACGCTGAAGGATGGTATCATCATGATCCAGACGCTGCTGATCATCCTCTTCATCATCGTGCCTATCTTCCTGCTGCTGGACAAG |
273 | 545 | GCAAGAAATCCAAGAGAAGGTAAGTTTTATTAGTGGCGAGGAGTTTCCACATCTGCTGATTCATTCTCTACTTCCTTAAGTTACTTCCTGCTCTAGCTAGACACATTAACCCCATAGTAGTTTATTTACCTGGGGTCCTCATCCAAATGA | 546 | CCTTACTGATTTCTCTTTCAGAATTGGACAAAAGTTTCTCAATGATTAAGGAGGGTGATTATAACCCCCTCTTCATTCCAGTGGCAGTCATGGTTACTGCATTCTCTGGGTTGGCATTTATCATTTGGCTGGCAAGGAGATTAAAAAAAG |
274 | 547 | GATCTCCCTCAGCGATCCCTGGCCCTGGCAGAGCAGAAGTGTGAAGAGTGGAGGAGCCAGTATGAGGCTCTGAAGGAGGACTGGAGGACCCTTGGGACCCAGCACAGGGAGCTGGAGAGCCAACTCCACGTGCTTCAGTCCAAACTGCAG | 548 | GGAGCAGATAGCAGGGACTTACAGATGAACCAGGCCCTGCGATTTTTGGAAAATGAGCACCAGCAACTGCAGGCCAAGATTGAATGCCTGCAAGGGGACAGAGACCTGTGCAGCTTGGATACCCAGGACCTACAAGGTACTCTTCTCCTT |
275 | 549 | TTGCAGCAATGTGAATTGGGCCAGAAGATCAGAGTGTAATATGTGTAATACTCCAAAGTATGCTAAATTAGAAGAAAGAACAGGTATGATAAAACCACATTGTAACTAAATGATTTTTTTTAAAGCACTAAATATTGAAACGATAATTGT | 550 | TATTTCTCCAGGTTCATATTGCATGATTTTTCTGTTTTCAGAGAAAACAACTGAGGCCAAGATGATGAAAGCTGGGGGCACTGAAATAGGAAAGACACTTGCAGAAAAGAGCCGAGGCCTATTTAGTGCTAATGACTGGCAATGTAAAAC |
276 | 551 | CCGAGTCGTCCGGAAATCCATTGCCCGTGTTCTCACAGTTATTAACCAGACTCAGAAAGAAAACCTCAGGAAATTCTACAAGGTGAGTCTGCCTGGACATAGGGAGGGTTGGCTGCAGGAAGCCAAGTGCTAGCCGTCCCTGGCCGGGGA | 552 | CTCTTGCGCGCAGGCCAAGATCAAGGCTCGAGATCTTCGCGGGAAGAAGAAGGAGGAGCTGCTGAAACAGCTGGACGACCTGAAGGTGGAGCTGTCCCAGCTGCGCGTCGCCAAAGTGACAGGCGGTGCGGCCTCCAAGCTCTCTAAGAT |
277 | 553 | ATGAGGGCAGGACCTGGGGGGACCTGGGCGCCGCTGCCGGGGGCGGCACCCCCAGCAAGGGGGTCAACTTCGCCGAGGAGCCCATGCAGTCCGACTCCGAGGACGGGGAGGAGGAGGAGGCGGCGCCCGCGGACGCAGGGGCGTTCAATG | 554 | CTCCAGTAATAAACCGATTCACAAGGCGTGCCTCAGGTAAGTCTGATTATATTATGGATTTTGTTTATTAATGGTGACATTTAAAAAATGATAATATTGGACAAGAAGGTACAAAGAATAATTGCTGTATAGTAAACCTTTATTTGTCAG |
278 | 555 | TGTGAAATCAAAGTTGCACAACCCAAAGAGGTATATAGGCAGCAACAGCAACAACAAAAAGGTGGAAGAGGTGCTGCAGCTGGTGGACGAGGTGGTACGAGGGGTCGTGGCCGAGGTGAGACTTAATTCTTGAAATATGACTCCGTGGTT | 556 | TGTGCATCTCGATTTCAGATTGAAAATATTGAACTTCCCATGGATACAAAAACAAATGAAAGAAGAGGATTTTGTTTTATCACATATACTGATGAAGAGCCAGTAAAAAAATTGTTAGAAAGCAGATACCATCAAATTGGTTCTGGGAAG |
279 | 557 | GATCCCGAAGGCAAGCCGAAGAAGACACACATCCAAGACAACCATGACGGCACGTATACAGTGGCCTACGTGCCAGACGTGACAGGTCGCTACACCATCCTCATCAAGTACGGTGGTGACGAGATCCCCTTCTCCCCGTACCGCGTGCGT | 558 | AAGGTCAAGGTGCTGCCTACTCATGATGCCAGCAAGGTGAAGGCCAGTGGCCCCGGGCTCAACACCACTGGCGTGCCTGCCAGCCTGCCCGTGGAGTTCACCATCGATGCAAAGGACGCCGGGGAGGGCCTGCTGGCTGTCCAGATCACG |
280 | 559 | CCTGGTTCAAAAGCAGCTAAACCAAAAGAAGCCTCCAGACAGCCCTGAGATCACCTAAAAAGCTGCTACCAAGACAGCCACGAAGATCCTACCAAAATGAAGCGCTTCCTCTTCCTCCTACTCACCATCAGCCTCCTGGTTATGGTACAG | 560 | ATACAAACTGGACTCTCAGGACAAAACGACACCAGCCAAACCAGCAGCCCCTCAGCATCCAGCAACATAAGCGGAGGCATTTTCCTTTTCTTCGTGGCCAATGCCATAATCCACCTCTTCTGCTTCAGTTGAGGTGACACGTCTCAGCCT |
281 | 561 | AGGATCAAGGTGAAAAGGAGAACCCCATGCGGGAACTTCGCATCCGCAAACTCTGTCTCAACATCTGTGTTGGGGAGAGTGGAGACAGACTGACGCGAGCAGCCAAGGTGTTGGAGCAGCTCACAGGGCAGACCCCTGTGTTTTCCAAAG | 562 | CTAGATACACTGTCAGATCCTTTGGCATCCGGAGAAATGAAAAGATTGCTGTCCACTGCACAGTTCGAGGGGCCAAGGCAGAAGAAATCTTGGAGAAGGGTCTAAAGGTGAGCCTAATCCCCTAATGGAGTGATATTGATCAGCACTCCT |
282 | 563 | AGGAGTTGAAGCCAAACAGCCAAATTCTGCCATTAGGAAGTGTGTAAGGGTCCAGCTGATCAAGAATGGCAAGAAAATCACAGCCTTTGTACCCAATGACGGTTGCTTGAACTTTATTGAGGTGAGTATTTCAACTCTATCGTACCTTCT | 564 | TGGACTTCGTACTGCTAGGAAGCTCCGTAGTCACCGACGAGACCAGAAGTGGCATGATAAACAGTATAAGAAAGCTCATTTGGGCACAGCCCTAAAGGCCAACCCTTTTGGAGGTGCTTCTCATGCAAAAGGAATCGTGCTGGAAAAAGT |
283 | 565 | AGGACTTGACCTCTGACCCCTACCCTCTCTCTCTGGCCTCAGGTGAGGGAGATTCTGGGCCGCTGCACCTGCCCTGACCAGTTTCCCATGATCAAGGTCTCAGAGGGGAAGTACCGTGTGGGGGACTCGAGCCTGCTCATCTTTGTGCGG | 566 | GTGCTGAGGAGCCACGTGATGGTGCGAGTGGGTGGTGGCTGGGACACGCTGGAGCATTACCTGGACAAGCACGACCCGTGCCGCTGCTCCTCCACTGGTCAGTGCCAGGGTGGGGCTGGGGCTGGACGGGCAGGGGACTTGCTTCTGTGG |
284 | 567 | TGATTCAGTGTGAGCCTCTATCAGCACCAGATTTGGGGATCATGAACTGTAGCCATCCCCTGGCCAGCTTCAGCTTTACCTCTGCATGTACCTTCATCTGCTCAGAAGGAACTGAGTTAATTGGGAAGAAGAAAACCATTTGTGAATCAT | 568 | GTACCATGGACTGTACTCACCCTTTGGGAAACTTCAGCTTCAGCTCACAGTGTGCCTTCAGCTGCTCTGAAGGAACAAACTTAACTGGGATTGAAGAAACCACCTGTGGACCATTTGGAAACTGGTCATCTCCAGAACCAACCTGTCAAG |
285 | 569 | GGCCTGGACATTGACCAGACAGCCACCTATGAGGACATAGTGACGCTGCGGACAGGGGAAGTGAAGTGGTCTGTAGGTGAGCACCCAGGCCAGGAGTGAGAGCCAGGTCGCCCCATGACCTGGGTGCAGGCTCCCTGGCCTCAGTGACTG | 570 | CTCACTCCTGACCCCTCACCCCTCTCCCTGGCCCTCCCCAGCCTGGCCCAGCAGGGGATGGGGCTGGGGGACACTAACACTCTGATCTCCATCCCTCTCCGCCCCCAGGATGACAGCAAGGCTGGCATGGAGGAAGATCACACCTACGAG |
286 | 571 | GAAGAAATTCAAAAGAAAAGAACCCGCCGAGCAGTCAAATTCCAGAGGGCCATTACTGGTGCATCTCTTGCTGATATAATGGCCAAGAGGAATCAGAAACCTGAAGTTAGAAAGGCTCAACGAGAACAAGCTATCAGGTGAGGAATGCTT | 572 | TATAGTTAAATAGTAATTCCTTTGCATTTGTCACTCTAGGTTTTCCAGTTTCTTAATGCGAAATGCGAGTCGGCTTTCCTTTCCAAGAGGAATCCTCGGCAGATAAACTGGACTGTCCTCTACAGAAGGAAGCACAAAAAGGGACAGTCG |
287 | 573 | TACTTCCTGGCATCCAGGAGGGTCTGAAAGATATTCACCTCCCCCTGCTCACTGAGGCACCCACCCCACCCACCCCTACAGAAACGATGGCAGAACGAGAAGCTCGGGTTGGATGCCGGGGATGAATATGAAGATGAAAACCTTTATGAA | 574 | GGCCTGAACCTGGACGACTGCTCCATGTATGAGGACATCTCCCGGGGCCTCCAGGGCACCTACCAGGATGTGGGCAGCCTCAACATAGGAGATGTCCAGCTGGAGAAGCCGTGACACCCCTACTCCTGCCAGGCTGCCCCCGCCTGCTGT |
288 | 575 | CCTTGGGGGTGGGGGGATAGAGGCATGGAATAGGTGCTCTGACCTCTGACCCTCTAGCCCAGGGAGAAGGTGAGCAGTATTGATTTGGAGATCGACTCTCTGTCCTCACTGCTGGATGACATGACCAAGAATGATCCTTTCAAAGCCCGG | 576 | GTGTCATCTGGATATGTGCCCCCACCAGTGGCCACTCCATTCAGTTCCAAGTCCAGTACCAAGCCTGCAGCCGGGGGCACAGCACCCCTGCCTCCTTGGAAGTCCCCTTCCAGCTCCCAGCCTCTGCCCCAGGTTCCGGCTCCGGCTCAG |
289 | 577 | CCGTGACCCTAGGGGCCGGTTTGCGCCGGGAGCCGGGGCACGGTTCCGGCCGTACTCACGGCGCCGCGCGGTGACTCCCCAGGCGCAGCCCAGCCTCGAAATGCAGAACGACGCCGGCGAGTTCGTGGACCTGTACGTGCCGCGGAAATG | 578 | CTCCGCTAGCAATCGCATCATCGGTGCCAAGGACCACGCATCCATCCAGATGAACGTGGCCGAGGTGAGCTGGGAGCCCGGGAGGCGGGAAGGTTGTGATATATGTGCGGGAAAGGCAGGCTGTCCCATTGTGGAGGAGCCCCTGGGGTG |
290 | 579 | GCGAGTATTTCTAAGTAAGTTTCACTGTCCTTTCTCCTCCAATTTTAGGTGTTCAGGCGCTTCGTGGAGGTTGGCCGGGTGGCCTATGTCTCCTTTGGACCTCATGCCGGAAAATTGGTCGCGATTGTAGATGTTATTGATCAGAACAGG | 580 | GCTTTGGTCGATGGACCTTGCACTCAAGTGAGGAGACAGGCCATGCCTTTCAAGTGCATGCAGCTCACTGATTTCATCCTCAAGTTTCCGCACAGGTAACTGTCCACTAATCACTCCTCCCTCCCATCCCCAGATTTGTTTATGCTAGTA |
291 | 581 | TTTTATTCTTGTCTGTTCTGCCTCACTCCCGAGCTCTACTGACTCCCAACAGAGCGCCCAAGAAGAAAATGGCCATAAGTGGAGTCCCTGTGCTAGGATTTTTCATCATAGCTGTGCTGATGAGCGCTCAGGAATCATGGGCTATCAAAG | 582 | AAGAACATGTGATCATCCAGGCCGAGTTCTATCTGAATCCTGACCAATCAGGCGAGTTTATGTTTGACTTTGATGGTGATGAGATTTTCCATGTGGATATGGCAAAGAAGGAGACGGTCTGGCGGCTTGAAGAATTTGGACGATTTGCCA |
292 | 583 | TATACCTGAGGTGCACCGGAGGTGAAGTCGGTGCCACTTCTGCCCTGGCCCCCAAGATCGGCCCCCTGGGTCTGGTACGTTATCCCCTCCAAGGGGCATTTTTTTCACATTTGTTTCACTTTAAGCGCCGGCTCGTGGAGTCACGCCTGT | 584 | CTGGCTTGTCCGCGCGATTTCCGGCCTCTCGGCTTTCGGCTCGGAGGAGGCCAAGGTGCAACTTCCTTCGGTCGTCCCGAATCCGGGTTCATCCGACACCAGCCGCCTCCACCATGCCGCCGAAGTTCGACCCCAACGAGATCAAAGTCG |
293 | 585 | TTATACAGGTTTCTGGCCAGAAGAACCAACTCCACATTCAACCAGGTTGTGTTGAAGAGGTTGTTTATGAGTCGCACCAACCGGCCGCCTCTGTCCCTTTCCCGGATGGTGAGTGGCTGGTCCAGAGAGCACGGTAGACCTGGGAGCCGC | 586 | ATAAGTAATAATTGGCTATGGTTGGGGGTAATTGGGTCCATGGTTGCCTCTTCACCCCCACAGGGAGTGGACATCCGCCATAACAAGGACCGAAAGGTTCGGCGCAAGGAGCCCAAGAGCCAGGATATCTACCTGAGGCTGTTGGTCAAG |
294 | 587 | GATATGGTGGTGGTTTTAATGAAAGAGAAAATGTTGAATATATAGAAAGAGAAGAATCTGATGGTGAATATGATGAGGTAAGCTATATTTTGGTGTTCAGGTTGAATATAAATTAGAAAAACAGAAAAAATTCTTAAATGCAAAGGAAAA | 588 | AAATATCTAAAAATTTGATATTCATCTATATTATAGCCTACTAATTTAGTATTTTTCACTTCTAAAGTTGCAGCAATGTGAATTGGGCCAGAAGATCAGAGTGTAATATGTGTAATACTCCAAAGTATGCTAAATTAGAAGAAAGAACAG |
295 | 589 | CGGGCCCCTGGCTGGGCCCAGTTCGGGGTGTGTGGGAGCTGAGGACTCACTGGGCTTGAGGACTGACTGATGTGGGGTGCAGAGGAGGCTTGGGCCTGGAACCGAGTGCTTTGTTCCTAACAGGTGATGTCGAGCCTGGCAGAGCTGGAG | 590 | GACGACTTCAAAGAGGGCTACCTGGAGACAGTGGCGGCTTATTATGAGGAGCAGCACCCAGTGAGTATGACACACCCATCTGGGCACCTTGCCTTCCTTCACCTCTGCCCTGTCTTTTCTTTCTTTCTTTCTTTTTGTTTATTTGAGACA |
296 | 591 | GGTGACTTTTAAATATGACGGCTCCACCATCGTCCCCGGCGAGCAGGGAGCGGAGTACCAGCACTTCATCCAGCAGTGCACAGGTAGGGAGGCGCGCCTGCCGGGCGGATGCGCGGTCGTTGGGAGGTTGTCTGCACCCGGGGAGCCCCG | 592 | CGCTCCCATCCCCGCCGCCGGCCAGGGGCGCGCTCGGCCGCCCCGGACAGTGTCCCGCTGCGGCTCCGCGGCGATGGCCACCAAGATCGACAAAGAGGCTTGCCGGGCGGCGTACAACCTGGTGCGCGACGACGGCTCGGCCGTCATCTG |
297 | 593 | AAGTGTTACAAATCCTTCTGCCCTCACTTAGGCATCTATATCATAAATCTCAAGAGGACCTGGGAGAAGCTTCTGCTGGCAGCTCGTGCAATTGTTGCCATTGAAAACCCTGCTGATGTCAGTGTTATATCCTCCAGGAATACTGGCCAG | 594 | AGGGCTGTGCTGAAGTTTGCTGCTGCCACTGGAGCCACTCCAATTGCTGGCCGCTTCACTCCTGGAACCTTCACTAACCAGATCCAGGCAGCCTTCCGGGAGCCACGGCTTCTTGTGGTTACTGACCCCAGGGCTGACCACCAGCCTCTC |
298 | 595 | GTGTCTTCCGTGAGGCCACCACTGAGTTCAGTGTGGACGCCCGGGCTCTGACACAGACCGGAGGGCCGCACGTCAAGGCCCGTGTGGCCAACCCCTCAGGCAACCTGACGGAGACCTACGTTCAGGACCGTGGCGATGGCATGTACAAAG | 596 | TTACCTACATTCCCCTCTGCCCCGGGGCCTACACCGTCACCATCAAGTACGGCGGCCAGCCCGTGCCCAACTTCCCCAGCAAGCTGCAGGTGGAACCTGCGGTGGACACTTCCGGTGTCCAGTGCTATGGGCCTGGTATTGAGGGCCAGG |
299 | 597 | CAAAGCTGACACTCCTGCAGAGAAAGCTCCAACCGGCAGCAATGTATGAGATTAAAGTTCGATCCATCCCTGATCACTATTTTAAAGGCTTCTGGAGTGAATGGAGTCCAAGTTATTACTTCAGAACTCCAGAGATCAATAATAGCTCAG | 598 | GGGAGATGGATCCTATCTTACTAACCATCAGCATTTTGAGTTTTTTCTCTGTCGCTCTGTTGGTCATCTTGGCCTGTGTGTTATGGAAAAAAAGGTGACCTTCTTCAACTAATAAAGAGGGTGATTGTGTGGGATCACGGACAGTCAGAG |
300 | 599 | TTACCTGGCTACAGAAAGAAGATGCCAGATGACACTTAAGACCTACTTGTGATATTTAAATGATGCAATAAAAGACCTATTGATTTGGACCTTCTTCTTAAACCGGTTATCCTTTTTAGCTAGTTTTTTTCCCTCGTGGAACAAGGAGCT | 600 | GATGTATTAATTGCTTTATCTTCACTCCTATAGCGGCTTTGATTCAGCAAGCCACAACAGTTAAAAACAAGGATATCAGGAAATTTTTGGATGGTATCTATGTCTCTGAAAAAGGAACTGTTCAGCAGGCTGATGAATAAGATCTAAGAG |
301 | 601 | AATAATACCGAGTCGAGTCATGAAATGTGTCCCACCCCCTTGTCTCCCTTCAGGTTTAAGTTACTGAGCCAGGAGGAAGGCGAGTACTTCAATGTGCCTGTGCCACCAGAAGGAAGTGAGGCCAATGAAGAACTGCGGCAGAAATTTGAG | 602 | AGGGCCAAGATCAGTCAGGGAACCAAGGTCCCGGAAGAAAAGACGACCAACACTGTCTCCAAATTTGACAACAATGGCAACAGAGACCGGATGAAACTGACCGATTTTAACTTCCTAATGGTGCTGGGGAAAGGCAGCTTTGGCAAGGTA |
302 | 603 | TAGAATAGGAATATAGAGTCAAACTCTTTGCAGACTAGATTTTGCCCCAAGCTCATTAACTCATCCCATTTGCTCCAGGGACAGCTTAATGAAGACAAACTGAAGGGGAAACTGAGATCCTTAGAAAACCAGCTATACACCTGTACCCAG | 604 | AAATACTCCCCTTGGGGAATGAAAAAAGTACTACTGGAGATGGAAGACCAGAAAAACAGCTATGAGCAGAAGGCCAAGGAGTCACTGCAGAAAGTGCTGGAGGAGAAAATGAATGCAGAGCAGCAACTACAGAGCACACAGGTATGGGGA |
303 | 605 | CCAGCGACTCCTGCTCTTGCTTCTGGATCTGCAGGGCAGTCCCAGCAGGACCCATGGAGTGTCCTTCGTGCCAGCATGTCTCCAAGGAGGAAACCCCCAAGTTCTGCAGCCAGTGCGGAGAGAGGCTGCCTCCTGCAGCCCCCATAGCAG | 606 | ATTCTGAGAACAATAACTCCACAATGGCGTCGGCCTCGGAGGGTGAAATGGAGTGTGGGCAGGAGCTGAAGGAGGAAGGGGGCCCGTGCTTGTTCCCGGGCTCAGACAGTTGGCAAGAAAACCCCGAGGAGCCCTGTTCCAAAGCCTCCT |
304 | 607 | GAAAGCAAGGAATTTAATGCAGAAGTACATCGGAAGCACATCATGGGCCAGAATGTTGCAGATTACATGCGCTACTTAATGGAAGAAGATGAAGATGCTTACAAGAAACAGTTCTCTCAATACATAAAGAACAGCGTAACTCCAGACATG | 608 | ATGGAGGAGATGTATAAGAAAGCTCATGCTGCTATACGAGAGAATCCAGTCTATGAAAAGAAGCCCAAGAAAGAAGTTAAAAAGAAGAGGTATGTCGTCTTTTTTTTTGTCTTTTCAAGAAAACAGGTTGGGAATGGTTCCCACGTGGGG |
305 | 609 | TTGAGCTAAAAGGTATTTTTGCATTCTAAAAGGGAAACTAAGGCAAAAAACCCACTTTTGTTTCCCCTCCTGCCTTTTAGGGAAGACAAAGGCGCTTTGGCTAAGCTGGTGGAAGCTATCAGGACCAATTACAATGACAGATACGATGAG | 610 | ATCCGCCGTCACTGGGGTGGCAATGTCCTGGGTCCTAAGTCTGTGGCTCGTATCGCCAAGCTCGAAAAGGCAAAGGCTAAAGAACTTGCCACTAAACTGGGTTAAATGTACACTGTTGAGTTTTCTGTACATAAAAATAATTGAAATAAT |
306 | 611 | GCTGAGCCCAGCAGCTTCTTGTGACTAGAGCAGGCCCTGTGAGTGCTCACAAAGTGGTTGTGTGTTCTAGGAGTTAACACCGTCACCACCTTGGTGGAGAACAAGAAAGCTCAGCTGGTGGTGATTGCACACGACGTGGATCCCATCGAG | 612 | CTGGTTGTCTTCTTGCCTGCCCTGTGTCGTAAAATGGGGGTCCCTTACTGCATTATCAAGGGAAAGGCAAGACTGGGACGTCTAGTCCACAGGAAGACCTGCACCACTGTCGCCTTCACACAGGTGAACTCGTAAGTACACAGCCTGGCC |
307 | 613 | GAAGCCCCTTGCAGTTCTATGTGGATTACGTCAACTGTGGCCATGTCACTGCCTATGGGCCTGGCCTCACCCATGGAGTAGTGAACAAGCCTGCCACCTTCACCGTCAACACCAAGGATGCAGGAGAGGGTGAGCAATAGCTCTGGTCTT | 614 | GCCCCACAGGGGAGGTTCGGATGCCCTCAGGCAAGGTGGCGCAGCCCACCATCACTGACAACAAAGACGGCACCGTGACCGTGCGGTATGCACCCAGCGAGGCTGGCCTGCACGAGATGGACATCCGCTATGACAACATGCACATCCCAG |
308 | 615 | TGAGGGTCTCGGCCACCTTCTGGCAGAACCCCCGCAACCACTTCCGCTGTCAAGTCCAGTTCTACGGGCTCTCGGAGAATGACGAGTGGACCCAGGATAGGGCCAAACCTGTCACCCAGATCGTCAGCGCCGAGGCCTGGGGTAGAGCAG | 616 | ACTGTGGCTTCACCTCCGGTAAGTGAGTCTCTCCTTTTTCTCTCTATCTTTCGCCGTCTCTGCTCTCGAACCAGGGCATGGAGAATCCACGGACACAGGGGTGTGAGGGAGGCCAGAGCCACCTGTGCACAGGTACCTACATGCTCTGTT |
309 | 617 | GGATTTCATTCGTGCCCAAGGAGACGGGGGAGCACCTGGTGCATGTGAAGAAAAATGGCCAGCACGTGGCCAGCAGCCCCATCCCGGTGGTGATCAGCCAGTCGGAAATTGGGGATGCCAGTCGTGTTCGGGTCTCTGGTCAGGGCCTTC | 618 | TGCGTATGTCCCACCTAAAGGTCGGCTCTGCTGCCGACATCCCCATCAACATCTCAGAGACGGATCTCAGCCTGCTGACGGCCACTGTGGTCCCGCCCTCGGGCCGGGAGGAGCCCTGTTTGCTGAAGCGGCTGCGTAATGGCCACGTGG |
310 | 619 | GTCCCCTATAATTTTATCTCAATAATGTTTTGCAGTAGTTTCTAAGGAAATTTTTATGGGGTCTTCAGTCTGACTGTGAAAAGCAGTTCAGAGTCCAACTCACACTGTGTTTTTCTTTCCTTTTCAGGGTGTGAGAGGAGTCCTGAGCAG | 620 | AAGTAAGGCTGTCACAAGGCTGGAAGCAGAGAACATCCCCATGGAACTGAAGACAGCATGCTGCATCCCTGGGAGGAGGGAGCTCTTAAGGAAGTTCCAAGGTAGTTGCATCTTAGAGACTGGGAATTAGGCTGCCTGGGGTTTGAAGAA |
311 | 621 | GCATGGATTCCGTGAAGGAACAACACCTAAACCCAAGAGGGCAGCTGTTGCAGCATCCAGTTCATCTTAAGAATGTCAACGATTAGTCATGCAATAAATGTTCTGGTTTTAAAAAATACATATCTGGTTTTGGTAAGGTATTTTTAATCA | 622 | TTAACAACACCTACAAGGTGTGTGGGAGAACACCGTTTGAAATCTTTTCTGAACTTATGTTTTAGATAACTGGAGTGCCAAGGCTAAAAGACGAAATACCACCGGAACTGGTCGAATGAGGCACCTAAAAATTGTATACCGCAGATTCAG |
312 | 623 | CCATTCATATACCCCCAACCTCCCTCGTCCCCTCTTTCATTCTTACCGCCCAAGTCCCCTCTGCTCACTGCGCCCTTTCTCCACAGCTCCGCTAGCAATCGCATCATCGGTGCCAAGGACCACGCATCCATCCAGATGAACGTGGCCGAG | 624 | GTTGACAAGGTCACAGGCAGGTTTAATGGCCAGTTTAAAACTTATGCTATCTGCGGGGCCATTCGTAGGATGGTGAGTGTTTCCCTGGGCTTTGCTCATCACTTCGGGACATCGTGGACTTTACCGTGCGCATTGGAGTGTGTGATGGTG |
313 | 625 | ACCATTTTAATTGCTGTTAGATTTTGCACTGAAGTTCTTGATGTTTGTGTTCTAGGCTTTGGTCGATGGACCTTGCACTCAAGTGAGGAGACAGGCCATGCCTTTCAAGTGCATGCAGCTCACTGATTTCATCCTCAAGTTTCCGCACAG | 626 | TGCCCACCAGAAGTATGTCCGACAAGCCTGGCAGAAGGCAGACATCAATACAAAATGGGCAGCCACACGATGGGCCAAGAAGATTGAAGCCAGAGAAAGGGTAATAACTTAGGGTCATTTGAATTCTGGTCCTTTCTTTTTTTGGAGGGT |
314 | 627 | TGAATCATCTACAGCCTCTGCCCTGGTCGCATAAATTTGTCTGTGTACTCAAGCAATAAAATGATTGTTTAACTAAAAGCATGTTTCATATTTATTTTCCTAGAAGAAAAATTATATATATCAGTGGTTCATATGTGTTGATCTTGTTTG | 628 | TAGGTCATTTTGGGTGGTTTTCTTGAATTGCACCAAATTTTATTTTTAGGATAAGGATGCTAAATTCCGTCTGATTCTAATAGAGAGCCGGATTCACCGTTTGGCTCGATATTATAAGACCAAGCGAGTCCTCCCTCCCAATTGGAAATA |
315 | 629 | GAGTCACAGTGGCTCAAGCTTCCTTCCCCGCTTCCACATGCAGGCATCTCTCGGGACAACTGGCACAAGCGCCGCAAAACCGGGGGCAAGAGAAAGCCCTACCACAAGAAGCGGAAGTATGAGTTGGGGCGCCCAGCTGCCAACACCAAG | 630 | ATTGGCCCCCGCCGCATCCACACAGTCCGTGTGCGGGGAGGTAACAAGAAATACCGTGCCCTGAGGTTGGACGTGGGGAATTTCTCCTGGGGCTCAGAGTGTGAGTGAGGCCCTTTGGGAGTGGGTGGGAAAACGCACCTAAACGGTCTT |
316 | 631 | TTTGTGCCATTATTACATTTTCACCTTCATTCTTCTGTTGTTTTTCAGGGCATTTTGTCAGTGATGCTGATCTTTGCCTTCTTCCAGGAACTTGTAATAGCTGGCATCGTTGAGAATGAATGGAAAAGAACGTGCTCCAGACCCAAATCT | 632 | AACATAGTTCTCCTGTCAGCAGAAGAAAAAAAAGAACAGACTATTGAAATAAAAGAAGAAGTGGTTGGGCTAACTGAAACATCTTCCCAACCAAAGAATGAAGAAGACATTGAAATTATTCCAATCCAAGAAGAGGAAGAAGAAGAAACA |
317 | 633 | GCCACGCCCACCAGCCCCATCCGAGTCAAGGTGGAGCCCTCTCATGACGCCAGTAAGGTGAAGGCCGAGGGCCCTGGCCTCAGTCGCACTGGTGAGGACAGGTACCCCATGGCAGGTTGCGGGGCATCAAGGGTAGGAGGGCTTGGGGCA | 634 | ATCGGCATCAAGTGTGCCCCTGGAGTGGTAGGCCCCGCCGAAGCTGACATCGACTTCGACATCATCCGCAATGACAATGACACCTTCACGGTCAAGTACACGCCCCGGGGGGCTGGCAGCTACACCATTATGGTCCTCTTTGCTGACCAG |
318 | 635 | ACTATGTGTGGCCAAGGTATGCAGGCCTTTGACTACTTGGAAGCTAGCAAAGTCATCTACACCAATGGCTGTATTGACAAGTTGGTCAACTGGATACACAGCAACCTATTCTTACTTGGTGGTGTGGCTCTAGGCCTGGCCATCCCCCAG | 636 | CTGGTGGGAATTCTGCTGTCCCAGATCCTAGTGAATCAGATCAAAGATCAGATCAAGCTACAGCTCTACAACCAGCAGCACCGGGCTGACCCATGGTACTGAGAATCCATCCTGCACCTCCTCACCATGGAAACTGGCAAGCCTCATAAA |
319 | 637 | GCCTCAGAGTCTCTGATCAAGCAGATTCCACGAATCCTCGGCCCAGGTTTAAATAAGGCAGGAAAGTTCCCTTCCCTGCTCACACACAACGAAAACATGGTGGCCAAAGTGGATGAGGTGAAGTCCACAATCAAGTTCCAAATGAAGAAG | 638 | GTGTTATGTCTGGCTGTAGCTGTTGGTCACGTGAAGATGACAGACGATGAGCTTGTGTATAACATTCACCTGGCTGTCAACTTCTTGGTGTCATTGCTCAAGAAAAACTGGCAGAATGTCCGGGCCTTATATATCAAGAGCACCATGGGC |
320 | 639 | GCATTAATATAGTAGGGCACATGAAATGAAACCAAGTACTGTTTGCTTTCCTTTGTTTCAGATGGAGGAGATGTATAAGAAAGCTCATGCTGCTATACGAGAGAATCCAGTCTATGAAAAGAAGCCCAAGAAAGAAGTTAAAAAGAAGAG | 640 | GTGGAACCGTCCCAAAATGTCCCTTGCTCAGAAGAAGGATCGGGTAGCTCAAAAGAAGGCAAGCTTCCTCAGAGCTCAGGAGCGGGCTGCTGAGAGCTAAACCCAGCAATTTTCTATGATTTTTTCAGATATAGATAATAAACTTATGAA |
321 | 641 | AACAGATGTCTGTGAAAAAATTATTGGAGGAAATGAAGTAACTCCTCATTCAAGACCCTACATGGTCCTACTTAGTCTTGACAGAAAAACCATCTGTGCTGGGGCTTTGATTGCAAAAGACTGGGTGTTGACTGCAGCTCACTGTAACTT | 642 | GAACAAAAGGTCCCAGGTCATTCTTGGGGCTCACTCAATAACCAGGGAAGAGCCAACAAAACAGATAATGCTTGTTAAGAAAGAGTTTCCCTATCCATGCTATGACCCAGCCACACGCGAAGGTGACCTTAAACTTTTACAGGTACGTAT |
322 | 643 | GTGGTACTCTTTTTGTTCATTAAGTGTTAACGATGTACTCATTGTAGTATGGTTTTTGATGAAACAATCTTTAAGATGTTCATTTTTGTTTTTATAGTATGTGCAGAAGCTTATAATCCTGATGAAGAAGAAGATGATGCAGAGTCCAGG | 644 | ATTATACATCCAAAAACTGATGATCAAAGAAATAGGTTGCAAGAGGCTTGCAAAGACATCCTGCTGTTTAAGAATCTGGATCCGGTAAGATAAATCTTAATAATAGAAATGGCTTTGTTTTTTCCCCCAGTGACAGTGTCAAGAACTGTA |
323 | 645 | GAGAAGATTGGCTGGCGAAAGGATGCACTGCATTTGCTGGTGTTCACAACAGATGATGTGCCCCACATCGCATTGGATGGAAAATTGGGAGGCCTGGTGCAGCCACACGATGGCCAGTGCCACCTGAACGAGGCCAACGAGTACACTGCA | 646 | TGCGTCCCCTCCTTTGGGTTCCGCCATCTGCTGCCTCTCACAGACAGAGTGGACAGCTTCAATGAGGAAGTTCGGAAACAGAGGGTGTCCCGGAACCGAGATGCCCCTGAGGGGGGCTTTGATGCAGTACTCCAGGCAGCCGTCTGCAAG |
324 | 647 | AGGATTGGCTTTCAGAGTCTAATCATGTTTTCTGTGTGTCTAGTATGCTCAGGCTTCAGAAGAGGCTCGCCTCTAGTGTCCTCCGCTGTGGCAAGAAGAAGGTCTGGTTAGACCCCAATGAGACCAATGAAATCGCCAATGCCAACTCCC | 648 | GTCAGCAGATCCGGAAGCTCATCAAAGATGGGCTGATCATCCGCAAGCCTGTGACGGTCCATTCCCGGGCTCGATGCCGGAAAAACACCTTGGCCCGCCGGAAGGGCAGGCACATGGGCATAGGTAAGTGTGGTCATCTTCTCCTTAAGA |
325 | 649 | GAGTATCCTTTCTACAATTATTTTTTTCTTTCAGAGGGTAAAACTGATTATTATGCTCGGAAACGCTTGGTGATACAAGATAAAAATAAATACAACACACCCAAATACAGGATGATAGTTCGTGTGACAAACAGAGATATCATTTGTCAG | 650 | ATTGCTTATGCCCGTATAGAGGGGGATATGATAGTCTGCGCAGCGTATGCACACGAACTGCCAAAATATGGTGTGAAGGTTGGCCTGACAAATTATGCTGCAGCATATTGTACTGGCCTGCTGCTGGCCCGCAGGGTATGTACAAGATGA |
326 | 651 | AACTTAGGGTCATTTGAATTCTGGTCCTTTCTTTTTTTGGAGGGTTCAAGATAGTGTGAGAGGGATAATTTTTATTTGTTGTTTTTTTTTTAACAGAAAGCCAAGATGACAGATTTTGATCGTTTTAAAGTTATGAAGGCAAAGAAAATG | 652 | AGGAACAGAATAATCAAGAATGAAGTTAAGAAGCTTCAAAAGGCAGCTCTCCTGAAAGCTTCTCCCAAAAAAGCACCTGGTACTAAGGGTACTGCTGCTGCTGCTGCTGCTGCTGCTGCTGCTAAAGTTCCAGCAAAAAAGATCACCGCC |
327 | 653 | GGCAACAAACCATGGATTTCTCTTCCCCGAGGAAAGGGTATCCGCCTCACCATTGCTGAAGAGAGAGACAAAAGACTGGCGGCCAAACAGAGCAGTGGGTGAAATGGGTCCCTGGGTGACATGTCAGATCTTTGTACGTAATTAAAAATA | 654 | TGTATGGTGACTGGAGGTGCTAACCTAGGAAGAATTGGTGTGATCACCAACAGAGAGAGGCACCCTGGATCTTTTGACGTGGTTCACGTGAAAGATGCCAATGGCAACAGCTTTGCCACTCGACTTTCCAACATTTTTGTTATTGGCAAG |
328 | 655 | ATTTAACATATGAGCTGACAGTCAAAAGTTCGGAACAGACAGGTAAGAACTCCTCCCCAGAAGTAAATGACAGTAGGTTTCCCTTTGTGGTACGTGTTGGTGCCGTTTTCACTAGTCACACACTTAGGAGAAAATGCTCTTGCTGGGAAG | 656 | CAGGTCCTCTGGATGTCAGCATGGCAGCCACAAACCTGGAGAACCAGCTGCACAGCGCACAGAAGAACCTCCTGTTCCTTCAGCGGGAGCATGCCAGCACGCTCAAGGGGCTGCACTCCGAGATCAGGCGGCTGCAGCAGCACTGCACAG |
329 | 657 | GCAGTCCTTTCAAGGTCCCTGTGCATGATGTGACAGATGCGTCCAAGGTCAAGTGCTCTGGGCCCGGCCTGAGCCCAGGCATGGTTCGTGCCAACCTCCCTCAGTCCTTCCAGGTGGACACAAGCAAGGCTGGTGTGGCCCCATTGCAGG | 658 | CGGGCGGCCTGGGCCTGGCTGTAGAGGGCCCCTCCGAGGCCAAGATGTCCTGCATGGATAACAAGGACGGCAGCTGCTCGGTCGAGTACATCCCTTATGAGGCTGGCACCTACAGCCTCAACGTCACCTATGGTGGCCATCAAGTGCCAG |
330 | 659 | GTGCTGGCATCGGCCCCACCATTCAGATTGGGGAGGAGACGGTGATCACTGTGGACACTAAGGCGGCAGGCAAAGGCAAAGTGACGTGCACCGTGTGCACGCCTGATGGCTCAGAGGTGGATGTGGACGTGGTGGAGAATGAGGACGGCA | 660 | ACCATGACGGCACGTATACAGTGGCCTACGTGCCAGACGTGACAGGTCGCTACACCATCCTCATCAAGTACGGTGGTGACGAGATCCCCTTCTCCCCGTACCGCGTGCGTGCCGTGCCCACCGGGGACGCCAGCAAGTGCACTGTCACAG |
331 | 661 | CTCCTTCCGTCGCCCGTGGGCACGCAGCACGTGTGGAGTGAGAGCGAGGACTGCCTGCCTTTCTTGCAGCTAGCACAGGATTACATCTCCTCCTGCGGCAAGAAGACGCTCCACGAAGTCCTGGAAAAAGTCTTCAAGTCTTTCAGACCT | 662 | TTACTGGGGCTTCCGGATGCAGATGACGATGCGTTTGAAGAGTACAGTGCTGACGTGGAAGAAGAGGAGCCAGAGGCGGACCACCCCCAGATGGGGGTCAGCCAGCAGTAAATCTGGGGGCTCCCCTGAGAAGGAGAGTGAGCCCCACAG |
332 | 663 | GATGCCTTGTGCCGCCTCCTTCCCAGGAGCCCAATAACTTGAAGGCCCGCAATTCCTTCCGCTACAACGGACTGATTCACCGCAAGACTGTGGGCGTGGAGCCGGCAGCCGACGGCAAAGGTGTCGTGGTGGTCATTAAGCGGAGATCCG | 664 | GCCAGCGGAAGCCTGCCACCTCCTATGTGCGGACCACCATCAACAAGAATGCTCGCGCCACGCTCAGCAGCATCAGACACATGATCCGCAAGAACAAGTACCGCCCCGACCTGCGCATGGTGAGCTGGGGTTTGGGGATCAGGCTTGGGG |
333 | 665 | GACTGCACTCCGTGGACGTGACCTATGACGGCAGTCCCGTGCCCAGCAGCCCCTTCCAGGTGCCCGTGACCGAGGGCTGCGACCCCTCCCGGGTGCGTGTCCACGGGCCAGGCATCCAAAGTGGCACCACCAACAAGCCCAACAAGTTCA | 666 | AGTTCAGTGTGGACGCCCGGGCTCTGACACAGACCGGAGGGCCGCACGTCAAGGCCCGTGTGGCCAACCCCTCAGGCAACCTGACGGAGACCTACGTTCAGGACCGTGGCGATGGCATGTACAAAGTGGAGTACACGCCTTACGAGGAGG |
334 | 667 | CTGAACATCTCCTTCCCAGCCACTGGCTGCCAGAAACTCATTGAAGTGGACGATGAACGCAAACTTCGTACTTTCTATGAGAAGCGTATGGCCACAGAAGTTGCTGCTGACGCTCTGGGTGAAGAATGGAAGGTAAAAGTTGACAAATTG | 668 | CGGAAGTACCGCCCACCCATGCTCACTTCCGCTATCCCGTACTTCTGCTCATCTCGCGAGAACTGAAAGCGCCTATGTGACCTGCGCTAAGCGGAAGTTGGCCCTCTTTTCCGTGGCGCCTCGGAGGCGTTCAGCTGCTTCAAGATGAAG |
335 | 669 | TGCTAAAAACCTTGTACCTATGGACCCCAATGGCCTGTCAGATCCCTACGTAAAACTGAAACTGATTCCCGATCCCAAAAGTGAGAGCAAACAGAAGACCAAAACCATCAAATGCTCCCTCAACCCTGAGTGGAATGAGACATTTAGATT | 670 | TCAGCTGAAAGAATCGGACAAAGACAGAAGACTGTCAGTAGAGATTTGGGATTGGGATTTGACCAGCAGGAATGACTTCATGGGATCTTTGTCCTTTGGGATTTCTGAACTTCAGAAAGCCAGTGTTGATGGCTGGTAAGTAAGATTTTG |
336 | 671 | GGGCAGAATGATCTGGAAAAGATGACCAGCATCCTGGAAGCTGTGCCACAGGTTAAGTTTATTTGCCTGGATGTGGCCAATGGGTATTCAGAACATTTTGTGGAATTCGTGAAACTTGTCCGTGCCAAATTTCCTGAACACACCATTATG | 672 | GCAGGGAACGTGGTGACAGGAGAAATGGTAGAAGAGCTTATTCTTTCCGGAGCAGATATCATCAAAGTGGGAGTTGGACCAGGTAAGACTTGTTAGGAGCACAGCAGAGGACGTGTGTGGGGAAGAATGGGATCTGGGGCTTGCGGGGAC |
337 | 673 | GTGTAATCCTGAGAGATTCACATGGTGTTGCACAAGTACGTTTTGTGACAGGCAATAAAATTTTAAGAATTCTTAAGTCTAAGGGACTTGCTCCTGATCTTCCTGAAGATCTCTACCATTTAATTAAGAAAGCAGTTGCTGTTCGAAAGC | 674 | AAGGAGAGACCGCTGTTCTGCGGCGCCATTCCTGGGTTCTCATCCTAAGGCTGCTTTCTATTCCATAACAGTGGTTGAAGTTGACATCTGACGACGTGAAGGAGCAGATTTACAAACTGGCCAAGAAGGGCCTTACTCCTTCACAGATCG |
338 | 675 | CTGAGCTGGCTAGGTGACTGTTGGTTATTCCTGGGACAGGTGCTGGGTAGGCCAGGTTTCAGCATCGCAGACAAGAAGCGCAGGACAGGCTGCATTGGGGCCAAACACAGAATCAGCAAAGAGGAGGCCATGCGCTGGTTCCAGCAGAAG | 676 | TATGATGGGATCATCCTTCCTGGCAAATAAATTCCCGTTTCTATCCAAAAGAGCAATAAAAAGTTTTCAGTGAAATGTGCAATTCTGTTGTGTGTTCTGTGAAAGGATCCTGGCCATATTCAAGTCCTTGGACCTCAAGCCACTTAAAGC |
339 | 677 | TGGTGATGAATACAATGTGGAAAGCATTGATGGTCAGCCAGGTGCCTTCACCTGCTATTTGGATGCAGGCCTTGCCAGAACTACCACTGGCAATAAAGTTTTTGGTGCCCTGAAGGGAGCTGTGGATGGAGGCTTGTCTATCCCTCACAG | 678 | TACCAAACGATTCCCTGGTTATGATTCTGAAAGCAAGGAATTTAATGCAGAAGTACATCGGAAGCACATCATGGGCCAGAATGTTGCAGATTACATGCGCTACTTAATGGAAGAAGATGAAGATGCTTACAAGAAACAGTTCTCTCAATA |
340 | 679 | CCCCTTTCTGCTCAGAAAATCCGTTCTAAAGTAGAGCTGGAAGTGCGTGACCTCCCTGAAGAGTTGTCTCTATCCTTCAATGCCACCTGCCTCAACAATGAGGTCATCCCTGGCCTCAAGTCTTGTATGGGACTCAAGATTGGAGACACG | 680 | GTGAGCTTCAGCATTGAGGCCAAGGTGCGAGGCTGTCCCCAGGAGAAGGAGAAGTCCTTTACCATAAAGCCCGTGGGCTTCAAGGACAGCCTGATCGTCCAGGTCACCTTTGATTGTGACTGTGCCTGCCAGGCCCAAGCTGAACCTAAT |
341 | 681 | TTTTTGTCTAAAAAGAGCTACTGGAAACCTGAAGTGATGATTGCTGCTCAGGGACCACTGAAGGAGACCATTGGTGACTTTTGGCAGATGATCTTCCAAAGAAAAGTCAAAGTTATTGTTATGCTGACAGAACTGAAACATGGAGACCAG | 682 | GAAATCTGTGCTCAGTACTGGGGAGAAGGAAAGCAAACATATGGAGATATTGAAGTTGACCTGAAAGACACAGACAAATCTTCAACTTATACCCTTCGTGTCTTTGAACTGAGACATTCCAAGGTATGGAAACAATTTGGGGAGTATATT |
342 | 683 | TTTGACAATCGTTCTCTGAATGTATTATTTTTCATTTCTAGATAATTCTAAGGCACTGATAGCATTTCTGGCATTTCTGATTATTGTGACATCAATAGCCCTGCTTGTTGTTCTCTACAAAATCTATGATCTACATAAGAAAAGATCCTG | 684 | CAATTTAGATGAACAGCAGGAGCTTGTTGAAAGGGGTAAGTATGTATATTTTTGCTGATGACTATTCCTTCCCCTGCATTTGAATCCATTCATTTTATTTATTTATTTATTTATATTTATTTTAAGACAGAGTCTCATTCTGTCTCCCAG |
343 | 685 | TGCTGAATTCCCATATATTAGGCTACTTGATTATTCACTATTTCACTTGTTTATTTTTCTTTTCCTTAAACAGATGATTATAACCGTGTTGAACTCTCTGAGATAAACGGAGATGCAGGGTCAAACTACATAAATGCCAGCTATATTGAT | 686 | GGTTTCAAAGAACCCAGGAAATACATTGCTGCACAAGGTAATTTCTTTGATAATCCAATATTCTTTTTGAAAAATTTTTATAGCACTTTTAAGAAAATTTTTCTTATCAGCTTTTATTTGTTTACCTCCTAGGTCCCAGGGATGAAACTG |
344 | 687 | TGTATGGTATGTGCAAGTTTGCATGTTTATCTTTGTTTTCAACTTGTTGGTAATACGTTTTATTGTCTTCAATAGGCCGCTGGCCCAACAGGCAAAAATGAAGAAAAAATTCAGGTTCTAACAGACAAAATTGATGTACTTCTGCAACAG | 688 | ATTGAAGAATTAGGGTCTGAAGGAAAAGTAGAAGAAGCCCAGGGGATGATGAAATTAGTTGAGCAATTAAAAGAAGAGAGAGAACTGCTAAGGTCCACAACGTCGGTGAGTAAACCTTATTTCACATTATCTCATCTGTCTGTTAACAGT |
345 | 689 | TCTTTCTTTTATAGGGATGGATCTCAGCAAACGGGAATATTTTGTGCTTTGTTAAATCTCTTAGAAAGTGCGGAAACAGAAGAGGTAGTGGATATTTTTCAAGTGGTAAAAGCTCTACGCAAAGCTAGGCCAGGCATGGTTTCCACATTC | 690 | GAGCAATATCAATTCCTATATGACGTCATTGCCAGCACCTACCCTGCTCAGAATGGACAAGTAAAGAAAAACAACCATCAAGAAGATAAAATTGAATTTGATAATGAAGTGGACAAAGTAAAGCAGGATGCTAATTGTGTTAATCCACTT |
346 | 691 | TCTAAGGTCCACAGCTTTTTTTCACTGTTGACTTTCTAACCATCATCATTTTGGGGGTTTGGCTTTTAGCTGCAGTGTTGTGGTATAAATGGCACGAGTGATTGGACCAGTGGCCCACCAGCATCTTGCCCCTCAGATCGAAAAGTGGAG | 692 | GGTTGCTATGCGAAAGCAAGACTGTGGTTTCATTCCAATTTCCTGTATATCGGAATCATCACCATCTGTGTATGTGTGATTGAGGTAAGAGCTTAACCACAGGGTTATTGTGAGGATTACATGAGTTAAGTCAGGTAAGATTTCAGAATA |
347 | 693 | GTTCTGCCATTACAGGACCAGTAGCAAAGGAGTGTGCAGACTTGTGGCCCCGGATTGCATCCAATGCTGGCAGCATTGCATGATTCTCCAGTATATTTGTAAAAAATAAAAAAAAAAACTAAACCCATTAAAAAGTATTTGTTTGCAGTG | 694 | CAGGTTAATGACTGCTGTCCTTTTTTCTTCTCTCAGTACATCCAGCAGTGGTCATTCGACAACGAAAGTCATACCGTAGAAAAGATGGCGTGTTTCTTTATTTTGAAGATAATGCAGGAGTCATAGTGAACAATAAAGGCGAGATGAAAG |
348 | 695 | AATTGGACAAAAGTTTCTCAATGATTAAGGAGGGTGATTATAACCCCCTCTTCATTCCAGTGGCAGTCATGGTTACTGCATTCTCTGGGTTGGCATTTATCATTTGGCTGGCAAGGAGATTAAAAAAAGGTATGTGAGTTTAACTTCACA | 696 | GGATCATGAACTGTAGCCATCCCCTGGCCAGCTTCAGCTTTACCTCTGCATGTACCTTCATCTGCTCAGAAGGAACTGAGTTAATTGGGAAGAAGAAAACCATTTGTGAATCATCTGGAATCTGGTCAAATCCTAGTCCAATATGTCAAA |
349 | 697 | AATTTACAGAAATACTCCCCTTGGGGAATGAAAAAAGTACTACTGGAGATGGAAGACCAGAAAAACAGCTATGAGCAGAAGGCCAAGGAGTCACTGCAGAAAGTGCTGGAGGAGAAAATGAATGCAGAGCAGCAACTACAGAGCACACAG | 698 | CGATCCCTGGCCCTGGCAGAGCAGAAGTGTGAAGAGTGGAGGAGCCAGTATGAGGCTCTGAAGGAGGACTGGAGGACCCTTGGGACCCAGCACAGGGAGCTGGAGAGCCAACTCCACGTGCTTCAGTCCAAACTGCAGGTACCAGGCACT |
350 | 699 | TTTTTCTCTATGCAGTCAGCTGAAAGAATCGGACAAAGACAGAAGACTGTCAGTAGAGATTTGGGATTGGGATTTGACCAGCAGGAATGACTTCATGGGATCTTTGTCCTTTGGGATTTCTGAACTTCAGAAAGCCAGTGTTGATGGCTG | 700 | GTTTAAGTTACTGAGCCAGGAGGAAGGCGAGTACTTCAATGTGCCTGTGCCACCAGAAGGAAGTGAGGCCAATGAAGAACTGCGGCAGAAATTTGAGGTGAGGTTTCTTTTCTTTTTCTCTTCTTTCTTTTTTCTCTTTCTTTTTTCCTT |
351 | 701 | AGGTCACGTAGACGGCGCGCCCCGCCCCCGTACGCCTAAGTTCTCGCGCGACTCCCACTTCCGCCCTTTTGGCTCTCTGACCAGCACCATGGCGGTTGGCAAGAACAAGCGCCTTACGAAAGGCGGCAAAAAGGGAGCCAAGAAGAAAGT | 702 | GGTTGATCCATTTTCTAAGAAAGATTGGTATGATGTGAAAGCACCTGCTATGTTCAATATAAGAAATATTGGAAAGACGCTCGTCACCAGGACCCAAGGAACCAGTAAGTAGCTTATTCTTGGTTTGTATTTTCCTTAAGTTGGCGCTTG |
352 | 703 | AATGTCTATTAATGTGATTTTTTTTTTTTTTAACCTTTCTCCCAATAGGTTGATGACAACAAGAAACTAGGAGAATGGGTAGGCCTTTGTAAAATTGACAGAGAGGGGAAACCCCGTAAAGTGGTTGGTTGCAGTTGTGTAGTAGTTAAG | 704 | GACTATGGCAAGGAGTCTCAGGCCAAGGATGTCATTGAAGAGTATTTCAAATGCAAGAAATGAAGAAATAAATCTTTGGCTCACATTCCTCATGTCTGGCTTTTTATTTGGGGCAGTAAAATAAGGTCCCTGTTAGCAAAGTAAAATGTA |
353 | 705 | ACCGGCGGGAGGGCTAGCGAGCCAGCGGTGTGAGGCGCGAGGCGAGGCCGAGCCGCGAGCGACATGGGGGACCGGGAGCAGCTGCTGCAGCGGGCGCGGCTGGCCGAGCAGGCGGAGCGCTACGACGACATGGCCTCCGCTATGAAGGCG | 706 | GTGACAGAGCTGAATGAACCTCTCTCCAATGAAGATCGAAATCTCCTCTCTGTGGCCTACAAGAATGTGGTTGGTGCCAGGCGATCTTCCTGGAGGGTCATTAGCAGCATTGAGCAGAAAACCATGGCTGATGGAAACGAAAAGAAATTG |
354 | 707 | GGGAGAGACGTGGGCTGGTGGCACAGCTGACCTTCTGCCATCTCAGGCAGCCGGAGTGGAAATATTCTTAGTGTGCTTTTTTTTTTTTCTTAAGGGTGAGTCAGATGATTCCATTCTCCGATTGGCCAAGGCCGATGGCATCGTCTCAAA | 708 | GAACTTTTGACTGGAGAGAATCACAGATGTGGAATATTTGTCATAAATAAATAATGAAAACCTACCTGTGCAGGTTCATTCTGTGTCTGTAGGCCCAGGGTTGAGGTTTTGCTGTCAGTGGGTGACGGGTGGGGTAGGGTACCCAGTTAG |
355 | 709 | AGCTAATGCTTTCTTCCAGCTGGTTGTCTTCTTGCCTGCCCTGTGTCGTAAAATGGGGGTCCCTTACTGCATTATCAAGGGAAAGGCAAGACTGGGACGTCTAGTCCACAGGAAGACCTGCACCACTGTCGCCTTCACACAGGTGAACTC | 710 | GGAAGACAAAGGCGCTTTGGCTAAGCTGGTGGAAGCTATCAGGACCAATTACAATGACAGATACGATGAGGTAAGAGGCAGCTTTACACCAAAATACTGTCATTCACAAATCTTTCTCCCAAATAACTGGCTGGCTTAACCTATGAGAAG |
356 | 711 | GTTTTTCGTTGGAATATACGTTGCACATTTATGGCGATTCTGAGTGTGAGGGCAGACTTCTGCCAGGCTCAGCACAGCATTTTCGCTGACAAGTGAGCTTGGAGGTTCTATGTGCCATAATTAACATTGCCTTGAAGACTCCTGGACACC | 712 | CGGGTGTCGCGCGCCGAGGCTGGGGGGGAGTCGTCGCCGCCGCCGCCACCGCTACCGCCGCCGCCGCCGCCGCCGAGGTGACTGAGGAGAGAGGCGCCTCCTCGCTCCCGCCACCGCCGGACTTCAATGCCCAGTCCCCAGCTCGCCAGC |
357 | 713 | CCCCTTCAAGGTCAAGGTGCTGCCTACTCATGATGCCAGCAAGGTGAAGGCCAGTGGCCCCGGGCTCAACACCACTGGCGTGCCTGCCAGCCTGCCCGTGGAGTTCACCATCGATGCAAAGGACGCCGGGGAGGGCCTGCTGGCTGTCCA | 714 | TGCTCTGTCCCTGGGGCTGGGGCCAGGCCTGGTGGAGCCAGTGGACGTGGTAGACAACGCTGATGGCACCCAGACCGTCAATTATGTGCCCAGCCGAGAAGGGCCCTACAGCATCTCAGTACTGTATGGAGATGAAGAGGTACCCCGGAG |
358 | 715 | TGTCTCTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTATAAATTATCCTGGAGGAAAGGTTAAGGTGACACATGGAGACTGAGTGTCACCGTTATTTCCGCAGGTCCTCTCTGATGACATGAAGAAGCTGAAGGCCCGAATG | 716 | CACCAGGCCATAGAAAGATTTTATGATAAAATGCAAAATGCAGAATCAGGACGTGGACAGGTGGGTGGATTTCCCCTCAGGCACCAGGTCACATGTCCCCGCCCCCAGGCACTCCACCCTGTGTGGGGCTCAGGGTGAGAAGGATGAAGA |
359 | 717 | CTGCATTTTTCTCCACAGGTGCGGGAGTATGAGTTAAGAAAAAACAACTTCTCAGATACTGGAAACTTTGGTTTTGGGATCCAGGAACACATCGATCTGGGTATCAAATATGACCCAAGCATTGGTATCTACGGCCTGGACTTCTATGTG | 718 | GTGCTGGGTAGGCCAGGTTTCAGCATCGCAGACAAGAAGCGCAGGACAGGCTGCATTGGGGCCAAACACAGAATCAGCAAAGAGGAGGCCATGCGCTGGTTCCAGCAGAAGGTAAAGCTGATTTATCTCAAGTGAAGTGGTGGAATGTGA |
360 | 719 | TCAGCTTGTGATGGGAGAAGATTTTACTAAGTTGCACTGGAAGAGCTGGCTCTTCCCTTCCTCTTCACAGCTTCTCCCCTGCTTTCTAGGAAGATCAGCCCATCTACTTGGCAGTGAAGGGAGTGGTGTTTGATGTCACCTCCGGAAAGG | 720 | AGTTTTATGGACGAGGAGCCCCCTACAATGCCTTGACGGGGAAGGACTCCACTAGAGGGGTAGCCAAGATGTCCTTGGATCCTGCAGACCTCACCCATGACACTGTGAGCCAGATTATAAGCCTTTGTAAAATCCTCTACCTCCTTGTCC |
361 | 721 | TGTTTACAAGTCACCTGGATGTACTCTTTTCTCATTCAGCATGGCCTGTATGAGAAGAAAAAGACCTCAAGAAAGCAACGAAAGGAACGCAAGAACAGAATGAAGAAAGTCAGGGGGACTGCAAAGGCCAATGTTGGTGCTGGCAAAAAG | 722 | TGAGCTGGAGATTGGATCACAGGTATAATTCAAGCTTTTCATGTAGTCATGTAGATCACTAGACTCCTTGGTGTACTGACGTAGCAATTTAAAAGCAGATCATGTGTAGTACATCTAGAAGTAGATTTACAAATATTCTGAAGAGTTGTA |
362 | 723 | TTATCCCTGACTTCCTTCCTTTGTTCCTTCAATATATTCATTAAATATAAGTAAAATACCAATTGAATTTTGTGCTTCTTGAGAATATAGAAACTTATTTTTCCTATTTTCACAGCAATTTAGATGAACAGCAGGAGCTTGTTGAAAGGG | 724 | ATGATGAAAAACAACTGATGAATGTGGAGCCAATCCATGCAGATATTTTGTTGGAAACTTATAAGAGGAAGATTGCTGATGAAGGAAGACTTTTTCTGGCTGAATTTCAGGTGTGTGTTGCTTTTGTTATATGATGATAAATTCGACATC |
363 | 725 | GACACATGTAACTAGTATTGAATCTTTAATATGTTTCCAGATGATGAAAAACAACTGATGAATGTGGAGCCAATCCATGCAGATATTTTGTTGGAAACTTATAAGAGGAAGATTGCTGATGAAGGAAGACTTTTTCTGGCTGAATTTCAG | 726 | AGCATCCCGCGGGTGTTCAGCAAGTTTCCTATAAAGGAAGCTCGAAAGCCCTTTAACCAGAATAAAAACCGTTATGTTGACATTCTTCCTTGTGAGTATTTATTGAGTGCTGAATTCCCATATATTAGGCTACTTGATTATTCACTATTT |
364 | 727 | TCCACCTAAACCCACTGTGTTCATCTCTGGGGTCATCGCCCGGGTAAGTCTGGGAGTGTCTGCGGTGGGTTTGAGGCTTTTGCATGGCAGCATGGAGTCTCCTGGTTGCTTCTGGTTTTGTTAAACTCTACCTGGAATAACCCAGCACCA | 728 | CCACAGGCAGCCCCATTTCCACATTAATCCCACACTCTCTGTTTTCATTTCAGTGAAAGCTGGTGGAATGCGAATTGTGCAGAAACACCCACATACAGGAGACACCAAAGAAGAGAAAGACAAGGATGACCAGGAATGGGAAAGCCCCAG |
365 | 729 | GCCCCGGAGAGGCCCCTGGTGGGTGTCAATGGGCTGGATGTGACCAGCCTGAGGCCCTTTGACCTTGTCATCCCCTTCACCATCAAGAAGGGCGAGATCACAGGTGAGTGGGGACTTGGGAAGGAGCTCGGGAGCCAAGGAGGCCAGACT | 730 | TGGGGCCCCTCTGTGACAACAGACTCTCCAGCAGCTCTCTGCTTTGCCCTGCAGGCTCTGGCTGGGGACCAGCCCTCGGTGCAGCCCCCTCTACGGTCTCAGCAGCTGGCCCCACAGTACACCTACGCCCAGGGCGGCCAGCAGACTTGG |
366 | 731 | GCCGCCCCGCGCAGGCGCCCCCGCCCCGCCGTCGCCGCCGCCGCAGCCAGGAGCCGCTGCACCATGCCCCGCATAGATGCGGACCTCAAGCTCGACTTCAAGGATGTCCTGCTCCGACCTAAGCGGAGCAGCCTCAAGAGCCGAGCCGAG | 732 | GTGGATCTTGAACGCACCTTCACGTTTCGAAATTCAAAGCAGACCTACTCAGGGATTCCCATCATCGTGGCCAACATGGACACTGTGGGCACGTTTGAGATGGCAGCCGTGATGTCACAGGTGAGGCGGTAGGCTTTTGTTTTTTCCCTT |
367 | 733 | GCGCTCCCAGAGTGTCTGAGAGACCATCATAAGGGCTTTCTTTCCTGACAGGGTGACCTGTTGACCAAAACACAGGAGCTGGGCCGTGACTACAGGACCTGTCTGACGATAGTCCAAAAACTGAAGAAGATGGTGGATAAGCCCACCCAG | 734 | AGAAGTGTTTCCAATGCTGCGACCCGGGTGTGTAGGACGGGGAGGTCACGATGGCGCGACGTCTGCAGAAATTTCATGAGGAGGTATCAGTCTAGAGTTACCCAGGGCCTCGTGGCCGGAGAAACTGCCCAGCAGATCTGTGAGGACCTC |
368 | 735 | TGTTTTTTTTATATCATGTGATTGTTTGTGTGTCCCCTTTCCTCTTCTTTGCTTAACACAATTATCTTGTGTTAAGGATCTCAAAGATTTCATGAGACAAGCTGGGGAAGTAACGTTTGCGGATGCACACCGACCTAAATTAAATGAAGG | 736 | GGTGGTTGAGTTTGCCTCTTATGGTGACTTAAAGAATGCTATTGAAAAACTTTCTGGAAAGGAAATAAATGGGAGAAAAATAAAATTAATTGAAGGCAGCAAAAGGCACAGGTATCTCTAATTTTTTAAAGTCAAAAGTTGTATTTAATG |
369 | 737 | TTTCTCTCTCTTCCTCCTCTCCGCTTTCCTTCTTCCCTTCCTCCCACCCTGATTTTCTCTTTTGCAGTTTGCTGCTTTGTGGTGCACAAGCGGTGCCATGAATTTGTCACATTCTCCTGCCCTGGCGCTGACAAGGGTCCAGCCTCCGAT | 738 | GACCCCCGCAGCAAACACAAGTTTAAGATCCACACGTACTCCAGCCCCACGTTTTGTGACCACTGTGGGTCACTGCTGTATGGACTCATCCACCAGGGGATGAAATGTGACAGTAAGTACTTTTTCTCTCTGGGGGCATCTGCTGATGGC |
370 | 739 | CCCTTTCCCCCCTAGCGTCTGACCAAACACACCAAGTTCGTGCGGGACATGATTCGGGAGGTGTGTGGCTTTGCCCCGTACGAGCGGCGCGCCATGGAGTTACTGAAGGTCTCCAAGGACAAACGGGCCCTCAAATTTATCAAGAAAAGG | 740 | GTGGGGACGCACATCCGCGCCAAGAGGAAGCGGGAGGAGCTGAGCAACGTACTGGCCGCCATGAGGAAAGCTGCTGCCAAGAAAGACTGAGCCCCTCCCCTGCCCTCTCCCTGAAATAAAGAACAGCTTGACAGAAGCCCTGGCTCTCCT |
371 | 741 | ATAACTGGAGTGCCAAGGCTAAAAGACGAAATACCACCGGAACTGGTCGAATGAGGCACCTAAAAATTGTATACCGCAGATTCAGGTACAGTTTGTATGTTCGATCATAATTGGTCCAGTGGGCTTGAATGAAACCCTCGTGTTTACTTG | 742 | CCTTTAATGTGCAGACGAAGGGAACGTCATCGTTTGGAAAGCGTCGCAATAAGACGCACACGTTGTGCCGCCGCTGTGGCTCTAAGGCCTACCACCTTCAGAAGTCGACCTGTGGCAAATGTGGCTACCCTGCCAAGCGCAAGAGAAAGT |
372 | 743 | TTCACAATCACAAAATGATGGATCTGAAGCAATTCCTCTAACTCACAATTTTTCCTGTTAATGAGTAATTGAATGTTCAGCAAATGACATATCTCTGCATGTGTTTTCAATAGGGTTTCAAAGAACCCAGGAAATACATTGCTGCACAAG | 744 | GTCCCAGGGATGAAACTGTTGATGATTTCTGGAGGATGATTTGGGAACAGAAAGCCACAGTTATTGTCATGGTCACTCGATGTGAAGAAGGAAACAGGGTAAGAACCAAGAAGATTCATAGTGTGGGTCTTGGGGTTAGGAAAACAAGGT |
373 | 745 | TTCCCATGTGGGGAACCCAGTCCTGCTGTGACTCTGGGAGGGAGAGGGCCGGGGATACAATCGTACATTCCTGGTAACAGCCCTGTGATTGTCTGCTTCAGGTTATCAGTGAGCTGAATGGAAAAAACATTGAAGACGTCATTGCCCAGG | 746 | GTATTGGCAAGCTTGCCAGTGTACCTGCTGGTGGGGCTGTAGCCGTCTCTGCTGCCCCAGGCTCTGCAGCCCCTGCTGCTGGTTCTGCCCCTGCTGCAGGTAAGTGGTGGCCTGGTGAGTGGGCAAGGGGCTGGGGCTCAGACGGTGTTG |
374 | 747 | GACATCCTGGTGGTCCTGCTCTCAGTGATGGGGGCCATTCTGCTCATTGGCCTTGCCGCCCTGCTCATCTGGAAACTCCTCATCACCATCCACGACCGAAAAGAATTCGCTAAATTTGAGGAAGAACGCGCCAGAGCAAAATGGGACACA | 748 | GCCAACAACCCACTGTATAAAGAGGCCACGTCTACCTTCACCAATATCACGTACCGGGGCACTTAATGATAAGCAGTCATCCTCAGATCATTATCAGCCTGTGCCACGATTGCAGGAGTCCCTGCCATCATGTTTACAGAGGACAGTATT |
375 | 749 | GCCAACAGAGCACTTATGGCAAGGCATCTCGAGGGGGTGGCAATCACCAAAACAATTACCAGCCATACTAAAGGAGAACATTGGAGAAAACAGGTGTGTATAAGAGTACAGGAAAACAGTAGAAATGTCTAATTTAATTTAAAGATCAAT | 750 | AAGGATTTAATAACTATTATGATCAAGGATATGGAAATTACAATAGTGCCTATGGTGGTGATCAAAACTATAGTGGCTATGGCGGATATGATTATACTGGGTATAACTATGGGAACTATGGATATGGACAGGGATATGCAGACTACAGTG |
376 | 751 | AGTTAAGCACAAAGGAAAACATTTCAATAAAGGATCATTTGACAACTGGTGGATTTTCTGGTGTGGCGTCTTCCTTGAGGGAGCTAGCTCCTTTGTGGGGTGGTCAGTGGGGTCAGGGTGGCAGAACCTGTGGAGAAGTAACAAGCACCT | 752 | GCCAGCACCTCCAATGCCACCATTTCTTTACTTAAAGGAACCATTAAAGAGATCCTGGGGACTGCCCAGTCAGTGGGCTGTAATGTTGATGGCCGCCATCCTCATGACATCATCGATGACATCAACAGTGGTGCTGTGGAATGCCCAGCC |
377 | 753 | ACGCAGTAAAACGCAGGACTCTTCCCGCTTGGATTCGCGAAGGTCTTGAAAAAATGGAACGTGAAAAGCAGAAGAAATTGGAGAAAGAAAGAATGGAACAACAACGTTCACAATTGTCCAAAAAAGAAAAAAAGGCCACAGAAGATGCTG | 754 | CGCAAGGTGGATTTCATCCTCCTTATTGGCAACCAGGACCTCCAGGACCTCCAGCACCTCCCCAGAATCGAAGAGAAAGGCCATCATCATTCAGGGATCGTCAGCGTTCACCTATTGCACTTCCTGTGAAGCAGGAGCCTCCACAAATTG |
378 | 755 | CTTTGTTACATGGTTAATTTATGTCAAAAGTATCATAGGCTAAGACATCAAAGTTTTAATAACATTCTTTTTTCTTTAAGGGGTTTGTTAAAGTTGTTAAGAATAAGGCCTACTTTAAGAGATACCAAGTGAAATTTAGAAGACGACGAG | 756 | AGGGTAAAACTGATTATTATGCTCGGAAACGCTTGGTGATACAAGATAAAAATAAATACAACACACCCAAATACAGGATGATAGTTCGTGTGACAAACAGAGATATCATTTGTCAGGTAAGTTGTATTCTAGACAGTCCCCTTTTTTTAT |
379 | 757 | TTTTTCCAGCTACTCAGCTGCTTAAGCTGGCCCACAAGTACAGACCAGAGACAAAGCAAGAGAAGAAGCAGAGACTGTTGGCCCGGGCCGAGAAGAAGGCTGCTGGCAAAGGGGACGTCCCAACGAAGAGACCACCTGTCCTTCGAGCAG | 758 | GAGTTAACACCGTCACCACCTTGGTGGAGAACAAGAAAGCTCAGCTGGTGGTGATTGCACACGACGTGGATCCCATCGAGGTGCGTTTGCCTGTTGACTGCTAACCCAAGGGCTTCTGGCAGTACCAGGAAGAGAGAGTAGACCTAATGC |
380 | 759 | ATATGCTAGTCTGTATTTTTGCTGTGCTATTGAGGATCAGGACAATGAACTAATTACCCTGGAAATAATTCATCGTTATGTGGAATTACTTGACAAGTATTTCGGCAGTGTGAGTAGTATTTTATTTTAGGAAATTGAATGCCATAGTAT | 760 | GGGAAAGCTTCGACTGCAAAAATGGTATGTCCCACTATCAGACAAAGAGAAGAAAAAGATCACAAGAGAACTTGTTCAGACCGTTTTAGCACGGAAACCTAAAATGTGCAGCTTCCTTGAGTGGCGAGATCTGAAGATTGTTTACAAAAG |
381 | 761 | GCTATGGTGGGCTCAGCCTGTCCATTGAGGGCCCCAGCAAGGTGGACATCAACACAGAGGACCTGGAGGACGGGACGTGCAGGGTCACCTACTGCCCCACAGAGCCAGGCAACTACATCATCAACATCAAGTTTGCCGACCAGCACGTGC | 762 | ATGGCCAGCACGTGGCCAGCAGCCCCATCCCGGTGGTGATCAGCCAGTCGGAAATTGGGGATGCCAGTCGTGTTCGGGTCTCTGGTCAGGGCCTTCACGAAGGCCACACCTTTGAGCCTGCAGAGTTTATCATTGATACCCGCGATGCAG |
382 | 763 | CTGACGCTGGCTCCTTCTGTTGTTTCTCTTGGCTCCAGGACCCCCGCAGCAAACACAAGTTTAAGATCCACACGTACTCCAGCCCCACGTTTTGTGACCACTGTGGGTCACTGCTGTATGGACTCATCCACCAGGGGATGAAATGTGACA | 764 | CCTGCATGATGAATGTGCACAAGCGCTGCGTGATGAATGTTCCCAGCCTGTGTGGCACGGACCACACGGAGCGCCGCGGCCGCATCTACATCCAGGCCCACATCGACAGGGACGTCCTCATTGTCCTCGGTAGGTGGCCCTGGGGCTCCA |
383 | 765 | CGGCAGGGTCCGCCCGGGCCGGCAGCGTCCGCCCGGCGGCGGGAGGAGGGAGCGGCGCAGACAAAGAGCGGCGCCTGGGCGGGCGCAGCGCGGCCACCGCCCCGGGACCCGCGCCGCTGCCCTCCGGCTCCGCGGGCGGCCCACGGCGAG | 766 | ATTTCATGTGTTCTTTGTATACAAGCGACGTCCCAGATTATAATTCTCTGCTGAGATTTGAGTTGGATTTGAGGATTTGGAGAATCCCTGCAGCTTTGTAACTTCAGAGGTGTAATTAGCTGAAAACATCATCGTTTTGAAGAGTTCTGC |
384 | 767 | CTCACTCGCTCCCCTCTCGTCCGCAGCCGCAGGGCCGTAGGCAGCCATGGCGCCCAGCCGGAATGGCATGGTCTTGAAGCCCCACTTCCACAAGGACTGGCAGCGGCGCGTGGCCACGTGGTTCAACCAGCCGGCCCGTAAGATCCGCAG | 768 | ACGTAAGGCCCGGCAAGCCAAGGCGCGCCGCATCGCCCCGCGCCCCGCGTCGGGTCCCATCCGGCCCATCGTGCGCTGCCCCACGGTTCGGTACCACACGAAGGTGCGCGCCGGCCGCGGCTTCAGCCTGGAGGAGCTCAGGGTGAGTAC |
385 | 769 | AGGGTGGTTTTCCTGAAGCAGCTGGCTAGTGGCTTATTACTTGTGACTGGTAAGAAAATCCTTGGATTGTGATGTTCTGTGAAACTTCCATTTTTAAATGCTTGCAGTATACACGTTTGTTTGCTACTGCCTACATGGTAGACACTTATT | 770 | CCACAGCCTAGATATTATCCTACTGAAGATGTGCCTCGAAAGCTGTTGAGCCACGGCAAAAAACCCTTCAGTCAGCACGTGAGAAAACTGCGAGCCAGCATTACCCCCGGGACCATTCTGATCATCCTCACTGGACGCCACAGGGGCAAG |
386 | 771 | AAGACAGCGACAGCTGTGGCGCACTGCAAACGCGGCAATGGTCTCATCAAGGTGAACGGGCGGCCCCTGGAGATGATTGAGCCGCGCACGCTACAGTACAAGGTGCTGGGATCCGGCACCGGCGTTGAGTGGATGGAGGACTCTTGGAGA | 772 | AGGCGCCTGCGCAGACCCTGAAAAGCGGCCAGGGTGGCCCCTAGCTTTCCTTTTCCGGTTGCGGCGCCGCGCGGTGAGGTTGTCTAGTCCACGCTCGGAGCCATGCCGTCCAAGGGCCCGCTGCAGTCTGTGCAGGTCTTCGGACGCAAG |
387 | 773 | TGCACAAATAATCACTTCAAGGTCCTGCTTTCAATTCTTGTGTCTACTCCCAAATTTTGAAAGTGCTTAATGTCTTGACATTTCATTTGTAGTGATGATGATGATGATTTTGATGATGAGGAAGCTGAAGAAAAAGCGCCAGTGAAGAAA | 774 | TCTATACGAGATACTCCAGCCAAAAATGCACAAAAGTCAAATCAGAATGGAAAAGACTCAAAACCATCATCAACACCAAGATCAAAAGTAAGTGGCTACATTTACACGTGGGTCTCATTGATCTAGTTGGGGAAAAAGATTCTACTGTGG |
388 | 775 | CTTTAGGAAATTGAAGTTGGTGGTGGTCGGAAAGCTATCATAATCTTTGTTCCCGTTCCTCAACTGAAATCTTTCCAGAAAATCCAAGTCCGGCTAGTACGCGAATTGGAGAAAAAGTTCAGTGGGAAGCATGTCGTCTTTATCGCTCAG | 776 | AGGAGAATTCTGCCTAAGCCAACTCGAAAAAGCCGTACAAAAAATAAGCAAAAGCGTCCCAGGAGGTGAGTATTTTAGTAGTTTCAGAAATGTGTGTACCCCTCTTATTAACAACTCTTAATTTGTTTAAGTTGTAGTTTATGAAAACAG |
389 | 777 | GTGACGACTCCATGCGTATGTCCCACCTAAAGGTCGGCTCTGCTGCCGACATCCCCATCAACATCTCAGAGACGGATCTCAGCCTGCTGACGGCCACTGTGGTCCCGCCCTCGGGCCGGGAGGAGCCCTGTTTGCTGAAGCGGCTGCGTA | 778 | GCCCGTCCAAAGCAGAAATCAGCTGCACTGACAACCAGGATGGGACATGCAGCGTGTCCTACCTGCCTGTGCTGCCGGGGGACTACAGCATTCTAGTCAAGTACAATGAACAGCACGTCCCAGGCAGCCCCTTCACTGCTCGGGTCACAG |
390 | 779 | GCAGCCCCTTCTCTGTGAAGGTGACAGGCGAGGGCCGGGTGAAAGAGAGCATCACCCGCAGGCGTCGGGCTCCTTCAGTGGCCAACGTTGGTAGTCATTGTGACCTCAGCCTGAAAATCCCTGGTAGGGGCTGTGGGAAGCCTGGGGAGG | 780 | ATGGTGGGCTCAGCCTGTCCATTGAGGGCCCCAGCAAGGTGGACATCAACACAGAGGACCTGGAGGACGGGACGTGCAGGGTCACCTACTGCCCCACAGAGCCAGGCAACTACATCATCAACATCAAGTTTGCCGACCAGCACGTGCCTG |
391 | 781 | CTACCTTAGGTGTTTCATCAGTACAGACGCCTCACCTTCCCACGCACGCAGACTCGCAGACGCCCTCTGCTGGAACTGACACGCAGACATTCAGCGGCTCCGCCGCCAATGCAAAACTCAACCCTACCCCAGGCAGCAATGCTATCTCAG | 782 | ATGTCCCAGGAGAGAGGAGTACAGCCAGCACCTTTCCTACAGACCCAGTTTCCCCATTGACAACCACCCTCAGCCTTGCACACCACAGCTCTGCTGCCTTACCTGCACGCACCTCCAACACCACCATCACAGCGAACACCTCAGGTCTGA |
392 | 783 | TCTCTCAAGTCCCGAGGCTACGTGAAGGAACAGTTTGCCTGGAGACATTTCTACTGGTACCTTACCAATGAGGGTATCCAGTATCTCCGTGATTACCTTCATCTGCCCCCGGAGATTGTGCCTGCCACCCTACGCCGTAGCCGTCCAGAG | 784 | ATGTTGATGCCTAAGAAGAACCGGATTGCCATTTATGAACTCCTTTTTAAGGAGGGAGTCATGGTGGCCAAGAAGGATGTCCACATGCCTAAGCACCCGGAGCTGGCAGACAAGAATGTGCCCAACCTTCATGTCATGAAGGCCATGCAG |
393 | 785 | TACATCCAGCAGTGGTCATTCGACAACGAAAGTCATACCGTAGAAAAGATGGCGTGTTTCTTTATTTTGAAGATAATGCAGGAGTCATAGTGAACAATAAAGGCGAGATGAAAGGTAGGAAATCAGTCCAGCTTGTTCCTTAGGTCTCTG | 786 | TTTATTTACATTCTTTTGTAGGAGCCAAAAACCTGTATATCATCTCCGTGAAGGGGATCAAGGGACGGCTGAACAGACTTCCCGCTGCTGGTGTGGGTGACATGGTGATGGCCACAGTCAAGAAAGGCAAACCAGAGCTCAGAAAAAAGG |
394 | 787 | CGTCGCGTCCTCTCCGCCCGCCTCAGGATGCGCTACGTCGCCTCCTACCTGCTGGCTGCCCTAGGGGGCAACTCCTCCCCCAGCGCCAAGGACATCAAGAAGATCTTGGACAGCGTGGGTATCGAGGCGGACGACGACCGGCTCAACAAG | 788 | GTTATCAGTGAGCTGAATGGAAAAAACATTGAAGACGTCATTGCCCAGGGTGAGTTGATGTGGACGGGCTTTCGTTTGTTTTCATGGTCCATCCTAATCCCTGCCGGTCCATCTGTGGCCTGCCAGGTTTCGCTTGTGGACCAGAGCACC |
395 | 789 | GTGAATGTGGGAGCTGGCAGCCACCCCAACAAGGTCAAAGTATACGGCCCCGGAGTAGCCAAGACAGGGCTCAAGGCCCACGAGCCCACCTACTTCACTGTGGACTGCGCCGAGGCTGGCCAGGGTAAGGCCTGGCTGTGGGTGGGAGGG | 790 | TGACAGGACAATGAAGGCTGCCCTGTGGAGGCGTTGGTCAAGGACAACGGCAATGGCACTTACAGCTGCTCCTACGTGCCCAGGAAGCCGGTGAAGCACACAGCCATGGTGTCCTGGGGAGGCGTCAGCATCCCCAACAGCCCCTTCAGG |
396 | 791 | CTAGCCAAAATGTACAAGACCACACCGGATGTCATCTTTGTATTTGGATTCAGAACTCATTTTGGTGGTGGCAAGACAACTGGCTTTGGCATGATTTATGATTCCCTGGATTATGCAAAGAAAAATGAACCCAAACATAGACTTGCAAGA | 792 | CATGGCCTGTATGAGAAGAAAAAGACCTCAAGAAAGCAACGAAAGGAACGCAAGAACAGAATGAAGAAAGTCAGGGGGACTGCAAAGGCCAATGTTGGTGCTGGCAAAAAGGTATAGTTCATTAAGGAAAATATAGAAACGTCATTAATT |
397 | 793 | ATCCGGAAGATGAAGCTTCCTGGCCGGGAAAACAAGACGGCCGTGGTTGTGGGGACCATAACTGATGATGTGCGGGTTCAGGAGGTACCCAAACTGAAGGTGAGCTGGCGGGGGCTGGGCAGACCCATCAGACCCTTGCTGTACTGTGCT | 794 | AGCAGGCTGTCCCAGCTTCTCACTGTCTTCCCGTCCCTCCAGTTATACAGGTTTCTGGCCAGAAGAACCAACTCCACATTCAACCAGGTTGTGTTGAAGAGGTTGTTTATGAGTCGCACCAACCGGCCGCCTCTGTCCCTTTCCCGGATG |
398 | 795 | GCTTCTCGGTGGAAGGGCCATCGCAGGCTAAGATCGAATGTGACGACAAGGGCGACGGCTCCTGTGATGTGCGCTACTGGCCGCAGGAGGCTGGCGAGTATGCCGTTCACGTGCTGTGCAACAGCGAAGACATCCGCCTCAGCCCCTTCA | 796 | CTCGCATTTGCAGTCCCTTCGAAGTGAAGGTGGGCACCGAGTGTGGCAATCAGAAGGTACGGGCCTGGGGCCCTGGGCTGGAGGGCGGCGTCGTTGGCAAGTCAGCAGACTTTGTGGTGGAGGCTATCGGGGACGACGTGGGCACGCTGG |
399 | 797 | CAGCTGAGGCGGCTGGAGGCGGAGGAGCGAGCGACGCTGCAGAGACTGCGGGAGAGCAAGAGCCGGCTGGTCCAGCAGAGCAAGGCCCTGAAGGAGCTGGCGGATGAGCTGCAGGAGAGGTGCCAGCGCCCGGCCCTGGGTCTGCTGGAG | 798 | GGTGTGAGAGGAGTCCTGAGCAGGTATGTGTGCTTTCTGAATTGGTGAAGGGATTGGGAGAGGCAGAGGAGCTGGTGGAGAACCCTGCTGACTTCTGTGGTTTCTGTGCTCTTCCCAGAAGTAAGGCTGTCACAAGGCTGGAAGCAGAGA |
400 | 799 | ACGAGCGGCTGTGGTTGCTGGACGACTCCAAGACGTGGTGGCGGGTGAGGAACGCGGCCAACAGGACGGGCTATGTACCGTCCAACTACGTGGAGCGGAAGAACAGCCTGAAGAAGGGCTCCCTCGTGAAGAACCTGAAGGACACACTAG | 800 | GCCTCGGCAAGACGCGCAGGAAGACCAGCGCGCGGGATGCGTCCCCCACGCCCAGCACGGACGCCGAGTACCCCGCCAATGGCAGCGGCGCCGACCGCATCTACGACCTCAACATCCCGGCCTTCGTCAAGTTCGCCTATGTGGCCGAGC |
401 | 801 | ATTGAGGAATTTTCTAAAGGTATCTCTCTCGGTGTATTTCTCTACTTACCTGTAATAATGCTTTTGTCTTAATAGGGTGGTTCTCTTCCCAAAGTGGAAGCCAAATTCATCAATTATGTGAAGAATTGCTTCCGGATGACTGACCAAGAG | 802 | GCTATTCAAGATCTCTGGCAGTGGAGGAAGTCTCTTTAAGAAAATAGTTTAAACAATTTGTTAAAAAATTTTCCGTCTTATTTCATTTCTGTAACAGTTGATATCTGGCTGTCCTTTTTATAATGCAGAGTGAGAACTTTCCCTACCGTG |
402 | 803 | GTGACAAGACCACCTTCCAGCTACAGGTTCGCCAGGTGGAGGACTATCCTGTGGACCTGTACTACCTGATGGACCTCTCCCTGTCCATGAAGGATGACTTGGACAATATCCGGAGCCTGGGCACCAAACTCGCGGAGGAGATGAGGAAGC | 804 | TCAAAAATGGCTGTGGAGGTGAGATAGAGAGCCCAGCCAGCAGCTTCCATGTCCTGAGGAGCCTGCCCCTCAGCAGCAAGGGTTCGGGCTCTGCAGGCTGGGACGTCATTCAGATGACACCACAGGAGATTGCCGTGAACCTCCGGCCCG |
403 | 805 | AGGTGGACGTTGGCAAAGACCAGGAGTTCACAGTCAAATCAAAGGGTGCTGGTGGTCAAGGCAAAGTGGCATCCAAGATTGTGGGCCCCTCGGGTGCAGCGGTGCCCTGCAAGGTGGAGCCAGGCCTGGGGGCTGACAACAGTGTGGTGC | 806 | AAATACCCCCTTCCCTTCTGCACCCTTCCCAGGGTCCAGTAGGCGTCAATGTCACTTATGGAGGGGATCCCATCCCTAAGAGCCCTTTCTCAGTGGCAGTATCTCCAAGCCTGGACCTCAGCAAGATCAAGGTGTCTGGCCTGGGAGAGA |
404 | 807 | GGGGCCTGTCTCTGGCCATTGAGGGCCCGTCCAAAGCAGAAATCAGCTGCACTGACAACCAGGATGGGACATGCAGCGTGTCCTACCTGCCTGTGCTGCCGGGGGACTACAGCATTCTAGTCAAGTACAATGAACAGCACGTCCCAGGCA | 808 | AGGCCCTTCTTCCTGCCTCAGGAAGCCCCTTGCAGTTCTATGTGGATTACGTCAACTGTGGCCATGTCACTGCCTATGGGCCTGGCCTCACCCATGGAGTAGTGAACAAGCCTGCCACCTTCACCGTCAACACCAAGGATGCAGGAGAGG |
405 | 809 | CCACAGATAAGCTACACCGGGCCCTCGTCCAGCGGGCGCTTCGCGGGCAGAGAGTGGTACTACGGGAACGTGACGCGGCACCAGGCCGAGTGCGCCCTCAACGAGCGGGGCGTGGAGGGCGACTTCCTCATTAGGGACAGCGAGTCCTCG | 810 | CCCAGCGACTTCTCCGTGTCCCTTAAAGCGTCAGGGAAGAACAAACACTTCAAGGTGCAGCTCGTGGACAATGTCTACTGCATTGGGCAGCGGCGCTTCCACACCATGGACGAGCTGGTGGAACACTACAAAAAGGCGCCCATCTTCACC |
406 | 811 | TACAGGCAGAGGCTGGCTTTGAGGATTGGTGTTTCCCAAACCTGGGGGAGTGGTTTGTGACCCTTCTTCTCTTTCTAGGTTGACAAGGTCACAGGCAGGTTTAATGGCCAGTTTAAAACTTATGCTATCTGCGGGGCCATTCGTAGGATG | 812 | GGTGAGTCAGATGATTCCATTCTCCGATTGGCCAAGGCCGATGGCATCGTCTCAAAGTAAGGTTGGGGGCTCACATTTGGGCAGAGTGAGTGGACTAGGACTGCTCCAGAGGCGTGGTCTTAACGTTGTCCTTTTCCCCTGGTTCTAGGA |
407 | 813 | ATAATCTGCCACTCTTGGCAGGGAGCTCACTCAGTGGGTTTGATGTGGTGGATGCTGGCTCGGGAAGTTCTGCGCATGCGTGGCACCATTTCCCGTGAACACCCATGGGAGGTCATGCCTGATCTGTACTTCTACAGAGATCCTGAAGAG | 814 | ATTGAAAAAGAAGAGCAGGCTGCTGCTGAGAAGGCAGTGACCAAGGAGGAATTTCAGGGTGAATGGACTGCTCCCGCTCCTGAGTTCACTGCTACTCAGCCTGAGGTTGCAGACTGGTCTGAAGGTGTACAGGTGCCCTCTGTGCCTATT |
408 | 815 | ACAGCTTTGTTTGCACTGTTGTTGGGGTCAGGGACAGTGATTAAGATAAATTTCTAATTGCAGTCTATACGAGATACTCCAGCCAAAAATGCACAAAAGTCAAATCAGAATGGAAAAGACTCAAAACCATCATCAACACCAAGATCAAAA | 816 | GGACAAGAATCCTTCAAGAAACAGGAAAAAACTCCTAAAACACCAAAAGGACCTAGTTCTGTAGAAGACATTAAAGCAAAAATGCAAGCAAGTATAGAAAAAGTGAGTAAAGTTATCTTAAAAAAACTTTGTCTCCCCCCTCAAATTGCA |
409 | 817 | TGCAATAAGCCAATATTTACATTTTAAAGGAGTTTTTCTGTTTTTTTTTTTTTTTTCAGAGACTTCCTTCATATAGGAGCTGGAGGACACAGCACATTGGAAATCAAGAAGAAAATAAAAGTAAAAACAGGAATTCTAATGTCATCCCAT | 818 | ATGACTATAACAGAGTGCCACTTAAACATGAGCTGGAAATGAGTAAAGAGAGTGAGCATGATTCAGATGAATCCTCTGATGATGACAGTGATTCAGAGGAACCAAGCAAATACATCAATGCATCTTTTATAATGGTAGGTACTTAAATTG |
410 | 819 | GTACACGGTCCTCTTCTCGCACGGCAATGCCGTGGACCTGGGCCAGATGAGCAGCTTCTACATTGGCCTGGGCTCCCGCCTCCACTGCAACATCTTCTCCTACGACTACTCCGGCTACGGTGCCAGCTCGGGCAGGCCTTCCGAGAGGAA | 820 | CGCACCCGGGCGCTGGAAGCTGCACCTGACGGAGCGTGCCGACTTCCAGTACAGCCAGCGCGAGCTGGACACCATCGAGGTCTTCCCCACCAAGAGCGCCCGCGGCAACCGCGTCTCCTGCATGTATGTTCGCTGCGTGCCTGGTGCCAG |
411 | 821 | CTTGAACTCACCTGCTTTTTACCATGTCTCCTCTGCTGGAATGTGCCTGCCCAGCTGAATGAGTATGTGGCTAAGGGTCTGACCGACAGCATCCACCGTTACCACTCAGACAATAGCACCAAGGCAGCGTGGGACTCCATCCAGTCATTT | 822 | CTGCAGTGTTGTGGTATAAATGGCACGAGTGATTGGACCAGTGGCCCACCAGCATCTTGCCCCTCAGATCGAAAAGTGGAGGTAATTTTGTCGGCAATGTTTCTGTTATTGACCTCTTTGTTTAAATGTTTAATTACCTCGGAAACTGCA |
412 | 823 | CAGGAGATTGAGCGGGAGCTGCGTGCTGCACCCCCAGCCCCCAACGCCCCTGCCGCTGGGGAGGACACCACTGAAACCGCCCCCGCACCAGGGACTCCTGCCCGCGGCCCCCGCATGACACCCAGCGACCTGCGCAACCTCGACGAGCTG | 824 | GTGAGGGAGATTCTGGGCCGCTGCACCTGCCCTGACCAGTTTCCCATGATCAAGGTCTCAGAGGGGAAGTACCGTGTGGGGGACTCGAGCCTGCTCATCTTTGTGCGGGTAAGGGCCTGGGGCCGCCCCAGCGGGCAGCAGCCAAGGTGG |
413 | 825 | TCGACATTACTCTGAAGGGACGCACAGTTATCGTGAAGGGCCCCAGAGGAACCCTGCGGAGGGACTTCAATCACATCAATGTAGAACTCAGCCTTCTTGGAAAGAAAAAAAAGAGGGTGAGGGTTTTTCTTCTGATAATTCAGTTGCTCG | 826 | AACTTCCGCCTGGCAGTCTCCAGTAGGAGTGGAGCTCTGTGCGGCGTAGTTTGGTGGAAAAACGGGCCTTGCGTCGGCCTCACCCCCAGTGTTTGTGTTTCAGAATGAAGACTATTCTCAGCAATCAGACTGTCGACATTCCAGAAAATG |
414 | 827 | GACTTCGGCAGTCTGTCCAACCTTCAGGTCACTCAGCCTACAGTTGGGATGAATTTCAAAACGCCTCGGGGACCTGTTTGAATTTTTTCTGTAGTGCTGTATTATTTTCAATAAATCTGGGACAACAGCCTTGCCTGTGTCATCTTTGCA | 828 | ATTTAAGAACCTGGGGAGAGGAGGAGGAGAGGTGAGTGATAATCTCATTGATTGGTATTTTGACCCTACCTCGTTTCCTTGTAGGTCTCAGCCTTGGATCAGGAGATTATTGAAGTAGATCCTGACACTAAGGAAATGCTGAAGCTTTTG |
415 | 829 | TGACCCCAGGACCTCCCTGACCCCCAACCAGGCCAGCGGAAGCCTGCCACCTCCTATGTGCGGACCACCATCAACAAGAATGCTCGCGCCACGCTCAGCAGCATCAGACACATGATCCGCAAGAACAAGTACCGCCCCGACCTGCGCATG | 830 | GCAGCCATCCGCAGGGCCAGCGCCATCCTGCGCAGCCAGAAGCCTGTGATGGTGAAGAGGAAGCGGACCCGCCCCACCAAGAGCTCCTGAGCCCCCTGCCCCCAGAGCAATAAAGTCAGCTGGCTTTCTCACCTGCCTCGACTGGGCCTC |
416 | 831 | GGCAAGAAGTACAAGCCCCTGGACCTGCGGCCTAAGAAGACACGTGCCATGCGCCGCCGGCTCAACAAGCACGAGGAGAACCTGAAGACCAAGAAGCAGCAGCGGAAGGAGCGGCTGTACCCGCTGCGGAAGTACGCGGTCAAGGCCTGA | 832 | GTGTGAGTCCTGAGTCTTGGGTAGTGCTATGTGTCTTCCCTATCTTCACTGACATCTCTGTTTTGTAGCCGAGTCGTCCGGAAATCCATTGCCCGTGTTCTCACAGTTATTAACCAGACTCAGAAAGAAAACCTCAGGAAATTCTACAAG |
417 | 833 | TATTCATCTTTTTCTTTAGTTAAATTATAAGATGGTTTACCAATTTGAGCCTTTTCAGGGTTTTGGGAAAATTAGACTTTTAATCTAATCATATTATTCTGCTTTTTCTTTTAGCTCCAGTAATAAACCGATTCACAAGGCGTGCCTCAG | 834 | TATGTGCAGAAGCTTATAATCCTGATGAAGAAGAAGATGATGCAGAGTCCAGGGTATGTAATTTACTGAATGAATGAATTTTAAATTGATGCCCTTGTCATATATAAGGAAAATAATCATAGAAAAGATTTAACAGGCATGTAGGTTAAC |
418 | 835 | GTATGTGCACTGCGCGTGACCAGCCGGGCCCGCAGCCGCATCCTCAGGGCAGGGGGCAAGATCCTCACTTTCGACCAGCTGGCCCTGGACTCCCCTAAGGGCTGTGGCACTGTCCTGCTCTCCGGTGAGTGATACGTGGTCGACGGGTTT | 836 | CTGAATGTAAACACCAGAACAACTTACGACGTACATCCTCCCCACCCTAAGATCCGGAAGATGAAGCTTCCTGGCCGGGAAAACAAGACGGCCGTGGTTGTGGGGACCATAACTGATGATGTGCGGGTTCAGGAGGTACCCAAACTGAAG |
419 | 837 | GGGAGGTTCGGATGCCCTCAGGCAAGGTGGCGCAGCCCACCATCACTGACAACAAAGACGGCACCGTGACCGTGCGGTATGCACCCAGCGAGGCTGGCCTGCACGAGATGGACATCCGCTATGACAACATGCACATCCCAGGTGGGCCTG | 838 | CAGATCCTCCCACTGTCCCTCACCCATGCCCTGTGTCTCCACTGCAGGCCCCGGAGAGGCCCCTGGTGGGTGTCAATGGGCTGGATGTGACCAGCCTGAGGCCCTTTGACCTTGTCATCCCCTTCACCATCAAGAAGGGCGAGATCACAG |
420 | 839 | GTCTCAGCCTTGGATCAGGAGATTATTGAAGTAGATCCTGACACTAAGGAAATGCTGAAGCTTTTGGTAAGTGTTTGCTGGATTCCTAAAGTGGTATTTTCCTGGTCAAAAACCATCAGTAGGTCTTATTATCCAAGGTCACCCAGCTAG | 840 | GATTTGTCTTCTCACTGTTCTCTTTGGCTGTGTGTGCTTTGTAGTTATGTCACGCATCTGATGAAGCGAATTCAGAGAGGCCCAGTAAGAGGTATCTCCATCAAGCTGCAGGAGGAGGAGAGAGAAAGGAGAGACAATTATGTTCCTGAG |
421 | 841 | TTACAAGTTGTTTCCAAATTGCGTCCCCTCCTTTGGGTTCCGCCATCTGCTGCCTCTCACAGACAGAGTGGACAGCTTCAATGAGGAAGTTCGGAAACAGAGGGTGTCCCGGAACCGAGATGCCCCTGAGGGGGGCTTTGATGCAGTACT | 842 | GGACAATATCCGGAGCCTGGGCACCAAACTCGCGGAGGAGATGAGGAAGCTCACCAGCAACTTCCGGTTGGGATTTGGGTCTTTTGTTGATAAGGACATCTCTCCTTTCTCCTACACGGCACCGAGGTACCAGACCAATCCGTGCATTGG |
422 | 843 | CCTGTAACCCGAGTGCCTGCCGGGCGGTTGGCCGGGGCCTCCAGCCCAAGGGTGTGCGGGTGAAGGAGACAGCTGACTTCAAGGTGTACACAAAGGGCGCTGGCAGTGGGGAGCTGAAGGTCACCGTGAAGGGCCCCAGTAAGTTGGCCT | 844 | AGGGCACGGTAGAGCCTCAGCTGGAGGCCCGGGGCGACAGCACATACCGCTGCAGCTACCAGCCCACCATGGAGGGCGTCCACACCGTGCACGTCACGTTTGCCGGCGTGCCCATCCCTCGCAGCCCCTACACTGTCACTGTTGGCCAAG |
423 | 845 | CTCGGGAACTGAGCCGGTACTCACCTCCGCCCCTTCTCCCCGTCGCTGTCCGCAGCCATGGCCCTACGCTACCCTATGGCCGTGGGCCTCAACAAGGGCCACAAAGTGACCAAGAACGTGAGCAAGCCCAGGCACAGCCGACGCCGCGGG | 846 | CGTCTGACCAAACACACCAAGTTCGTGCGGGACATGATTCGGGAGGTGTGTGGCTTTGCCCCGTACGAGCGGCGCGCCATGGAGTTACTGAAGGTCTCCAAGGACAAACGGGCCCTCAAATTTATCAAGAAAAGGGTAGGTGGGCGCTGC |
424 | 847 | CCCGGGTGTGTAGGACGGGGAGGTCACGATGGCGCGACGTCTGCAGAAATTTCATGAGGAGGTATCAGTCTAGAGTTACCCAGGGCCTCGTGGCCGGAGAAACTGCCCAGCAGATCTGTGAGGACCTCAGGTTGTGTATACCTTCTACAG | 848 | GTCCCCTCTGAGCCCTCTCACCTTGTCCTGTGGAAGAAGCACAGGCTCCTGTCCTCAGATCCCGGGAACCTCAGCAACCTCTGCCGGCTCCTCGCTTCCTCGATCCAGAATCCACTCTCCAGTCTCCCTCCCCTGACTCCCTCTGCTGTC |
425 | 849 | GAGCCAAAAACCTGTATATCATCTCCGTGAAGGGGATCAAGGGACGGCTGAACAGACTTCCCGCTGCTGGTGTGGGTGACATGGTGATGGCCACAGTCAAGAAAGGCAAACCAGAGCTCAGAAAAAAGGGTGAGTAAACACTGAGCCCAC | 850 | GGGAACTCCGAACCATGTCTAGATTGTGATCTCTTTATCCTGTTTCCCTTCCCTTTATATCCACAGGACGTGGTGGGTCCTCTGGTGCGAAATTCCGGATTTCCTTGGGTCTTCCGGTAGGAGCTGTAATCAATTGTGCTGACAACACAG |
426 | 851 | TTTATCAGTTATTATGAGTGAATATCATGTGAGAGTTACCTCTGGTTTGATCAGTTTCAGGAAAATGCCAGTGAAGGGAAGGCCCCTGCAGAAGACGTCTTTAAGAAGCCCCTGCCTCCTACTGTGAAGAAGGAAGAGAGTCCCCCTCCA | 852 | CCTAAAGTGGTAAACCCACTGATCGGCCTCTTGGGTGAATATGGAGGAGACAGTGACTATGAGGAGGAAGAAGAGGAGGAACAGACCCCTCCCCCACAGCCCCGCACAGCACAGCCCCAGAAGCGAGAGGAGCAAACCAAGAAGGAGAAT |
427 | 853 | CCCGTCAGCCTCCCGCTCGGGGTGCGCCGCCCTTCGTCTGGGTCTCCGCCCCCAGGACCCGCGGCCGAGAGCTCCGGAGCGCGGCTTCCCCGGCCGGCTGCGCGATGGGCTGCGGGAACTCCACCGCCACCAGCGCGGGCGCGGGCCAAG | 854 | GCCCTGCAGGAGCAGCCAAAGATGTGTAAGTATTGAATATTAATGATTTTATAAGCTGTCTTTCTGAGGAAGTTGCTGTTTTTCATGATTATGACCTTTAGATCTCTGTGGGTATGGCTGAAAAGACATGGAAATACTTTGTGTATAATG |
428 | 855 | GCAAAAGTGACCGCCAATAACGACAAGAACCGCACCTTCTCCGTCTGGTACGTCCCCGAGGTGACGGGGACTCATAAGGTGAGCCCTTGGCCAGGGGGGAGGCTTGTGACCTCAGGCAGTGGCTGGAGGCCCCCAGCCCTACCCTCACGG | 856 | GGCCCGTGGTTGGCTCGCCTTCCCCTGCCAGGCATCGAGCCCACAGGCAACATGGTGAAGAAGCGGGCAGAGTTCACTGTGGAGACCAGAAGTGCTGGCCAGGGAGAGGTGCTGGTGTACGTGGAGGACCCGGCCGGACACCAGGAGGAG |
429 | 857 | CCGGCACCCAGCGCCCCGCCGCCCGCAAGCCGCGCGCCCGTCCGCCGCGCCCCGAGCCCGCCGCTTCCTATCTCAGCGCCCTGCCGCCGCCGCCGCGGCCCAGCGAGCGGCCCTGATGCAGGCCATCAAGTGTGTGGTGGTGGGAGACGG | 858 | AGCTGTAGGTAAAACTTGCCTACTGATCAGTTACACAACCAATGCATTTCCTGGAGAATATATCCCTACTGTGTAAGTATCTTAAATTGGGAATTAACCTGTTTGTGTTACGGGTTTCACATTTCTTTGACCATTTGTTTTGCTGTAAAG |
430 | 859 | TGAGTGACTTCAAAGCTGATTTCTTAATCTGTGGTCTTGGCTCGTTCTAGTGCCCACCAGAAGTATGTCCGACAAGCCTGGCAGAAGGCAGACATCAATACAAAATGGGCAGCCACACGATGGGCCAAGAAGATTGAAGCCAGAGAAAGG | 860 | AAAGCCAAGATGACAGATTTTGATCGTTTTAAAGTTATGAAGGCAAAGAAAATGGTAAGATTTAAGATCTGTATTTTTGTGTAACTTAGCTTTAAATAATAAGGGAGCAGTAGCCAAATCCCATTTCAGGCTGCCAGCTTCTTGGAAGCT |
431 | 861 | CCTTTTTCTTTCTTTTTTTTTGGCCAACAGGTGGATCTTGAACGCACCTTCACGTTTCGAAATTCAAAGCAGACCTACTCAGGGATTCCCATCATCGTGGCCAACATGGACACTGTGGGCACGTTTGAGATGGCAGCCGTGATGTCACAG | 862 | CACTCCATGTTTACAGCAATTCATAAGCATTACTCCCTGGATGACTGGAAGCTCTTTGCCACAAATCACCCAGAATGCCTGCAGGTACGACTACAGCCTGGTTATCAATTACCAGTGCTGCAGGGGGGAACAAAATCTTCAGAGCTGTCA |
432 | 863 | TCTCTCTTACTATAGATTGCTTATGCCCGTATAGAGGGGGATATGATAGTCTGCGCAGCGTATGCACACGAACTGCCAAAATATGGTGTGAAGGTTGGCCTGACAAATTATGCTGCAGCATATTGTACTGGCCTGCTGCTGGCCCGCAGG | 864 | CTTCTCAATAGGTTTGGCATGGACAAGATCTATGAAGGCCAAGTGGAGGTGACTGGTGATGAATACAATGTGGAAAGCATTGATGGTCAGCCAGGTGCCTTCACCTGCTATTTGGATGCAGGCCTTGCCAGAACTACCACTGGCAATAAA |
433 | 865 | GGTGACAAAGATTTCCCCCCGGCGGCTGCGCAGGTGGCTCACCAGAAGCCGCATGCCTCCATGGACAAGCATCCTTCCCCAAGAACCCAGCACATCCAGCAGCCACGCAAGTGAGCCTGGAGTCCACCAGCCTGCCCCATGGCCCCGGCT | 866 | GCCTCCACGCCCTCCAGCCTGGCTCATCCACATACACATCGTCTGCCACCGTGTTCCCTGTTGTGGACTGTTTTGTTAAATCTGCCTTTTCCCTTCTTTTTTTCCAGTCCACCTAAACCCACTGTGTTCATCTCTGGGGTCATCGCCCGG |
434 | 867 | CAGATGGAGGAGGGAGGGGGGCTCTCACTTGGCTTCCCATCCTAATGGTGCTGTTTTGTTTTCTAGCACTCCATGTTTACAGCAATTCATAAGCATTACTCCCTGGATGACTGGAAGCTCTTTGCCACAAATCACCCAGAATGCCTGCAG | 868 | AATGTAGCCGTGAGTTCAGGCAGTGGGCAGAATGATCTGGAAAAGATGACCAGCATCCTGGAAGCTGTGCCACAGGTTAAGTTTATTTGCCTGGATGTGGCCAATGGGTATTCAGAACATTTTGTGGAATTCGTGAAACTTGTCCGTGCC |
435 | 869 | CTGCCGACACACAAGCTCTGTTGAGGAATGACCAGGTCTATCAGGTGAGCGTTGAGGGGAAGGAGGCAGGAATGAAGGGAGGGTAAGTGGGGATAGAGAGGCTCACACTGAATGCTGTTTGCACGTGGGAAGGGTCCTACTGGGGAGTTC | 870 | TCCTCCCTTCCCCCACAGTGTGCCAGAGCTGTGTGGAGCTGGATCCAGCCACCGTGGCTGGCATCATTGTCACTGATGTCATTGCCACTCTGCTCCTTGCTTTGGGAGTCTTCTGCTTTGCTGGACATGAGACTGGAAGGCTGTCTGGGG |
436 | 871 | GTTTTGGAACTGAAAGAACACAAACTGGATGGCAAATTGATAGATCCCAAAAGGGCCAAAGCTTTAAAAGGGAAAGAACCTCCCAAAAAGGTTTTTGTGGGTGGATTGAGCCCGGATACTTCTGAAGAACAAATTAAAGAATATTTTGGA | 872 | TTGAGCTGGGATACAAGCAAAAAAGATCTGACAGAGTACTTGTCTCGATTTGGGGAAGTTGTAGACTGCACAATTAAAACAGATCCAGTCACTGGGAGATCAAGAGGATTTGGATTTGTGCTTTTCAAAGATGCTGCTAGTGTTGATAAG |
437 | 873 | GCCGCCGAGCGAGGGCGAGGAGAGCACCGTGCGCTTCGCCCGCAAAGGCGCCCTCCGGCAGAAGAACGTGCATGAGGTCAAGAACCACAAATTCACCGCCCGCTTCTTCAAGCAGCCCACCTTCTGCAGCCACTGCACCGACTTCATCTG | 874 | GGGCTTCGGGAAGCAGGGATTCCAGTGCCAAGGTAGGCTCTGGGGCTTTGGGGATGCTATTTGTGGGAAGAGAGGGTGAAAAATACTTTATAGAAGAAGTTACTGAGTTAGGCAGAGAGTGAAAGAATCACGTTGGTCGGAGTGACCTCC |
438 | 875 | GGCACCAGCCCCAGAAGGTGGCCCGGCGCGTGTTCACCAACAGCCGGGAGCGCTGGCGGCAGCAGAACGTTAACGGCGCCTTCGCCGAGCTGAGGAAGCTGCTGCCGACGCACCCGCCCGACCGGAAGCTGAGCAAGAACGAGGTGCTCC | 876 | GGGATTGGGGGCCAGGGTCCTTGCCCACAAGGCATTAGTGACCCACGACCCCTTACAGTGTCTACATTGGGCCAGCAGGACCTTTTAGCATCTTCCCTAGCAGCCGGTTGAAGCGGAGACCAAGCCACTGTGAGCTGGACCTGGCTGAGG |
439 | 877 | GCATCGAGCCCACAGGCAACATGGTGAAGAAGCGGGCAGAGTTCACTGTGGAGACCAGAAGTGCTGGCCAGGGAGAGGTGCTGGTGTACGTGGAGGACCCGGCCGGACACCAGGAGGAGGTAGGGCCAGCTGCTGGCAGCAGAGGCCCCG | 878 | AGGTGATCACCCCCGAGGAGATTGTGGACCCCAACGTGGACGAGCACTCTGTCATGACCTACCTGTCCCAGTTCCCCAAGGCCAAGCTGAAGCCAGGGGCTCCCTTGCGGCCCAAACTGAACCCGAAGAAAGCCCGTGCCTACGGGCCAG |
440 | 879 | AATTTTACAGCCCTGATACCTGGAACAACGGTGGAGATTTTAGATGGAGACTCCAAAAATATTATTCAACTGATTATTAATGCATACAATGTAAGTCATCAGTTTCTTCCCCCACTGCCACCTCCCTTCCACCCTCTCCCACTGAGGCCC | 880 | TGGTATCTACACCTGTTAGGAATGTCATAGCCTTGACTTTTGCCTTGGCCCTAGGACTATCCATCCCTTGCCTTGCTTGGAGAGAAATTGGCAGAGAACAACATCAACCTCATCTTTGCAGTGACAAAAAACCATTATATGCTGTACAAG |
441 | 881 | CGTACTCTGACAGCTGTGCACGATGCCATCCTTGAGGACTTGGTCTTCCCAAGCGAAATTGTGGGCAAGAGAATCCGCGTCAAACTAGATGGCAGCCGGCTCATAAAGGTTCATTTGGACAAAGCACAGCAGAACAATGTGGAACACAAG | 882 | GTTGAAACTTTTTCTGGTGTCTATAAGAAGCTCACGGGCAAGGATGTTAATTTTGAATTCCCAGAGTTTCAATTGTAAACAAAAATGACTAAATAAAAAGTATATATTCACAGTACTCTGTTTCAGTTATGTTTTTCAAAATTCCAAATT |
실시예 2. 혈소판 유래 전사체 데이터를 이용한 암 진단 방법
본 발명에서는 암 여부를 판별하기 위하여 혈소판 유래 전사체 데이터를 사용하며, 특히 엑손-접합 수 (exon-junction count) 데이터를 사용하였다. 엑손-접합은 한 유전자 내의 서로 다른 두 개의 엑손에 대하여 상위 위치의 엑손의 끝 부분 (3' 부분)과 하위 위치의 엑손의 시작 부분 (5' 부분)의 접합을 나타낸다 (도 3). 엑손-접합 수 데이터는 선별된 리드 (read) 중 서로 다른 두 개의 엑손의 가장 끝쪽 말단, 즉 상위 위치의 엑손의 끝 부분과 하위 위치의 엑손의 시작 부분으로부터 시작하여 연속되는 최소 1개 이상의 엑손 영역 염기쌍을 포함하는 리드를 계수한 것이며, 이 때 서로 다른 두 개의 엑손들의 경우 참조 유전체 상에서 바로 인접한 엑손이 아니어도 된다. 중간에 위치한 한 개 이상의 엑손들이 skip되었을 때 skip되는 엑손 영역은 해당하는 개별 엑손 각각에 대해서 부분이 아닌 전체가 한꺼번에 skip이 되어야만 한다. 예를 들어 1번, 2번, 3번의 엑손이 있고 1번과 3번을 연결하는 엑손-접합이 질병관련 마커로 계수될 때에 2번 엑손의 모든 영역이 리드가 매핑되는 영역에 포함되지 않아야 한다. 또한 해독되지 아니하는 인트론 부분이 섞여서 있는 리드의 경우는 엑손-접합 수로 계수하지 아니한다 (도 4).
엑손-접합 수 데이터로부터 상기 실시예 1의 바이오마커들을 추출하여 사전 학습된 암 여부 판별 모형에 적용한다. 판별 모형은 해당 바이오마커 특성을 입력하면 암과 정상 판별 스코어를 출력하며, 피험자의 엑손-접합 정보를 시각화 및 중요도를 분석하여 피험자에게 통보할 수 있다.
실시예 3. 본원 발명의 성능 확인
암 여부 판별 모형 학습 시 사용된 샘플들의 441개 엑손-접합 라이브러리에 대한 발현 패턴을 학습 데이터셋 (도 5a) 및 검증 데이터셋 (도 5b)에서 도시화하였다. 행과 열은 각각 441개 엑손-접합 라이브러리와 샘플을 나타내며, 샘플에 대한 각 엑손-접합 라이브러리의 발현값을 색으로 나타내었고, 비슷한 패턴을 가지는 샘플 및 엑손-접합 라이브러리가 군집을 이루도록 하였다.
그 결과, 도 5에서 보듯이, 암 여부 판별 모형 학습 시 사용된 샘플들의 441개 엑손-접합 라이브러리에 대한 발현 패턴에서 학습 데이터셋 (도 5a) 및 검증 데이터셋 (도 5b) 모두 암과 정상 샘플이 구별된 군집을 이루는 것을 알 수 있다.
실시예 4. 유전자 마커를 활용한 모형과의 비교
본 발명의 441개 엑손-접합 라이브러리를 사용한 암 여부 판별 모형의 성능을 기존 마커 (유전자 1072개에 대한 SVM 모형)와 비교하였다. 기존 마커에 대한 모형은 동일한 데이터셋을 이용한 선행연구 [6]에서의 유전자 1,072개를 특성으로 하여 학습한 Support Vector Machine (SVM) 모형으로, 해당 모형의 AUC 스코어를 도 6a에 나타내었고, 본원에 따른 441개 엑손-접합 라이브러리를 특성으로 하여 학습한 SVM 모형의 AUC 스코어는 도 6b에 나타내었다.
도 6a 및 도 6b에서 보듯이, 선행 연구에 사용된 것보다 모형에 입력되는 특성의 개수가 크게 줄었음에도 불구하고 향상된 AUC 스코어를 보이며 우수한 성능을 보이는 것을 알 수 있었다.
또한, 동일한 비교 모형에 대해서 검증 데이터셋에 대한 정확도(Accuracy), 민감도(Sensitivity), 특이도(Specificity) 및 AUC 스코어를 비교하였다.
그 결과, 도 7에서 보듯이, 본원에 따른 441개 엑손-접합 라이브러리에 의한 모형과 선행 연구의 1,072개 유전자를 이용한 모형을 비교한 결과, 441개 엑손-접합 라이브러리를 사용한 모형으로도 암 여부 정확하게 구별해낼 수 있어, 더 많은 특성 사용하는 선행 연구와 비교해도 동등하거나 우월한 성능을 확보할 수 있는 점을 알 수 있었다.
실시예 5. 엑손-접합 마커의 일 예시에 대한 분석
도 8은 정상 샘플에 비해 암 샘플에서 발현이 낮아지는 엑손-접합(exon-junction) 중 암과 정상 샘플에서의 차이가 가장 큰 엑손-접합(exon-junction)의 정량 정보에 대한 일 예시도를 나타낸 것으로, Integrative Genomics Viewer (IGV) 프로그램을 이용한 암 샘플 3개 (빨간색, 상위 3개)와 정상 샘플 3개 (파란색, 하위 3개)의 참조 유전체 매핑 결과를 각각 나타낸 것이다. IGV는 통합적인 유전체 데이터셋을 시각화할 수 있는 프로그램으로, 시퀀싱 데이터 등 다양한 포맷의 데이터를 로드하여 참조 유전체에 매핑한 결과를 보여준다. 암 샘플에서 발현이 낮아지며 정상 샘플과의 차이가 가장 큰 엑손-접합(exon-junction)은 유전자 TRAC의 22,549,683~22,550,556 영역이며, 해당 영역이 도 8의 상단 Refseq Genes 트랙에 표시되어 있다. Refseq Genes 트랙 하단의 6개 트랙은 각 트랙에 로드된 샘플에 대하여 해당 영역에 실제 매핑 된 리드의 깊이를 보여준다. 이를 통해 해당 영역에 매핑 된 리드의 개수가 암과 정상 샘플에서 차이남을 알 수 있었다.
도 9는 정상 샘플에 비해 암 샘플에서 발현이 낮아지는 엑손-접합(exon-junction) 중 암과 정상 샘플에서의 차이가 가장 큰 엑손-접합(exon-junction)의 정규화 된 발현값에 대한 일 예시도를 나타낸 것으로, 도 9a는 학습 데이터셋, 도 9b는 검증 데이터셋에 대한 그래프를 나타낸 것이다. 이 그래프는 모든 샘플을 해당 엑손-접합의 log2CPM 값이 큰 순서대로 정렬하여 막대그래프로 표현한 것이며, 암 샘플은 빨간색, 정상 샘플은 파란색으로 나타내어 암과 정상 샘플에서의 발현 값을 비교하였다. 이를 통하여 학습 데이터셋과 검증 데이터셋 모두 해당 엑손-접합의 발현 값이 정상 샘플에 비해 암 샘플에서 낮게 나타남을 알 수 있다.
실시예 6. 엑손-접합 마커에 대한 feature reduction 분석
도 10은 본원에 따른 441개 엑손-접합 라이브러리의 전체 또는 일부를 사용한 암 여부 판별 모형의 성능에 대한 일 예시도를 나타낸 것이다.
441개 엑손-접합 라이브러리의 일부를 사용한 암 여부 판별 모형의 성능을 측정하기 위하여 어떤 특성이 결과값에 영향을 미친 정도를 나타내는 샤프레이 값 (Shapley value)이 사용되었다. 441개 엑손-접합 라이브러리를 특성으로 한 암 여부 판별 모형에서 샤프레이 값이 가장 작은, 즉 해당 모형에 가장 적게 영향을 미치는 엑손-접합 라이브러리를 1개 제거한 후 나머지 440개의 엑손-접합 라이브러리만을 특성으로 한 암 여부 판별 모형을 학습하였다. 이와 같이 판별 모형에서 가장 적게 영향을 주는 엑손-접합 라이브러리를 1개씩 제거하며 암 여부 판별 모형을 학습하고, 그 성능을 그래프로 나타내었다.
암 여부 판별 모형에서 가장 적게 영향을 주는 엑손-접합 라이브러리를 1개씩 제거하며 학습한 모형에 대해 x축은 모형 학습에 사용된 엑손-접합 라이브러리의 수, y축은 검증 데이터셋의 성능을 나타낸 것으로, 정확도 (Accuracy; acc), 민감도 (Sensitivity; sen), 특이도 (Specificity; spe), AUC 스코어를 각각 표시한 것이다. 여기서 암 여부 판별 모형에 영향을 주는 정도를 기준으로 엑손-접합 라이브러리 번호를 선정하였다(1번이 가장 높은 영향을 주는 엑손-접합 라이브러리). 즉, 위에서 '암 여부 판별에 가장 적게 영향을 주는 엑손-접합 라이브러리를 1개씩 제거'했다는 것은 엑손-접합 라이브러리 441부터 1개씩 제거했다는 것이며, 더 구체적으로 엑손-접합 라이브러리 1 내지 엑손-접합 라이브러리 441를 사용한 모형, 엑손-접합 라이브러리 1 내지 엑손-접합 라이브러리 440을 사용한 모형, … 엑손-접합 라이브러리 1 내지 엑손-접합 라이브러리 2를 사용한 모형 및 엑손-접합 라이브러리 1을 사용했다는 의미이다. 또한 이는 청구범위에서 정의되는 '단수 또는 복수의 엑손-접합 라이브러리'을 의미한다.
실험 결과, 도 10에서 보듯이, 엑손-접합 라이브러리의 수를 줄여가며 일부만 사용했음에도 불구하고 성능이 크게 떨어지지 않으므로 암과 정상을 판별할 때 표 1의 모든 엑손-접합 라이브러리(441개)가 사용되거나 그 일부 (단수 또는 복수의 엑손-접합 라이브러리)만 사용될 수 있다.
[참고문헌]
1. Chen, Ming, and Hongyu Zhao. “Next-generation sequencing in liquid biopsy: cancer screening and early detection.” Human genomics 13.1 (2019): 1-10.
2. Pisapia, Pasquale, et al. “Next generation sequencing for liquid biopsy based testing in non-small cell lung cancer in 2021.” Critical Reviews in Oncology/Hematology 161 (2021): 103311.
3. Liu, Minetta C. “Transforming the landscape of early cancer detection using blood tests―Commentary on current methodologies and future prospects.” British journal of cancer 124.9 (2021): 1475-1477.
4. Ried, Karin, Peter Eng, and Avni Sali. “Screening for circulating tumour cells allows early detection of cancer and monitoring of treatment effectiveness: an observational study.” Asian Pacific journal of cancer prevention: APJCP 18.8 (2017): 2275.
5. Wan, Jonathan CM, et al. “Liquid biopsies come of age: towards implementation of circulating tumour DNA.” Nature Reviews Cancer 17.4 (2017): 223-238.
6. Best, Myron G., et al. "RNA-Seq of tumor-educated platelets enables blood-based pan-cancer, multiclass, and molecular pathway cancer diagnostics." Cancer cell 28.5 (2015): 666-676.
Claims (24)
- (a) (i) 개체의 혈액의 무핵세포(anucleated cells)에서 분리한 전체 RNA 또는 이의 일부, (ii) 개체의 혈액의 엑소좀(exosome)에서 분리한 전체 RNA 또는 이의 일부, 및 (iii) 개체의 혈액에서 분리한 전체 cfRNA(cell-free RNA) 또는 이의 일부로 이루어진 군에서 선택된 하나 이상의 RNA를 분리하는 단계(b) 상기 (a) 단계에서 분리한 RNA에 대한 상보적 DNA (cDNA)을 합성하는 단계;(c) 상기 cDNA의 염기서열정보를 수득하는 단계;(d) 상기 cDNA 염기서열정보를 미리 정해진 엑손-접합(exon-junction) 라이브러리와 대비하여 각 엑손-접합에서의 염기서열 발현정보를 수득하는 단계; 및(e) 상기 각 엑손-접합에서의 염기서열 발현정보를 바탕으로 암 보유 여부를 결정하는 단계를 포함하는, 개체에서 암 진단에 필요한 정보를 제공하기 위해 전사체를 분석하는 방법.
- 제1항에 있어서, 상기 암 보유 여부의 결정은 하나 또는 2종류 이상의 암의 보유 여부를 결정하는 것을 특징으로 하는 방법.
- 제2항에 있어서, 상기 2종류 이상의 암의 보유 여부의 결정은 개체에서 분리한 1개의 생물학적 시료에서 동시에 또는 순차적으로 결정되는 것을 특징으로 하는 방법.
- 제1항에 있어서, 상기 전사체 분석은 차세대 염기서열 분석법 (Next-Generation Sequencing, NGS)에 의해서 수행되는 것을 특징으로 하는 방법.
- 제5항에 있어서, 상기 각 엑손-접합에서의 염기서열 발현정보는 상기 position 1 및 position 2의 각 염기를 포함하면서, 각 염색체의 5'방향 및/또는 3'방향으로 연속되는 2 이상의 염기를 포함하는 서열에 정렬(alignment)되는 염기서열 발현정보인 것을 특징으로 하는 방법.
- 제5항에 있어서, 상기 엑손-접합 라이브러리는 상기 표 1에 기재된 단수 또는 복수의 엑손-접합을 포함하며, 상기 단수 또는 복수의 엑손-접합은 엑손-접합 번호 1 , … , 엑손-접합 라이브러리 n-1 및 엑손-접합 라이브러리 n이되, 상기 n은 자연수로서 1 내지 441 중 어느 하나인 것을 특징으로 하는 방법.
- 제1항에 있어서, 각 엑손-접합에서의 염기서열 발현정보를 바탕으로 암 보유 여부를 결정하는 것은 사전 학습된 암 여부 판별 모형에 의해서 수행되는 것을 특징으로 하는 방법.
- 제8항에 있어서, 상기 사전 학습은 나이브 베이즈 분류(Naive Bayes Classification), 로지스틱 회귀(Logistic Regression), 의사결정나무(Decision tree), 랜덤포레스트(Random forest), 부스팅(XGBoost/ensemble boosting/AdaBoost/Gradient Boost/LightGBM/CatBoost 등), 퍼셉트론(Perceptron), 서포트 벡터 머신(Support Vector Machine), 쿼드라틱 분류(Quadratic classifiers), 클러스터링(K-means clustering, Bayesian network clustering 등), 딥 뉴럴 네트워크(Deep Neural Network)로 이루어진 군에서 선택된 어느 하나의 기계 학습 알고리즘에 의해서 수행되는 것을 특징으로 하는 방법.
- 제1항에 있어서, 상기 암은 방광암, 뼈암, 혈액암, 유방암, 흑색종양, 갑상선암, 부갑상선암, 골수암, 직장암, 인후암, 후두암, 폐암, 식도암, 췌장암, 대장암, 위암, 설암, 피부암, 뇌종양, 자궁암, 두부 또는 경부암, 담낭암, 구강암, 결장암, 항문 부근암, 중추신경계 종양, 간암 및 대장암으로 이루어진 군에서 선택되는 것을 특징으로 하는 방법.
- 상기 표 1의 엑손-접합(exon-junction)으로 이루어진 군에서 선택된 단수 또는 복수의 엑손-접합을 유효성분으로 포함하는 암 진단용 조성물이며, 상기 엑손-접합은 (i) 개체의 혈액의 무핵세포(anucleated cells)에서 분리한 RNA 또는 이의 일부, (ii) 개체의 혈액의 엑소좀(exosome)에서 분리한 전체 RNA 또는 이의 일부, 및 (iii) 개체의 혈액에서 분리한 전체 cfRNA(cell-free RNA) 또는 이의 일부에서 검출하는 것을 특징으로 하는 암 진단용 조성물.
- 제11항에 있어서, 상기 암 진단은 하나 또는 2종류 이상의 암의 진단인 것을 특징으로 하는 암 진단용 조성물.
- 제12항에 있어서, 상기 2종류 이상의 암의 진단은 개체에서 분리한 1개의 생물학적 시료에서 동시에 또는 순차적으로 진단되는 것을 특징으로 하는 암 진단용 조성물.
- 제11항에 있어서, 상기 암은 방광암, 뼈암, 혈액암, 유방암, 흑색종양, 갑상선암, 부갑상선암, 골수암, 직장암, 인후암, 후두암, 폐암, 식도암, 췌장암, 대장암, 위암, 설암, 피부암, 뇌종양, 자궁암, 두부 또는 경부암, 담낭 암, 구강암, 결장암, 항문 부근암, 중추신경계 종양, 간암 및 대장암으로 이루어진 군에서 선택되는 것을 특징으로 하는 진단용 조성물.
- 제11항의 조성물을 포함하는 암 진단 키트.
- 제15항에 있어서, 상기 암 진단은 하나 또는 2종류 이상의 암의 진단인 것을 특징으로 하는 암 진단 키트.
- 제16항에 있어서, 상기 2종류 이상의 암의 진단은 개체에서 분리한 1개의 생물학적 시료에서 동시에 또는 순차적으로 결정되는 것을 특징으로 하는 암 진단 키트.
- 제15항에 있어서, 상기 암은 방광암, 뼈암, 혈액암, 유방암, 흑색종양, 갑상선암, 부갑상선암, 골수암, 직장암, 인후암, 후두암, 폐암, 식도암, 췌장암, 대장암, 위암, 설암, 피부암, 뇌종양, 자궁암, 두부 또는 경부암, 담낭 암, 구강암, 결장암, 항문 부근암, 중추신경계 종양, 간암 및 대장암으로 이루어진 군에서 선택되는 것을 특징으로 암 진단 키트.
- 상기 표 1의 엑손-접합(exon-junction)으로 이루어진 군에서 선택된 단수 또는 복수의 엑손-접합을 검출할 수 있는 제제를 포함하는, 암 진단용 조성물이며, 상기 엑손-접합은 (i) 개체의 혈액의 무핵세포(anucleated cells)에서 분리한 전체 RNA 또는 이의 일부, (ii) 개체의 혈액의 엑소좀(exosome)에서 분리한 전체 RNA 또는 이의 일부, 및 (iii) 개체의 혈액에서 분리한 전체 cfRNA(cell-free RNA) 또는 이의 일부에서 검출하는 것을 특징으로 하는 암 진단용 조성물.
- 제19항에 있어서, 상기 암 진단은 하나 또는 2종류 이상의 암의 진단인 것을 특징으로 하는 암 진단용 조성물.
- 제20항에 있어서, 상기 2종류 이상의 암의 진단은 개체에서 분리한 1개의 생물학적 시료에서 동시에 또는 순차적으로 결정되는 것을 특징으로 하는 암 진단용 조성물.
- 제19항에 있어서, 상기 암은 방광암, 뼈암, 혈액암, 유방암, 흑색종양, 갑상선암, 부갑상선암, 골수암, 직장암, 인후암, 후두암, 폐암, 식도암, 췌장암, 대장암, 위암, 설암, 피부암, 뇌종양, 자궁암, 두부 또는 경부암, 담낭 암, 구강암, 결장암, 항문 부근암, 중추신경계 종양, 간암 및 대장암으로 이루어진 군에서 선택되는 것을 특징으로 하는 암 진단용 조성물.
- 암 진단용 조성물을 제조하기 위한, 상기 표 1의 엑손-접합(exon-junction)으로 이루어진 군에서 선택된 단수 또는 복수의 엑손-접합을 검출할 수 있는 제제의 용도.
- (a) (i) 개체의 혈액의 무핵세포(anucleated cells)에서 분리한 전체 RNA 또는 이의 일부, (ii) 개체의 혈액의 엑소좀(exosome)에서 분리한 전체 RNA 또는 이의 일부, 및 (iii) 개체의 혈액에서 분리한 전체 cfRNA(cell-free RNA) 또는 이의 일부로 이루어진 군에서 선택된 하나 이상의 RNA를 분리하는 단계(b) 상기 (a) 단계에서 분리한 RNA에 대한 상보적 DNA (cDNA)을 합성하는 단계;(c) 상기 cDNA의 염기서열정보를 수득하는 단계;(d) 상기 cDNA 염기서열정보를 미리 정해진 엑손-접합(exon-junction) 라이브러리와 대비하여 각 엑손-접합에서의 염기서열 발현정보를 수득하는 단계; 및(e) 상기 각 엑손-접합에서의 염기서열 발현정보를 바탕으로 암 보유 여부를 결정하는 단계를 포함하는, 암 진단 방법.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR20220133331 | 2022-10-17 | ||
KR10-2022-0133331 | 2022-10-17 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2024085608A1 true WO2024085608A1 (ko) | 2024-04-25 |
Family
ID=90738201
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/KR2023/016067 WO2024085608A1 (ko) | 2022-10-17 | 2023-10-17 | 혈액 내 rna의 엑손-접합 정보를 이용한 암 진단 방법 |
Country Status (2)
Country | Link |
---|---|
KR (1) | KR20240054194A (ko) |
WO (1) | WO2024085608A1 (ko) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20200020728A (ko) * | 2017-06-27 | 2020-02-26 | 고쿠리츠다이가쿠호우진 도쿄다이가쿠 | 융합 유전자 및/또는 엑손 스키핑에 의해 생기는 전사 산물을 검출하기 위한 프로브 및 방법 |
US20200199671A1 (en) * | 2018-12-18 | 2020-06-25 | Grail, Inc. | Methods for detecting disease using analysis of rna |
-
2023
- 2023-10-17 WO PCT/KR2023/016067 patent/WO2024085608A1/ko unknown
- 2023-10-17 KR KR1020230138881A patent/KR20240054194A/ko unknown
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20200020728A (ko) * | 2017-06-27 | 2020-02-26 | 고쿠리츠다이가쿠호우진 도쿄다이가쿠 | 융합 유전자 및/또는 엑손 스키핑에 의해 생기는 전사 산물을 검출하기 위한 프로브 및 방법 |
US20200199671A1 (en) * | 2018-12-18 | 2020-06-25 | Grail, Inc. | Methods for detecting disease using analysis of rna |
Non-Patent Citations (3)
Title |
---|
DAVID JULIANNE K, MADEN SEAN K, WEEDER BENJAMIN R, THOMPSON REID F, NELLORE ABHINAV: "Putatively cancer-specific exon–exon junctions are shared across patients and present in developmental and other non-cancer cells", NAR CANCER, OXFORD UNIVERSITY PRESS, vol. 2, no. 1, 1 March 2020 (2020-03-01), pages zcaa001 - zcaa001, XP009554402, ISSN: 2632-8674, DOI: 10.1093/narcan/zcaa001 * |
KRISHNA DONKENA: "Whole blood defensin mRNA expression is a predictive biomarker of docetaxel response in castration-resistant prostate cancer", ONCOTARGETS AND THERAPY, pages 1915, XP093162151, ISSN: 1178-6930, DOI: 10.2147/OTT.S86637 * |
RAHMAN FAISAL A; AZIZ NAVEED; COVERLEY DAWN: "Differential detection of alternatively spliced variants of Ciz1 in normal and cancer cells using a custom exon-junction microarray", BMC CANCER, BIOMED CENTRAL, LONDON, GB, vol. 10, no. 1, 10 September 2010 (2010-09-10), LONDON, GB , pages 482, XP021075304, ISSN: 1471-2407, DOI: 10.1186/1471-2407-10-482 * |
Also Published As
Publication number | Publication date |
---|---|
KR20240054194A (ko) | 2024-04-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2018199589A1 (ko) | 위암의 생물학적 특성에 기반한 군 구분 및 예후 예측 시스템 | |
Jin et al. | Crosstalk in competing endogenous RNA network reveals the complex molecular mechanism underlying lung cancer | |
KR101583546B1 (ko) | 유전자 다형성을 이용한 소라페닙 치료에 대한 반응성 예측방법 | |
Colombo et al. | Gene expression profiling reveals molecular marker candidates of laryngeal squamous cell carcinoma | |
CN105925681A (zh) | 一种用于肺癌筛查的组合物及其应用 | |
WO2014163444A1 (ko) | 국소 진행형 위암에 대한 예후 예측 시스템 | |
US10889862B2 (en) | Methods for identifying and using small RNA predictors | |
US20110294684A1 (en) | Gene expression signatures for lung cancers | |
US10947592B2 (en) | Method for detecting cystic fibrosis | |
CN103923983B (zh) | 一种在食管鳞癌中显著上调的长链非编码rna的检测及应用 | |
Jonchère et al. | Identification of positively and negatively selected driver gene mutations associated with colorectal cancer with microsatellite instability | |
CA2696947A1 (en) | Methods and tools for prognosis of cancer in er- patients | |
KR102549013B1 (ko) | 췌장암 진단을 위한 메틸화 마커 유전자 및 이의 용도 | |
CN103923985B (zh) | 一种新的食管癌标志基因的检测及应用 | |
US10787711B2 (en) | Method for differentiating between lung squamous cell carcinoma and lung adenocarcinoma | |
WO2024085608A1 (ko) | 혈액 내 rna의 엑손-접합 정보를 이용한 암 진단 방법 | |
WO2017084027A1 (zh) | 一种用于急性髓细胞白血病预后分层的试剂盒及检测方法 | |
Masson et al. | Copy number variants associated with 18p11. 32, DCC and the promoter 1B region of APC in colorectal polyposis patients | |
CN113862372B (zh) | 定量检测abi1-tsv-11的pcr方法及其所用引物 | |
Favis et al. | Harmonized microarray/mutation scanning analysis of TP53 mutations in undissected colorectal tumors | |
WO2018123764A1 (ja) | 神経芽腫の微小残存病変を評価するために用いられる試薬、およびそれを用いた生体試料の分析方法 | |
US10155993B2 (en) | Method or kit for determining lung cancer development | |
Eiberg et al. | A splice-site variant in the lncRNA gene RP1-140A9. 1 cosegregates in the large Volkmann cataract family | |
WO2020096247A1 (ko) | 유방암 조직 내 세포 유래 돌연변이를 검출하기 위한 프로브 제조 및 검출 방법 | |
US11021756B2 (en) | MiRNA markers for the diagnosis of osteosarcoma |